The Vowel Demonstrator illustrates the basics of vowel production using a speech synthesizer to model vowel production.
This figure shows the source spectrum of the vowel. The acoustic source for vowels is vibration of the glottis. Glottal vibrations result from muscle actions that both 1) push air between the vocal folds, tending to separate them; and 2) tense the vocal folds, tending to pull them back together. Because the vocal folds are elastic, these forces repeatedly force the vocal folds gradually apart and then snap them back together more rapidly. Since the open/close cycles are regular and produced at approximately equal time intervals of T0, the resulting sound is periodic, with an amplitude spectrum composed of harmonics (multiples) of the fundamental frequency f0 = 1/T0. The shape of the spectral envelope (that is, the imaginary curve connecting the peaks of the harmonics) is determined by the exact shape (gradual opening, rapid closure) of the open/close cycle.
This figure shows the filtering effect of the current vocal tract configuration. The sound produced by the glottis is filtered as it passes through the vocal tract because the vocal tract has resonances which amplify some frequencies more than others. Each configuration of the vocal tract acts as a different acoustic filter because it has different resonances. The resonances of the vocal tract result in more amplification and higher output at some frequencies and less amplification and lower output at frequencies. The bands of higher output are called formants. Vowels are characterized primarily by the frequencies of the first three formants, labeled F1, F2, and F3 (in ascending order). The relative amplitude of the formants is a secondary factor that affects vowel quality.
Click or drag on the “meters” (red or blue) below the graph of the filter to move the corresponding cursor (the red or blue vertical line) to one of the peaks in the graph. When the cursor is at a peak in the filter, it measures a formant frequency. The formant frequency is displayed in the meter, e.g., “1525 Hz”. The gain of the filter at the formant frequency is displayed at the top of the graph, in dB (e.g., “22 dB”). The corresponding cursors (red or blue vertical lines) in the graph of the source spectrum (to the left) and the output spectrum (to the right) display the corresponding amplitudes at the same frequency.
This figure shows the spectrum of the output sound. The output sound, the sound that would be recorded in front of the mouth, is the result of the filter acting on the source spectrum. It reflects both the downward tilt of the source spectrum and the resonances of the filter created by the vocal tract. At any given frequency, the amplitude of the output sound (in decibels, or dB) is equal to the amplitude of the glottal source plus the amplification (also called “gain”) from the filter.
Click or drag on the slider’s handle to adjust the fundamental frequency. The synthesizer will play the glottal source sound while you are adjusting the fundamental. Note that the spacing between harmonics changes as fundamental frequency changes, because the harmonics are at frequencies which are multiples of the fundamental frequency (2 x f0, 3 x f0, 4 x f0, etc.) The shape of the source spectrum is determined by the open/close cycle, which is set in the speech synthesizer. Note also that the envelope of the source spectrum by and large does not change when the vocal tract configuration changes.
This figure shows a sagittal section of a vocal tract, including the lips, tongue, jaw, and teeth. Different vowels are produced by different configurations of these articulators. Changing the shape and size of the hollow spaces (cavities) in the vocal tract thus changes the resonances of the vocal tract. The most important factor is the position of the tongue in the oral cavity. Tongue position is described by how high, and how far back, the tongue is in the mouth. In the Vowel Demonstrator, vocal tract configuration is controlled by positioning the cursor in the vowel quadrilateral.
Another important factor in vowel production is lip rounding. With the tongue in a given position, the character of a vowel can be changed markedly by rounding or unrounding the lips.
Click on the “lip rounding” checkbox to the right to change the lip posture. ition of the lips.
The vowel quadrilateral is a schematic roughly indicating tongue position for the production of various rounded and unrounded vowels. The left side of the quadrilateral indicates vowels in which the tongue is positioned towards the front of the mouth; the right side indicates vowels with the tongue towards the back. The top of the vowel quadrilateral indicates vowels with the tongue raised close to the roof of the mouth; the bottom of the vowel quadrilateral indicates low tongue and jaw positions that produce a more open vocal tract.
Click or drag on the vowel quadrilateral to change the vocal tract configuration and listen to the corresponding synthetic vowel.
Click on the “IPA cardinals” checkbox to the right to see the IPA symbols for rounded and unrounded vowels on the vowel quadrilateral.