











|
Video
Phonetics Database and Program, Model 5150
The
Video Phonetics Database and Program, Model 5150, is a powerful new option for CSL and
Multi-Speech which provides the ability to display a video window in conjunction with
acoustic data; the data are linked in time for synchronous real-time display or in-depth
analysis via cursor movement along the waveform. The audio from the standard Windows
AVI files can be saved in (.wav) or CSL format (.nsp).
Video data can be acquired using generic multimedia cards. The program supports
standard video inputs (e.g., Cinepaks AVI). The audio information is loaded
separately into a waveform display window for analysis.
Applications
Video data suitable for capture might include a close-up of lip movement, a picture of
the clients face, endoscopic images of the larynx, jaw movement, or a speakers
posture.
This capability can be used in a variety of applications such as teaching acoustic
phonetics, linguistics research, articulation training, and second language training.
All of the powerful acoustic analysis features of Multi-Speech are available in this
program. When a video file is loaded or an IPA symbol is selected from a displayed chart,
a video box is opened and is associated with the active window. The video data is loaded
and the waveform corresponding to the audio portion of the AVI file is displayed as a
waveform. The IPA character, if available, is entered on the IPA transcription line at the
start of the waveform.
Video Database
The Video Phonetics Database is included with the Video Phonetics Program. It was
developed at the University of Victoria in Victoria, BC, Canada. This database consists of
more than 200 audio/video files in AVI file format, illustrating sample pronunciations of
the International
Phonetic Alphabet (IPA) symbols. At least one AVI file exists for each IPA symbol, and
two files are provided for most. The total size of the database is about 80 Mbytes.
Video clips of each sample pronunciation are loaded from the database and are played on
the computer screen along with a waveform display of the audio signal. The video clip with
audio is then played, and the data cursor in the waveform display keeps track of the
current output location. A spectrogram can be easily generated in another window. The clip
can be played with audio output and cursor tracking of data. Alternatively, two video
clips may be loaded, each associated with different waveform windows.

The Video Phonetics option allows video/audio
files to
be loaded and acoustically analyzed.
In the example above, a pseudopalate
was worn
and the utterance "Kay is in Lincoln Park" was
captured
along with the synchronized video
and Palatometer display showing linguapalatal
contact. The video and Palatometer display
(during the "n") are
linked to the cursor position
on the spectrogram and waveform (with IPA
transcription).
The video samples in the Video Phonetics
Database are taken from various camera positions that best represent the articulation in
question. In some cases, it means displaying the subjects lips and jaw movement,
while in other cases, it means displaying the tongue position or movement closely enough
to observe oral gestures. In still other cases, the significant articulation may take
place in the pharynx or at the glottis, and an endoscopic view is used.
These video
samples provide an excellent means of comparing articulatory gestures and their
concomitant speech sounds, for such applications as training in acoustic phonetics,
articulation training, or pronunciation training for second language instruction. Four
video IPA charts and lists, which display the IPA symbols, are included with the video IPA
program: the Vowel Chart, the Pulmonic Consonants Chart, the Non-Pulmonic Consonants
Chart, and the Other IPA Characters List.

Video Phonetics includes four IPA charts (pulmonic
consonant chart displayed
above) where a user
can select any phoneme with a click of the mouse
and
then view the video and acoustic data
associated with that sound for one
or two
speakers for each symbol.
Audio and Video Acquisition
The program reads standard AVI files and separates the waveform into a linked window
from the video so that that the signals can be analyzed. While any video card (not
included with program) can be used, the audio input on most video capture cards is not
good enough for reliable acoustic analysis. Users should use a high-end video card or
separate audio system. The Video Phonetics database used the miroVIDEO DC30plus for video
but used the CSL, Model 4300B, for the audio signals. Capturing palatometric data with
video required two computers to capture.

A videostroboscopic image from Kay's Digital Strobe
shows the slow motion vibratory characteristics of the
vocal folds. Also
displayed are the waveform and
a narrow-band spectrogram.
Summary
Video Phonetics Database and Program provide a rich array of audio/video samples
including the complete IPA symbol set. It is an excellent educational tool. The program
includes a comprehensive set of acoustic analysis tools so that the physiological basis of
sounds associated with video can be readily studied.
Current CSL, Model 4500 and
4150, software and
database options include:
-
Analysis-Synthesis
Laboratory (ASL), Model 5104 -
Applied
Speech Science for Dysarthrias, Model 5153
-
Applied
Speech Science for Voice & Resonance Disorders, Model 5156
-
Auditory
Feedback Tools, Model 3506
-
Disordered
Voice Database,
Model 4337
-
Games,
Model 5167
-
Motor
Speech Profile,
Model 5141
-
Multi-Dimensional
Voice Program,
Model 5105
-
Neuroscience
for Human Communications, Model 5155
-
Palatometer
Database,
Model 4333
-
Phonetic
& Perception Simulation Programs, Model 5151
-
Phonetic
Database,
Model 4332
-
Real-Time
EGG Analysis, Model 5138
-
Real-Time
Pitch,
Model 5121
-
Real-Time
Spectrogram,
Model 5129
-
Respiration,
Phonation and Prosody Simulation, Model 5152
-
Signal
Enhancement Program, Model 5142
-
Sona-Match, Model 5127
-
Speech
Articulation: Animation of Muscle Vectors, Model 5154
-
Video
Phonetics Program and Database,
Model 5150
-
Voice
Range Profile, Model 4326
Click
here for the PDF version of this document.
|