Home  |  Contact Us    

Home
Company History and Profile
Visiting KayPENTAX
Product Information
Publications
KayPENTAX in the News
Conferences and Workshops
What's New
International Representatives
Sales
Tech Support
Employment Opportunities

 

 

Video Phonetics Database and Program, Model 5150

The Video Phonetics Database and Program, Model 5150, is a powerful new option for CSL and Multi-Speech which provides the ability to display a video window in conjunction with acoustic data; the data are linked in time for synchronous real-time display or in-depth analysis via cursor movement along the waveform. The audio from the standard Windows™ AVI files can be saved in (.wav) or CSL format (.nsp).

Video data can be acquired using generic multimedia cards. The program supports standard video inputs (e.g., Cinepak’s AVI). The audio information is loaded separately into a waveform display window for analysis.

Applications

Video data suitable for capture might include a close-up of lip movement, a picture of the client’s face, endoscopic images of the larynx, jaw movement, or a speaker’s posture.

This capability can be used in a variety of applications such as teaching acoustic phonetics, linguistics research, articulation training, and second language training.

All of the powerful acoustic analysis features of Multi-Speech are available in this program. When a video file is loaded or an IPA symbol is selected from a displayed chart, a video box is opened and is associated with the active window. The video data is loaded and the waveform corresponding to the audio portion of the AVI file is displayed as a waveform. The IPA character, if available, is entered on the IPA transcription line at the start of the waveform.

Video Database

The Video Phonetics Database is included with the Video Phonetics Program. It was developed at the University of Victoria in Victoria, BC, Canada. This database consists of more than 200 audio/video files in AVI file format, illustrating sample pronunciations of the International

Phonetic Alphabet (IPA) symbols. At least one AVI file exists for each IPA symbol, and two files are provided for most. The total size of the database is about 80 Mbytes.

Video clips of each sample pronunciation are loaded from the database and are played on the computer screen along with a waveform display of the audio signal. The video clip with audio is then played, and the data cursor in the waveform display keeps track of the current output location. A spectrogram can be easily generated in another window. The clip can be played with audio output and cursor tracking of data. Alternatively, two video clips may be loaded, each associated with different waveform windows.

 Automatically Analyze Video and Audio
The Video Phonetics option allows video/audio
 files to be loaded and acoustically analyzed.
In the example above, a pseudopalate was worn
and the utterance "Kay is in Lincoln Park" was
captured along with the synchronized video
 and Palatometer display showing linguapalatal
 contact. The video and Palatometer display
(during the "n") are linked to the cursor position
on the spectrogram and waveform (with IPA transcription).

The video samples in the Video Phonetics Database are taken from various camera positions that best represent the articulation in question. In some cases, it means displaying the subject’s lips and jaw movement, while in other cases, it means displaying the tongue position or movement closely enough to observe oral gestures. In still other cases, the significant articulation may take place in the pharynx or at the glottis, and an endoscopic view is used. These video samples provide an excellent means of comparing articulatory gestures and their concomitant speech sounds, for such applications as training in acoustic phonetics, articulation training, or pronunciation training for second language instruction. Four video IPA charts and lists, which display the IPA symbols, are included with the video IPA program: the Vowel Chart, the Pulmonic Consonants Chart, the Non-Pulmonic Consonants Chart, and the Other IPA Characters List.

Includes Four IPA Charts
Video Phonetics includes four IPA charts (pulmonic
consonant chart displayed above) where a user
 can select any phoneme with a click of the mouse
 and then view the video and acoustic data
associated with that sound for one or two
 speakers for each symbol.

Audio and Video Acquisition

The program reads standard AVI files and separates the waveform into a linked window from the video so that that the signals can be analyzed. While any video card (not included with program) can be used, the audio input on most video capture cards is not good enough for reliable acoustic analysis. Users should use a high-end video card or separate audio system. The Video Phonetics database used the miroVIDEO DC30plus for video but used the CSL, Model 4300B, for the audio signals. Capturing palatometric data with video required two computers to capture.

Videostroboscopic Image from Digital Strobe
A videostroboscopic image from Kay's Digital Strobe
 shows the slow motion vibratory characteristics of the
 vocal folds. Also displayed are the waveform and
 a narrow-band spectrogram.

Summary

Video Phonetics Database and Program provide a rich array of audio/video samples including the complete IPA symbol set. It is an excellent educational tool. The program includes a comprehensive set of acoustic analysis tools so that the physiological basis of sounds associated with video can be readily studied.

Current CSL, Model 4500 and 4150, software and database options include:

Click here for the PDF version of this document.

Copyright © 1996-2008 KayPENTAX. All rights reserved. Site Map  |  Contact Us