|
|
COMMUNIKAY Vol. 3, No. 2
Multi-Speech Added to Kay's Acoustic Product Line Kay has introduced a new, low-cost speech analysis product called Multi-Speech which uses standard Windows® multimedia hardware (e.g., Sound Blaster boards) to capture, analyze, and play speech samples. Multi-Speech is the latest addition to Kays family of acoustic analysis products which includes the Computerized Speech Lab (CSL), DSP Sona-Graph, and Visi-Pitch II. Although there is overlap in functionality, each of these serves specific needs across the wide spectrum of Kays customers. Multi-Speech is designed for clinics, universities, or individuals with limited budgets or for those who wish to outfit multiple computers with sophisticated speech analysis software. Multi-Speech contains virtually all of the powerful analysis and editing features of the core CSL software. As a Windows application, the program has a different look and feel from CSL, but CSL users will quickly acclimate to features with which they are familiar. Analysis tools include spectrograms, power spectra, LPC analysis, long term average spectra, pitch, amplitude, and cepstrum. On-line Windows Help, impressive editing features, logging of extracted parameters, macros, and IPA annotation are a sampling of utilities available in this feature-rich program. An extensive tutorial is supplied to introduce new users to menu selections and program operations. Like all programs based on multimedia hardware, Multi-Speech is somewhat limited by the performance specifications of these sound cards, notably their signal-to-noise acquisition quality (usually 50-60 dB). Unless careful attention is given to signal level during acquisition (often resulting in multiple input attempts to obtain full dynamic range or avoid peak clipping), the resolution of the acquired data may not be adequate for certain clinical, and most research, applications. However, Multi-Speech can read both .wav and .nsp file formats. As a result, it is an ideal option for CSL, Visi-Pitch II, or other systems with robust hardware platforms for data acquisition (>80 dB signal-to-noise) which can store files in .nsp or .wav file formats. Depending on the host computers speed, Multi-Speech provides rapid analysis of acquired speech, but not in true real-time. As a result, it should not be considered a visual feedback tool (like Visi-Pitch II) which is required for many therapy applications. Multi-Speech brings Kay quality to the inexpensive multimedia hardware environment for speech acquisition and analysis. It is an impressive software package with a rich array of features for the price. Swallowing Workstation Granted FDA Clearance Kays Swallowing Workstation has recently received FDA clearance. Integrated with all hardware components on a mobile cart and with Windows-based software, the workstation provides clinicians with a rich set of tools for observing swallowing processes in real-time and making quantitative measurements. Further information is available upon request. Overview of Current CSL Options The listing below briefly summarizes the various CSL options. Also included with the descriptions are the primary application areas for each product. Synthesis Program, Model 4304. Extends the LPC (linear predictive coding) analysis capabilities of core CSL software by providing synthesis of the speech signal, editing of extracted speech parameters (including formant and pitch manipulation), and resynthesis. Primary application: teaching and linguistics research. Multi-Dimensional Voice Program (MDVP), Model 4305. Clinical quantitative assessment of voice quality calculating over 20 parameters on a single sustained phonation. Graphically and numerically compares extracted parameters with built-in database. Primary application: clinical voice. Voice Range Profile, Model 4326. Plots maximum and minimum amplitude (dB SPL) of sustained phonation throughout the subjects fundamental frequency range. The two-dimensional plot is a revealing "snapshot" (also called phonetogram) for detecting subtle changes of vocal function even in professional singers. Primary application: clinical voice. Sona-Match, Model 4327. Real-time spectral display of vowel formant frequencies and sibilant "shape". Primary application: general purpose clinical and teaching. Delayed Auditory Feedback, Model 4328. High fidelity auditory feedback which is delayed by user-defined intervals from 25-500 msec. Primary application: clinical fluency training. Real-Time Spectrogram, Model 4329. True real-time display of the "gold standard" spectrogram. Primary application: general purpose clinical and teaching. CSL-Pitch, Model 4331. Real-time fundamental frequency and energy displays with statistical analysis. Primary application: general purpose clinical. Phonetic Database, Model 4332. CD-ROM of more than 4,000 digitized samples from 45 languages. Documentation also provides consonant/vowel charts for each language along with IPA transcription. Primary application: teaching. Palatometer Database, Model 4333. A total of 146 files of spoken English (two speakers) which include acoustic and palatometric data acquired with Kays Palatometer. Primary application: teaching and clinical articulation. IPA Transcription Tutorial, Model 4335. Multimedia learning program for the International Phonetic Alphabet. Sound, text, IPA symbols, spectrographic analysis, and selected linguapalatal displays are included to help students learn IPA. Primary application: teaching. Disordered Voice Database, Model 4337. CD-ROM of 1,400 normal and disordered voices collected at the Massachusetts Eye and Ear Infirmary Voice and Speech Lab. Both sustained utterances and connected speech samples on each subject. Primary application: teaching/research and clinical voice. EGG Processing, Model 4338. Real-time display of EGG waveform and analysis of various quotients (requires electroglottograph). Primary application: clinical voice. Motor Speech Profile, Model 4341. Extracts parameters relevant to dysarthric speech. Protocols provided for eliciting speech utterances with automatic statistical calculations and report summary. Primary application: clinical motor speech. Signal Enhancement in Noise, Model 4342. Powerful editing tool for enhancing signals corrupted by noise. Primary application: forensics and law enforcement. Auditory Perception Program and Database, Model 4343. Combination of software (for signal presentation and subject scoring) and CD-ROM database to help construct perception experiments. Primary application: teaching and research. Condenser Microphone, Model 4302. Head-mounted, professional grade condenser microphone. Supplied with VRP but available as option. Primary application: research and clinical usage. DAT Interface and Four Channel Input, Model 4311. Allows DAT recorded data to be directly linked to PC via CSL without resampling; also provides two additional channels to CSL hardware, allowing four channels of concurrent data acquisition. Primary application: research and clinical. Direct-to-Disk, Model 4321. Allows input signal to be written directly to computers hard drive. Primary application: when signal length is many minutes in duration. Programmers Kit, Model 4322. Instructions on how to use the CSL input/output hardware in conjunction with other software. Primary application: computer programmers. What is the difference between the various lens adapters Kay offers for rigid and flexible endoscopes? Lens adapters serve two functions. First, they are a coupling and focusing device between the endoscope and the CCD camera. Second, they magnify the endoscopic image by varying amounts depending on the focal length of the lens. Longer focal length results in greater magnification and some, though minimal, loss of light. Kay offers three different lens adapters which are specified according to their focal lengths (28 mm, 35 mm, and 45 mm). The 35 mm lens adapter is generally considered optimal for Kays rigid endoscope and most flexible endoscopes. However, a 35 mm lens adapter may not be best suited for other rigid endoscopes. Because an endoscope has its own magnification, it is the combination of the particular endoscope you use, along with the focal length of the lens adapter, which determines the size of the image you see on the monitor. If you think the image on your monitor is too large or too small, you may want to consider using a different lens adapter. Can I use percutaneous EMG with the Swallowing Workstation? Percutaneous EMG is not included with the workstation, but an existing EMG system can be used in conjunction with it. Assuming you have a percutaneous EMG system that includes an analog output(s), you can connect this to one of the swallowing systems auxiliary channels. The EMG data can then be correlated with other data (video or other physiologic signals) acquired concurrently with the system. How do you recommend keeping the large number of switches in their proper positions on the front of the strobes VCR? Our facility has had problems a couple of times due to a switch position getting changed. The Mitsubishi VCR (BV- 2000) has 14 switches on the front panel; if one of these gets inadvertently switched to the incorrect position (e.g., when the equipment is moved), operational problems can occur. We have observed that some facilities tape the switches in their proper position; others place small colored dots on the side of the correct switch position. More recently, Kay has made available a clear plastic switch template which adheres to the front of the VCR and keeps the switches from being toggled. The template has to be removed if you wish to change a setting, but it comes off easily. These are available at no charge to strobe customers. Contact the customer service department by phone (Ext.144) or e-mail to: service@kayelemetrics.com. Changing the dynamic range of spectrograms within CSL Spectrographic analysis settings are typically adjusted by varying filter bandwidth (FFT size) to produce wideband or narrowband displays (see CommuniKay Vol. 2, No. 2). Users may occasionally need to adjust dynamic range in a spectrogram, depending on the acquired signal. Dynamic range determines what dB levels are translated into grey scale levels in the spectrogram. In the CSL program, dynamic range is changed by adjusting the Darkness Scale (for command line users, set spg.scale n...). To make dynamic range adjustments, click Main Menu selection Analysis, and then the following selections in the pull-down menu: "Spectrogram...", then "Options...", then "Darkness Scale". The default settings of 18 dB to 48 dB (relative dB scale) generally produce good spectrograms of speech samples. To change the settings, enter new minimum and maximum values, and click "Smooth" to create a linear scale; then reanalyze to see the results. You may wish to maintain the same range (30 dB), but "slide" the window; for example, you could change the range from 25 dB to 55 dB when there is ambient noise that you do not want to see spectrographically. Alternatively, you may wish to decrease (e.g., to 20 dB) or increase (e.g., to 40 dB) the dynamic range. If you want to see darker formants in a speech signal, or view low energy nasal formants, decrease the range (e.g., 18 dB to 38 dB). Experiment with the darkness scale settings on the particular signals you are evaluating to determine what provides you with the most meaningful analysis representation. Using Kay's Databases for Teaching Applications Kay offers four databases for a variety of speech/voice applications to complement the CSL, Multi-Speech, and Visi-Pitch II analysis systems: the Phonetic Database (CD-ROM with more than 4500 sample files), the Disordered Voice Database (CD-ROM with more than 1400 sample files), the Palatometer Database (five disk set with 146 files), and the Perception Database (CD-ROM with more than 2500 sample files). The databases are designed to aid in a variety of teaching, clinical, and research applications. Convenient access to these carefully collected samples provides a valuable resource for teaching acoustic phonetics, ear-training for vocal pathology, IPA transcription, and speech perception. The Phonetic Database, Model 4332, was developed at the University of Victoria, with contributions from others, to provide representative speech samples from 45 languages. The 4500 different files can be used to study the acoustic properties of various languages. These speech samples can also be used in conjunction with the IPA Transcription Tutorial, Model 4335, as a source for transcription exercises, by loading the Phonetics Database CD-ROM and importing the desired files. Tags (markers) with remarks, as well as IPA characters, can be added at segments of interest in the speech tokens. The tag comments can be used as hints for students when the IPA Tutorial is used as a teaching exercise. The Disordered Voice Database, Model 4337, was collected at the Massachusetts Eye and Ear Infirmary Voice and Speech Lab. This database provides a rich source of both spoken speech and sustained phonation from over 700 patients with various voice pathologies. Samples from normal voices are also included. Using MDVP or CSL, these samples can be loaded, played, and analyzed. Students can then compare their perceptual voice evaluation with analysis results obtained from the MDVP program. In addition, the Auditory Perception Program and Database, Model 4343, can be used for more structured ear-training exercises or psychoacoustic experiments. The Palatometer Database, Model 4333, contains examples of the articulatory patterns used by two different speakers reading standardized passages. Differences and similarities can be readily seen. Students can also see the correlation between physiology and resultant acoustic signal by comparing the spectrograms and palatograms. The Perception Database (included with Model 4343 noted above) consists of samples specifically designed for teaching psychoacoustic concepts and speech perception (see CommuniKay Vol. 2, No. 3 for a review of this product). Files contained in stored lists for demonstrating specific principles can be presented to the subject for training or experimental applications. Users can construct their own lists for presentation to students. This product is ideal for teaching a course on perception. All of the databases provide a rich source of material to train the ear, learn IPA transcription, and teach the student the correlation between perceptual evaluation and acoustic measurement. The databases include comprehensive documentation and work with CSL, Visi-Pitch II, or Multi-Speech. Please look for Kay products on display at the
following conferences, workshops, and congresses.
On September 6-7, Kay will once again present The Advanced Stroboscopy Workshop: Operations and Interpretation to be held in Lincoln Park, New Jersey. The workshop is designed for current stroboscope users, including laryngologists, speech-language pathologists, and voice scientists. This workshop and others are offered as part of Kays commitment to providing customers with operational support and continuing education in applications related to Kay products. The complete workshop will run for two days from 8:30 a.m. to 5:30 p.m. The first day features advanced operations and will be taught by Kay's product specialists. Brief lecture segments will be interspersed with hands-on exercises. On the second day, David Eibling, M.D., University of Pittsburgh, and Rebecca Leonard, Ph.D., University of California-Davis, will discuss interpretation of stroboscopy procedures using samples from their own practices. Attendees are invited to bring their own difficult cases (VHS or S-VHS format) for review and discussion. Because of the hands-on nature of the workshop, registration is limited. The cost for the two days is $495. For either day alone, the cost is $295. The fee does not include transportation or accommodations; however, it does include workshop materials, refreshments, lunch, and dinner on Friday evening. Note that registration and payment of course fee are required a minimum of four weeks prior to the workshop. Refunds will be granted provided that written notice of cancellation is received at least two weeks prior to the start of the workshop. For further information, call 1-800-289-5297 (in USA and
Canada only) or (201) 628-6200, Ext. 162. For international
customers, our fax number is (201) 628-6363.
|
| Copyright © 1996-2008 KayPENTAX. All rights reserved. Site Map | Contact Us |