Home  |  Contact Us    

Home
Company History and Profile
Visiting KayPENTAX
Product Information
Publications
KayPENTAX in the News
Conferences and Workshops
What's New
International Representatives
Sales
Tech Support
Employment Opportunities
Last updated: May 07, 2009

KayPENTAX in the News...Recently Published Studies Involving KayPENTAX Instrumentation

Current postings (see below) include 368 article abstracts that represent a sampling of the recent publications citing our instrumentation.

By Product Line:

Acoustic Analysis

Aerodynamics

Flexible Endoscopes (including Transnasal Esophagoscopes)

Pulsed Dye Laser (PDL) - KayPENTAX/Cynosure

Stroboscopy

Swallowing

 

 

 

 

 

 

 

 

 

Acoustic Analysis

“Laryngeal function after supracricoid laryngectomy,” Saito, Koichiro, Koji Araki, Kaoru Ogawa, and Akihiro Shiotani, Otolaryngology-Head and Neck Surgery, Vol. 140, No. 4, pp. 487-492, April 2009.

Objective: The purpose of this study was to assess laryngeal function after supracricoid laryngectomy.

Study Design: Case series.

Subjects and Methods: Supracricoid laryngectomy (SCL) has been performed in our institution for 24 selected patients with laryngeal cancer since December 2000. Reconstruction was performed through cricohyoidoepiglottopexy for 23 patients and cricohyoidopexy for 1 patient. Seven patients had ipsilateral arytenoid removal, and 15 patients underwent SCL as salvage surgery. A retrospective chart review was performed to assess postoperative speech and swallowing function. Stroboscopy and/or fiberscopy of the neoglottis were used to assess postoperative speech kinetics. Acoustic parameters were measured to evaluate vocal function, and several questionnaires were used to evaluate post-operative quality of life (QOL).

Results: In the absence of postoperative complications, stoma closure and normal diet intake were achieved 1 month after surgery. The neoglottis comprises the arytenoid(s), epiglottis, and pyriform sinus mucosa. Several different combinations of vibrating regions were observed among patients during phonation. Although vocalization sounded rough and breathy, vocal communication was possible with little inconvenience.

Conclusion: Acceptable functional recovery and tolerable QOL can be achieved after SCL.

The KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5150, was used, in this study, to perform acoustic analyses of voice quality.

 

“Marsupialization of Vocal Fold Retention Cysts: Voice Assessment and Surgical Outcomes,” Hsu, Cheng-Ming, Gian Luca Armas, and Chih-Ying Su, Annals of Otology, Rhinology & Laryngology, Vol. 118 (4), pp. 270-275, April 2009.

Objectives: Although total excision remains the standard treatment for vocal fold retention cysts, postoperative deficits and damage to the vocal folds still occur. Marsupialization is a more conservative technique and can prevent these complications.

Methods: In this prospective clinical series, 25 patients underwent the marsupialization procedure. Under a direct laryngomicroscope, the cystic wall margin was retracted medially with microforceps. An incision was made with microscissors encircling the equator of the cyst. The cyst contents drained from the cystic cavity when the capsule was sectioned. For 7 patients with concomitant marked vocal fold atrophy, strap muscle transposition laryngoplasty was simultaneously performed.

Results: All patients had complete preoperative and postoperative voice parameter analyses. A subjective improvement in voice quality was reported by 23 of the 25 patients (92%). A small recurrent vocal fold cyst was detected in 1 patient. Small vocal fold deficits and sulcus vocalis were detected in 2 and 4 patients, respectively. Only 1 patient described a worse voice after operation. No other complications were noted.

Conclusions: Marsupialization of vocal fold retention cysts is a simple, relatively safe, and effective surgical treatment. Voice improvement, a low incidence of recurrence, and minimal vocal fold deficits demonstrate the validity of this technique. Marked preoperative vocal fold atrophy or postoperative glottal gap can be managed with medialization laryngoplasty.

In this study, the KayPENTAX RLS system was used to perform laryngostroboscopy, the KayPENTAX CSL, Model 4300B, to perform acoustic analysis, and the KayPENTAX Aerophone II, Model 6800, to measure aerodynamic parameters.

 

“Menstrual Cycle Influences on Voice and Speech in Adolescent Females,” Meurer, Elisea M., Vera Garcez, Helena von Eye Corleta, and Edison Capp, Journal of Voice, Vol. 23 No. 1, pp. 62-70, January 2009.

Summary: The objective of this study is to characterize voice intensity and stability of fundamental frequency, formants and diadochokinesis, vocal modulations, rhythms, and speed of speech in adolescents during follicular and luteal phases of the menstrual cycle. Twenty-three adolescent females who were nonusers of oral contraceptives participated in a cross-sectional study of menstrual cycle influences on voicing and speaking tasks. Acoustic analyses were performed during both phases of the menstrual cycle using the Kay Elemetrics Computer Speech Lab Software Package. Data were analyzed using Student’s paired sample t test. Phono-articulatory parameters were similar in both phases of the menstrual cycle (fundamental frequency: 192.6 ± 23.9 Hz; minimum formant 891.7 ± 110.3 Hz; and maximum formant: 2471.5 ± 203.6 Hz). In diadochokinesis, they had a speed of 5.6 ± 0.6 seg/s and vocal intensity was 61.5 ± 2.6 dB. The mean values for the variations in voice modulations were as follows: anger (21.7 ± 8.7 Hz) < normal state (23.4 ± 12.4 Hz) < sadness (24.9 ± 10.5 Hz) < exclamatory sentence (29.3 ± 12.4 Hz) < interrogative sentence (33.1 ± 12.4 Hz) < happiness (33.3 ± 12.1 Hz. Combining both phases of the menstrual cycle, the speed of speech was 5.2 ± 0.6 seg/s in meaningful sentences and 1.9 ± 0.2 seg/s in meaningless sentences. In conclusion, the adolescents showed similar voice fundamental frequency and intensity, formants, speech of speech, and suprasegmental speech parameters. The results shown in this study may be used as standard of acoustic phono-articulatory for adolescents.

Researchers in this study used the KayPENTAX Computerized Speech Lab (CSL) in conjunction with the KayPENTAX Motor Speech Profile (MSP), Model 5141, to collect and analyze data.

 

“Role of Vortices in Voice Production: Normal versus Asymmetric Tension,” Khosla, Sid, Shanmugam Murugappan, Randal Paniello, Jun Ying, and Ephraim Gutmark, Laryngoscope, Vol. 119, No. 1, pp. 216-221, January 2009.

Objective: Decreasing the closing speed of the vocal folds can reduce loudness and energy in the higher frequency harmonics, resulting in reduced voice quality. Our aim was to study the correlation between higher frequencies and the intraglottal vorticity (which contributes to rapid closing by producing transient negative intraglottal pressures).

Methods: Using six excised canine larynges (three with symmetric and three with asymmetric, periodic vocal fold motion), intraglottal vorticity was calculated from 2D velocity fields measured using particle imaging velocimetry.

Results: There is a strong correlation between intraglottal vorticity and acoustic energy in the higher frequencies; in periodic asymmetric motion, the vorticity and higher frequencies are both reduced.

Conclusion: For unilateral vocal fold paralysis, these findings suggest one reason why periodic, asymmetric motion, may produce an abnormal voice. Further study will help determine when and why reinnervation, as opposed to medialization, may result in better voice quality.

In this study, the KayPENTAX CSL was used in conjunction with the Real-Time EGG, Model 5138, to record acoustic, EGG, and trigger signals.

 

“Metallic Voice: Physiological and Acoustic Features,” Hanayama, Eliana Midori, Zuleica Antonia Camargo, Domingos Hiroshi Tsuji and Silvia Maria Rebelo Pinho, Journal of Voice, Vol. 23 No. 1, pp. 62-70, January 2009.

Summary: The metallic voice is usually confused with ring or nasality by singers and nontrained listeners, who are not used to perceptual vocal analysis. They believe a metallic voice results from a rise in fundamental frequency. A diagnostic error in this aspect may lead to lowering pitch, an incorrect procedure that could cause vocal overload and fatigue. The purpose of this article is to study the quality of metallic voice considering the correlation between information of the physiological and acoustic plans, based on a perceptive consensual assumption. Fiberscopic video pharyngolaryngoscopy was performed on 21 professional singers while speaking vowel [e] – in normal and metallic modes to observe muscular movements and structural changes of the velopharynx, pharynx, and larynx. Vocal samples captured simultaneously to the fiberscopic examination were acoustically analyzed. Frequency and amplitude of the first four formants (F1, F2, F3 and F4) were extracted by means of linear predictor coefficients (LPC) spectrum and were statistically analyzed.

Audio samples in this study were digitized, and the acoustic analysis was performed, using the KayPENTAX Computerized Speech Lab (CSL), Model 4300B.

 

“High-Speed Laryngeal Imaging Compared With Videostroboscopy in Healthy Patients,” Kendall, Katherine A., Archives of Otolaryngology-Head & Neck Surgery, Vol. 135, No. 3, pp. 274-281, March 2009.

Objectives: To describe normal vocal fold vibratory characteristics recorded with high-speed digital imaging (HSV) of the larynx.

Design: Prospective study of healthy subjects who volunteered to undergo laryngeal HSV and videostroboscopy. Image analysis was randomly assigned to 3 blinded raters.

Setting: Community-based clinic with a specialty in laryngology.

Participants: Fifty healthy subjects aged 21 to 65 years who were nonsmokers and who had no voice problems, laryngopharyngeal reflux, or reactive airway disease.

Main Outcome Measures: The following characteristics of vibration were described: glottal configuration, phase closure, vibratory symmetry, mucosal wave propagation, amplitude of vibration, and periodicity of vibration. Interrater and intrarater reliabilities were calculated for both imaging modalities.

Results: The range of findings for each measure is described. The comparison of videostroboscopy ratings with ratings from HSV studies did not reveal any significant difference between the 2 modalities for any of the measures except for the assessment of periodicity. Aperiodic vibratory characteristics were noted on 30% of the videostroboscopy students (n = 15) and in only 4% of the HSV studies (n = 2) (P < .001). Although interrater and intrarater agreement were considered to be generally acceptable, a significant rater effect was identified.

Conclusion: This preliminary study describes a range of normal values for vocal fold vibratory characteristics as recorded with laryngeal HSV, providing a basis for comparison of studies in patients with voice problems.

In this study, the KayPENTAX Visi-Pitch IV, Model 3950, was used for acoustic analysis. The KayPENTAX RLS, Model 9100B, and HSV, Model 9700, were used to perform standard videostroboscopy and high-speed digital imaging of the larynx, respectively.

 

“Laryngeal and vocal evaluation in untreated growth hormone deficient adults,” Barreto, Valéria, M.P., Jeferson S. D’Ávila, Neuza J. Sales, Maria Inês R. Gonçalves, Juliane Dantas Seabra, Roberto Salvatori, and Manuel J. Aguiar-Oliveira, Otolaryngology-Head and Neck Surgery, Vol. 140, No. 1, pp. 37-42, January 2009.

Objective: To evaluate the consequences of lifetime, severe and untreated isolated growth hormone deficiency (IGHD) on vocal and laryngeal function.

Study Design: Cross-sectional.

Subjects and Methods: A total of 23 IGHD adult subjects and 22 controls were administered a questionnaire about vocal complaints and harmful voice habits, and underwent videolaryngostroboscopic examination, voice evaluation by perceptual-auditory analysis with GRBAS scale including grade of dysphonia, roughness, breathiness, asthenia and strain items, objective voice evaluation by maximum phonation time (MPT), and acoustic analysis.

Results: There was no difference in vocal complaints between IGHD subjects and controls. Vocal abuse and smoking were more frequent in IGHD subjects. IGHD subjects presented higher values for roughness, breathiness, and strain. Laryngopharyngeal reflux (LPR) signs and laryngeal constriction were more frequent in IGHD individuals. MPT was similar in the two groups. Fundamental frequency was higher in IGHD females and males. Harmonic to noise ration was higher in IGHD in both genders and shimmer was lower in IGHD females.

Conclusion: IGHD subjects have higher prevalence of signs of LPR and laryngeal constriction, with high pitch in both genders, which suggests a prominent role of IGHD on these parameters.

The KayPENTAX Multi-Speech, Model 3700, was used to perform the acoustic analysis in this study.

 

“Does an Exercise Aimed at Improving Swallow Function Have an Effect on Vocal Function in the Healthy Elderly?” Easterling, Caryn, Dysphagia, Vol. 23, Number 3, pp. 317-326, September 2008.

Abstract: Age-related sarcopenia or muscle wasting contributes to changes in the ability to perform activities of daily living, changes in deglutition, and changes in vocal function. The Shaker Exercise, an isometric and isokinetic exercise, has been shown to strengthen suprahyoid muscles and increase deglutitive anteroposterior (AP) upper esophageal sphincter (UES) opening diameter. The aim of this study was to determine if this exercise has an effect on the age-related changes in vocal function and deglutition in healthy older adults. Eleven females and 10 males, aged 65-78 years (mean = 70 ± 4 years) and with a negative history for dysphagia and voice disorders, participated by exercising three times per day for 6 weeks. Five age-matched controls did not perform the exercise. Acoustic analysis of voice and biomechanical analysis of deglutition were performed before and after 6 weeks of exercise. Controls participated in voice analysis only. Dysphonia Severity Index (DSI), a multivariate voice index, was used to compare voice production initially and after 6 weeks. Deglutitive biomechanical measures increased and DSI scores improved in 10 of 21 participants following 6 weeks of exercise. DSI for controls did not change over the 6-week period. Ten of 21 exercise participants experienced improved deglutitive biomechanics and DSI scores. Accuracy of exercise performance, compliance, and/or disclosed alterations in health status may contribute to the lack of deglutitive and DSI change in the participants who did not experience change in function. A large randomized control study, including periodic monitoring of health status, exercise performance accuracy, and compliance is warranted to evaluate the affect of this exercise on deglutition as well as voice. The Shaker Exercise could be recommended as a preventative measure to diminish the effect of sarcopenia on the muscles used in deglutition and voice and alter the progression of the characteristic senescent voice and swallow changes.

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, was used in this study to analyze the digitized vowel samples for habitual fundamental frequency, minimum intensity, and maximum pitch range.

 

“Evidence for Distinguishing Pressed, Normal, Resonant, and Breathy Voice Qualities by Laryngeal Resistance and Vocal Efficiency in Vocally Trained Subjects,” Grillo, Elizabeth U., and Katherine Verdolini, Journal of Voice, Vol. 22 No. 5, pp. 546-552, September 2008.

Summary: The purpose of this study was to determine if pressed, normal, resonant, and breathy voice qualities can be distinguished from one another by laryngeal resistance (LR; cm H2O/1/s) and/or vocal efficiency (VE; dB/cm H2O X 1/s) in vocally, trained subjects. The experimental design was a within-subjects repeated measures design. Independent variables were pressed, normal, resonant, and breathy voice qualities. Dependent variables were LR and VE. Participants were 13 women of age 18-45 years with established vocal expertise. After a brief training phase, subjects were asked to produce each of the voice qualities on the pitch A3 (220 HZ) at a constant, individually identified comfortable dB level (±1 dB), during a repeated consonant-vowel utterance of /pi pi pi pi pi/. Results indicated that LR but not VE reliably distinguished pressed, normal, and breathy voice. Neither of the measures, however, distinguished normal from resonant voice, which were distinguished perceptually. The results suggest that LR may provide a useful tool studying the coordinative dynamics of pressed, normal, and breathy voice qualities.

In this study, a KayPENTAX Aerophone II was used to capture aerodynamic and acoustic data.
 

“Vocal Changes in Patients Using Nasal Continuous Positive Airway Pressure,” Hamdan, Abdul-Latif, Omar Sabra, Hani Rifai, Dollen Tabri, and Ahmad Hussari, Journal of Voice, Vol. 22 No. 5, pp. 603-606, September 2008.

Summary: The aim of this prospective study is to assess the vocal changes in patients using nasal continuous positive airway pressure (CPAP). A total of 18 subjects using nasal CPAP were assessed by grading their voice perceptually as G0 for normal voice and G3 for severe hoarseness. Acoustic analysis was also performed and the following parameters were measured: fundamental frequency, habitual pitch, shimmer, relative average perturbation, voice turbulence index, and noise-to-harmonic ratio. The same was done for a control group matched according to age and gender. There was a statistically significant difference in the perceptual evaluation between the CPAP group and controls, with more patients in the former group having moderate hoarseness. There was also an increase in the perturbation parameters and a decrease in the fundamental frequency and habitual pitch in the CPAP group compared to controls. The increase in shimmer was statistically significant. The usage of nasal CPAP seems to induce vocal changes that are perceived as mild to moderate hoarseness, together with an increase in the perturbation parameters. These seem to be secondary to the upper airway dryness reported in these patients. The hypothetical effect of nasal CPAP on the sol layer of the vocal folds is discussed. 

Researchers in this study used the KayPENTAX Visi-Pitch, Model 3300, to perform their acoustic analysis.

 

“Visualization of Speech Patterns for Language Learning,” Molholt, Garry and Fenfang Hwu (2008).  Chapter 5 of The Path of Speech Technologies in CALL: Foundations, Prototypes, and Evaluation, edited by V. Melissa Holland and F. Pete Fisher, New York: Routledge, 2008.

The Adapting Speech Technology section of this publication includes two chapters on how component speech technologies are selected and adapted for second language instruction—how they are explored, tested, and shaped for CALL (computer-assisted language learning) functions. The Molholt and Hwu chapter covers speech visualization technology for production diagnosis and instruction. The Delmonte chapter covers speech synthesis for an array of activities in an intelligent tutoring framework.

The researchers who wrote the chapter on visualization of speech patterns for language learning used the KayPENTAX Visi-Pitch for accent reduction.

 

“Phase Asymmetries in Normophonic Speakers: Visual Judgments and Objective Findings,” Bonilha, Heather Shaw and Dimitar D. Deliyski, American Journal of Speech-Language Pathology, Vol. 17, No. 4, pp. 367-376, November 2008.

Purpose: To ascertain the amount of phase asymmetry of the vocal fold vibration in normophonic speakers via visualization techniques and compare findings for habitual and pressed phonations. 

Method: Fifty-two normophonic speakers underwent stroboscopy and high-speed videoendoscopy (HSV). The HSV images were further processed into 4 visual displays: HSV playbacks, digital kymography (DKG) playbacks, mucosal wave kymography playbacks, and static kymographic images of the medial line from the DKG playback. Two types of phase asymmetries, left and right and anterior-posterior, were rated on a scale from 1 to 5. Objective measures of left-right phase asymmetry were obtained. 

Results: The majority of normophonic speakers (81%) were noted to display anterior-posterior asymmetry; however, 66% of those were characterized as mild. Seventy-nine percent of participants were noted to display left-right asymmetry; however, 72% of those were mild. A moderate relationship between the objective measures and subjective ratings was found.

Conclusions: Most normophonic speakers exhibit mild left-right and anterior-posterior asymmetries for both habitual and pressed phonations. Asymmetries were noted more often during habitual than pressed phonations, and when visualized by HSV and kymography than stroboscopy. Differences between objective measures and visual judgments support the need to quantify vocal fold vibratory features.

In this study, a KayPENTAX Digital Strobe, Model 9100B, coupled to a 70-degree rigid endoscope, Model 9106, was used along with a laryngeal contact microphone to track vocal fold vibratory frequency. Recordings from a KayPENTAX High-Speed Video System, Model 9700, were synchronized with acoustic recordings captured with the KayPENTAX CSL, Model 4400, to allow for perceptual judgments, acoustic measurements, and comparisons between physical and acoustic events.

 

“Intra-Oral Pressure-Based Voicing Control of Electrolaryngeal Speech with Intra-Oral Vibrator,” Takahashi, Hirokazu, Masayuki Nakao, Yataro Kikuchi, and Kimitaka Kaga, Journal of Voice, Vol. 22 No. 4, pp. 420-429, July 2008.

Summary: In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the rest consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peach ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.

KayPENTAX’s Multi-Speech, Model 3700, was the speech analysis software used in this study.

 

“Effect of cochlear implantation on nasality in children,” Nguyen, Lily H.P., Jennifer Allegro, Aaron Low, Blake Papsin, and Paolo Campisi, Ear, Nose & Throat Journal, Vol. 87 No. 3, pp. 138-143, March 2008.

Abstract: Hypernasality is a commonly perceived characteristic of speech in deaf adults and children, but the mechanism of this abnormal nasal resonance is poorly understood. The impact of cochlear implantation on nasalance measures in children with severe auditory deprivation has not been previously reported. We conducted a study of nasality in 6 deaf children who had undergone cochlear implantation. Voice recordings were obtained before surgery and 6 months after activation of the implants. The MacKay-Kummer SNAP Test—which consists of a syllable-repetition subtest and a picture-cued subtest—was used to obtain nasalance scores for oral (bilabial, alveolar, velar, and sibilant) and nasal phonemes. Before cochlear implantation, mean nasalance scores were significantly higher than normal during the production of oral phonemes for both subtests (p ≤ 0.05). Six months after activation, the nasalance measures for all components of the syllable-repetition subtest had been restored to within 1 standard deviation of normal. For all oral phonemes of the picture-cued sub-test, the elevated nasalance scores were consistently lower after cochlear implant activation, although the difference was statistically significant only for velar tasks. Nasalance scores for nasal phonemes were within 1 standard deviation of normal both before and after implant activation. Our study showed that cochlear implantation partially corrects elevated nasalance measures. Disturbances in nasal resonance may be caused in part by the inability of deaf speakers to monitor velopharyngeal valving with auditory feedback. The trend toward improved nasalance scores after implantation highlights the role of auditory feedback in monitoring velopharyngeal function. Visual biofeedback may be required to further normalize hypernasal speech in profoundly deaf children.

The KayPENTAX Nasometer II, Model 6400, was used to obtain nasalance scores in this study.

 

“Pitch Discrimination and Pitch Matching Abilities with Vocal and Nonvocal Stimuli,” Moore, Robert E., Julie Estis, Susan Gordy-Hickey, and Chirstopher Watts, Journal of Voice, Vol. 22 No. 4, pp. 399-407, July 2008.

Summary: Various stimulus types have been investigated in pitch discrimination and pitch matching tasks. However, previous studies have not explored the use of recorded samples of an individual’s own voice in performing these two tasks. The purpose of this study was to investigate pitch discrimination and pitch matching abilities using three stimuli conditions (participant’s own voice, a neutral female voice, and nonvocal complex tones) to determine if pitch discrimination and/or pitch matching abilities are influenced by the type of stimuli presented. Results of the pitch discrimination tasks yielded no significant difference in discrimination ability for the three stimuli. For the pitch matching tasks, a significant difference was found for the participants’ voice versus tonal stimuli. There was no significant difference in pitch matching ability between the neutral female voice and the tonal stimuli. There was no significant correlation between pitch discrimination and pitch matching abilities for any of the three stimuli types. These results suggest that it is easier to match the pitch of one’s own voice than to match the pitch of a neutral female voice and nonvocal complex tones, although no difference was found for pitch discrimination abilities. One possible implication of this study is that differences in matching the pitch of one’s own voice compared to matching other stimuli types may help to differentiate the source of singing inaccuracy (motor vs. discrimination skills). 

The acoustic analysis in this study was conducted using the KayPENTAX Computerized Speech Lab (CSL) in conjunction with the Multi-Dimensional Voice Program (MDVP).

 

“Baseline laryngeal effects among individuals with dust mite allergy,” Krouse, John H., James P. Dworkin, Michael A. Carron, and Robert J. Stachler, Otolaryngology-Head and Neck Surgery, Vol. 139, No. 1, pp. 149-151, July 2008.

Objective: To examine baseline effects of perennial allergy on laryngeal appearance, laryngeal function, and perceived vocal handicap among individuals without current allergy or voice symptoms.

Data Sources: This pilot study included 47 adults: 21 with positive and 26 with negative skin test responses for the dust mite, Dermatophagoides pteronyssinus.

Methods: Subjects were tested for sensitivity to dust mite antigen by prick testing. Laryngeal appearance and function were studied with laryngovideostroboscopy, acoustic and speech aerodynamic analysis, and voice sampling. These parameters were blindly analyzed by three trained examiners. Subjects also completed the Voice Handicap Index (VHI) as a measure of vocal handicap.

Results: Subjects allergic to dust mites perceived significantly greater vocal handicap on the VHI than did nonallergic subjects. No significant differences were noted between groups in laryngeal appearance or function.

Conclusion: These pilot data suggest that, at baseline, allergic individuals perceived greater vocal handicap than their nonallergic counterparts (P  =  0.04), even in the absence of current allergy symptoms or observable physical or functional abnormalities. These preliminary observations can serve as an impetus for further research into this important area, including the potential interrelationship between acid reflux disease and allergic laryngeal inflammation.

In this study, laryngeal anatomy and physiology were analyzed using the KayPENTAX Digital Strobe system with a 70-degree rigid endoscope. Acoustic and speech aerodynamic parameters and subglottal pressure were assessed using the KayPENTAX Computerized Speech Lab (CSL) and the KayPENTAX Aerophone.

 

“Voice of Postradiotherapy Nasopharyngeal Carcinoma Patients: Evidence of Vocal Tract Effect,” Lin, Emily, Tzer-Zen Hwang, Jeremy Hornibrook, and Tika Ormond, Journal of Voice, Vol. 22 No. 3, pp. 351-364, May 2008.

Summary: This study was aimed at identifying acoustic and physiological measures useful for monitoring voice changes in postnasopharyngeal patients with nonlaryngeal malignancies, and providing evidences of vocal tract effect on voice through comparisons between individuals with and without intact vocal tract. Simultaneous acoustic-electroglottographic signals recorded during phonation of vowels /i/ and /a/ sustained at habitual, high, and low pitch levels were compared among 10 postradiotherapy patients with nasopharyngeal carcinoma (NPC), 10 voice patients (VPs) with intact vocal tract, and 10 healthy individuals with normal voice (NORM). Results from a series of discriminant analyses revealed that the NPC group generally exhibited lower signal-to-noise (SNR) and open quotient (OQ) and higher Formant 1 frequency (F1) and speed quotient (SQ) than the NORM group. Unlike both VP and NORM groups, the NPC group failed to show a pitch effect on all voice measures, including OQ, SQ, percent jitter, percent shimmer, and SNR, suggesting an effect of radiotherapy and/or vocal tract on laryngeal behaviors. For the vowel /i/, on the other hand, only the NPC and NORM groups showed a pattern of pitch-dependent F1 raising, a reflection of increased pharyngeal narrowing. These findings suggested that the pitch effect on laryngeal behaviors differed not only between individuals with intact vocal tract and those without but also between those with structural and dynamic changes of vocal tract.

The KayPENTAX Electroglottograph (EGG), Model 6103, was used to acquire three EGG measures (i.e., Fo, OQ, and SQ) in this study.
 

“Effects of Bilateral Subthalamic Nucleus Stimulation and Medication on Parkinsonian Speech Impairment,” D’Alatri, Lucia D., Gaetano Paludetti, M. Fiorella Contarino, Stefania Galla, Maria Raffaella Marchese, and Anna Rita Bentivoglio, Journal of Voice, Vol. 22 No. 3, pp. 365-372, May 2008.

Summary: This study aimed to assess quantitatively the effect of bilateral subthalamic nucleus (STN) stimulation and medication on hypokinetic parkinsonian dysarthria. Twelve Italian patients (11 males and 1 female) with idiopathic Parkinson’s disease (mean age 60.29 ± 7.50 years) and bilateral STN implantation were studied. Neurological assessments and acoustic recordings were performed in four clinical conditions combining stimulation and medication to assess the degree of motor disabilities and speech impairment. Acoustic analysis was performed by means of the Multidimensional Voice Program and the Advanced Motor Speech Profile (Kay Elemetrics, Lincoln Park, NJ). None of the evaluated parameters deteriorated after TN deep brain stimulation. STN stimulation significantly improved motor performances and vocal tremor and provided a major stability to glottal vibration. Effect of stimulation on these parameters was superior to that of levodopa. No significant variations were observed in perceptual evaluation and in acoustic parameters related to prosody, articulation, and intensity after either stimulation or medication. The improvement of acoustic parameters related to glottal vibration and voice tremor was not accompanied by a substantial effect on speech intelligibility. STN stimulation was more effective on global motor limb dysfunctions than on dysarthria, but we did not report negative consequences on speech.

Acoustic analysis, in this study, was performed using the KayPENTAX Computerized Speech Lab (CSL), Model 4300B, in conjunction with the KayPENTAX MDVP, Model 5105, and Advanced Motor Speech Profile, Model 5141.

 

“Acoustic Changes in Chinese Patients With Cancer-Related Unilateral Vocal Fold Paralysis After Medialization Thyroplasty,” Ng, Manwa L., Ripley K. Wong, William I. Wei, Y.H. Wong, and Paul K. Y. Lam, Contemporary Issues in Communication Science and Disorders, Vol. 35, pp. 17-24, Spring 2008.

Abstract. The present study investigated the change in the voice quality of patients with unilateral vocal fold paralysis (UVFP) of benign and malignant causes after medialization thyroplasty. Thirty-four native Cantonese adults who had been diagnosed with UVFP participated in the study. Acoustical parameters including the average voice fundamental frequency, percent jitter, relative average perturbation (RAP), percent shimmer, and noise-to-harmonic ration (NHR) were measured from the sustained vowel /a/ that was recorded before and after the thyroplasty procedure. Maximum phonation time (MPT) was also obtained.

Results indicated that, for both benign and malignant patients, all acoustical parameters except for NHR showed improvement after thyroplasty: Percent jitter, RAP, and percent shimmer values were significantly reduced, and MPT was significantly lengthened. Our findings support the notion that medialization thyroplasty is a useful palliative procedure to improve voice production in Cantonese-speaking UVFP patients. Despite the cancerous condition, it is still beneficial to malignant UVFP patients, and better voice quality can be achieved.

In this study, acoustic analysis was performed using the KayPENTAX Multi-Dimensional Voice Profile (MDVP), Model 5105.

 

“Respiratory and Laryngeal Function During Spontaneous Speaking in Teachers With Voice Disorders,” Lowell, Soren Y., Julie M. Barkmeier-Kraemer, Jeanette D. Hoit, and Brad H. Story, JSLHR, Vol. 51 No. 2, pp. 333-349, April 2008.

Purpose: To determine if respiratory and laryngeal function during spontaneous speaking were different for teachers with voice disorders compared with teachers without voice problems.

Method: Eighteen teachers, 9 with and 9 without voice disorders, were included in this study. Respiratory function was measured with magnetometry, and laryngeal function was measured with electroglottography during 3 spontaneous speaking tasks: a simulated teaching task at a typical loudness level, a simulated teaching task at an increased loudness level, and a conversational speaking task. Electroglottography measures were also obtained for 3 structured speaking tasks: a paragraph reading task, a sustained vowel, and a maximum phonation time vowel.

Results: Teachers with voice disorders started and ended their breath groups at significantly smaller lung volumes than teachers without voice problems during teaching-related speaking tasks; however there were no between-group differences in laryngeal measures. Task-related differences were found on several respiratory measures and on one laryngeal measure.

Conclusions: These findings suggest that teachers with voice disorders used different speech breathing strategies than teachers without voice problems. Implications for clinical management of teachers with voice disorders are discussed.

The KayPENTAX Electroglottograph (EGG) was used to assess laryngeal adduction characteristics during continuous speaking in this study.

  

“Fundamental Frequency Change During Offset and Onset of Voicing in Individuals with Parkinson Disease,” Goberman, Alexander M., and Michael Blomgren, Journal of Voice, Vol. 22 No. 2, pp. 178-191, March 2008.

Summary: After years of treatment with the medication levodopa, most individuals with Parkinson disease (PD) experience fluctuations in response to their medications. Although relatively consistent perceptual voice improvements have been documented to correspond with these fluctuations, consistent quantitative data to support this finding are lacking. This mismatch may have occurred because most of this phonation research has centered on long-term phonatory measures (i.e., across speaking samples and prolonged vowel tasks). The current study examined short-term phonatory behavior in individuals with PD, specifically examining fundamental frequency (F0) at the offset and onset of phonation, before and after a voiceless consonant. The F0 analysis at phonatory offset supported the conclusion that individuals with PD have difficulty with the rapid offset of voicing, and that they are stopping vocal fold vibration primarily through vocal fold abduction (without adding tension). The F0 analysis at phonatory onset revealed that all groups use some laryngeal tension at the initiation of voicing. The tension was lowest for the PD participants who were in their OFF medication state, and it was highest for the age-matched control participants and the PD participants in their ON medication states.

The speech samples in this study were digitized and analyzed using a KayPENTAX Computerized Speech Lab (CSL), Model 4400.

 

“The Relationship Between Perceptual Evaluation and Objective Multiparametric Evaluation of Dysphonia Severity,” Hakkesteegt, Marieke M., Michael P. Brocaar, Marjan H. Wieringa, and Louw Feenstra, Journal of Voice, Vol. 22 No. 2, pp. 138-145, March 2008.

Summary: The purpose of this study was to investigate the usefulness of the Dysphonia Severity Index (DSI) as an objective multiparametric measurement in assessing dysphonia. The DSI was compared with the score on Grade of the GRBAS scale. Investigated was also whether the DSI is related to severity of dysphonia, which was represented by different diagnosis groups. Furthermore, it was investigated whether the DSI can differentiate between a group of patients and a control group. A total of 294 patients with different voice pathologies were included. A control group consisted of 118 volunteers without any voice complaints. The voices of all participants were perceptually evaluated on Grade, and the DSI was measured. The groups of patients with voice complaints have a lower DSI and higher scores on Grade than the control group. The DSI discriminates between patients with nonorganic voice disorders, vocal fold mass lesions, and vocal fold paresis/paralysis. To determine whether the DSI discriminates between patients and controls, the sensitivity and specificity for different DSI cutoff points were calculated. With a DSI cutoff of 3.0, maximum sensitivity (0.72) and specificity (0.75) were found. We conclude that the DSI is a useful instrument to objectively measure the severity of dysphonia.

In this study, the KayPENTAX Multi-Speech, Model 3700, was used for acoustic analysis of sound files.

 

“Pitch Deviation Analysis of Pathological Voice in Connected Speech,” Laflen, Brandon J., Cathy L. Lazarus and Milan R. Amin, Annals of Otology, Rhinology & Laryngology, Vol. 117 (2), pp. 90-97, Feb. 2008

Objectives: This study compares normal and pathologic voices using a novel voice analysis algorithm that examines pitch deviation during connected speech. The study evaluates the clinical potential of the algorithm as a mechanism to distinguish between normal and pathologic voices using connected speech.

Methods: Adult vocalizations from normal subjects and patients with known benign free-edge vocal fold lesions were analyzed. Recordings had been previously obtained in quiet under controlled conditions. Two phrases and sustained /a/ were recorded per subject. The subject populations consisted of 10 normal and 31 abnormal subjects. The voice analysis algorithm generated 2-dimensional patterns that represent pitch deviation in time and under variable window widths. Measures were collected from these patterns for window widths between 10 and 250 ms. For comparison, jitter and shimmer measures were collected from sustained /a/ by means of the Computerized Speech Lab (CSL). A t-test and tests of sensitivity and specificity assessed discrimination between normal and abnormal populations.

Results: More than 58% of the measures collected from connected speech outperformed the CSL jitter and shimmer measures in population discrimination. Twenty-five percent of the experimental measures (including /a/) indicated significantly different populations (p < .01%).

Conclusions: The results demonstrate that the algorithm distinguishes between normal and abnormal populations by use of samples of connected speech.

The KayPENTAX Computerized Speech Lab (CSL) was used to obtain the jitter and shimmer measures analyzed in this study.

 

“The Effects of Frequency Range, Vowel, Dynamic Loudness Level, and Gender on Nasalance in Amateur and Classically Trained Singers,” Jennings, Jori Johnson, and David P. Kuehn, Journal of Voice, Vol. 22 No. 1, pp. 75-89, January 2008.

Summary: This study addresses two questions: (1) How much nasality is present in classical Western singing? (2) What are the effects of frequency range, vowel, dynamic level, and gender on nasality in amateur and classically trained singers? The Nasometer II 6400 by KayPENTAX (Lincoln Park, NJ) was used to obtain nasalance values from 21 amateur singers and 25 classically trained singers while singing an ascending five-tone scalar passage in low, mid, and high frequency ranges. Each subject sang the scalar passage at both piano and mezzo-forte dynamic loudness levels on each of the five cardinal vowels (/a/, /e/, /i/, /o/, /u/). A repeated mixed-model analysis indicated a significant main effect for the amateur/classically trained distinction, dynamic loudness level, and vowel, but not for frequency range or gender. The amateur singers had significantly higher nasalance scores than classically trained singers in all ranges and on all vowels except /o/. Dynamic loudness level had a significant effect on nasalance for all subject groups except for female majors in the mid- and high-frequency ranges. The vowel, /i/, received significantly higher nasalance than all of the other vowels. Although results of this study show that dynamic loudness level, vowel, and level of training in classical singing have a significant effect on nasality, nasalance scores for most subjects were relatively low. Only six of the subjects, all of whom were amateur singers, had average nasalance scores that could be considered hypernasal (i.e., a nasalance average of 22 or above).

The KayPENTAX Nasometer II, Model 6400, was used in this study to obtain nasalance values from each subject.

 

“Vocal Improvement After Voice Therapy in Unilateral Vocal Fold Paralysis,” Schindler, Antonio, Alessandro Bottero, Pasquale Capaccio, Daniela Ginocchio, Fulvio Adorni, and Francesco Ottaviani, Journal of Voice, Vol. 22 No. 1, pp. 113-118, January 2008.

Summary: Unilateral vocal fold paralysis (UVFP) is associated with changes in acoustic and aerodynamic voice measurements and can have a significant impact on a patient’s quality of life. Few objective data regarding the efficacy of voice therapy for UVFP exist. The aim of this study was to retrospectively analyze voice modifications in a group of patients with UVFP before and after voice therapy. Forty patients with UVFP of different etiology were included in the study. Each subject had voice therapy with an experienced speech/language pathologist twice a week; the mean number of sessions was 12.6. A multidimensional assessment protocol was used; it included videoendoscopy, the maximum phonation time (MPT), the GIRBAS scale, spectrograms and a perturbation analysis, and the Voice Handicap Index (VHI). Pre- and posttreatment data were compared by means of the Wilcoxon and Student’s t tests. A complete glottal closure was seen in 8 patients before voice therapy and in 14 afterward. Mean MPT increased significantly. In the perceptual assessment, the difference was significant for five out of six parameters. A significant improvement was found on spectrographic analysis; as for perturbation analysis, the differences in jitter, shimmer, and noise-to-harmonic ratio values were significant. VHI values showed a clear and significant improvement. A significant improvement of voice quality and quality of life after voice therapy is an often reached and reasonable goal in patients with UVFP.

In this study, the KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105 was used in conjunction with the CSL, Model 4300, to perform objective voice evaluation.

 

“Acoustic Analyses of Sustained and Running Voices From Patients With Laryngeal Pathologies,” Zhang, Yu, and Jack J. Jiang, Journal of Voice, Vol. 22 No. 1, pp. 1-9, January 2008.

Summary: In this paper, we investigated the acoustic characteristics of sustained and running vowels from normal subjects and patients with laryngeal pathologies. Perturbation methods (including jitter and shimmer), signal-to-noise ratio (SNR), and nonlinear dynamic methods (such as correlation dimension and second-order entropy) were used to analyze sustained and running vowels. We found that the sustained vowels and running voices from normal subjects and patients with laryngeal pathologies had low-dimensional dynamic characteristics. For sustained vowels, the analyses of jitter, shimmer, correlation dimension, and second-order entropy revealed significant differences between normal and pathological voices. For running voices, jitter and shimmer did not statistically discriminate between normal and pathological voices, but a significant difference was found for SNR, correlation dimension, and second-order entropy. The results suggest that nonlinear dynamic analysis and traditional SNR analysis may be valuable for the analysis of sustained and running vowels; perturbation analysis may be applicable for the analysis of sustained vowels but should be applied with caution for running voice analysis

The voice samples used in this study were selected from the KayPENTAX Disordered Voice Database, Model 4337.

 

 “An Exploration of Skin Acceleration Level as a Measure of Phonatory Function in Singing,” Lamarche, Anick, and Sten Ternström, Journal of Voice, Vol. 22 No. 1, pp. 10-22, January 2008.

Summary: Two kinds of fluctuations are observed in phonetogram recordings of singing. Sound pressure level (SPL) can vary due to vibrato and also due to the effect of open and closed vowels. Since vowel variation is mostly a consequence of vocal tract modification and is not directly related to phonatory function, it could be helpful to suppress such variation when studying phonation. Skin acceleration level (SAL), measured at the jugular notch and on the sternum, might be less influenced by effects of the vocal tract. It is explored in this study as an alternative measure to SPL. Five female singers sang vowel series on selected pitches and in different tasks. Recorded data were used to investigate two null hypotheses: (1) SPL and SAL are equally influenced by vowel variation and (2) SPL and SAL are equally correlated to subglottal pressure (PS). Interestingly, the vowel variation effect was small in both SPL and SAL. Furthermore, in comparison to SPL, SAL correlated weakly to PS. SAL exhibited practically no dependence on fundamental frequency, rather, its major determinant was the musical dynamic. This results in a non-sloping, square-like phonetogram contour. These outcomes show that SAL potentially can facilitate phonetographic analysis of the singing voice.

In this study, both the  microphone/sound level meter and the pressure transducer were  connected to the KayPENTAX Computerized Speech Lab (CSL), Model 4500, which acquires up to 4 channels of data.

 

“Mucosal Wave: A Normophonic Study Across Visualization Techniques,” Shaw, Heather S., and Dimitar D. Deliyski, Journal of Voice, Vol. 22 No. 1, pp. 23-33, January 2008.

Summary: Visualization of vocal fold vibration is essential for accurate diagnoses and optimal treatment of persons with voice disorders. Recently, scientific and anecdotal reports have evidenced an increased amount of variation in the diagnostically relevant features of extent and symmetry of mucosal wave magnitude in normophonic speakers. The objectives of this study were to preliminarily ascertain the variation in mucosal wave magnitude and symmetry for normophonic speakers as assessed via standard and novel techniques, and compare findings across modal and pressed phonations. A correlational design with a multiple baseline across visualization methods approach was used. Mucosal wave presence, magnitude, and symmetry from 52 normophonic speakers were judged via stroboscopy, high-speed videoendoscopy (HSV) playback, mucosal wave playback, and mucosal wave kymography playback. Results demonstrate a prevalence of atypical magnitude and symmetry of mucosal wave during modal and pressed phonations by normophonic persons, differences across techniques, and a relationship between judgments and habitual fundamental frequency. Given the prevalence of mucosal wave magnitude and symmetry variations in the normophonic population, overdiagnosis may be possible without caution. The various visualization techniques provided unique information suggesting that it may be beneficial to use both full view and kymographic visualization techniques in combination. A major restriction of the current commercial HSV systems is the frame rate, typically limited to 2000 frames per second, which appears insufficient for most female habitual phonations.

In this study, the KayPENTAX Digital Strobe, Model 9100B, coupled to a KayPENTAX 70-degree rigid endoscope, Model 9106, was used along with a laryngeal contact microphone to track vocal fold vibratory frequency. Also used in the study were the KayPENTAX High-Speed Video (HSV) system, Model 9700, and xenon light source, Model 7152. The KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105, coupled with a condenser head-mount microphone, was used to record the acoustic signal synchronized with the HSV recording.

 

“Influence of Speaker Gender on Listener Judgments of Tracheoesophageal Speech,” Eadie, Tanya L., Philip C. Doyle, Kerry Hansen, and Paul G. Beaudin, Journal of Voice, Vol. 22 No. 1, pp. 43-57, January 2008.

Summary: The objectives of this prospective and exploratory study are to determine: (1) naïve listener preference for gender in tracheoesophageal (TE) speech when speech severity is controlled; (2) the accuracy of identifying TE speaker gender; (3) the effects of gender identification on judgments of speech acceptability (ACC) and naturalness (NAT); and (4) the acoustic basis of ACC and NAT judgments. Six male and six female adult TE speakers were matched for speech severity. Twenty naïve listeners made auditory-perceptual judgments of speech samples in three listening sessions. First, listeners performed preference judgments using a paired comparison paradigm. Second, listeners made judgments of speaker gender, speech ACC, and NAT using rating scales. Last, listeners made ACC and NAT judgments when speaker gender was provided coincidentally. Duration, frequency, and spectral measures were performed. No significant differences were found for preference of male or female speakers. All male speakers were accurately identified, but only two of six female speakers were accurately identified. Significant interactions were found between gender and listening condition (gender known) for NAT and ACC judgments. Males were judged more natural when gender was known; female speakers were judged less natural and less acceptable when gender was known. Regression analyses revealed that judgments of female speakers were best predicted with duration measures when gender was unknown, but with spectral measures when gender was known; judgments of males was best predicted with spectral measures. Naïve listeners have difficulty identifying the gender of female TE speakers. Listeners show no preference for speaker gender, but when gender is known, female speakers are least acceptable and natural. The nature of the perceptual task may affect the acoustic basis of listener judgments.

The KayPENTAX Computerized Speech Lab (CSL), Model 4500, was used to perform the acoustic analyses in this study.

 

“Objective and Subjective Evaluation of Voice Quality in Multiple Sclerosis,” Dogan, Müzeyyen, Ipek Midi, Mine Almaz Yazici, Ismail Kocak, Dilek Günal, and Mehmet Ali Sehitoglu, Journal of Voice, Vol. 21 No. 6, pp. 735-740, November 2007.

Summary: The aim of this comparative, controlled, cross-sectional study is to evaluate the voice quality in patients with multiple sclerosis (MS) by subjective and objective methods. Female patients with MS (n = 27) and age-and sex-matched healthy controls (n = 27) were included in this stud. Vocal functions were evaluated by a multidimensional set composed of videolaryngostroboscopic examination, acoustic analysis, and subjective measurements (GRBAS and “Voice Handicap Index”). Jitter percent, shimmer percent, and softa phonation index (SPI) values were higher in MS patients compared to controls (Jitt, P – 0.001; Shim, P – 0.033; SPI P < 0.0001). Maximum phonation time was significantly shorter for MS patients compared to controls (P < 0.0001). Stroboscopic examination revealed that 16 out of 27 MS patients have a “posterior chink” as glottic closure pattern with higher SPI values (40%). Noise to harmonic ratio (NHR) and mean fundamental frequency (F0) values were similar for MS and control groups (NHR, P = 0.737; F0, P – 0.976). In this study, most of the MS patients had dysphonia due to weakness of voice. MS tends to worsen acoustic parameters including fundamental frequency, SPI, and jitter values. These results are consistent with the more asthenic voice quality observed in MS group.

The KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105, was used in conjunction with the KayPENTAX Multi-Speech, Model 3700, for the acoustic analysis of voice samples in this study.

 

“Short-Term Effects of Endotracheal Intubation on Voice,” Hamdan, Abdul-Latif, Abla Sibai, Charbel Rameh, and Ghassan Kanazeh, Journal of Voice, Vol. 21 No. 6, pp. 762-768, November 2007.

Summary: The objective of this study was to examine the vocal symptoms and acoustic changes perceived in the short period after endotracheal intubation, and to find the association between these changes and the endotracheal tube parameters. A total of 35 subjects were included. They were examined preoperatively, and 2 and 24 hours postoperatively. The vocal symptoms of hoarseness, vocal fatigue, loss of voice, throat clearing, globus pharyngeus, throat pain, and the acoustic variables mainly average fundamental frequency, relative average perturbation, shimmer, noise to harmony ratio, voice turbulence index, habitual pitch, and maximum phonation time (MPT) were assessed as such and in relation to the following endotracheal tube parameters: duration of anesthesia, number of intubation attempts, size of the tube, cuff volume, cuff mean pressure, and the emergence. The association between anesthesia parameters with incidence of vocal complaints and changes in acoustic parameters were examined using logistic and linear regression. Vocal fatigue was associated significantly with the increase in cuff volume and the number of intubation attempts. Throat clearing was associated significantly with the increase in cuff mean pressure. Only the increase in habitual pitch was associated significantly with the increase in cuff volume. The acute short-term effect of endotracheal intubation on voice is significant. The most important endotracheal tube parameters that affect the vocal changes are the cuff mean pressure and volume. The laryngeal contribution to these vocal changes seems to be minimal All vocal symptoms increased significantly except for globus pharyngeus at 2 hours postoperatively. The acoustic parameters did not change significantly except for a decrease in MPT. At 24 hours postoperatively, all vocal symptoms subsided with no significant difference to baseline value. The habitual pitch increased significantly, and the rest of the parameters remained comparable to baseline value.

The KayPENTAX Visi-Pitch, Model 3300, was used to perform the acoustic analysis in this study.

 

“Effect of Deep Brain Stimulation on Different Speech Subsystems in Patients with Multiple Sclerosis,” Putzer, Manfred, William John Barry, and Jean Richard Moringlane, Journal of Voice, Vol. 21 No. 6, pp. 741-753, November 2007.

Summary: The effect of deep brain stimulation on articulation and phonation subsystems in seven patients with multiple sclerosis (MS) was examined. Production parameters in fast syllable repetitions were defined and measured, and the phonation quality during vowel productions was analyzed. Speech material was recorded for patients (with and without stimulation) and for a group of healthy control speakers. With stimulation, the precision of glottal and supraglottal articulatory gestures is reduced, whereas phonation has a greater tendency to be hyperfunctional in comparison with the healthy control data. Different effects on the two speech subsystems are induced by electrical stimulation of the thalamus in patients with MS.

In this study, the KayPENTAX CSL, Model 4300B, was used to capture and digitize the EGG and microphone signals that were recorded simultaneously.

 

“Acoustic Voice Analysis of Prelingually Deaf Adults Before and After Cochlear Implantation,” Evans, Maegan K., and Dimitar D. Deliyski, Journal of Voice, Vol. 21 No. 6, pp. 669-682, November 2007.

Summary: It is widely accepted that many severe to profoundly deaf adults have benefited from cochlear implants (CIs). However, limited research has been conducted to investigate changes in voice and speech of prelingually deaf adults who receive CIs, a population well known for presenting with a variety of voice and speech abnormalities. The purpose of this study was to use acoustic analysis to explore changes in voice and speech for three prelingually deaf males pre- and post implantation over 6 months. The following measurements, some measured in varying contexts, were obtained: fundamental frequency (F0), jitter, shimmer, voice-to-harmonic ratio, voice turbulence index, soft phonation index, amplitude- and F0 variation, F0-range, speech rate, nasalance, and vowel production. Characteristics of vowel production were measured by determining the first formant (F1) and second formant (F2) of vowels in various contexts, magnitude of F2-variation, and rate of F2-variation. Perceptual measurements of pitch, pitch variability, loudness variability, speech rate, and intonation were obtained for comparison. Results are reported using descriptive statistics. The results showed patterns of change for some of the parameters while there was considerable variation across the subjects. All participants demonstrated a decrease in Fo in at least one context and demonstrated a change in nasalance toward the norm as compared to their normal hearing control. The two participants who were oral language communicators were judged to produce vowels with an average of 97.2% accuracy and the sign-language user demonstrated low percent accuracy for vowel production.

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, and Nasometer II, Model 6400, were used to obtain the objective data in this study. In addition to the core CSL program, other CSL applications that were used included the Motor Speech Profile (MSP), Model 5141, and Multi-Dimensional Voice Program (MDVP), Model 5105.

 

“Speech Breathing Behavior and Vocal Fold Function in Dysphonic Participants Before and After Therapy During Connected Speech: Preliminary Observations,” Schaeffer, Natalie, Contemporary Issues in Communication Science and Disorders, Vol. 34, pp. 61-72, Fall 2007.

Abstract. This research is an extension of the author’s previous research in which speech breathing values (on the respigraph) of participants with abuse-related dysphonia and those with normal voices were compared during connected speech. Results from the previous study revealed that the dysphonic group used significantly lower end-expiratory values (i.e., extended exhalation below resting expiratory levels) in comparison to the group with normal voices. The present study investigated speech breathing values (on the respigraph) simultaneously with vocal fold function (on the electroglottograph) in 10 dysphonic participants, before and after therapy, during connected speech. Preliminary results indicated a significant improvement in speech breathing data (higher end-expiratory levels) and a trend toward increased vocal fold symmetry (lower speed quotients) following therapy. Additionally, posttherapy perceptual ratings revealed significant improvements in the participants’ vocal quality when compared to pretherapy ratings.

In this study, vocal fold function, specifically contact quotient and contact index or speed quotient, was measured on EGG waveforms acquired using  the KayPENTAX Electroglottograph, Model 6103.

 

“Effect of Syllable-Initial Voicing on Vowel Duration During Simultaneous Communication in Speech Produced by Inexperienced Signers: A Systematic Replication,” Allen, Kristin, Sarah Maisonet, Dale E. Metz, Nicholas Schiavetti, and Robert L. Whitehead, Contemporary Issues in Communication Science and Disorders, Vol. 34, pp. 101-105, Fall 2007.

Abstract. Under natural speaking conditions, or speaking alone (SA), vowels following word-initial voiced stop consonants are longer in duration than vowels following word-initial voiceless stops. This study investigated vowel durations following the production of word-initial voiced and voiceless stop consonants produced during simultaneous communication (SC) by recording inexperienced sign language users during SC and SA. Although the results indicated longer sentence and vowel durations for SC than SA, they showed no differences in the relative duration of vowels following voiced or voiceless stops. Vowel durations following voiced stop consonants were uniformly longer and than vowel durations following voiceless stops across both speaking conditions. This finding is consistent with previous research indicating that global temporal alterations observed in SC do not degrade important temporal cues of spoken English. The findings are also consistent with the finding of D. E. Metz et al. (2006), who investigated experienced signers’ vowel durations under identical experimental conditions as the present study.

The KayPENTAX Computerized Speech Lab (CSL), Model 4300B, was used in this study to digitize and display the acoustic audio signals.

 

“Functional Analysis of Voice Using Simultaneous High-Speed Imaging and Acoustic Recordings,” Yan, Yuling, Edward Damrose, and Diane Bless, Journal of Voice, Vol. 21 No. 5, pp. 604-616, September 2007.

Summary: We present a comprehensive, functional analysis of clinical voice data derived from both high-speed digital imaging (HSDI) of the larynx and simultaneously acquired acoustic recordings. The goals of this study are to: (1) correlate dynamic characteristics of the vocal folds derived from direct laryngeal imaging with indirectly acquired acoustic measurements; (2) define the advantages of using a combined imaging/acoustic approach for the analysis of voice condition; and (3) identify new quantitative measures to evaluate the regularity of the vocal fold vibration and the complexity of the vocal output—these measures will be key to successful diagnosis of vocal abnormalities. Image- and acoustic-based analyses are performed using an analytic phase plot approach previously introduced by our group (referred to as ‘Nyquist’ plot). Fast Fourier Transform (FFT) spectral analyses are performed on the same data for a comparison. Clinical HSDI and acoustic recordings from subjects having normal and specific voice pathologies, including muscular tension dysphonia (MTD) and recurrent respiratory papillomatosis (RRP) were analyzed using the Nyquist plot approach. The results of these analyses show that a combined imaging/acoustic analysis approach provides better characterization of the vibratory behavior of the vocal folds as it correlates with vocal output and pathology.

Researchers used a KayPENTAX High-Speed Video System for simultaneous image and acoustic data acquisition in this study.

 

“The Effectiveness of Oral Resonance Therapy on the Perception of Femininity of Voice in Male-to-Female Transsexuals,” Carew, Lisa, Georgia Dacakis, and Jennifer Oates, Journal of Voice, Vol. 21 No. 5, pp. 591-603, September 2007.

Summary: Ten male-to-female transsexuals participated in five sessions of oral resonance voice therapy targeting lip spreading and forward tongue carriage. Acoustic analysis of recordings made pre- and posttherapy found that participant formant frequency values (F1, F2, and F3, from the vowels /a/, /i/, and /U/, as well as fundamental frequency (F0), underwent a general increase posttherapy. F3 values, in particular, increased significantly posttreatment. Trends in listener ratings of these recordings showed that the majority of participants were perceived to sound more feminine following treatment. Participants’ self-ratings of their voices pre- and posttreatment also indicated that participants perceived their voices as sounding more feminine and that they were more satisfied with their voices following treatment. The present study supports the findings of previous studies that have demonstrated that resonance characteristics in male-to-female transsexuals can be changed to more closely approximate those of females through oral resonance therapy. This intervention study also demonstrates that a spontaneous increase in F0 is achieved during the course of therapy. Further, this study provides preliminary evidence to suggest that oral resonance therapy may be effective in increasing femininity of voice in male-to-male transsexual clients.

The KayPENTAX CSL, Model 4300B, was used to perform the acoustic analyses in this study.

 

“Acoustic Analysis of the Interaction of Choral Arrangements, Musical Selection, and Microphone Location,” Morris, Richard J., Ashley J. Mustafa, Christopher R. McCrea, Linda P. Fowler, and Christopher Aspaas, Journal of Voice, Vol. 21 No. 5, pp. 568-575, September 2007.

Summary: Acoustic differences were evaluated among three choral arrangements and two choral textures recorded at three microphone locations. A choir was recorded when singing two musical selections of different choral texture, one homophonic and one polyphonic. Both musical selections were sung in three choral arrangements: block sectional, sectional-in-columns, and mixed. Microphones were placed at the level of the choristers, the conductor, and the audience. The recordings at each location were analyzed using long-term average spectrum (LTAS). The LTAS from the mixed arrangement exhibited more signal amplitude than the other arrangements in the range of 1000-3500 Hz. When considering the musical selections, the chorus produced more signal amplitude in the region of 1800-2200 Hz for the homophonic selection. In addition, the LTAS produced by the choir for the homophonic selection varied across the microphone locations. As for the microphone location, the LTAS of the signal detected directly in front of the chorus had a greater slope than the other two locations. Thus, the acoustic signal near the choristers differed from the signals near the conductor and in the audience. Conductors may be using acoustic information from the region of the second and third formants when they decide how to arrange a choir for a particular musical selection.

In this study, productions from the choir were digitized via digital-analog-digital connections between the DAT and the KayPENTAX CSL, Model 4400. The CSL was also used to analyze these files.

 

“The Role of Pitch Memory in Pitch Discrimination and Pitch Matching,” Moore, Robert E., Casie Keaton, and Christopher Watts, Journal of Voice, Vol. 21 No. 5, pp. 560-567, September 2007

Summary: Accurate control of vocal pitch (fundamental frequency) requires coordination of sensory and motor systems. Previous research has supported the relationship between perceptual accuracy and vocal pitch matching accuracy. The purpose of this study was to investigate the role of memory for pitch in pitch matching and pitch discrimination ability. Three experimental tasks were used. First, a pitch matching task was completed, in which the participants listened to target tones and vocally matched the pitch of the tones. The second task was a pitch discrimination task that required the participants to judge the pitch (same or different) of complex tone pairs. The third task was pitch discrimination with memory interference task that was similar to the pitch discrimination task except interference tones were added. Results of the pitch matching and pitch discrimination tasks yielded a significant correlation between pitch discrimination and pitch matching. These results support earlier findings of a relationship between pitch discrimination and pitch matching abilities. The results also suggest a possible role of pitch memory in both tasks. These findings may have implications for abilities related to accurate pitch control.

The KayPENTAX Multi-Dimensional Voice Program (MDVP) was used in conjunction with the KayPENTAX CSL to calculate the fundamental frequency of each recorded sample.

 

“The Effects of Three Nebulized Osmotic Agents in the Dry Larynx,” Tanner, Kristine, Nelson Roy, Ray M. Merrill, and Mark Elstad, Vol. 50 No. 3, pp. 635-646, June 2007.

Purpose: This investigation examined the effects of nebulized hypertonic saline, isotonic saline (IS), and sterile (hypotonic) water on phonation threshold pressure (PTP) and self-perceived phonatory effort (PPE) following a surface laryngeal dehydration challenge.

Method: In a double-blind, randomized experimental trial, 60 vocally healthy women (n = 15 per group) underwent a laryngeal desiccation challenge involving oral breathing for 15 min using medical-grade dry air (RH < 1%). Three of the four groups then received nebulized isotonic saline (0.9% NaCl), hypertonic saline (7% NaCl), or sterile (hypotonic) water, respectively; the 4th group served as a nontreatment control. PTP and PPE were estimated for high-pitched productions at baseline, immediately postdesiccation, and at 5, 30, 35, and 50 min postnebulization.

Results: PTP increased significantly for all groups following the desiccation challenge. PTP values were, on average, 0.5 cm H2O greater immediately postdesiccation versus baseline. In contrast PTP values did not change significantly following the administration of nebulized treatments, although a temporary trend toward a reduction in PTP was observed for the IS group. Unexpectedly, PPE ratings decreased significantly after the desiccation challenge. In general, PPE ratings were poorly correlated with PTP measures.

Conclusions: A laryngeal desiccation challenge (i.e., temporary exposure to extremely low relative humidity while breathing transorally) significantly increased PTP. Although interesting trends emerged, none of the nebulized treatments significantly enhanced recovery from the negative effects of desiccation on PTP. In light of very low correlations between PTP and PPE, serious questions are raised regarding presumed associations between these measures.

The KayPENTAX Multi-Speech, Model 3700, was used to acquire Fo data in this study.

 

“F2 Locus Equations: Phonetic Descriptors of Coarticulation in 17- to 22-Month Old Children,” Gibson, Terrie and Ralph N. Ohde, JHLSR, Vol. 50 No. 1, pp. 97-108, February 2007.

The general purpose of this research was to describe coarticulation across voiced stop consonant place of articulation in 10 children younger than 2 years of age. A total of 1,182 voiced stop CV productions was analyzed using the locus equation metric, which yielded 3 regression lines that described the relation of F2 onset and F2 vowel for /bV/, /dV/, and /gV/ productions. The results revealed significant differential effects for slope and y-intercept as a function of stop consonant place of articulation. The ordering of the mean slope values for stop consonant place of articulation was /g/>/b/ and /d/, indicating that /g/ was produced with significantly greater coarticulation than /b/ or /d/. However, the unique vowel allophonic pattern of [g] coarticulation reported in the literature for English-speaking adults was generally not learned by these young children. Group and individual coarticulation trends are described in relation to developmental theories of sound acquisition. Results suggest that early coarticulation patterns are phoneme specific.

In this study, CV productions for 7 children were digitized for acoustic analysis using the KayPENTAX Computerized Speech Lab (CSL), while the CV productions of the remaining 3 children were analyzed using the KayPENTAX Multi-Speech, Model 3700.

 

“Subjective and Objective Evaluation of Voice Quality in Patients with Asthma,” Dogan, Muzeyyen, Emel Eryuksel, Ismail Kocak, Turgay Celikel, and Mehmet Ali Sehitoglu, Journal of Voice, Vol. 21 No. 2, pp. 224-230, March 2007.

Summary: Objectives: To evaluate the voice quality in patients with mild-to-moderate asthma by subjective and objective methods. Study design: Comparative, controlled, cross-sectional study. Methods: Patients with mild-to-moderate asthma (n = 40) and age- and sex-matched healthy controls (n = 40) were included. Acoustic analyses were performed by the Multi-Dimensional Voice Program (MDVP; Kay Elemetrics Corporation, Lincoln Park, NJ) and the movements of the vocal cords were examined by videolaryngostroboscopy (VLS). In addition, the duration of illness, maximum phonation time, “s/z” values, and vital capacity were evaluated. Voice Handicap Index (VHI) and GRB scales were used for subjective evaluations. Results: Maximum phonation time values were significantly shorter both in male and female asthma patients compared with controls (P < 0.0001). Also, average shimmer values in MDVP were higher for both sexes in the patient group compared with controls (P = 0.002 and P = 0.04, respectively). There was a significant difference between female patients and sex-matched controls with regard to mean noise-to-harmonic ration values (P = 0.006). Female patients with asthma had higher average jitter values compared with sex-matched controls (P < 0.0001). A significant difference was noted between asthma and control groups with regard to GRB scale (P < 0.0001, P < 0.001, and P < 0.0001, respectively). The VHI score was above the normal limit in 16 (40%), and VLS findings were abnormal in 39 (97.5%) asthmatics. Conclusion: In asthmatic patients, maximum phonation time, frequency, and amplitude perturbation parameters were impaired, but the vital capacity and the duration of illness did not correlate with these findings.

In this study, the voice analysis was performed using the KayPENTAX Multi-Dimensional Voice Program.

 

“Adductor Spasmodic Dysphonia Versus Muscle Tension Dysphonia: Examining the Diagnostic Value of Recurrent Laryngeal Nerve Lidocaine Block,” Roy, Nelson, Marshall E. Smith, Brynn Allen, and Ray M. Merrill, Annals of Otology, Rhinology & Laryngology, Vol. 116 (3), pp. 161-168, March 2007.

Objectives: Differentiating adductor spasmodic dysphonia (ADSD) from muscle tension dysphonia (MTD) can be difficult. This investigation examined the precision of response to unilateral lidocaine block of the recurrent laryngeal nerve (RLN block) as a potential diagnostic test to discriminate ADSD from MTD.

Methods: Patients with ADSD (n = 23) and MTD (n = 20) were audio-recorded before and during RLN block. The patients completed self-ratings of dysphonia severity, vocal effort, and laryngeal tightness, and blinded listeners completed auditory-perceptual ratings of overall severity, breathiness, and strain of voice samples before and during the block.

Results: Repeated-measures analysis of variance, with “group” (ADSD/MTD) as the between-subjects variable and “time” (before block/during block) as the within-subjects variable, confirmed significant “time” effects, but no significant “group-by-time” interaction effects, indicating that both disorder groups responded favorably to RLN block, according to patient- and listener-based ratings. Furthermore, low estimates of sensitivity and specificity and weak receiver operating characteristic curves confirmed that a positive response to the RLN block test did not distinguish ADSD from MTD.

Conclusions: We conclude that RLN block offers little discriminatory value in the differential diagnosis of ADSD versus MTD, and a positive response to RLN block should not be considered confirmatory of ADSD.

The KayPENTAX Multi-Speech, Model 3700, was used to digitize and store speech samples recorded before and during RLN block in this study.

 

“Evolution of Vocal Fold Nodules from Childhood to Adolescence,” De Bodt, M.S., K. Ketelslagers, T. Peeters, FL. Wuyts, F. Mertens, J. Pattyn, L. Heylen, A. Peeters, A. Boudewyns, and P. Van de Heyning, Journal of Voice, Vol. 21 No. 2, pp. 151-156, March 2007.

Summary: Bilateral (quasi) symmetrical lesions of the anterior third of the vocal folds, commonly called vocal fold nodules (VFNs) are the most frequent vocal fold lesions in childhood caused by vocal abuse and hyperfunction. This study evaluates their long-term genesis with or without surgery and voice therapy. A group of 91 postmutational adolescents (mean age, 16 years), in whom VFNs were diagnosed in childhood, were questioned to analyze the evolution of their complaints. Thirty four of them could be clinically reexamined by means of the European Laryngological Society-protocol, including a complete laryngological investigation and voice assessment. A total of 21% of the questioned group (n = 91) had voice complaints persisting into postpubescence with a statistically significant difference (P ≤ 0.001) between boys (8%) and girls (37%). VFNs were still present in 47% of the girls and 7% of the boys of the clinically evaluated group (n = 34). Analysis of the data before and after puberty shows that the variables gender, allergy, and degree of dysphonia (“G”) in childhood enable a fairly correct prediction of persisting voice complaints in adolescence (sensitivity of 89% and specificity of 67%). The results of this study show a clearly different evolution for both sexes, with significant higher long-term risks for dysphonic girls with allergy.

In this study, the KayPENTAX Voice Range Profile (VRP) and Multi-Dimensional Voice Program (MDVP) were used in conjunction with the Computerized Speech Lab (CSL) for acoustic analysis.

 

“Long-Term Outcome of Hyperfunctional Voice Disorders Based on a Multiparameter Approach,” Van Lierde, K.M., S. Claeys, M. De Bodt, and P. van Cauwenberge, Journal of Voice, Vol. 21 No. 2, pp. 179-188, March 2007.

Summary: The purpose of this study is to determine the long-term voice outcome (6.1 years after a well-defined voice treatment program) of hyperfunctional voice disorders in 27 subjects. All patients showed a muscle tension pattern type I (MTP I). Perceptual ratings, aerodynamic and acoustical analyses, Voice Handicap Index (DSI) were performed. The laryngovideostroboscopic images indicated that 51% of the subjects still show pathological laryngological findings. The negative evolution of the DSI from -1 to -3.2 is in agreement with this finding. Analysis of the components of the DSI shows that the main responsible variable for this negative change is the lowest intensity (I-low) that increased with 8.1 dB, indicating that subjects generally speak too loud, which is a typical problem for vocal hyperfunction. The VHI-score indicates an unimportant psychosocial impact of the voice disorder. The more objective and laryngostroboscopic findings indicate a chronic situation for a substantial part of the subjects and even a worse situation for some of them. Whether the long-term voice outcome results can be changed with the insertion of several follow-up voice rehabilitation sessions over the years remains unanswered and is a subject for further research.

Acoustic analysis in this study was performed using the KayPENTAX Multi-Dimensional Voice Program (MDVP) and Voice Range Profile (VRP) in conjunction with the CSL.

 

“Current Diagnosis and Treatment of Laryngocele in Adults,” Dursun, Gursel, Ozan B. Ozgursoy, Suha Beton, and Hunkar Batikhan, Otolaryngology-Head and Neck Surgery, Vol. 136, No. 2, pp. 211-215, February 2007.

Objectives: To evaluate the treatment outcome of a series of laryngoceles and to comment on the current diagnosis and management of laryngoceles.

Study Design and Setting: A retrospective review of charts, radiological and histopathological notes, videolaryngostroboscopic records, and acoustic voice analyses of patients with laryngocele treated over a 10-year period was undertaken.

Results: Seven patients had internal laryngoceles; one had external; another one had combined laryngocele. Patients with internal laryngocele underwent endoscopic CO2 laser resection, while those with external or combined laryngocele were treated via external approach. Quality of voice was improved and no recurrences were encountered during the follow-up. No evidence of laryngeal cancer was found on the histological examinations.

Conclusions: Endoscopic CO2 laser resection of internal laryngocele provides a reliable and cost-effective method that minimizes hospitalization and the need for tracheotomy. We believe that advances in the applications of laser in microlaryngosurgery will alter the traditional management of all type of laryngoceles.

The acoustic analysis in this study was performed using the KayPENTAX Computerized Speech Lab in conjunction with the Multi-Dimensional Voice Program (MDVP).

 

“The Effect of Visible Speech in the Perceptual Rating of Pathological Voices,” Martens, Jan W.M.A.F., Huib Versnel, and Philippe H. Dejonckere, Archives of Otolaryngology-Head & Neck Surgery, Vol. 133 No. 2, pp. 178-185, February 2007.

Objectives: To test a simple method for improving consistency among raters for the perceptual evaluation of pathological voice quality by providing visible speech (spectrogram) as additional information because, to date, the interrater variability still limits the widespread clinical use of the best available rating system.

Design: Experimental comparison between 2 different ways (with and without the addition of visible speech) of perceptual rating by trained professionals of recorded pathological voices. Furthermore, the correlation between acoustical (jitter, shimmer, and noise-harmonic ration) and perceptual parameters was investigated in both rating conditions.

Subjects: Six experts evaluated 70 recorded pathological voices using the GIRBAS (grade, instability, roughness, breathiness, asthenicity, and strain) sale in 2 separate sessions: first, conventionally, without visible speech as additional information, and several months later, with visible speech as additional information.

Main Outcome Measures: The κ interrater agreement and the correlation coefficient between GIRBAS scores and acoustic measures.

Results: We found a significant effect of visible speech on the agreement between the raters. The interrater agreement according to κ statistics was significantly stronger with the addition of visible speech than without for rating grade, roughness, and breathiness. The correlation between acoustical and perceptual parameters showed no significant effect of visible speech.

Conclusion: The addition of visible speech to the perceptual evaluation of pathological voices is an interesting clinical asset to enhance its reliability. The addition of visible speech to the clinical setting is feasible, since affordable computer programs are currently available that can provide the spectrogram in quasi-real time while conversing with the patient. The acoustical analysis might be applied in addition to perceptual rating in a multi-dimensional approach to assess voice quality.

In this study, the KayPENTAX Multi-Dimensional Voice Program (MDVP) was used for acoustical analysis; spectrograms were generated using the Kay DSP Sona-Graph, Model 7800.

 

“Effects of Vocal Training and Phonatory Task on Voice Onset Time,” McCrea, Christopher R. and Richard J. Morris, Journal of Voice, Vol. 21 No. 1, pp. 54-63, January 2007.

Summary: Objectives/Hypothesis: The purpose of this study was to examine the temporal-acoustic differences between trained singers and nonsingers during speech and singing tasks.

Methods: Thirty male participants were separated into two groups of 15 according to level of vocal training (i.e., trained or untrained). The participants spoke and sang carrier phrases containing English voiced and voiceless bilabial stops, and voice onset time (VOT) was measured for the stop consonant productions.

Results: Mixed analyses of variance revealed a significant main effect between speech and singing for /p/ and /b/, with VOT durations longer during speech than singing for /p/, and the opposite true for /b/. Furthermore, a significant phonatory task by vocal training interaction was observed for /p/ productions.

Conclusions: The results indicated that the type of phonatory task influences VOT and that these influences are most obvious in trained singers secondary to the articulatory and phonatory adjustments learned during vocal training.

The KayPENTAX Computerized Speech Lab (CSL), Model 4300B, was used to perform the acoustic analysis in this study.

 

“Transoral Approach to Laser Thyroarytenoid Myoneurectomy for Treatment of Adductor Spasmodic Dysphonia: Short-Term Results,” Su, Chih-Ying, Hui-Ching Chuang, Shang-Shyue Tsai, and Jeng-Fen Chiu, Annals of Otology, Rhinology & Laryngology, Vol. 116 (1, pp. 11-18, January 2007.

Objectives: The surgical technique for the resection of the recurrent laryngeal nerve for adductor spasmodic dysphonia (ASD) has high late failure rates. During the past decade, botulinum toxin has emerged as the treatment of choice for ASD. Although effective, it also has significant disadvantages, including a temporary effect and an unpredictable dose-response relationship. In this study we investigated the effectiveness of a new transoral approach to laser thyroarytenoid myoneurectomy for treatment of ASD.

Methods: Fourteen patients with ASD underwent transoral laser myoneurectomy of bilateral thyroarytenoid muscles. Under general anesthesia, an operating microscope and a carbon dioxide laser were used to perform myectomy of the mid-posterior belly of bilateral thyroarytenoid muscles together with neurectomy of the terminal nerve fibers among the deep muscle bundles Care was taken not to damage the vocalis ligaments, arytenoid cartilages, and lateral cricoarytenoid muscles. Preoperative and postoperative videolaryngostroboscopy and vocal assessments were studied.

Results: The 13 patients who completed more than 6 months follow-up were enrolled in this study. Moderate and marked vocal improvement was achieved in 92% of the patients (12 of 13) after laser surgery during an average follow-up period of 17 months (range, 6 to 31 months). No vocal fold atrophy or paralysis was observed in any patient. None of the patients had a recurrence during the follow-up period.

Conclusions: Transoral laser myoneurectomy of bilateral thyroarytenoid muscles is a relatively simple, effective, and valuable technique for the treatment of ASD. The durability of outcome achieved with this procedure is encouraging.

A KayPENTAX Computerized Speech Lab (CSL), Model 4300B, was used to measure acoustic parameters including mean fundamental frequency, noise-to-harmonics ratio, jitter, and shimmer.

 

“The Speaker’s Formant,” Irene Velsvik Bele, Journal of Voice, Vol. 20 No.4, pp. 555-578, December 2006.

Summary: The current study concerns speaking voice quality in two groups of professional voice users, teachers (n = 35) and actors (n = 36), representing trained and untrained voices. The voice quality of text reading at two intensity levels was acoustically analyzed. The central concept was the speaker’s formant (SPF), related to the perceptual characteristics “better normal voice quality” (BNQ) and “worse normal voice quality” (WNQ). The purpose of the current study was to get closer to the origin of the phenomenon of the SPF, and to discover the differences in spectral and formant characteristics between the two professional groups and the two voice quality groups. The acoustic analyses were long-term average spectrum (LTAS) and spectrographical measurements of formant frequencies. At very high intensities, the spectral slope was rather quadrangular without a clear SPF peak. The trained voices had a higher energy level in the SPF region compared with the untrained, significantly so in loud phonation. The SPF seemed to be related to both sufficiently strong overtones and a glottal setting, allowing for a lowering of F4 and a closeness of F3 and F4. However, the existence of SPF also in LTAS of the WNQ voices implies that more research is warranted concerning the formation of SPF, and concerning the acoustic correlates of the BNQ voices.

The spectrograms in this study were generated with a KayPENTAX DSP Sona-Graph, Model 5500.

 

“Intelligibility of Tracheoesophageal Speech in Noise,” Douglas A. McColl, Journal of Voice, Vol. 20 No. 4, pp. 605-615, December 2006.

Summary: The purpose of this investigation is to determine the extent to which background noise negatively impacts the intelligibility of tracheoesophageal (TE) speech. Four male TE speakers provided speech samples that were recorded in quiet and in noise conditions. The listener/subjects occupied a sound-treated booth and were presented with two tasks. In Task 1, the subjects were required to transcribe TE speech stimuli recorded in quiet. In Task 2, the subjects were required to transcribe TE speech stimuli recorded in noise. Repeated measures 2 x 4 factorial analyses of variance were calculated for the dataset. The results of the statistical analysis revealed that the TE speech produced in quiet was significantly more intelligible to the listeners than the TE speech produced in noise for three of the four TE speakers. Furthermore, the results seem to support the hypothesis that the activation of a Lombard effect in TE speakers may detract from their overall speech intelligibility.

In this study, the acoustical analyses of the vocal parameters (jitter, shimmer, and noise-to-harmonic ratio) were performed using the KayPENTAX CSL, Model 4300B in conjunction with the Multi-Dimensional Voice Program, (MDVP), Model 5105.

 

“The Effect of Perceptual Training on Inexperienced Listeners’ Judgments of Dysphonic Voice,” Eadie, Tanya L. and Carolyn R. Baylor, Journal of Voice, Vol. 20 No. 4, pp. 527-544, December 2006.

Objectives/hypothesis: The purpose of this study was (1) to determine whether changes in intra- and interrater reliability occur for inexperienced listeners’ judgments of overall severity, roughness, and breathiness in dysphonic and normal speakers after 2 hours of listener training; and (2) to determine the acoustic bases of inexperienced listeners’ judgments before and after training.

Study Design: Prospective, single group, pre- and postdesign.

Methods: Thirty adult dysphonic and six normal speaker samples were selected from a database. Samples included 21 test stimuli and 15 training stimuli of both sustained vowels and connected speech. Sixteen inexperienced listeners judged all samples for overall severity, roughness, and breathiness using visual analog scales. Each listener provided pretraining ratings at baseline. Listeners were then trained using 15 anchor voice samples and 15 training stimuli. During training, listeners were provided with definitions of rating dimensions, accuracy feedback, and anchor samples. Listeners then judged test stimuli in a posttraining session. Speaker samples also were analyzed acoustically.

Results: Intrarater reliability was least variable for judgments of overall severity, but improved further with training. Listener judgments of roughness and breathiness in vowels were least reliable at baseline, but they significantly improved between listeners after training. Finally, measures of cepstral peak prominence significantly predicted all voice quality judgments except roughness in vowels, which was predicted by shimmer. The acoustic bases of group perceptual judgments did not seem to change with training.

Conclusions: These findings have implications for developing training programs in perceptual evaluation and mapping relationships between acoustic and perceptual characteristics of voice disorders.

Voice samples from the KayPENTAX Disordered Voice Database and Program, Model 4337, were used in this study; the KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105, was used for acoustic analysis.

 

“The Effect of Speaking Context on Elicitation of Habitual Pitch,” Zraick, Richard, I., Mollie A. Gentry, Laura Smith-Olinde, and Brent A. Gregg, Journal of Voice, Vol. 20 No. 4, pp. 251-262, December 2006.

Summary: The purpose of this study was to investigate if there was an effect of speaking context on the elicitation of habitual pitch [speaking fundamental frequency (SFF)]. Six simulated speaking contexts were created (speaking during a voice evaluation, speaking in public, speaking to a peer, speaking to a superior, speaking to a subordinate, and speaking to a parent or spouse), and the SFF for 30 adult women with normal voice was compared across these contexts. A one-way analysis of variance (ANOVA) revealed a statistically significant (P < 0.001) effect of simulated speaking context on SFF, with post hoc analyses indicating a statistically significant difference in SFF while “speaking to a superior” (P < 0.001) and “speaking to a subordinate” (P < 0.001). Possible reasons for an effect of speaking context are discussed. Also, the implications of the use of varied speaking contexts when eliciting SFF are discussed, as in the possibility of an effect of speaking context on the elicitation of other clinically useful voice parameters.

In this study, voice samples were captured with the KayPENTAX Computerized Speech Lab (CSL), Model 4500, and were analyzed with the KayPENTAX Multi-Dimensional voice Program (MDVP), Model 5105.

 

“Ambulatory Monitoring of Disordered Voices,” Hillman, Robert E., James T. Heaton, Asa Masaki, Steven M. Zeitels, and Harold A. Cheyne, Annals of Otology, Rhinology & Laryngology, Vol. 115 (11), pp. 795-801, November 2006.

Objectives: Recently developed systems for ambulatory monitoring of voice use employ miniature accelerometers place at the base of the anterior neck to sense phonation. As it is hope that such systems will help improve the clinical assessment and management of voice disorders, this study was undertaken to determine the impact of dysphonia severity on the accuracy of accelerometer-based estimates of vocal function. 

Methods: Simultaneous recordings were made of oral acoustic (microphone) and neck skin acceleration signals for 6 normal speakers and 18 patients with voice disorders (mild to severe dysphonia) as they performed several speech tasks. Measures of phonation time, fundamental frequency, and sound pressure level were extracted from the two types of signals and compared.

Results: It was generally demonstrated that accelerometer-based measures closely approximated corresponding measurements obtained from a microphone signal across all levels of dysphonia severity. Furthermore, there was evidence that in some cases the accelerometer may actually represent a more robust approach for estimating phonation parameters in disordered voices.

Conclusions: The results generally support the recent application of accelerometers as phonation sensors in ambulatory voice monitoring systems that can be used in the clinical assessment and management of voice disorders.

In this study, the KayPENTAX Ambulatory Phonation Monitor (APM), Model 3200, was used to perform both data collection and analysis.

 

“Birth Control Bills and Nonprofessional Voice: Acoustic Analyses” Amir, Ofer, Tal Biron-Shental, and Esther Shabtai, Vol. 49 No. 5, pp. 1114-1126, October 2006. 

Purpose: Two studies are presented here. Study 1 was aimed at evaluating whether the voice characteristics of women who use birth control pills that contain different progestins differ from the voice characteristics of a control group. Study 2 presents a meta-analysis that combined the results of Study 1 with those from 3 recent studies that compared voices of women who use and do not use birth control pills.

Method: In Study 1, voice samples from 30 women with no history of voice training, who use pills with different progestins (drospirenone, desogestrel, gestodene), and 10 women who do not use the pill were recorded at specific time points across the menstrual cycle and were analyzed acoustically. In Study 2, results from Study 1 were analyzed jointly with results from three recent studies, which used similar methodologies.

Results: Results of Study 1 did not reveal acoustic differences in sustained phonation of vowels across the pill groups and controls. Results of the meta-analysis performed in Study 2 indicated that pill users exhibited lower jitter and shimmer values on sustained vowels, whereas no difference of fundamental frequency was observed among women who use the pill.

Conclusions: These results support findings from previous studies, which suggested that no adverse effect on voice was detected among nonprofessional speakers who use new-generation monophasic birth control pills, for the measures studied. Furthermore, results of the meta-analysis suggested that some acoustic properties of the voice, which are reflected in perturbation measures in sustained vowels, may be improved among women who use the pill.

The KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105, was used to perform the acoustic analysis in this study.

 

“Articulation Rate and Vowel Space Characteristics of Young Males with Fragile X Syndrome: Preliminary Acoustic Findings” Zajac, David J., Joanne E. Roberts, Elizabeth A. Hennon, Adrianne A. Harris, Elizabeth F. Barnes, and Jan Misenheimer, Vol. 49 No. 5, pp. 1147-1155, October 2006.

Purpose: Increased speaking rate is a commonly reported perceptual characteristic among males with fragile X syndrome (FXS). The objective of this preliminary study was to determine articulation rate—one component of perceived speaking rate—and vowel space characteristics of young males with FXS.

Method: Young males with FXS (n = 38), developmental age (DA)-matched males (n = 21), and chronological age (CA)-matched males (n = 16) were audiotaped while engaged in spontaneous conversation and a picture-naming task. Articulation rate in syllables per second during intelligible utterances and vowel space area/dispersion measures were acoustically determined for each speaker.

Results: Males with FXS did not articulate significantly faster than CA-matched males. Area and dispersion of the acoustic vowel space also were similar between the 2 groups. Males with FXS, however, used significantly shorter utterances and had a tendency to pause less often than CA-matched males. In addition, males with FXS exhibited greater intraspeaker variability of formants associated with the vowel /a/.

Conclusions: These preliminary findings suggest that articulation rate may not be a primary factor contributing to perceived speaking rate of males with FXS. Limitations of the study relative to speech production tasks and utterance intelligibility are discussed.

Researchers in this study used the KayPENTAX Computerized Speech Lab (CSL), Model 4400, in the acoustic analysis performed.

 

“Response of the Female Vocal Quality and Resonance in Professional Voice Users Taking Oral Contraceptive Pills: A Multiparameter Approach,” Van Lierde, Kristiane M., Sofie Claeys, Marc De Bodt, and Paul Van Cauwenberge, Laryngoscope, Vol. 116, pp. 1894-1898, October 2006.

Objectives: The purpose of this study was to analyze the vocal quality and resonance (nasality and nasalance values) during the menstrual cycle in professional voice users using oral contraceptive pills (OCPs). Although professional voice users are more sensitive and aware of their vocal quality, no changes of voice and resonance characteristics were expected because OCPs create a stable hormonal balance through the menstrual cycle.

Study Design: The authors conducted a comparative study of 24 healthy, young professional voice users using OCPs. One assessment was performed between the 10th and 17th day of pill intake, when hormonal levels reached a steady state. The second assessment was performed during the first 3 days of menses, when no pills were taken and hormonal levels were minimized.

Methods: Subjective (perceptual evaluation of voice and nasality) and objective (aerodynamic, voice range, acoustic, Dysphonia Severity Index [DSI], nasometer) assessment techniques were used.

Results: The Mann-Whitney U test showed no significant difference between the perceptual evaluation of the voice and the nasality in the two assessments. The paired Student t test showed no significant difference regarding the maximum phonation time, the vocal performance, the acoustic parameters, and the DSI.

Conclusion: These findings indicate that OCPs do not have an impact on the objective and subjective voice and resonance parameters in young professional voice users. This information is specifically relevant to professional voice users who are more aware of vocal quality changes and ear, nose and throat specialists/voice therapists who treat professional voice users with voice problems/disorders. Further research regarding the impact of increased vocal load during the premenstrual or menstrual phase in professional voice users using OCPs should be considered.

The KayPENTAX Nasometer, Model 6200, was used to obtain nasalance values in this study. The KayPENTAX Voice Range Profile (VRP) and Multi-Dimensional Voice Program (MDVP) were used to obtain voice range measures and perform the acoustic analysis, respectively.

 

“Clinical Evaluation of Parkinson’s-Related Dysphonia,” Sewall, Gregory K., Jack Jiang, and Charles N. Ford, Laryngoscope, Vol. 116, pp. 1740-1744, October 2006.

Objective/Hypothesis: Nearly one third of patients with idiopathic Parkinson’s disease (IPD) cite dysphonia, characterized subjectively as causing a harsh and breathy voice, as their most debilitating deficit. Medical or behavioral treatments may lead to voice improvement. The purpose of this study was 1) to determine whether vocal fold injection of Cymetra (micronized form of collagen, elastin, proteoglycans; Lifecell Co.) is associated with changes in dysphonic voice characteristics in subjects with IPD, as judged perceptually using a standard instrument Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V), and (2) which acoustic and aerodynamic measurements of voice are most reflective of any observed perceptual changes in voice.

Study Design: Prospective clinical evaluation of patients with Parkinson’s-related dysphonia (PRD).

Methods: Six patients with PRD were evaluated before treatment for the presence of dysphonia and glottal gap. All subjects underwent trans-oral vocal fold collagen injection using topical anesthesia in the otolaryngology clinic as part of their clinical care. At the initial clinic visit, and 10 to 14 days after vocal fold collagen injection, patients were asked to complete the Voice Handicap Index (VHI), a questionnaire concerning voice-related quality of life, and perceptual analyses of voice quality were performed. In addition, patients underwent acoustic (pitch/loudness range, maximum phonation time [MPT], and aerodynamic phonation threshold pressure [PTP] voice analysis.

Results: Five of six subjects had self-perceived improvements in voice after treatment, as determined by the VHI (range, +8 to -24). All five subjects who completed testing demonstrated decreased PTP (range, -1.3 to -2.7, P = .002). Five of six subjects demonstrated statistically significant improvements in MPT (range, -2-16 s, P = .05). Five of six subjects had improved pitch range (-26-343 Hz), whereas all subjects had increased intensity range (0.6-23 dB) after injection.

Conclusion: Trans-oral collagen injection in patients with PRD is safe, well tolerated, and is an effective temporary method of subjectively improving voice and speech in selected patients with IPD. Reduction of glottal gap with collagen improved MPT and subglottal PTP. The resulting gain of vocal efficiency may reduce vocal fatigue and provide a useful adjunct to voice therapy for PRD.

In this study, all vocal tasks were recorded using the KayPENTAX Computerized Speech Lab (CSL), and Multi-Dimensional Voice Program (MDVP). The Real-Time Pitch (RTP) program was used to analyze MPT, pitch, and loudness range.

 

“Multiparametric Evaluation of Dysphonic Severity,” Estella P.-M. Ma and Edwin M.-L. Yiu, Journal of Voice, Vol. 20 No. 3, pp. 380-390, September 2006.

Summary: In recent years, the multiparametric approach for evaluating perceptual rating voice quality has been advocated. This study evaluates the accuracy of predicting perceived overall severity of voice quality with a minimal set of aerodynamic, voice range profile (phonetogram), and acoustic perturbation measures. One hundred and twelve dysphonic persons (93 women and 19 men) with laryngeal pathologies and 41 normal controls (35 women and six men) with normal voices participated in this study. Perceptual severity judgement was carried out by four listeners rating the G (overall grade) parameter of the GRBAS scale.1. The minimal set of instrumental measures was selected based on the ability of the measure to discriminate between dysphonic and normal voices, and to attain at least a moderate correlation with perceived overall severity. Results indicated that perceived overall severity was best described by maximum phonation time of sustained /a/, peak intraoral pressure of the consonant-vowel /pi/ strings production, voice range profile area, and acoustic jitter. Direct-entry discriminant function analysis revealed that these four voice measures in combination correctly predicted 67.3% of perceived overall severity levels.

Researchers in this study used the KayPENTAX CSL, Model 4300B, to capture all voice recordings. The KayPENTAX MDVP was then used to perform acoustic perturbation analysis and the KayPENTAX Aerophone II, Model 6800, the aerodynamic evaluation.

 

“Acoustic Signal Typing for Evaluation of Voice Quality in Tracheoesophageal Speech,” van As-Brooks, Corina, J., Florien J. Koopmans-van Beinum, Louis C.W. Pols, and Frans J.M. Hilgers, Journal of Voice, Vol. 20 No. 3, pp. 355-368, September 2006.

Summary: Because of the aperiodicity of many tracheoesophageal voices, acoustic analysis of the tracheoesophageal voice is less straightforward than that of the normal voice. This study presents the development of testing of acoustic signal typing system based on visual inspection of a narrow-band spectrogram that can be used by researchers for classification of voice quality in tracheoesophageal speech. In addition to this classification system, a selection of acoustic measures [median fundamental frequency, standard deviation of fundamental frequency, jitter, percentage of voiced (%Voiced), harmonics-to-noise ratio (HNR), glottal-to-noise excitation (GNE) ratio, and band energy difference (BED)] was computed to provide more insight into the acoustic components of tracheoesophageal voice quality. For clinical relevance, relationships between the acoustic signal types and an overall judgment of the voice were investigated as well. Results showed that the four acoustic signal types form a good basis for performing more acoustic analyses and give impression of the overall quality of the voice.

The KayPENTAX Computerized Speech Lab, Model 4300B, together with a DAT recorder, was used to record and digitize the speech data in this study.

 

“Maximum Duration of Sustained /s/ and /z/ and the s/z Ratio with Controlled Intensity,” Marylou Pausewang Gelfer and John F. Pazera, Journal of Voice, Vol. 20 No. 3, pp. 369-379, September 2006.

Summary: The purpose of this study was to compare maximum prolongations of controlled-intensity /s/ vs. /z/ in young healthy male and female adults and to compare the s/z ratio in young men and women. Twenty young adult men and 20 young adult women were included in this study. Participants produced 10 trials of /s/ and 10 of /z/ with a controlled intensity of 60-dB sound-pressure level (SPL). Maximum prolongations and s/z ratio were determined by three different methods: based on the longest out of 10 trials, the longest of 3 trials, and an average of the first 3 trials. Results revealed that based on averaged group data, /s/ and /z/ seemed to be prolonged for similar durations. Men consistently prolonged both phonemes significantly longer than women. There were no significant differences in s/z ratio between men and women. However, when individual data were reviewed, it seemed that some subjects consistently prolonged /s/ for a longer duration than /z/, some subjects prolonged /z/ longer than /s/, and some subjects actually produced approximately equal durations of the two phonemes. It was further noted that /s/ durations were more favorably impacted by practice than /z/ durations.

In this study, participants were screened for normal voice production using the KayPENTAX Multi-Dimensional Voice Program (MDVP) software in conjunction with the KayPENTAX Computerized Speech Lab (CSL).

 

“Physiologic Features of Vocal Fatigue: Electromyographic Spectral-Compression in Laryngeal Muscles,” Boucher, Victor J., Christian Ahmarani, and Tareck Ayad, Laryngoscope, Vol. 116 No. 6, pp. 959-965, June 2006.

Objectives: This study addresses the problem of defining observable attributes of “vocal fatigue” as a physiologic condition. The aim was to determine the applicability of electromyography (EMG) spectral compression in observing fatigue in laryngeal muscles arising from prolonged vocal effort.

Study Design: Single institution, nonrandomized, prospective analysis of subjects evaluated in an academic, tertiary care center.

Methods: In adapting EMG techniques, we report pretest observations that bear on the choice of voicing tasks serving to induce and estimate muscle fatigue and the selection of muscles that are particularly involved in effortful vocalization. On this basis, an experiment was designed where intramuscular EMG was used to record lateral cricoarytenoid potentials of seven subjects at regular intervals across a 12 to 14 hour period (50 samples per subject). Between each of these samples, the participants were required to produce loud speech for 3 minutes with peaks of 74 dBA at 1 meter.

Results: The results show fatigue-related spectral compression for all subjects and nonlinear changes across time indicating critical values beyond which fatigue is persistent.

Conclusions: Spectral compression appears to present a robust attribute of fatigue-related changes in muscles involved in vocalization. There are several implications with respect to research on the prevention of acquired voice pathologies.

In this study, the KayPENTAX CSL, Model 4400, was used to obtain fundamental frequency measures.

 

“Functional Outcomes after CO2 Laser Treatment of Early Glottic Carcinoma,” Ledda, Gian Peppino, Nancy Grover, Vishal Pundir, Ernestina Masala, and Roberto Puxeddu, Laryngoscope, Vol. 116 No. 6, pp. 1007-1011, June 2006.

Objective: To analyze vocal outcome after endoscopic CO2 laser treatment of early glottic carcinoma by perceptive and objective assessment.

Study Designs: Retrospective study.

Methods: Retrospective analysis of 141 consecutive patients undergoing surgery for previously untreated early glottic carcinoma between October 1993 and July 2003. Five types of laser cordectomies as classified by the European Laryngological Society classification were performed. Comparison of voice results between the different types of cordectomies as well as with a control group was performed.

Results: There was no significant difference in the vocal parameters between subepithelial and subligamental cordectomies and controls (P > .05). There was, however, a significant difference between the groups of transmuscular, total, and extended cordectomies and controls (P < .05).

Conclusions: Good oncologic results and vocal outcomes with no difference between controls and subepithelial and subligamental cordectomies support the use of CO2 laser endoscopic surgery as the first line of treatment for early glottic cancel.

Researchers used the KayPENTAX CSL, Model 4300, in conjunction with the MDVP to perform the acoustic analysis in this study.

 

“Aerodynamic Analysis of Male-to-Female Transgender Voice,” Gorham-Rowan, Mary and Richard Morris, Journal of Voice, Vol. 20 No. 2, pp. 251-262, June 2006.

Summary: The attainment of a feminine-sounding voice is a highly desirable goal among male-to-female transgender (MFT) persons, but this goal may be difficult for many to accomplish. The characteristics associated with a feminine vocal quality include increases in fundamental frequency and in vocal breathiness. In this study, we used inverse-filtering of the airflow signal to indirectly assess vocal fold function in 13 MFT persons. Each participant was asked to sustain the vowel /a/ first in her biological male voice and then again in her female voice. In addition, these vowel productions were compared with vowels produced by age-matched biologic women and men. The results of the study revealed a significant increase in maximum flow declination rate during female voice production. Perceptual ratings of a feminine voice were associated with a fundamental frequency (Fo) of 180 Hz or greater, although Fo did not differ significantly between male and female voice production. These results are discussed relative to the mechanisms that obtained a feminine-sounding voice.

Acoustic analysis in this study was performed using the KayPENTAX Computerized Speech Lab (CSL) in conjunction with the Multi-Dimensional Voice Program.

 

“Relationship Between Masking Levels and Phonatory Stability in Normal-Speaking Women,” Carole T. Ferrand, Journal of Voice, Vol. 20 No. 2, pp. 223-228, June 2006.

Summary: Disruption of auditory feedback such as masking has been shown to influence vocal production. A reliable finding is an increase in intensity level; an increase in fundamental frequency (Fo) is a less robust finding. Research is lacking concerning the effects of auditory masking on measures of phonatory stability such as jitter and harmonics-to-noise ration (HNR). This study investigated changes in intensity, Fo, jitter, and HNR in 22 normally speaking college ages women. Subjects produced the vowel /a/ under three conditions: no masking level (0-dB ML), 50-dB ML, and 80-dB ML. Significant differences between conditions emerged for intensity; means for the other measures were not significantly different. Intraindividual differences between conditions for each variable are discussed in the framework of auditory versus kinesthetic feedback.

The KayPENTAX Computerized Speech Lab (CSL), Model 4300, was used to perform the acoustic analysis in this study.

 

“Predictors of Laryngeal Complications in Patients Implanted with the Cyberonics Vagal Nerve Stimulator,” Shaw, Gary Y., Philip Sechtem, Jeff Searl, and Emily S. Dowdy, Annals of Otology Rhinology & Laryngology, Vol. 115 (4), pp. 260-267, April 2006.

Objectives: Since its approval by the US Food and Drug Administration in 1997 for management of medically refractory seizures, more than 35,000 patients have been implanted with the Cyberonics vagal nerve stimulator. Preliminary reports described transient vocal changes in the majority of subjects, which were thought to be short-term. However, these reports were for the most part based upon perceptual evaluations by the subjects themselves. Later reports described possibly more permanent recurrent laryngeal nerve injury and recommended measuring the nerve diameter to use the safest spiral cuff electrode. To date, no study has systematically evaluated vocal fold mobility in subjects before and after implantation. The objectives of this study were to determine the true incidence of both sort- and long-term recurrent laryngeal nerve injuries and determine whether there are any potential indicators to predict in which patients long-term nerve deficits may develop.

Methods: Thirteen subjects underwent preimplantation laryngeal electromyography, videolaryngoscopy, measurement of the maximum phonation time. Voce Handicap Index determination, and Consensus Auditory-Perceptual Evaluation of Voice. Two weeks after implantation, all subjects underwent videolaryngoscopy. Three months after implantation and activation of the device, all subjects were evaluated.

Results: Six of the 13 subjects had significant vocal fold mobility abnormalities at 2 weeks. Significant electromyographic abnormalities were detected before implantation in 5 subjects. All 5 of the subjects, at 3 months after implantation, had prolonged left vocal fold paresis.

Conclusions: The authors conclude that perioperative vocal fold paresis occurs in approximately 50% of subjects. Further, laryngeal electromyography performed before implantation of the vagal nerve stimulator is a statistically significant predictor (p < .05) of which patients may be at risk for extended vocal fold abnormalities. Possible explanations for this phenomenon are offered. Surgical modifications to limit vagal nerve injury are offered.

In this study, videolaryngoscopy was performed with a KayPENTAX RLS, Model 9100, using a 70-degree rigid endoscope. Acoustic analysis was performed using the KayPENTAX CSL, Model 4300B. Additionally, all patients underwent flexible laryngoscopic examination using the KayPENTAX fiberoptic flexible endoscope, Model FNL-10RP3.

 

“Adaptation to an Electropalatograph Palate: Acoustic, Impressionistic, and Perceptual Data,” McLeod, Sharynne and Jeff Searl, American Journal of Speech-Language Pathology, Vol. 15, pp. 192-206, May 2006.

Purpose: The purpose of this study was to evaluate adaptation to the electropalatograph (EPG) from the perspective of consonant acoustics, listener perceptions, and speaker ratings.

Method: Seven adults with typical speech wore an EPG and pseudo-EPG palate over 2 days and produced syllables, read a passage, counted, and rated their adaptation to the palate. Consonant acoustics, listener ratings, and speaker ratings were analyzed.

Results: The spectral mean for the burst (/t/) and friction (/s/) was reduced for the first 60-120 min of wearing the pseudo-EPG palate. Temporal features (stop gap, frication, and syllable duration) were unaffected by wearing the pseudo-EPG palate. The EPG palate had a similar effect on consonant acoustics as the pseudo-EPG palate. Expert listener ratings indicated minimal to no change in speech naturalness or distortion from the pseudo-EPG or EPG palate. The sounds /tò, dZ, ò , s, z, Z/ were most likely to be affected. Speaker self-ratings related to oral comfort, speech, tongue movement, appearance, and oral sensation were negatively affected by the presence of the palatal devices.

Conclusions: Speakers detected a substantial difference when wearing a palatal device, but the effects on speech were minimal based on listener ratings. Spectral features of consonants were initially affected, although adaptation occurred. Wearing an EPG or pseudo-EPG palate for approximately 2 hr results in relatively normal-sounding speech with acoustic features similar to a no-palate condition.

Acoustic analysis in this study was performed using the KayPENTAX Computerized Speech Lab, Model 4400.

 

“Effects of Prolonged Loud Reading on Normal Adolescent Male Voices,” Kelchner, Lisa N., Margaret M. Toner, and Linda Lee, LSHSS, Vol. 37 No. 2, pp. 96-103, April 2006.

Purpose: The purpose of this article was to test the effects of vocal loading in healthy, peripubescent teenage boys. It was hypothesized that select acoustic measures, ratings of physical appearance of the larynx, and self-ratings of physical effort and vocal quality in the experimental group would significantly change in response to 2 hr of prolonged loud reading.

Method: In this prospective, repeated measures study, 25 boys aged 13-16 years were randomly assigned to either an experimental group (2 hr of continuous loud reading) or a control group (silent reading with brief periods of conversation). Pre-post acoustic, videoendoscopic, and perceptual data including self-ratings were collected. Postreading recovery changes were tracked by monitoring average reading fundamental frequency (F0) and intensity for 20 min following cessation of the reading task.

Results: The experimental group demonstrated statistically significant differences before and after prolonged loud reading for three variables: F0 (p < .01), self-ratings of vocal quality (p < .01), and physical effort (p < .01). No pre-post changes were evident in the control group. In the experimental group, posttest return of F0 to pretest levels occurred within 20 min. Self-ratings revealed that the boys felt that their voice quality worsened and physical effort increased during the experimental task. Expert ratings did not detect any significant differences in either the perceptual quality of the experimental group’s voices or their videoendoscopic images.

Implications: These findings demonstrate that prolonged loud reading can induce temporary but measurable changes in F0 and in self-perception of vocal function in adolescent males who are experiencing a period of rapid laryngeal growth. The underlying mechanism for these changes remains unclear and warrants continued investigation. Furthermore, the results suggest that in the pubescent male population, comparable vocal loading tasks encountered in daily use should not result in long-term negative effects.

The KayPENTAX Computerized Speech Lab (CSL), Model 4400 was used to record the acoustic signals in this study.

 

“Laser-Assisted Voice Adjustment (LAVA) in Transsexuals,” Orloff, Lisa A., Andrea P. Mann, John F. Damrose, and Stephen N. Goldman, Laryngoscope, Vol. 116 No. 4, pp. 655-660, April 2006.

Objective: The objective of this study was to evaluate results of laser-assisted voice adjustment (LAVA) surgery in male-to-female (MTF) transsexual patients with androphonia.

Methods: The authors conducted a prospective case-control study of MTFs who underwent CO2 laser vocal fold vaporization between 1997 and 2003. Thirty-one patients were self-referred for voice feminization. Pre- and postoperative evaluations were completed. Patients’ voices were recorded to obtain Fo before and after surgery. Voice Handicap Index (VHI) questionnaires were completed by post-LAVA patients. A panel of blinded listeners identified patients as male or female based on samples of connected speech recorded over the telephone.

Results: Mean follow-up (23 weeks) revealed pitch increases averaging 26 Hz. Self-evaluations revealed increases in voice femininity, congruity with self-image, and satisfaction. However, the evaluations also showed decreased vocal quality, loudness, and vocal range. Mean VHI was consistent with VHI scores associated with Reinke’s edema. Six of 10 patients were consistently perceived as female.

Conclusions: LAVA provides a conservative treatment for androphonia. Post-operative voice therapy may optimize outcomes.

 

The acoustic analysis in this study was performed using the KayPENTAX Visi-Pitch II, Model 3300.

 

“The Singing Power Ratio as an Objective Measure of Singing Voice Quality in Untrained Talented and Nontalented Singers,” Watts, Christopher, Kathryn Barnes-Burroughs, Julie Estis, and Debra Blanton, Journal of Voice, Vol. 20 No. 1, pp.82-88, March 2006.

Summary: A growing body of contemporary research has investigated differences between trained and untrained singing voices. However, few studies have separated untrained singers into those who do and do not express abilities related to singing talent, including accurate pitch control and production of a pleasant timbre (voice quality). This investigation studied measures of the singing power ration (SPR), which is a quantitative measure of the resonant quality of the singing voice. SPR reflects the amplification or suppression in the vocal tract of the harmonics produced by the sound source. This measure was acquired from the voices of untrained talented and nontalented singers as a means to objectively investigate voice quality differences. Measures of SPR were acquired from vocal samples with fast Fourier transform (FFT) power spectra to analyze the amplitude level of the partials in the acoustic spectrum. Long-term average spectra (LTAS) were also analyzed. Results indicated significant differences in SPR between groups, which suggest that vocal tract resonance, and its effect on perceived vocal timbre or quality, may be an important variable related to the perception of singing talent. LTAS confirmed group differences in the tuning of vocal tract harmonics.

 

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, was used to determine the singing power ratio (SPR) and the long-term average spectra (LTAS) of the subjects in this study.

 

“The Effectiveness of Group Therapy for Students with Mild Voice Disorders: A Controlled Clinical Trial,” Simberg, Susanna, Eeva Sala, Jyrki Tuomainen, Jaana Sellman, and Anna-Maija Rönnemaa, Journal of Voice, Vol. 20 No. 1, pp. 97-109, March 2006.

Summary: Previous studies of students studying to be teachers have indicated that these students commonly have voice disorders. Ideally, voice disorders should be treated before students start their work as teachers, by the resources for this treatment are often limited. This study examines whether group voice therapy is effective for teach students. Accordingly, 20 teach students with mild voice disorders received group voice therapy (in three small groups), whereas 20 students with similar voice disorders served as a control group and consequently did not received voice therapy. Two out of three outcome measures (perceptual evaluation of voice quality and a questionnaire on the occurrence of vocal symptoms) indicated significant changes in the treatment group compared with the control group. No differences between groups were noted in the laryngeal status. The results suggest that group voice therapy seems to be an effective method to treat students with mild voice disorders.

 

The KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105, was used to prepare voice samples for the perceptual assessment of voice quality in this study.

 

“Vowel Effect on Glottal Parameters and the Magnitude of Jaw Opening,” Lim, Marilyn, Emily Lin, and Philip Bones, Journal of Voice, Vol. 20 No. 1, pp. 46-54, March 2006.

Summary: This study investigated the relationship among the magnitude of jaw opening, intrinsic fundamental frequency (F0), and glottal parameters in natural speech. Acoustic, jaw opening, and electroglottographic (EGG) signals were simultaneously recorded. The subjects were 10 healthy men with New Zealand English as their native language. Subjects were asked to repeat a standard nonemphasized sentence in which one of the target vocals (/a/, /e/, /i/, /o/, and /u/) was embedded in various contexts. The glottal parameters F0, open quotient (OQ), and speed quotient (SQ) were measured from the EGG signal. Results of a series of one-way repeated-measures analyses of variance (ANOVA) showed a significant vowel effect on the magnitude of jaw opening [F(4, 24) = 25.512, P < .001], F0 [F(4, 28) = 45.415, P < .001] and speed quotient [F(4, 28) = 5.233, P = .003], but not on the open quotient [F(4, 28) = 0.501, P = .735]. The magnitude of jaw opening was found to be inversely related with F0 (r = -0.624, n = 25, P = .0009). These findings showed that the magnitude of jaw opening was related to F0 and that jaw opening might be a control signal for simulation of long-term F0 variation to achieve a higher degree of naturalness in artificial voice.

 

Three glottal parameters (F0, OQ, and SQ) were measured in this study using the EGG signal obtained from the KayPENTAX Electroglottograph, Model 6103.

 

“Fundamental Frequency in Monolingual English, Bilingual English/Russian, and Bilingual English/Cantonese Young Adult Women,” Altenberg, Evelyn P. and Carole T. Ferrand, Journal of Voice, Vol. 20 No. 1, pp. 89-96, March 2006.

Summary: Mean F0 of nine young adult English/Russian female bilinguals and nine young adult English/Cantonese female bilinguals were examined from samples of connected speech in each language. Mean F0 were compared in each language and in English with those of a monolingual English control group of ten young adult female speakers. Acoustic measurements were analyzed with the Kay Elemetrics Multi-Speech program (Kay Elemetrics, Lincoln Park, NJ). The results indicate that the English/Russian bilinguals consistently had a high mean F0 in Russian than in English. Mean F0 did not changed with language switch for the English/Cantonese speakers. There were no significant differences between the groups in their English production. Clinical implications regarding norms for both monolingual and bilingual persons, as well as implications for understanding the nature of bilingualism, in particular code-switching, are discussed.

 

In this study, KayPENTAX’s Multi-Speech, Model 3700, was used for acoustic analysis.

 

“Chaos in Voice, From Modeling to Measurement,” Jiang, Jack J., Yu Zhang, and Clancy McGilligan, Journal of Voice, Vol. 20 No. 1, pp. 2-17, March 2006.

Summary: Chaos has been observed in turbulence, chemical reactions, nonlinear circuits, the solar system, biological populations, and seems to be an essential aspect of most physical systems. Chaos may also be central to the interpretation of irregularity in voice disorders. This presentation will summarize the results from a series of our recent studies. These studies have demonstrated the presence of chaos in computer models of vocal folds, experiments with excised larynges, and human voices. Methods based on nonlinear dynamics can be used to quantify chaos and irregularity in vocal fold vibration. Studies have suggested that disordered voices from laryngeal pathologies such as laryngeal paralysis, vocal polyps, and vocal nodules might exhibit chaotic behaviors. Conventional parameters, such as jitter and shimmer, may be unreliable for analysis of periodic and chaotic voice signals. Nonlinear dynamic methods, however, have differentiated between normal and pathological phonations and can describe the aperiodic and chaotic voice. Chaos theory and nonlinear dynamics can enhance our understanding and therefore our assessment of pathological phonation.

 

In this study, jitter and shimmer were estimated using the KayPENTAX Multi-Dimensional Voice Program (MDVP).

 

“Diagnosis of Unilateral Recurrent Laryngeal Nerve Paralysis: Laryngeal Electromyography, Subjective Rating Scales, Acoustic and Aerodynamic Measures,” Bielamowicz, Steven and Sheila V. Stager, Laryngoscope, Vol. 116, pp. 359-364, March 2006.

 

Objective/Hypothesis: To determine whether specific laryngeal electromyography (LEMG) patterns in patients with unilateral vocal fold paralysis/paresis (UVFP) are related to etiology of injury, time from onset of injury, patient perception of symptom severity, acoustic measures, and laryngeal aerodynamic measures.

 

Study Design: This is a retrospective review of 75 patients.

 

Methods: Each patient received LEMG, acoustic and aerodynamic testing, and a subjective rating scale assessment (the Glottal Closure Index). Statistical analysis by groups were performed using both c2 and single-factor analysis of variance testing.

 

Results: An iatrogenic etiology was associated with poor tone on LEMG (P = .05). Those individuals evaluated after 3 months after onset demonstrated more nascent units, a sign of reinnervation, compared with individuals evaluated before 3 months (P < .02. Individuals with fewer normal motor units on LEMG had significantly higher mean translaryngeal air flows (P = .044). Individuals with poor recruitment had significantly shorter maximum phonation times (P = .034) and higher mean flows (P = .044). Individuals with better laryngeal tone as noted on LEMG had significantly lower mean flows (P = .06).

 

Conclusions: Specific LEMG patterns are related to the etiology of the UVFP and time course since recurrent laryngeal nerve injury. LEMG appears to reflect vocal fold muscle tone as seen on laryngeal function studies. In combination, these studies provide a cohesive assessment of laryngeal function in patients with UVFP.

 

All subjects in this study were examined using transnasal fiberoptic laryngoscopy without videostroboscopy. The flexible endoscope used was coupled to a KayPENTAX RLS 9100 light source. KayPENTAX’s CSL was also used in conjunction with the Real-Time Pitch Program to determine MPT.

 

“Acoustic Analysis of Snoring Sounds by a Multidimensional Voice Program,” Hara, Hirotaka, Naoko Murakami, Yuji Miyauchi, and Hiroshi Yamashita, Laryngoscope, Vol. 116, pp. 379-386, March 2006.

Objectives: This prospective study aimed to determine whether the acoustic characteristics of snoring sounds differed between simple snorers and patients with obstructive sleep apnea syndrome (OSAS) by using a multidimensional voice program (MDVP) that analyzes various aspects of voice.

 

Methods: Fifty-eight patients (48 men, 10 women) with a history of snoring were included in the study. All patients underwent conventional polysomnography (PSG). Twelve subjects were diagnosed as simple snorers and 46 subjects were diagnosed with OSAS. The mean body mass index (BMI) of simple snorers was 24.7 kg/m2 and that of patients with OSAS was 25.8 kg/m2. Natural overnight snoring was recorded from each subject while they slept during PSG. Using the multiple token protocols of MDVP, 30 snores from each subject were analyzed automatically. For data analysis, four markers were used: peak frequency, soft phonation index (SPI), noise to harmonics ratio (NHR), and power ratio.

 

Results: The Mann-Whitney U test revealed significant difference between the SPI, NHR, and power ratio of simple snorers and patients with OSAS. Simple snorers had a high SPI value. OSAS-related snorers demonstrated a high NHR and low power ratio.

 

Conclusions: MDVP can be used for snoring sound analysis as a noninvasive examination of sleep-related breathing disorders for differential diagnosis. However, a suitable option that is rapid and has an easy-to-use interface would be more advantageous for analyzing snoring sounds.

 

KayPENTAX’s Multi-Dimensional Voice Program (MDVP), Model 5105, was used to perform the snoring sound analysis on four markers—peak frequency, SPI, NHR, and power ratio—in this study.

 

“Treatment of Adductor-Type Spasmodic Dysphonia by Surgical Myectomy: A Preliminary Report,” Koufman, Jamie A., Catherine J. Rees, Stacey L. Halum, and David Blalock, Annals of Otology, Rhinology & Laryngology, Vol. 115 (2), pp. 97-102, February 2006. 

Objectives: Despite the belief that it represents a central neurologic dysfunctional process, adductor-type spasmodic dysphonia without tremor is usually effectively treated by injection of botulinum toxin A; however, in most cases such injections must be repeated every few months. A promising new surgical procedure is herein reported.

Methods: Under local anesthesia with intravenous sedation, a large laryngoplasty window is created, and under direct vision with intraoperative voice monitoring, fibers from the thyroarytenoid and lateral cricoarytenoid muscles are removed until breathiness occurs. The two sides are staged; that is, one side is done at a time, with surgery on the second side being performed 3 to 6 months after that on the first side, if needed.

Results: This was a retrospective, unblended study of 5 patients who underwent myectomy of the thyroarytenoid and lateral cricoarytenoid muscles. The preliminary results show improved voice fluency in all patients at 5 to 19 months of follow-up. There was no period of prolonged breathiness or dysphagia in any of the patients, and there were no surgical complications.

Conclusions: Myectomy of the thyroarytenoid and lateral cricoarytenoid muscles is a promising new surgical treatment for adductor-type spasmodic dysphonia that may effectively mimic “permanent” botulinum toxin injections.

Acoustic analysis in this study was performed using the KayPENTAX Computerized Speech Lab (CSL) in conjunction with the CSL Pitch Program, Model 4331.

 

“The Effects of Fundamental Frequency Level on Voice Onset Time in Normal Adult Male Speakers,” McCrea, Christopher R. and Richard J. Morris, Journal of Speech, Language, and Hearing Research, Vol. 48 No. 5, pp. 1013-1024, October 2005.

The purpose of this study was to examine the effect of fundamental frequency (Fo) on stop consonant voice onset time (VOT). VOT was measures from the recordings of 56 young men reading phrases containing all 6 English voiced and voiceless stops in word-initial position across high-, medium-, and low-Fo levels. Separate analyses of variance for the voice and voiceless stops revealed no significant main effect for Fo for the voiced stops but a significant Fo effect for the voiceless stops. Across the voiceless stops, productions at high Fos displayed significantly shorter VOTs than productions at low or mid Fos. The findings indicated that researchers must take into account the Fo level at which voiceless stop VOT is measured.

Voice recordings, in this study, were digitized and acoustically analyzed using a KayPENTAX Computerized Speech Lab (CSL), Model 4300B.

 

“An Acoustic Profile of Normal Swallowing,” Youmans, Scott R. and Julie A.G. Stierwalt, Dysphagia, Vol. 20, Number 3, pp. 195-209, Summer 2005.

Cervical auscultation has been proposed as a technique to augment the clinical evaluation of dysphagia to improve its accuracy in the diagnosis of dysphagia. Before using cervical auscultation to reliably diagnose disordered swallowing, it is necessary to first acoustically characterize normal swallowing for comparison with dysphagic swallowing. Ninety-seven healthy adult participants consumed teaspoon boluses of various consistencies while the sounds of swallowing were recorded. Descriptive statistics were reported for measures of duration, intensity, and frequency of the acoustic swallowing signal. Correlations between the variables and between bolus consistencies were computed. Overall, results compared favorably with previous research. Significant correlations were found among several of the variables, including an increasing duration of the acoustic swallowing signal with increasing age and decreasing intensity of the signal with increasing age. None of the variables differed significantly as a function of gender. Of potential clinical relevance, significant correlations between bolus consistencies for the duration and intensity variables indicated relative similarities across bolus consistencies. Duration and intensity of the acoustic signal appeared to be the most reliable of the variables measured. These results could serve as a reference point for future studies into normal swallowing across multiple bolus consistencies and volumes and eventually be compared with disordered swallowing.

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, was used to record and analyze the acoustic signals in this study.

 

“Speech Motor Development During Acquisition of the Voicing Contest,” Grigos, Maria I., John H. Saxman, and Andrew M. Gordon, Journal of Speech, Language, and Hearing Research, Vol. 48 No. 4, pp. 739-752, August 2005.

Lip and jaw movements were studied longitudinally in 19-month-old children as they acquired the voicing contrast for /p/ and /b/. A movement tracking system obtained lip and jaw kinematics as participants produced the target utterances /papa/ and /baba/. Laryngeal adjustments were also tracked through acoustically recorded voice onset time (VOT) of the consonants. Across this period of developmental phonological change, the children began to produce VOTs in 2 distinct categories for voiced and voiceless plosives. Specific kinematic differences were observed during oral opening and closing and between spatial and temporal parameters of movement. The development of the voicing contrast was most closely associated with changes in jaw kinematics for oral opening in comparison to that of the lip. Conversely, movements into oral closing were not accompanied by significant increases in jaw, upper lip, or lower lip displacement or velocity, although a decrease in jaw movement variability was found. There was no evidence of phoneme-specific movement differences between /p/ and /b/ in the children or in the adults studied. Spatial coupling between the jaw and upper lip changed significantly across sessions, whereas changes in temporal coupling were not observed. Findings indicate that oral opening and closing have different task requirements and that children modify their articulatory movements to meet the demands of each task. Overall, the findings illustrate how orofacial movements and laryngeal function change in parallel during linguistic development.

The KayPENTAX Computerized Speech Lab (CSL), Model 4300B, was used to perform the acoustic analysis in this study
 

“Toward Diagnostic and Phenotype Markers for Genetically Transmitted Speech Delay,” Shriberg, Lawrence D., Barbara A. Lewis, J. Bruce Tomblin, Jan L. McSweeny, Heather B. Karlsson, and Alison R. Scheer, Journal of Speech, Language, and Hearing Research, Vol. 48 No. 4, pp. 834-852, August 2005.

Converging evidence supports the hypothesis that the most common subtype of childhood speech sound disorder (SSD) of currently unknown origin is genetically transmitted. We report the first findings toward a set of diagnostic markers to differentiate this proposed etiological subtype (provisionally termed speech delay-genetic) from other proposed subtypes of SSD of unknown origin. Conversational speech samples from 72 preschool children with speech delay of unknown origin from 3 research centers were selected from an audio archive. Participants differed on the number of biological, nuclear family members (0 or 2+) classified as positive for current and/or prior speech-language disorder. Although participants in the 2 groups were found to have similar speech competence, as indexed by their Percentage of Consonants Correct scores, their speech error patterns differed significantly in 3 ways. Compared with children who may have reduced genetic load for speech delay (no affected nuclear family members), children with possibly higher genetic load (2+ affected members) had (a) a significantly higher proportion of relative omission errors on the Late-8 consonants; (b) a significantly lower proportion of relative distortion errors on these consonants, particularly on the sibilant fricatives /s/, /z/, and /ò/; and (c) a significantly lower proportion f backed /s/ distortions, as assessed by both perceptual and acoustic methods. Machine learning routines identified a 3-part classification rule that included differential weightings of these variables. The classification rule had diagnostic accuracy value of 0.83 (95% confidence limits = 0.74-0.92), with positive and negative likelihood ratios of 9.6 (95% confidence limits = 3.1-29.9) and 0.40 (95% confidence limits = 0.24-0,68), respectively. The diagnostic accuracy findings are viewed as promising. The error pattern for this proposed subtype of SSD is viewed as consistent with the cognitive-linguistic processing deficits that have been reported for genetically transmitted verbal disorders.

In this study, speech samples were digitized at a 20-kHz sample rate using the KayPENTAX Computerized Speech Lab (CSL), Model 4300B.

 

“Genetics of Vocal Quality Characteristics in Monozygotic Twins: A Multiparameter Approach,” Van Lierde, Kristiane M., Bart Vinck, Sofia De Ley, Gregory Clement, and Paul Van Cauwenberge, Journal of Voice, Vol. 19 No. 4, pp. 511-518, December 2005.

The main purpose of this study was to determine the vocal quality characteristics among the 45 monozygotic cotwins (MT). As the performance of the voice is related to several genetically determined anatomical and physiological factors, the authors hypothesized that the vocal characteristics and the overall vocal quality by means of the Dysphonia Severity Index (DSI) will be identical in MT. An additional objective of this study was to determine whether sex and age influence vocal similarities in MT and to compare the voice characteristics of MT with the normative data of unrelated peers. As more environmental factors influence the aging of the voice, age-related differences were expected. No sex-related differences were expected. Subjective and objective assessment techniques determined the vocal quality. No significant differences were obtained, and most comparisons resulted in significant correlation coefficients. For the acoustic parameters jitter and shimmer only, no significant correlation coefficients could be obtained. It is clear that the perceptual voice characteristics, the laryngeal aerodynamic measurements of maximum phonation time (MPT), the vocal performances, and the overall vocal quality by means of the DSI are similar in MT. These vocal characteristics are not influenced either by the subjects’ age or sex and are situated within the normative range of unrelated peers. To what extent other aspects (environment, anxiety, tension, etc) might play a role in the acoustical dimensions regarding frequency and amplitude perturbation, which were in the normal range, is a subject of further research..

In this study, the KayPENTAX Voice Range Profile and Multi-Dimensional Voice Program (MDVP) were used in conjunction with Computerized Speech Lab (CSL) for voice analysis.

 

“Perturbation and Nonlinear Dynamic Analyses of Voices from Patients with Unilateral Laryngeal Paralysis,” Zhang, Yu, Jack J. Jiang, Laura Biazzo, and Malinda Jorgensen, Journal of Voice, Vol. 19 No. 4, pp. 519-528, December 2005.

This study used perturbation methods (eg, jitter and shimmer) and nonlinear dynamic methods (eg, phase space reconstruction and correlation dimension) to analyze sustained voices generated by normal subjects and patients with unilateral laryngeal paralysis. We found that normal and pathological voices had low-dimensional dynamic characteristics. For nearly periodic voices, jitter and shimmer values of pathological voices from patients with unilateral laryngeal paralysis were significantly different from normal voices. For nearly periodic and aperiodic voices, the correlation dimensions of pathological voices were statistically higher than normal voices. Receiver operating characteristic analysis was used to evaluate the diagnostic performances of jitter, shimmer, and correlation dimension. High sensitivity and specificity of these three acoustic analyses in distinguishing unilateral laryngeal paralysis patients from normal subjects were found. We concluded that combining traditional perturbation analysis and nonlinear dynamic analysis might provide efficient descriptions of pathological voices and represent a valuable tool for clinical diagnosis of laryngeal paralysis.

The voice samples examined in this study were from the KayPENTAX Disordered Voice Database and Program, Model 4337. Acoustic perturbation measures were obtained using the KayPENTAX Computerized Speech Lab (CSL) in conjunction with the Multi-Dimensional Voice Program (MDVP), Model 5105.

 

“The Relationship Between Vocal Pitch-Matching Skills and Pitch Discrimination Skills in Untrained Accurate and Inaccurate Singers,” Watts, Christopher, Robert Moore, and Kacia McCaghren, Journal of Voice, Vol. 19 No. 4, pp. 534-543, December 2005.

Few studies have compared the relationship between pitch discrimination accuracy and the accuracy of fundamental frequency (Fo) control. This study investigated the relationship between vocal pitch-matching skills, which is one method of testing Fo control, and pitch discrimination skills in untrained accurate and inaccurate singers, and the effect of timbre on their pitch discrimination accuracy. Data showed that accurate singers had more precise discrimination and pitch-matching abilities compared with their inaccurate counterparts. Pitch discrimination was differentially affected by the timbre (eg, spectral differences) of comparison tones. In addition, results showed a significant relationship between pitch discrimination abilities and pitch-matching accuracy. The results suggest that accurate Fo control is at least partially dependent on pitch discrimination abilities, which are important for accurate singing..

All singing samples in this study were recorded digitally using the KayPENTAX Computerized Speech Lab (CSL).


 

“Revisiting the Pitch Controversy: Changes in Speaking Fundamental Frequency (SFF) After Management of Functional Dysphonia,” Roy, Nelson and Heru Hendarto, Journal of Voice, Vol. 19 No. 4 pp. 582-591, December 2005.

Speaking fundamental frequency (SFF) and its perceptual correlate “habitual pitch” have been considered important and contentious parameters in voice assessment and treatment. In clinical circles, disagreement exists regarding the role of habitual pitch in the development, maintenance, and treatment of disordered voices. Despite these divergent opinions, few studies have therapy. To determine whether consistent directional and magnitude changes in SFF occur after management, pretreatment and posttreatment audio recordings of 40 women with functional dysphonia were analyzed. All subjects were treated with manual circumlaryngeal therapy, a treatment approach that does not directly target pith as a perceptual entity to be manipulated. Results indicated that, as a group, no significant change in mean SFF was observed after successful management. Although no consistent directional pattern was identified, 80% of the subjects experienced pitch changes greater than one semitone; this suggests that voice improvement is often accompanied by a shift in SFF. Clinical implications of the data are discussed.

In this study, the KayPENTAX Computerized Speech Lab (CSL) was used in conjunction with the Multi-Dimensional Voice Program (MDVP), Model 5105, to calculate the mean speaking fundamental frequency (SFF) of the subjects.

 

“Effects of Topical Anesthetic and Flexible Fiberoptic Laryngoscopy on Professional Sopranos,” Jacobs, Margaret A. and Dianna T. Kenny, Journal of Voice, Vol. 19 No. 4 pp. 645-664, December 2005.

This study examined the acoustic and perceptual effects of topical anesthetic and flexible fiberoptic laryngoscopy (FFL) against a control condition on the singing voices of ten professional sopranos. Recordings of a section of an aria, various scales, and a messa di voce exercise were obtained in the three experimental conditions. Acoustic analyses of the same aria section recorded during the three conditions were similar with respect to the distribution of energy across the spectrum (LTAS) and vibrato rate and extent. The ability of the participants to achieve their highest and lowest notes or to complete the messa di voce was also not affected by the anesthetic or FFL. Perceptual ratings of a variety of parameters by experienced singing teachers also revealed little difference across conditions with only “appropriate velopharyngeal closure” found to differ in one comparison. These results indicate that highly experienced operatic sopranos are either not affected by or appear to have the ability to compensate for the presence of anesthetic and the FFL. The most likely explanation is that this group of singers relied on a solid vocal technique. Results will need to be replicated on less accomplished singers before concluding that this medical procedure does not affect the operatic singing voice.

The KayPENTAX Computerized Speech Lab (CSL) and Multi-Dimensional Voice Program (MDVP) were used to analyze jitter, shimmer, and noise-to-harmonic ratio in this study.

 

“Objective Voice Measures in Nonsinging Patients with Unilateral Superior Laryngeal Nerve Paresis,” Robinson, Jamie L, Steven Mandel, and Robert Thayer Sataloff, Journal of Voice, Vol. 19 No. 4, pp. 665-667, December 2005.

The clinical value of objective voice measures in nonsinging patients with superior laryngeal nerve dysfunction is unknown. In this study, patients with symptomatic unilateral superior nerve paresis were evaluated for maximum phonation time, frequency range of phonation, and mean flow rate. Patients with coexisting pathology, bilateral superior nerve paresis, and those with recurrent laryngeal nerve paresis were excluded from this analysis. A total of 35 nonsinging patients, 14 men and 21 women, with unilateral superior laryngeal nerve paresis were examined between 1999 and 2002. The severity of superior laryngeal nerve paresis ranged from 25% to 85% of normal recruitment with a mean of 70% superior laryngeal nerve recruitment in men and 65% in women by electromyography. In both men and women with superior laryngeal nerve paresis, the maximum phonation time and frequency range of phonation were decreased and the mean air flow rate was increased when compared with normal population values. The jitter percent, shimmer percent, and noise-to-harmonic ratio were also increased in patients when compared with normative data. Selected objective voice measures are abnormal in voice patients with superior laryngeal nerve paresis, which suggests that the measures may be useful as outcomes measures after therapy. More research is encouraged.

Acoustic analyses in this study were performed using the KayPENTAX Computerized Speech Lab (CSL) and Multi-Dimensional Voice Program (MDVP).

 

“Comparison of Vocal Characteristics of Future Professionals in Three Different University Majors,” Ng, Manwa L., Rita L. Bailey, and Lance R. Lippert, Contemporary Issues in Communication Science and Disorders, Vol. 32, pp. 142-150, Fall 2005.

Vocal characteristics of students in three preprofessional undergraduate programs (speech-language pathology and audiology, broadcast communication, and theater) were compared across the following parameters: a number of acoustic measurements, a perceptual voice evaluation by speech-language pathologists, and self-reported vocal quality ratings using a vocal use questionnaire. Results indicated some significant differences between different student groups. Specifically, students in the broadcast communication and theater programs were found to have significantly higher scores of self-identified and perceptual voice problems than students in the speech-language pathology and audiology program. Acoustic measurements appeared to confirm the results of the self-identified and perceptual data.

In this study, the KayPENTAX CSL, Model 4300B, was used to record and digitize speech samples, which were then used to calculate percent jitter values, percent shimmer values, and harmonic-to-noise ratios.

 

“Second Formant Frequency Transition I Diphthongs During Simultaneous Communication,” MacKenzie, Douglas J. Marietta Bennett, Trisha Breen, Anne Bufano, Janella Clarke, Jessica Eggleston, Katie Eye, and Russ Turner, Contemporary Issues in Communication Science and Disorders, Vol. 32, pp. 151-155, Fall 2005.

The magnitude of second formant frequency transitions in diphthongs produced during simultaneous communication (SC) was investigated by recording sign language users during SC and speech alone (SA). Results showed longer sentence durations in SC than SA but no differences in the absolute values of second formant frequency change during the production of diphthongs. These results are consistent with previous research indicating that temporal alterations in SC do not degrade acoustical characteristics of spoken English.

Researchers in this study used the KayPENTAX Computerized Speech Lab (CSL), Model 4300B, to digitize speech samples and to measure sentence durations and the magnitude of second formant frequency changes.

 

“Functional Significance of Arytenoid Adduction with the Suture Attaching to Cricoid Cartilage versus to Thyroid Cartilage for Unilateral Paralytic Dysphonia,” Su, Chih-Ying, Shang-Shyue Tsai, Hui-Ching Chuang, and Jeng-Fen Chiu, Laryngoscope, Vol. 115, pp. 1752-1759, October 2005.

Objective: In the treatment of unilateral paralytic dysphonia, traditional arytenoid adduction is designed to place suture through the muscular process of the arytenoid attaching anteriorly to the thyroid ala. In contrast with the suture direction of this technique, a new paramedian approach to arytenoid adduction anchors anteroinferiorly to the cricoid cartilage, mimicking the force action of the lateral cricoarytenoid muscle (the major adductor of the larynx). This study investigated the influence of these changes in suture direction on the vocal fold level as well as the vocal outcomes in these two techniques of arytenoid adduction.

 

Study Design: A prospective clinical series.

 

Methods: Thirty patients with unilateral paralytic dysphonia underwent medialization laryngoplasty with arytenoid adduction and strap muscle transposition. Under local anesthesia, the thyroid lamina on the involved side was paramedially separated. The inner perichondrium was carefully elevated away from the overlying thyroid cartilage, carrying the dissection posteriorly to the level of the superior and inferior cornua. The lamina was retracted laterally, the inner perichondrium was opened near the midpoint, and the lateral cricoarytenoid muscle identified. Tracing the muscle fibers posterosuperiorly, the muscular process of the arytenoid was identified. A 2-0 Prolene suture was placed through the muscular process and temporarily tied to the anterolateral aspect of the thyroid ala (AA-thyroid suture). Intraoperative acoustic and perceptual assessments were performed. After releasing the tie, the suture was anchored to the cricoid cartilage at the origin of the lateral cricoarytenoid muscle (AA-cricoid suture). Voice assessments were repeated, and the outcomes of the two tests were compared. The choice of the type of arytenoid adduction suture was made intraoperatively according to which condition provided better vocal performance. After securing the suture, a bipedicled strap muscle flap was transposed into the space between the lamina and inner perichondrium and the thyroid cartilages sutured back into place.

 

Results: The intraoperative acoustic and perceptual assessments revealed the vocal performance was significantly better with AA-cricoid suture than the AA-thyroid suture in this series. No major complications occurred in the study.

 

Conclusion: This study suggests that that arytenoid adduction with suture attachment along the longitudinal axis of the lateral cricoarytenoid muscle to the cricoid cartilage is more physiologic and effective than that attaching the suture to the thyroid ala. A paramedian approach to arytenoid adduction with or without strap muscle transposition is a safe and effective method for treatment of unilateral paralytic dysphonia.

In this study, the pre- and post-surgical videolaryngostroboscopy was performed using a KayPENTAX RLS, Model 9100B. For acoustic analysis, the KayPENTAX CSL, Model 4300B, was used. Aerodynamic parameters were determined using the KayPENTAX Aerophone II, Model 6800.

 

“Implantation of Esterified Hyaluronic Acid in Microdissected Reinke’s Space after Vocal Fold Microsurgery: First Clinical Experiences,” Finck, Camille and Philippe Lefebvre, Laryngoscope, Vol. 115, pp. 1841-1847, October 2005.

Objective: In this pilot study are presented the first clinical experiences of the use of a resorbable bioimplant made of esterified hyaluronic acid inserted in the microdissected superficial layer of the lamina propria (SLLP), also called Reinke’s space, after a flap excision procedure for a benign vocal fold lesion. Laryngeal and vocal evolution of implanted patients are depicted and discussed.

Study Design: Eleven bio-implants have been inserted in microdissected SLLP of 11 cases presenting with benign vocal fold lesions. The surgical procedure consisted of the excision of primary lesion by a microflap technique immediately followed by implantation of esterified hyaluronic acid in Reinke’s space.

 

Methods: All patients underwent rigid laryngoscopy and a microsurgical procedure under general anesthesia. The cordal lesion was treated with cold instrumentation of Bouchayer (7 cases) or with a mixed technique using CO2 laser (4 cases). After the classical freeing-up of Reinke’s space and the creation of a mucosal flap, a few fibers of esterified hyaluronic bioimplant are gently arranged in Reinke’s space before redraping the ligament and closing the cordal incision with a few drops of fibrin glue. Laryngeal and vocal assessments were performed pre- and postoperatively in all patients using videostroboscopy as well as perceptual and objective voice evaluation. All patients were followed in a longitudinal manner: between two and five postoperative evaluations were performed. The longest follow-up was 19 months and the shortest 2 months.

 

Results: All cases exhibited postsurgical improvement of the pliability of the SLLP. None of them developed an adverse scarring process. Improvement of SLLP’s pliability was maintained in time in all cases. Vocal improvement was observed in all. Temporary inflammation was noted in one case. There were no serious adverse effects apparent during the follow-up period.

 

Conclusion: Bio-implantation of esterified hyaluronic acid in Reinke’s space is technically easy and well tolerated. All treated cases exhibited postoperative good pliability of the SLLP compared with their preoperative evaluation.

 

A KayPENTAX CSL was used in this study to record patients’ voices, while the KayPENTAX MDVP was used to obtain voice quality objective data.

 

“Coordination of Oral and Laryngeal Movements in the Perceptually Fluent Speech of Adults Who Stutter,” Max, Ludo and Vincent L. Gracco, Journal of Speech, Language, and Hearing Research, Vol. 48 No. 3, pp. 524-542, June 2005.

This work investigated whether stuttering and nonstuttering adults differ in the coordination of oral and laryngeal movements during the production of perceptually fluent speech. This question was addressed by completing correlation analyses that extended previous acoustic studies by others as well as inferential analyses based on the within-subject central tendency and variability of acoustic and physiological indices of oral-laryngeal control and coordination. Stuttering and nonstuttering adults produced the target /p/ as the medial consonant in C1V1#C2V2C3 sequences (C = consonant; V = vowel or diphthong, # = word boundary) embedded in utterances differing in length and location of the target movements. No between-groups differences were found for across- or within- subject correlations between acoustic measures of stop gap and voice onset time (VOT). However, the acoustic data did show longer duration for devoicing interval and VOT in the stuttering versus nonstuttering individuals, in the absence of a difference for a proportional measure specifically reflecting oral-laryngeal relative timing. Analyses of combined kinematic and electroglottographic data revealed that the stuttering individuals’ speech was also characterized by (a) longer durations from bilabial closing movement onset and peak velocity to V1 vocal fold vibration offset and (b) greater within-subject variability for dependent variables that were physiological indices of devoicing interval and VOT, but again no between-groups differences were found for specific indices of oral-laryngeal relative timing. Overall, findings suggest that, for the production of voiceless bilabial stops in perceptually fluent speech, stuttering and nonstuttering adults differ in the duration of intervals defined by events within as well as across the oral and laryngeal subsystems, but the groups show similar patterns of relative timing for the involved oral and laryngeal movements.

In this study, the KayPENTAX Multi-Speech, Model 3700, was used to obtain and analyze acoustic measures.

 

“Quantifying the Effect of Compression Hearing Aid Release Time on Speech Acoustics and Intelligibility,” Jenstead, Lorienne M. and Pamela E. Souza, Journal of Speech, Language, and Hearing Research, Vol. 48 No. 3, pp. 651-667, June 2005.

Compression hearing aids have the inherent, and often adjustable, feature of release time from compression. Research to date does not provide a consensus on how to choose or set release time. The current study had 2 purposes: (a) a comprehensive evaluation of the acoustic effects of release time for a single-channel compression system in quiet and (b) an evaluation of the relation between the acoustic changes and speech recognition. The release times under study were 12, 100, and 800 ms. All of the stimuli were VC syllables from the Nonsense Syllable Task spoken by a female talker. The stimuli were processed through a hearing aid simulator at 3 input levels. Two acoustic measures were made on individual syllables: the envelope-difference index and CV ratio. These measurements allowed for quantification of the short-term amplitude characteristics of the speech signal and the changes to these amplitude characteristics caused by compression. The acoustic analyses revealed statistically significant effects among the 3 release times. The size of the effect was dependent on the characteristics of the phoneme. Twelve listeners with moderate sensorineural hearing loss were tested for their speech recognition for the same stimuli. Although release time for this single-channel, 3:1 compression ratio system did not directly predict overall intelligibility for these nonsense syllables in quiet, the acoustic measurements reflecting the changes due to release time were significant predictors of phoneme recognition. Increased temporal-envelope distortion was predictive of reduced recognition for some individual phonemes, which is consistent with previous research on the importance of relative amplitude as a cue to syllable recognition for some phonemes.

The KayPENTAX Multi-Speech, Model 3700, was used to perform the acoustic analysis  in this study, specifically to obtain the CV ratio (i.e., the difference in decibels between adjacent consonants and vowels).

 

“Voice-over: Perceptual and Acoustic Analysis of Vocal Features,” Medrado, Reny, Leslie Piccolotto Ferreira, and Mara Behlau, Journal of Voice, Vol. 19 No. 3, pp. 340-349, September 2005.

Voice-overs are professional voice users who use their voices to market products in the electronic media. The purposes of this study were to (1) analyze voice-overed and non-overed productions of an advertising text in two groups consisting of 10 male professional voice-overs and 10 male non-voice-overs; and (2) determine specific acoustic features of voice-over productions in both groups. A naïve group of listeners were engaged for the perceptual analysis of the recorded advertising text. The voice-overed production samples from both groups were submitted for analysis of acoustic and temporal features. The following parameters were analyzed: (1) the total text length, (2) the length of the three emphatic pauses, (3) values of the mean, (4) minimum, (5) maximum fundamental frequency, and (6) the semitone range. The majority of voice-overs and non-voice-overs were correctly identified by the listeners in both productions. However voice-overs were more consistently correctly identified than non-voice-overs. The total text length was greater for voice-overs. The pause time distribution was statistically more homogeneous for the voice-overs. The acoustic analysis indicated that the voice-overs had lower values of mean, minimum, and maximum fundamental frequency and a greater range of semitones. The voice-overs carry the voice-overed features to their non-voice-overed production.

In this study, the Real-Time Pitch module of the KayPENTAX Visi-Pitch II was used to analyze both temporal and acoustic parameters.

 

The Impact of Specific Exertion on the Efficiency and Ease of the Voice,” Bagnall, Alison D and Kirsty McCulloch, Journal of Voice, Vol. 19 No. 3, pp. 384-390, September 2005.

Even though most singers and other professional voice users are encouraged to relax to optimize the quality and performance of the voice, observations of acclaimed singers, actors, and public speakers would suggest otherwise. These successful vocal performers appear to be energized, actively working and exerting themselves. For this reason, a study was designed to explore the role of exertion in maintaining and optimizing the voice. The focus of this study was the possibility that increasing exertion could improve the voice and might result in the voice user experiencing less strain and, therefore, more comfort and ease. Ten subjects were recorded before and after completing a workshop to develop their skills with precise use of effort involving selected parameters of the larynx and vocal tract. Self-reported ratings of degree of exertion and level of comfort were collected at the time of each recording. The preworkshop and postworkshop recordings were analyzed acoustically and perceptually to compare the degree of noise in the signal that corresponds with the efficiency of the voice. The results indicated that, for all subjects, the quality of the voice improved with an increase in the use of specific exertion. Furthermore, ease and comfort also significantly increased. 

The KayPENTAX Computerized Speech Lab (CSL) was used in conjunction with the KayPENTAX  Multi-Dimensional Voice Program (MDVP) to perform the acoustic analysis in this study.


 

“Spectral Amplitude Measures of Adductor Spasmodic Dysphonic Speech,” Cannito, Michael P., Eugene H. Buder, and Lesya B. Chorna, Journal of Voice, Vol. 19 No. 3, pp. 391-410, September 2005.

 Spectral amplitude measures are sensitive to varying degrees of vocal fold adduction in normal speakers. This study examined the applicability of harmonic amplitude differences to adductor spasmodic dysphonia (ADSD) in comparison with normal controls. Amplitudes of the first and second harmonics (H1, H2) and of harmonics affiliated with the first and second harmonics (A1, A2, A3) were obtained from spectra of vowels /a/ and /i/ excerpted from connected speech. Results indicated that these measures could be made reliably in ADSD. With the exception of H1*-H2*, harmonic amplitude difference (H1*-A1, H1*-A2, and H1*-A3) exhibited significant negative linear relationships (P < 0.05) with clinical judgments of overall severity. The four harmonic amplitude differences significantly differentiated between pre-BT and post-BT productions (P < 0.05). After treatment, measurements from /a/ detected significant differences between ADSD and normal controls (P < 0.05), but measurements from /i/ did not. LTAS analysis of ADSD patients’ speech samples proved a good fit with harmonic amplitude difference measures. Harmonic amplitude differences also significantly correlated with perceptual judgments of breathiness and roughness (P < 0.05). These findings demonstrate high clinical applicability for harmonic amplitude differences for characterizing phonation in the speech of persons with ADSD, as well as normal speakers, and they suggest promise for future application to other voice pathologies.

In this study, the KayPENTAX CSL, Model 4300B, was used to digitize and analyze subjects’  speech samples.


“Comparisons of Voice Onset Time for Trained Male Singers and Male Nonsingers During Speaking and Singing,” McCrea, Christopher R. and Richard J. Morris, Journal of Voice, Vol. 19 No. 3, pp. 420-430, September 2005.

This study was designed to examine the temporal acoustic differences between male trained singers and nonsingers during speaking and singing across voiced and voiceless English sop consonants. Recordings were made of 5 trained singers and 5 nonsingers, and acoustically analyzed for voice onset time (VOT). A mixed analysis of variance showed that the male trained singers had significantly longer mean VOT than did the nonsingers during voiceless stop production. Sung productions of voiceless stops had significantly longer mean VOTs than did the spoken productions. No significant differences were observed for the voiced stops, nor were any interactions observed. These results indicated that vocal training and phonatory task have a significant influence on VOT.

The KayPENTAX CSL, Model 4300B, was used to perform the acoustic analysis in this study.

 

“Pitch Discrimination and Pitch Matching Abilities of Adults Who Sing Inaccurately,” Bradshaw, Elizabeth and Monica A. McHenry, Journal of Voice, Vol. 19 No. 3, pp. 431-439, September 2005.

Past research regarding singing ability has provided evidence that both supports and refutes a relationship between pitch discrimination ability and pitch production ability. Researchers have suggested that these skills improve with age. Despite this suggestion, most investigators studying singing ability have included only children as participants. Additionally, although many researchers have studied accurate singers, few have directly studied persons who do not sing accurately. We designed this study to examine the relationship between pitch discrimination ability and pitch production ability in inaccurate adult singers. Fifteen adults, aged 18 to 40 years that met specific criteria qualified as inaccurate singers. Each participated in two tasks, a pitch discrimination task and a pitch production task. We used the Multi-Dimensional Voice Profile-Advanced (Kay Elemetrics Corporation, Lincoln Park, NJ) to determine the frequency of each participant’s vocal productions during the pitch production task. We also used a Pearson product moment correlation to analyze the relationship between pitch discrimination and pitch production accuracy within a semitone of the target frequency. No meaningful relationship was found, and results were not statistically significant. However, the inaccurate singers in this study could be classified into two separate categories, those who discriminated pitches accurately, but produced pitches inaccurately, and those who discriminated pitches inaccurately and produced pitches inaccurately. These findings may be of great importance to music educators and impact the focus of instruction when teaching an inaccurate singer to sing more accurately.

In this study, the KayPENTAX CSL was used in conjunction with the KayPENTAX Multi-Dimensional Voice Profile for acoustic analysis.

 

“Vocal Projection in Actors: The Long Term Average Spectral Features That Distinguish Comfortable Acting Voice From Voicing with Maximal Projection in Male Actors,” Pinczower, Rachel and Jennifer Oates, Journal of Voice, Vol. 19 No. 3, pp. 440-453, September 2005.

This study explored whether acoustic and perceptual features could distinguish comfortable from maximally projected acting voice. Thirteen professional male actors performed a passage from William Shakespeare’s Julius Caesar twice. The first delivery used their comfortably projected voices, whereas the second used maximal projection. Acoustic measures, expert rating, and self-ratings of projection and voice quality were investigated. Long-term average spectra (LTAS) and sound pressure level (SPL) analyses were conducted. Perceptual variables included projection, breathiness, roughness, and strain. When comparing the intensity difference between the higher (2-44 kHz) and lower (0-2 kHz) regions of the spectrum in voice samples from the maximal projected condition, LTAS analyses demonstrated increased acoustic energy in the higher part of the spectrum. This LTAS pattern was not as evident in the comfortable projected condition. These findings offered some preliminary support for the existence of an actor’s formant (prominent peak in the upper art of the spectrum) during maximal projection.

Researchers in this study used the KayPENTAX Multi-Speech, Model 3700, to determine mean and peak SPL.

 

“Acoustic Analysis of the Voice in Pediatric Cochlear Implant Recipients: A Longitudinal Study,” Campisi, P., A. Low, B. Papsin, R. Mount, R. Cohen-Kerem, and R. Harrison, Laryngoscope, Vol. 115, pp. 1046-1050, June 2005.

Objectives: To characterize inherent acoustic abnormalities of the deaf pediatric voice and the effect of artificially restoring auditory feedback with cochlear implantation.

Design: Inception cohort.

Settings: Academic referral center.

Patients: Twenty-one children with severe to profound hearing loss (15 prelingually deaf, 6 postlingually deaf) accepted into the cochlear implant program were followed for up to 6 months. Patients unable to perform the vocal exercises were excluded.

Interventions: Objective voice analysis was performed using the Computerized Speech Laboratory (Kay Elemetrics) prior to cochlear implantation, at the time of implant activation and at 2 and 6 months postactivation. Assessments were based on sustained phonations and dynamic ranges.

Main Outcome Measure: Fundamental frequency, long-term control of fundamental frequency (vF0) and long-term control of amplitude (vAM) were derived from sustained phonations. The dynamic frequency from sustained phonations. The dynamic frequency range was derived from scale exercises. Formant frequencies (F1, F2, F3) were determined using linear predictive coding.

Results: Fundamental frequency was not altered by implant activation or experience (P = 0.342). With profoundly deaf subject, the most prevalent acoustic abnormality was a poor long-term control of frequency (vF0, 2.81%) and long-term control of amplitude (vAM, 23.58%). Implant activation and experience had no effect on the long-term control of frequency (P = 0.106) but normalized the long-term control of amplitude (P = 0.007). The mean frequency range increased from 311.9 Hz preimplantation to 483.5 Hz postimplantation (P = 0.08). The F1/F2 ratio remained stable (P = 0.476).

Conclusion: In children, severe to profound deafness results in poor long-term control of frequency and amplitude. Cochlear implantation restores control of amplitude only and implies the need for additional rehabilitative strategies for restoration of control of frequency.

Researchers in this study used the KayPENTAX CSL, Model 4400, for data collection and analysis, in conjunction with KayPENTAX’s Multi-Dimensional Voice Program, Model 5105, and Real-Time Pitch, Model 5121.

 

“Influence of Data Acquisition Environment on Accuracy of Acoustic Voice Quality Measurements,” Deliyski, Dimitar D., Maegan K. Evans, and Heather S. Shaw, Journal of Voice, Vol. 19 No. 2, pp. 176-186, June 2005.

Accuracy of acoustic voice analysis is influenced by the quality of recording. Lately, articles have suggested that soundcards perform equivalently to specialized professional-grade data acquisition (DA) systems. The purpose of this study was to investigate the influence of DA environment (DA system and microphone) on acoustic voice quality measurement (VQM) while balancing for gender, age, intersubject and intrasubject variability, and analysis software. More specifically, the relative performance of different hardware environments and the relationship between their technical characteristics and VQM performance was investigated. The discretization error and the effective dynamic range of the different DA environments were measured. We used 3 software systems to record and measure separately 2000 acoustic samples of sustained phonation for fundamental frequency, jitter, and shimmer. Analyses of variance (ANOVA) were performed with these parameters as the dependent variables. The results of the study suggested that professional-grade DA hardware is strongly recommended to provide accurate and valid voice assessment. The fundamental frequency measurement differences across DA environments were highly correlated to the discretization error (r = 1.00), whereas jitter and shimmer were highly correlated to the effective dynamic range of the DA environments (r = -0.68 and r = 0.86, respectively).

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, was one of the professional-grade data acquisition (DA) systems used by researchers in this study to determine the effect of DA environment (DA system and microphone) on acoustic voice quality measurement (VQM).


“The Effect of Speaking Sample Duration on Determination of Habitual Pitch,” Zraick, Richard I., Kasie Y. Birdwell, and Laura Smith-Olinde, Journal of Voice, Vol. 19 No. 2, pp. 197-201, June 2005.

The purpose of this study was to investigate if there was an effect of duration of speaking on determination of habitual pitch. Five speaking periods commonly used to elicit habitual pitch in clinical voice evaluations were compared (1, 5, 15, 30, and 60 seconds). Thirty female speakers with normal voices participated. Results of a within-subject univariate F-test revealed a statistically significant (p < 0.001) difference in habitual pitch among the speaking periods. Habitual pitch for the 1-second and 60-second speaking periods were found to be statistically significantly (p < 0.05) different than all remaining speaking periods, and the habitual pitch for the 30-second speaking period was found to be statistically significantly (p < 0.05) different than 60-second speaking period. Implications for the use of various speaking durations when determining habitual pitch are discussed, as is the possibility of a speaking duration effect on determination of other pitch-related voice parameters.

KayPENTAX’s Visi-Pitch II, Model 3300, was used to determine the speaking fundamental frequency (SFF) of the participants in this study.

 

“Spectral Moments of the Long-term Average Spectrum: Sensitive Indices of Voice Change After Therapy?,” Tanner, Kristine, Nelson Roy, Andrea Ash, and Eugene H. Buder, Journal of Voice, Vol. 19 No. 2, pp. 211-222, June 2005.

Voice clinicians require an objective, reliable, and relatively automatic method to assess voice change after medical, surgical, or behavioral intervention. This measure must be sensitive to a variety of voice qualities and severities, and preferably should reflect voice in continuous speech. The long-term average spectrum (LTAS) is a fast Fourier transform-generated power spectrum whose properties can be compared with a Gaussian bell curve using spectral moments analysis. Four spectral moments describe features of the LTAS: Spectral mean (Moment 1) and standard deviation (Moment 2) represent the spectrum’s central tendency and dispersion, respectively. Skewness (based on Moment 3) and kurtosis (based on Moment 4) represent the spectrum’s tilt and peakedness, respectively. To examine whether the first four spectral moments of the LTAS were sensitive to perceived voice improvement after voice therapy, this investigation compared pretreatment and posttreatment voice samples of 93 patients with functional dysphonia using spectral moments analysis. Inspection of the results revealed that spectral mean and standard deviation lowered significantly with perceived voice improvement after successful behavioral management (p < 0.001). However, changes in skewness and kurtosis were not significant. Furthermore, lowering of the spectral mean uniquely accounted for approximately 14% of the variance in the pretreatment to posttreatment changes observed in perceptual ratings of voice severity (p < 0.001), indicating that spectral mean (ie, Moment 1) of the LTAS may be one acoustic marker sensitive to improvement in dysphonia severity.

KayPENTAX’s Computerized Speech Lab (CSL), Model 4300B, was used to digitize the speech samples in this study; spectral moments were generated from the LTAS of each voice sample using the KayPENTAX Multi-Speech, Model 3700.

 

“Evaluating the Influence of Warmup on Singing Voice Quality Using Acoustic Measures,” Amir, Ofer, Noam Amir, and Orit Michaeli, Journal of Voice, Vol. 19 No. 2, pp. 252-260, June 2005.

Vocal warmup is generally accepted as vital for singing performance. However, only a limited number of studies have evaluated this effect quantitatively. In this study, we evaluated the effect of vocal warmup on voice production, among young female singers, using a set of acoustic parameters. Warmup reduced frequency-perturbation (p < 0.001) and amplitude-perturbation values (p < 0.05). In addition, warmup increased singer’s formant amplitude (p < 0.05) and improved noise-to-harmonic ratio (p < 0.05). Tone-matching accuracy, however, was not affected by warmup. The effect of vocal warmup on frequency-perturbation parameters was more evident among mezzo-soprano singers than it was among soprano singers. It was also more evident in the low pitch-range than in the higher pitch-ranges (p < 0.05). The results of this study provide valid support for the advantageous effect of vocal warmup on voice quality and present acoustic analysis as a valuable and sensitive tool for quantifying this effect.

In this study, acoustic analysis was performed using the KayPENTAX Multi-Dimensional Voice Program (MDVP), Model 5105.

 

“Acoustic Prediction of Voice Type in Women with Functional Dysphonia,” Awan, Shahenn N. and Nelson Roy, Journal of Voice, Vol. 19 No. 2, pp. 268-282, June 2005.

The categorization of voice into quality type (ie, normal, breathy, hoarse, rough) is often a traditional part of the voice diagnostic. The goal of this study was to assess the contributions of various time and spectral-based acoustic measures to the categorization of voice type for a diverse sample of voices collected from both functionally dysphonic (breathy, hoarse, and rough) (n = 83) and normal women (n = 51). Before acoustic analyses, 12 judges rated al voice samples for voice quality type. Discriminant analysis, used the modal rating of voice type as the dependent variable, produced a r-variable model (comprising time and spectral-based measures) that correctly classified voice type with 79.9% accuracy (74.6% classification accuracy on cross-validation). Voice type classification was achieved based on two significant discriminant functions, interpreted as reflecting measures related to “Phonatory Instability” and “F0 Characteristics.” A cepstrum-based measure (CPP/EXP ratio) consistently emerged as a significant factor in predicting voice type; however, variables such as shimmer (RMS dB) and a measure of low- vs. high-frequency spectral energy (the Discrete Fourier Transformation ratio) also added substantially to the accurate profiling and prediction of voice type. The results are interpreted and discussed with respect to the key acoustic characteristics that contributed to the identification of specific voice types, and the value of identifying a subset of time and spectral-based acoustic measures that appear sensitive to a perceptually diverse set of dysphonic voices.

Voice samples in this study were recorded and digitized using the KayPENTAX Computerized Speech Lab (CSL), Model 4300B.

 

“The Effect of Hemodialysis on Voice: An Acoustic Analysis,” Hamdan, Abdul-Latif, Walid Medawar, Abbas Younes, Hala Bikhazi, and Nabil Fuleihan, Journal of Voice, Vol. 19 No. 2, pp. 290-295, June 2005.

Because respiration is part of the well-coordinated process necessary for phonation, this study was conducted with the purpose of analyzing the effect of chronic hemodialysis on voice characteristics of patients with chronic renal failure. A total of 57 patients were recruited for the study, including 31 males and 26 females ranging in age from 16 to 85 years. Patients underwent evaluation of their voice directly before and after hemodialysis using the Kay Elemetrics Visi-Pitch (Model 3300; Kay Elemetrics Corporation, Lincoln Park, New Jersey). The vocal acoustic parameters studied include habitual pitch, pitch range, relative average perturbation, shimmer, noise-to-harmonic ratio, voice turbulence index, maximum phonation time, and voice energy. The data were analyzed using the paired t-test for the total sample and the nonparametric test for the female and male subgroups. The total sample analysis showed a statistically significant increase in the habitual pitch after the hemodialysis (p < 0.05), with a borderline increase in the pitch range and maximum phonation time (p < 0.10). In the female group, there was a statistically significant increase in the habitual pitch and a borderline increase in the relative average perturbation. In the male group, there was a significant increase in the habitual pitch with a borderline increase in maximum phonation time. Discussion of the after-mentioned results in presented.

In this study, the VQA (Vocal Quality Assessment) and the PE (Pitch and Energy) modules of the KayPENTAX Visi-Pitch, Model 3300, were used to analyze a variety of acoustic parameters including habitual pitch, maximum phonation time, pitch range, voice energy, relative average perturbation, shimmer, NHR, and voice turbulence index.

 

“A Comparative Study of Acoustic Voice Measurements by Means of Dr. Speech and Computerized Speech Lab,” Smits, Ilse, Piet Ceuppens, and Marc S. De Bodt, Journal of Voice, Vol. 19 No. 2, pp. 187-196, June 2005.

In this study, the calculations and results of acoustic voice analysis as calculated by two different analysis systems (Doctor Speech (DRS), Tiger Electronics, Neu-Anspach, German, and Computerized Speech Lab (CSL), Kay Elemetrics Corporation, Lincoln Park, NJ) are compared. A group of 120 normal voices was selected for analysis of the objective parameters: fundamental frequency (F0), variation of F0 (F0SD), jitter, shimmer, and harmonics-to-noise ratio (HNR). The subject group was a random selection of normal voices of adults. The aim of this comparison was to find determined differences and similarities in data measurements between both systems to make data transfer possible. A significant correlation was found for F0, HNR, and shimmer relative. The correlation for jitter (relative and absolute) and F0SD was weak. DRS and CSL are not comparable in absolute figures, but their judgment against normative data is identical Further research is necessary to explore the affect on pathological voices or child voices.

The KayPENTAX CSL, Model 4300B, in conjunction with KayPENTAX’s Multi-Dimensional Voice Program (MDVP), was used for acoustic voice analysis in this study.

 

“Influence of Aging and Sex on Voice Parameters in Patients with Unilateral Vocal Cord Paralysis,” Kandogan, Tolga, and Eberhart Seifert, Laryngoscope, Vol. 115, pp. 655-660, April 2005.

Objectives: The objective of this study was to investigate influences of aging and sex on different voice parameters in patients with unilateral vocal cord paralysis (VCP).

Study Design: Retrospective review of patients with unilaterial VCP.

Material and Methods: Forty-seven patients, 22 males, 25 females (24-85 years), were enrolled in the study. The diagnosis of VCP was established by videolaryngostroboscopy. The acoustic parameters of jitter, shimmer, degree of subharmonic, noise to harmonic ratio, fundamental frequency, and maximal intensity were measured. The auditive voice analysis included roughness, breathiness, and hoarseness. Statistical analysis involves Pearson’s bivariate correlation coefficients and two-way analysis of variance with interaction variables. 

Resolution: After unilateral VCP in the elderly, some sex- and age-related differences in the restriction of the voice can be documented.

Conclusion: In general, the investigated voice parameters showed similar tendencies to those in otherwise healthy aging persons.

Researchers in this study used the KayPENTAX Multi-Dimensional Voice Program (MDVP) to analyze jitter, shimmer, and NHR.

 

“Effects of genioglossal muscle advancement on speech,” Vähätalo, Kimmo, Juha-Pertti Laaksonen, Henna Tamminen, Olli Aaltonen, and Risto-Pekka Happonen, Otolaryngology, Vol. 132 No. 4, pp. 636-640, April 2005.

Objective: The effects of the genioglossal muscle advancement on phonetic quality of speech were studied analyzing the acoustic features of vowel sounds.

Study Design and Setting: The study group consisted of 5 men suffering from partial upper airway obstruction during sleep. To prevent tongue base collapse, genioglossal muscle advancement was made with chin osteotomy without hyoid myotomy and suspension. The speech material consisted of 8 vowels produced in sentence context repeated 10 times before the operation, and 10 days and 6 weeks after the operation. The acoustic features of vowels were analyzed.

Results: The operation had no significant effects on vowel quality. Only for 2 of the subjects the pitches changed systematically due to the operation.

Conclusion: According to the acoustic analysis, genioglossal muscle advancement with chin osteotomy has no effects on vowel production. Some short-term changes were observed, but these changes were highly individual.

Significance: The operation seems to have no potential to change vowel production. 

In this study, the acoustic analysis of speech samples was performed using the KayPENTAX Computerized Speech Lab (CSL), Model 4300B.

 

“Adverse Effects of Environmental Noise on Acoustic Voice Quality Measurements,” Deliyski, Dimitar D., Heather S. Shaw, and Maegan K. Evans, Journal of Voice, Vol. 19 No. 1, pp. 15-28, March 2005. 

An accurate analysis of voice quality is imperative when using acoustic measurements to diagnose vocal pathologies. It is known that noise has a significant effect on the reliability and validity of acoustic voice measurements, but the precise relationship has not been established. The purpose of this study was to investigate the influence of noise on the accuracy, reliability, and validity of acoustic voice quality measurements while balancing for gender, age, intersubject and intrasubject variability, microphones, computer hardware, analysis software, and type of noise. Level of noise was precisely controlled. The specific focus of interest was to determine the critical levels of noise that can invalidate voice quality measurements and to generate practical recommendations. Results suggest that the recommended, acceptable, and unacceptable levels of noise in the acoustic environment are above 42 dB, above 30 dB, and below 30 dB signal-to-noise ratio, respectively.

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, was used in conjunction with the Multi-Dimensional Voice Program (MDVP), Model 5105, for voice analysis in this study.

 
“Vocal Range and Intensity in Actors: A Studio Versus Stage Comparison,” Emerich, Kate A., Ingo R Titze, Jan G. Švec, Peter S. Popolo, and Gary Logan, Journal of Voice, Vol. 19 No. 1, pp. 78-83, March 2005.

A voice range profile (VRP) was obtained from each of eight professional actors and compared with two speech range profiles (SRPs). One speech profile was obtained during the dramatic reading of a scene in the laboratory and the other during a performance onstage in a professional theater. The objective was to determine the pitch and loudness ranges used by the actors in speech relative to the VRP. The principal question of interest was whether the actors stayed within the center of the VRP, or whether they tended to drift toward the boundaries of intensity and frequency. A second question was whether the performance within the laboratory accurately reflects that of a stage performance. The results suggest that some subjects tend to exceed the center of the VRP during the stage performance. It is hypothesized that these actors may stress their vocal mechanism during performance and are more likely candidates for vocal injury. 

The KayPENTAX Computerized Speech Lab (CSL), Model 4400, and Voice Range Profile (VRP) Program, Model 4326, were used to obtain the voice range profile of each subject in this study.

 

“The Efficacy of Resonance Method to Hyperfunctional Dysphonia from Physiological, Acoustic and Aerodynamic Aspects: The Preliminary Study,” Chen, Sheng Hwa, Jui-Lin Huang, and Wei-Shan Chang, Asia Pacific Journal of Speech, Language and Hearing, Vol. 8, No. 3, pp. 200-203, 2003.

Abstract: The purpose of this study is to investigate the treatment efficacy of Resonance Voice Therapy (RVT) to hyperfunctional dysphonic patients from physiological, acoustic and aerodynamic aspects. Twenty-one females, age range 23 to 56 years, diagnosed with hyperfunctional dysphonia in the Veterans General Hospital, Taipei were equally divided into seven groups. The therapy was provided by a speech pathologist once a week, for 90 minutes per session, for eight weeks. Laboratory tests of videostroboscopy, auditory perceptual judgment, acoustic analysis and aerodynamic analysis were done on all subjects before and after RVT. A paired t-test was used to analyze test-retest reliability and significance of therapy. After RVT, there is significant improvement of vocal fold closure, vocal fold vibration and voice quality. A significant decrease of shimmer and intra-oral pressure, and a significant increase in speaking frequency range, physiological frequency range and intensity range quantify subjective improvements. RVT reflects physiological changes in respiratory, laryngeal and resonant systems. This therapy technique proves to be an effective treatment method for hyperfunctional voice-disordered patients.

In this study, Kay’s RLS Model 9100 light source was used to perform videostroboscopic examinations; Kay’s VRP, Model 4326 was used to obtain maximum vocal performance data; and Kay’s Computerized Speech Lab (CSL), Model 4300B, was used for acoustic analysis.

 

“Identification of Cantonese Tones by Children with Sensory Hearing Impairment: Effects of Noise and Hearing Aid Frequency Response,” Morris, David and Joseph Kei, Asia Pacific Journal of Speech, Language and Hearing, Vol. 8, No. 3, pp. 212-220, 2003.

Abstract: The present study examined the tonal identification abilities of a control group (n = 23 normal hearing children, mean age = 13.6 years; SD = 1.9) and two experimental groups: a mild to moderate hearing-impaired group (n = 11, mean age = 15.8 years, SD = 1.9); and, a moderately severe hearing-impaired group (n = 11, mean age = 14.3 years, SD = 3.7). The two experimental groups were tested monaurally in a sound field under two different hearing aid frequency responses and two noise conditions (quiet/noise). All participants in the experimental group were of normal intelligence and free from other disabilities except hearing impairment. The results revealed that tonal identification ability was impeded as the degree of hearing impairment increased. A noise effect was found, showing a significant lower tone identification score in noise than in quiet. The hearing aid frequency response effect, however, did not reach significance. The most common tonal identification errors, made by the hearing-impaired groups, were tone 6/tone 4 and tone 5/tone 2 confusions, with their reciprocals. These results highlight the difficulty of hearing-impaired children in identifying Cantonese tones, particularly in noisy environments.

Acoustical analysis of the subjects’ six word tokens was performed using Kay’s DSP Sona-Graph, Model 5500.

 

“Physiological Features of Dysarthria in Friedreich’s Ataxia,” Cahill, Louise, M., Deborah G. Theodoros, Bruce E. Murdoch, and John MacMillan, Asia Pacific Journal of Speech, Language and Hearing, Vol. 8, No. 3, pp. 221-228, 2003.

Abstract: Due to the paucity of literature concerning the motor speech impairment in persons with Friedreich’s ataxia (FA), the air of the study was to investigate the perceptual and physiological features of dysarthria in a 30-year-old male with FA, 22 years post diagnosis. The four speech subsystems were comprehensively evaluated using physiological measures of respiratory (Respitrace), laryngeal (Laryngograph, Aerophone II), velopharyngeal (Nasometer) and articulatory (lip and tongue pressure transduction systems) function. Perceptual speech evaluations included the Frenchay Dysarthia Assessment, the Assessment of Intelligibility of Dysarthric Speech and a perceptual analysis of a speech sample. The findings were compared with those of non-neurologically impaired controls, matched for age and sex. Results revealed marked impairment in respiratory, velopharyngeal and articulatory function, and mild laryngeal dysfunction. Based on these results the subject was rated as displaying a moderate mixed dysarthria (flaccid/ataxic), with a mild to moderate decrease in overall intelligibility. The results of the assessments will be discussed in relation to the possible effects of FA on motor speech function.

Researchers in this study used Kay’s Aerophone II, Model 6800 for aerodynamic evaluation; velopharyngeal function was assessed using Kay’s Nasometer, Model 6400.

 

Task Specificity in Adductor Spasmodic Dysphonia Versus Muscle Tension Dysphonia,” Nelson, Roy, Manon Gouse, Shannon C. Mauszycki, Ray M. Merrill, and Marshall E. Smith, Laryngoscope, Vol. 115, pp. 311-316, February 2005.

Objectives/Hypothesis: Adductor spasmodic dysphonia (ADSD) has been characterized as a “task specific” laryngeal dystonia, meaning that the severity of dysphonia varies depending on the demands of the vocal task. Voice produced in connected speech as compared with sustained vowels is said to provoke more frequent and severe laryngeal spasms. This study examined the diagnostic value of “task specificity” as a marker of ADSD and its potential to differentiate ADSD from muscle tension dysphonia (MTD), a functional voice disorder than can often masquerade as ADSD.

Study Design: Case-control study.

Methods: Five listeners, blinded to the purpose of the study, used a 10 cm visual analogue scale to rate dysphonia severity of subjects with ADSD (n = 36) and MTD (n = 45) producing either connected speech or a sustained vowel “ah.” 

Results: In ADSD, dysphonia severity for connected speech (M = 6.22 cm, SD = 2.56) was rated significantly more severe than sustained vowel productions (M = 4.8 cm, SD = 2.8 [t (35) = 3.67, P < .001]). In MTD, however, no significant difference in severity was observed for the connected speech sample (M = 5.98 cm, SD = 2.83 versus the sustained vowel M = 5.86 cm, SD = 2.87 [t (44) = 0.378, P = .707]). The receiver operating characteristic (ROC) curve, an index of the accuracy of task specificity as a diagnostic marker, revealed that a 1 cm difference criterion correctly identified 53% of ADSD cases (sensitivity) and 76% of MTD cases (specificity) (X2 (1) = 6.88, P = .0087).

Conclusions: Reduced dysphonia severity during sustained vowels supports task specificity in ADSD but not MTD and highlights a valuable diagnostic marker whose recognition should contribute to improved diagnostic precision.

In this study, the connected speech and sustained vowel samples were digitized using Kay’s Multi-Speech, Model 3700.

 

“Impact of Topical Anesthesia on Acoustic Characteristics of Voice During Laryngeal Telescopic Examination,” Yang, Cheng-Chien and Sheng Hwa Chen, Otolaryngology-Head and Neck Surgery, Vol. 132, No. 1, pp. 110-114, January 2005.

Objective: The purposes of this study are to investigate the impact of topical anesthetic alone and with concurrent laryngeal telescopic examination on acoustic characteristics of vocal fold function. Comparison with phonation in controlled conditions may imply diagnostic information from the examination.

Study Design: Thirty males evaluated as having a normal voice were included in the study. The subjects were asked to phonate sustained /i/ with a naturally comfortable pitch and loudness in three consecutive experimental sequences as “control condition,” Acoustic analysis of fundamental frequency, jitter, shimmer, and harmonic to noise ration in the three different conditions were executed.

Results: The mean and standard deviation of Fo in control condition, anesthetic condition, and telescopic condition were 130.1 ±. 18.5 Hz, 125.7 ± 19.7 Hz, and 172.2 ± 35.1 Hz, respectively. The telescopic condition showed more negative change than that in control condition and anesthetic condition in other parameters. There was a significant difference (P < 0.001) between control condition and telescopic condition in all four parameters.

Conclusions: This study showed that anesthesia has little effect on voice performance for subjects with a normal voice. On the other hand, the acoustic characteristics changed significantly during telescopic performance. When doing interpretation of acoustic data, the abnormality of the acoustic characteristics might be the result of the procedures and not reflect vocal pathology. Laryngeal variations due to manipulation of telescope should be ruled out.

Researchers in this study used Kay’s Rhinolaryngeal Stroboscope, Model 9100, for laryngeal examinations and Kay’s Computerized Speech Lab, Model 4300B, for acoustic analysis.

 

“Outcome of Laryngeal Manual Therapy in Four Dutch Adults with Persistent Moderate-to-Severe Vocal Hyperfunction: A Pilot Study,” Van Lierde, Kristiane M., Sophia De Ley, Gregory Clement, Marc De Bodt, and Paul Van Cauwenberge, Journal of Voice, Vol. 18, No. 4, pp. 467-474, December 2004.

A relatively new management strategy for the treatment of voice disorders in the use of laryngeal manual therapy. The main purpose of the present pilot study is to document the outcome of vocal quality after a well-defined laryngeal manual therapy (LMT) program. Four Dutch professional voice users with a persistent moderate or severe muscle tension dysphonia were studied pretreatment (1 week before LMT) and posttreatment (1 week) after completion of manual therapy (25 sessions). These subjects had received several months of traditional voice therapy, without any success. To measure and compare, the effect of LMT objective and subjective assessment techniques were used. Perceptual voice assessment included a perceptual rating of the voice using the GRBAS scale. Furthermore, the vocal quality in this population was modeled by means of the Dysphonia Severity Index (DSI). All of the subjects selected for LMT showed improvement in perceptual vocal quality and DSI values. As the DSI is weighted variable including aerodynamic and acoustic measure, small improvements (closer to 5) are very indicative of vocal quality improvement. The use of LMT in professional voice users with persistent moderate-to-severe muscle tension dysphonia, especially in some subjects who have not responded to traditional voice therapy, is supported by this pilot study.

Kay’s CSL, Model 4300B, was used in conjunction with the Multi-Dimensional Voice Program (MDVP), Model 5105, and Voice Range Profile, Model 4326, to assess vocal quality in this study.

 

“Fragments of a Greek Trilogy: Impact on Phonation,” Ferrone, Carol, Grace Leung, and Lorraine Olson Ramig, Journal of Voice, Vol. 18, No. 4, pp. 488-499, December 2004.

This study documents the vocal characteristics of an actor before and after a series of eight performances involving extended voice use. The hypothesis was that this type of extended voice use would result in symptoms of vocal abuse and that damage to the actor’s voice would be evident in measures made after the performance series. Three pre-performance and three post-performance speech samples were gathered and analyzed using the CSL and Visi-Pitch II. Measurements taken included maximum phonational range; maximum sustained phonation; fundamental frequency during reading; maximum intensity levels; sound pressure levels for soft, moderate, and loud productions of sustained /a/; and perturbation including jitter, shimmer, harmonics-to-noise ratio, and an s/z ratio. Pre- and post-performance samples of the “Rainbow passage” and sustained vowel phonation were rated by a group of blinded listeners that included professional voice trainers and speech pathologists. In addition, sample lines from the performance were played for the listeners to judge whether this technique would result in symptoms of vocal abuse. Eleven out of 12 professional voice trainers rated that this technique would result in symptoms of vocal abuse. The data revealed post-performance improvement in phonational range, maximum intensity levels, perturbation measures, and s/z ratio. Measures of maximum sustained phonation, fundamental frequency, and sound pressure levels remained stable. Videoendoscopy revealed normal function of the larynx and vocal folds.

Kay’s Computerized Speech Lab (CSL), Model 4300B, was used for analyzing the Rainbow Passage and lines from the play. Kay’s Visi-Pitch II, Model 3300, was used to obtain measures of average fundamental frequency, jitter, shimmer, and noise-to-harmonic ratio on samples of a sustained /a/. Visi-Pitch II was also used to measure maximum phonational range.

 

“Voice Quality After Carbon Dioxide Laser and Conventional Surgery for T1A Glottic Carcinoma,” Schindler, Antonio, Francesca Palonta, Giuliana Preti, Francesco Ottaviani, Oskar Schindler, and Andrea Luigi Cavalot, Journal of Voice, Vol. 18, Number 4, pp. 545-550, December 2004.

The different types of small vocal fold tumor therapy allow the preservation of respiration and deglutition; the quality of phonation is the most important criterion for the patient. The aim of the study is to compare vocal function after treatment of T1a tumors by conventional and laser cordectomy. Fifty-seven male patients were included in the study: 27 underwent conventional cordectomy using an external approach, and 30 underwent an endoscopic microscopic laser cordectomy. Videolaryngoscopy were performed for each subject, and the maximal phonation time was measure. Spectrograms were recorded, and perturbation analysis was performed if a clear harmonic structure was visible. Voices were perceptually rated by two experienced phoniatricians using the GRBAS scale. Even though a slightly better voice was found after conventional surgery throughout the data, no statistically significant difference was measured in the two groups. The data on voice outcome per se do not indicate the selection of one surgical approach over another.

Kay’s Computerized Speech Lab, (CSL), Model 4300, and Multi-Dimensional Voice Program, Model 5105, were used to perform objective voice evaluations in this study.

 

“Study of Lubricant-Induced Changes in Chronic Snorers,” Wijewickrama, Rohan, C., David Blalock, and James W. Mims, Otolaryngology-Head and Neck Surgery, Vol. 131, No. 5, pp. 606-609, November 2004.

Objective: The efficacy of many of the noninvasive treatments for snoring has not been evaluated in controlled trials. This paper seeks to evaluate the efficacy of an oil-based spray in the treatment of snoring, in a double-blinded, placebo-controlled, crossover trial using objective acoustic analysis and subjective questionnaires.

Study Design and Setting: Participants were randomized to use both oil-based oral spray (treatment) and water-based oral spray (placebo) during a two-night in-home study period. Questionnaires were completed by participant and bed-partner in addition to audio-tape recordings which were analyzed for frequency, duration, and mean energy of snoring.

Results: Greatest snoring rate demonstrated 30% = benefit; 40% = no change; 30% = adverse effect (n = 20). Percent time snoring yielded: 30% benefit; 15% no change; 55% adverse effect (n = 12): benefit = 17%, no change = 33%, adverse effect = 50%. Bed-partner observations (n = 17) demonstrated 37% = benefit; 38% = no change; 25% adverse effect.

Conclusion/Significance: Objective and subjective evaluation of the performance of the oil-based Snoreless spray in comparison to placebo demonstrated a lack of efficacy in snoring reduction.

In this study, the audiotaped tokens (snores) were analyzed using Kay’s Computerized Speech Lab (CSL), Model 4300B.

 

“Factors Predicting Patient Perception of Dysphonia Caused by Benign Vocal Fold Lesions,” Behrman, Alison, Lucian Sulica, and Tina He, Laryngoscope, Vol. 114, pp. 1693-1700, October 2004.

Objective/Hypothesis: To access factors that may be predictive of patient perception of dysphonia severity, as quantified by the Voice Handicap Index (VHI) score. We hypothesize that 1) level of vocal demand; 2) auditory-perceptual evaluation of dysphonia severity; and 3) vocal function, as defined by phonatory glottal closure and mucosal wave vibration, are the most significant predictors of VHI score.

Study Design: Retrospective review of 100 patients with benign vocal fold lesions.

Methods: Variables assessed for predictive value to VHI score are level of vocal demands, auditory-perceptual evaluation of dysphonia severity, integrity of mucosal wave vibration and phonatory glottal closure, lesion type, duration of current complaint, smoking, age, and sex. Harmonic to noise ratio was assessed in a subset of 50 patients.

Results: Patients with routine voice use had significantly lower VHI scores than those with more intensive (nonsinging/acting) vocal demands. Patients who quit smoking had greater VHI scores than those who currently smoke or never started. Patients with long-standing dysphonia tended to have lower VHI scores than those with shorter duration vocal complaints. Auditory-perceptual assessment of dysphonia severity and harmonic to noise ratio were weak predictors of VHI score. Age, sex, lesion type, phonatory glottal closure, and mucosal wave vibration were not significant predictors of VHI score.

Conclusions: Patient perception of dysphonia severity is independent of many factors commonly assessed during the evaluation of voice disorders. It appears to be an import independent element in the assessment of the effect of a benign vocal fold lesion and critical to therapeutic decision-making.

In this study, acoustic data were obtained using an external microphone input to Kay’s Digital Video Stroboscopy System and sampled at 51.2 k samples/s before the endoscopic examination. The digital files were then analyzed using Kay’s Computerized Speech Lab (CSL).

 

“Voice Changes after Androgen Therapy for Hypogonadotrophic Hypogonadism,” Akcam, Timur, Erol Bolu, Albert L. Merati, Coskun Durms, Mustafa Gerek, and Yalcin Ozkaptan, Laryngoscope, Vol. 114, pp. 1587-1591, September 2004.

Objectives/Hypothesis: Males with isolated hypogonadotropic hypogonadism (IHH) fail to undergo normal sexual development, including the lack of masculinization of the larynx. The objective of this study was to measure the mean vocal fundamental frequency (Mfo) in IHH patients and determine the impact of androgen treatment. An additional aim was to compare the Mfo between IHH patients and controls.

Study/Design: Prospective observational study. 

Methods: Twenty-four patients with IHH were identified along with 30 normal males and females. Voice recordings were obtained on all subjects. Androgen therapy was administered to the IHH patients. The Mfo and serum sex hormone levels were measured before treatment and at intervals during therapy. These results were compared with the pretreatment data within the IHH group. Voice parameters were also compared between the pre- and posttreatment IHH patients and the normal males and females.

Results: The Mfo in untreated IHH patients was 229 ± 41 Hz. This was intermediate between the normal male (150 ± 22 Hz, P < .001) and normal female patients (256 ± 29 Hz, P < .01). After treatment, the Mfo in the IHH group decreased to 173 ± 30 Hz (P < .0001); indeed, their posttreatment Mfo approached that of normal males (P < .08). Serum hormone levels responded to the injected testosterone, but these levels did not directly correlate with Mfo.

Conclusions: Mfo in IHH patients is intermediate between normal male and female levels. After treatment with testosterone, these values approach the range of normal males. This prospective study details the impact of androgens on the larynx and vocal function in patients with IHH.

Kay’s CSL, in conjunction with the Multi-Dimensional Voice Program (MDVP), Model 5105, was used to determine the acoustic parameters in this study.

 

“Quality-of-Life Outcomes Following Laryngeal Endoscopic Surgery for Non-Neoplastic Vocal Fold Lesions,” Johns, Michael M., C. Gaelyn Garrett, Joanna Hwang, Robert H. Ossoff, and Mark S. Courey, Annals of Otology, Rhinology & Laryngology, Vol. 113, Number 8, pp. 597-601, August 2004.

Preservation of the vocal fold cover during laryngeal surgery should optimize vocal outcomes for patients with benign glottal lesions. The aim of the study was to evaluate changes in the quality of life, perceptual voice evaluation, and acoustic and aerodynamic measures of patients before and after endoscopic laryngeal microsurgery for true vocal fold cysts, polyps, and scarring. Pre-operative and postoperative Voice Handicap Index (VHI) scores, Short form 36 scores, and perceptual, acoustic, and aerodynamic voice measures were obtained prospectively from 42 patients who underwent phonomicrosurgery from February 2000 through May 2003. The mean (±SD) preoperative VHI was 49.6 ± 21. The mean postoperative VHI score at a minimum of 3 months after surgery decreased to 26.8 ± 21 (p < .001). When divided by lesion type, VHI scores improved significantly after surgery for vocal fold polyps and cysts. Although patients with vocal fold scarring demonstrated improvement in VHI scores after surgery, statistical significance was not achieved. For the entire group, the Short Form 36 scores were not significantly different from US norms either before or after operation. The acoustic data showed statistically significant decreases in jitter (2.05% to 1.26%), shimmer (7.06% to 4.03%), and noise-to-harmonics ratio (0.18 to 0.13) after surgery (p < .05) in female patients. The upper pitch limit increased after surgery in women (495.3 Hz to 654.9 Hz, p < .001). These results indicate that the voice-related quality of life and some acoustic parameters improved significantly for patients who have undergone laryngeal microsurgery for vocal fold cysts and polyps. Vocal fold scarring remains a difficult clinical problem with less favorable outcomes following surgical treatment in the patients set.

In this study, acoustic data were collected and analyzed using Kay’s CSL, Model 4300, in conjunction with Kay’s Multi-Dimensional Voice Program (MDVP).

 

“The Effect of Task on Determination of Habitual Loudness,” Zraick, Richard I., Whitney Marshall, Laura Smith-Olinde, and James C. Montague, Journal of Voice, Vol. 18, Number 2, pp. 176-182, June 2004.

The purpose of this study was to investigate if there is an effect of task on determination of habitual loudness. Four tasks commonly used to elicit habitual loudness were compared (automatic speech, elicited speech, spontaneous speech, and reading aloud). Participants were adult female speakers (N = 30) with normal voice. A one-way analysis of variance (ANOVA) revealed a statistically significant (p < 0.05) effect of task, with post-hoc analyses indicating that there was a statistically significant difference in habitual loudness elicited via automatic versus spontaneous speech (p < 0.05), and automatic speech versus reading aloud (p < 0.001). The issue of how habitual loudness is defined is considered. Implications of the use of one task for determination of habitual loudness are discussed, as in the possibility of a task effect on determination of other clinically useful vocal parameters.

In this study, Kay’s Visi-Pitch II, Model 3300, was used to determine whether each subject demonstrated normal vocal quality and speaking fundamental frequency and to obtain and analyze measurements of habitual vocal intensity.

 

“Vocal Aging and the Impact on Daily Life: A Longitudinal Study,” Verdonck-de Leeuw, Irma M. and Hans F. Mahieu, Journal of Voice, Vol. 18, Number 2, pp. 193-202, June 2004.

 Longitudinal studies on vocal aging are scarce, and information on the impact of age-related voice changes on daily life is lacking. This longitudinal study reports on age-related voice changes and the impact on daily life over a time period of 5 years on 11 healthy male speakers, age ranging from 50 to 81 years. All males completed a questionnaire on vocal performance in daily life, and perceptual and acoustical analyses of vocal quality and analyses of maximum performance tasks of vocal function (voice. range profile were performed. Results showed a significant deterioration of the acoustic voice signal as well as increased ratings on vocal roughness judged by experts after the time period of 5 years. An increase of self-reported voice instability and the tendency to avoid social parties supports these findings. Smoking males had a lower speaking fundamental frequency compared with nonsmoking males, and this seemed reversible for males who stop smoking. This study suggests a normal gradual vocal aging process with clear consequences in daily life, which should be taken into consideration in clinical practice as well as in studies concerning communication in social life.

Researchers in this study performed acoustic analyses of voice quality using Kay’s MDVP software in conjunction with the CSL, Model 4300B.

 

“Reliability of Calculating the Cepstral Peak without Linear Regression Analysis,” Heman-Ackah, Yoland D., Journal of Voice, Vol. 18, Number 2, pp. 203-208, June 2004.

Measures of cepstral peak prominence, using the smoothing algorithm and linear regression analysis software developed by Hillenbrand have been shown to be reliable predictors of dysphonia in voice samples. Recently, the Computerized Speech Laboratory [(CSL) Kay Elemetrics, Pine Brook, New Jersey] has introduced cepstral analysis as a component of that software package. The cepstral peak, in this instance, is calculated by the voice clinician analyzing the phonatory sample by subtracting the value of the peak from the apparent baseline signal. This study compares the ability of cepstral peak values calculated from the CSL software to predict dysphonia reliably with that of the values produced by the smoothing algorithm and linear regression analysis of Hillenbrand. The results of this study show that linear regression analysis is an important step in calculating the cepstral peak prominence, thus limiting the usefulness of software programs that do not employ this step.

The voice samples in this study underwent cepstral analysis using Kay's core CSL software accessed from samples acquired within MDVP (Advanced version.

 

“Association between birth control pills and voice quality,” Amir, Ofer and Liat Kishon-Rabin, Laryngoscope, Vol. 114, pp. 1021-1026, June 2004.

Objective/Hypothesis: The objective was to extend our knowledge of the effect of birth control pills on voice quality in women based on various acoustic measures.

Study Design: A longitudinal comparative study of 14 healthy young women over a 36- to 45-day period.

Methods: Voices of seven women who used birth control pills and seven women who did not were recorded repeatedly approximately 20 times. Voice samples were analyzed acoustically, using an extended set of frequency perturbation parameters (jitter, relative average perturbation, pitch period perturbation quotient), amplitude perturbation parameters (shimmer, amplitude average perturbation quotient), and noise indices (noise-to-harmonics ratio, voice turbulence index).

Results: Voice quality and stability were found to be better among the women who used birth control pills. Lower values were found for all acoustic measures with the exception of voice turbulence index. Results also provided preliminary indication for vocal changes associated with the days preceding ovulation.

Conclusions: In contrast to the traditional view of oral contraceptives as a risk factor for voice quality, and in keeping with the authors’ previous work, the data in the present study showed that not only did oral contraceptives have no adverse effect on voice quality but, in effect, most acoustic measures showed improved voice quality among women who used the birth control pill. The differences in the noise indices between groups may also shed light on the nature of the effect on vocal fold regulation of vibration than on glottal adduction.

Kay's CSL, Model 4300B, and Multi-Dimensional Voice Program (MDVP) were used to conduct the acoustic analyses in this study.

 

“Medialization laryngoplasty with strap muscle transposition for vocal fold atrophy with or without sulcus vocalis,” Su, Chih-Ying, Shang-Shyue Tsai, Jeng-Fen Chiu, and Chu-An Cheng, Laryngoscope, Vol. 114, pp. 1106-1112, June 2004.

Objective: Vocal fold atrophy with or without sulcus vocalis may result in a spindle-shaped glottal incompetence (SGI). Because of varying drawbacks with all existing materials (e.g., Silastic block, Teflon, fat, etc.) used for medialization or augmentation of the atrophic vocal folds, there is a need to supplant these materials with a more stable, autologous tissue to correct the SGI.

Study Design: Thirty-two patients with vocal fold atrophy underwent medialization laryngoplasty with strap muscle transposition.

Methods: Under local or general anesthesia, the thyroid lamina on the more affected side was vertically incised 5 mm off the midline. The inner perichondrium was carefully elevated fro the overlying thyroid ala. Care was taken not to enter the laryngeal lumen. After dividing the thyrohyoid and cricothyroid membranes, the lamina was retracted laterally. To accommodate the muscle flap more easily, the caudal edge of the lamina was trimmed using a small burr. A bipedicled strap muscle flap was then transposed into the space between the lamina and the paraglottic soft tissue. The thyroid cartilages were carefully sutured back in place. All patients underwent pre- and post-operative voice evaluations including laryngostroboscopy, perceptual assessment, and acoustic and aerodynamic analyses. Patients who had been followed up for more than 3 months were enrolled in the study.

Results: A total of 27 of the 3 patients with complete pre- and postoperative voice function measurements were included in the analysis. Vocal improvement was demonstrated in 26 of these 27 (97%) patients. No dyspnea or other major complications were noted in any patients.

Conclusions: The results indicate that medialization laryngoplasty with strap muscle transposition is a prosthesis-free, safe, and effective technique for correcting SGI caused by vocal fold atrophy.

Videolaryngostroboscopy was performed pre- and post-operatively using Kay’s RLS, Model 9100, stroboscopic light source. Kay’s CSL, Model 4300B, and Aerophone II, Model 6800, were also used in this study to perform the acoustic and aerodynamic analyses, respectively.

 

“Beware of the ‘telephone effect’: the influence of telephone transmission on the measurement of Formant Frequencies,” Künzel, Hermann J., Forensic Linguistics, Vol. 8, pp. 80-112, November 2001.

Abstract: Speech scientists often have to work with speech signals that have been transmitted over the telephone. Although the acoustic properties of telephone transmission such as the band-pass filter characteristics are well know, little attention has been paid to their effect on the measurement of speech parameters. This study deals with artifacts introduced by the lower cut-off slope of the transmission channel on vowel formants. For theoretical reasons, frequency components may be assumed to be attenuated the lower they are. Therefore F1 of most vowels can be expected to be affected most. Attenuation of the lower components of a formant will necessarily increase the relative weight of the higher components for the determination of a formant and thus cause an artificial upward shift of its center frequency. An empirical investigation with directly and telephone-transmitted samples from ten male and ten female subjects shows that the predicted effect on F1 does in fact occur for all tested vowels except /a/, whose F1 is too high to be affected by the slope of the band-pass. The consequences of measurement errors arising from such artifacts are discusses with special reference to speaker identification and empirical dialectology.

Kay’s Multi-Speech software was used to obtain the relevant acoustic measurements in this study.

 

“The Effect of Auditory Feedback on Phonation Threshold Pressure Measurement,” Morgan, Michael D., Miguel A. Triana, and Thomas J. Milroy, Journal of Voice, Vol. 18, Number 1, pp. 46-55, March 2004.

The effect of auditory feedback on phonation threshold pressure (Pth) measurement was investigated in 14 females with normal, untrained voices. Two measurement systems (Glottal Enterprises MS 100—circumferentially vented mask and Kay Elemetrics Aerophone II—non-circumferentially vented mask) were examined under three conditions: (1) masked, (2) no mask, and (3) masked with enhanced auditory feedback-acoustic signal placed at ears through headphones. Masked with enhanced auditory feedback, in addition to subject training, significantly lowered Pth values regardless of mask design. The amount of auditory feedback provided by different mask designs was investigated and revealed a significant difference. Clinical significance of different auditory feedback levels provided by the two mask designs was investigated. Direct comparison of the mean values between systems was not possible because of each system’s design and calibration. Comparisons were accomplished by subtracting means of select-paired conditions (masked/no mask; masked/masked plus masked with enhanced auditory feedback) within each system and then comparing these difference scores from the same paired conditions between each system. No clinical significance in difference scores was revealed because of varying amounts of auditory feedback provided by the masks. Results support the use of enhanced auditory feedback, in addition to subject training, when measuring Pth.

In this study, Kay’s CSL, Model 4300B, and Multi-Dimensional Voice Program (MDVP) were used to ensure normal limits for Fo, jitter, and shimmer; Kay’s Real-Time Pitch Program was used to obtain each subject’s habitual pitch; Kay’s Voice Range Profile (VRP) program was used to obtain the nearest semitone to a subject’s habitual pitch; and, Kay’s Aerophone II, Model 6800, was used to acquire oral pressures and airflow measures.

 

“Outcome of Laryngeal and Velopharyngeal Biofeedback Treatment in Children and Young Adults: A Pilot Study,” Van Lierde, Kristiane M., Sofie Claeys, Marc De Bodt, and Paul Van Cauwenberge, Journal of Voice, Vol. 18, Number 1, pp. 97-106, March 2004.

A relatively new management strategy for the treatment of voice disorders is the use of laryngeal (LB) and velopharyngeal biofeedback (VB). The main purpose of the present pilot study is to document the outcome of vocal and velopharyngeal performances after a well-defined LB and VB treatment. Four subjects were studied pretreatment (1 week before LB or VB treatment) and posttreatment (1 week after the LB or VB treatment). To measure and compare the effect of LB and VB, objective and subjective assessment techniques were used. Perceptual voice assessment included perceptual rating of the voice using the GRBAS scale. Furthermore, the vocal quality in this population is modeled by means of the Dysphonia Severity Index. For the objective assessment of nasal resonance, the Nasometer and the Glatzel test were used. A perceptual evaluation of speech, the Gutzmann test, and the tests from Bzoch were used as subjective assessment techniques. Both patients selected for LB and VB treatment showed improvement of their performances. The resulting improvement, as measured by means of an objective approach, is in agreement with the perceived (auditory) improvement of voice and resonance. The use of LB and VB treatment in patients, especially in some subjects who are not responding to traditional voice or velopharyngeal therapy, must be encouraged.

Researchers in this study used Kay’s Multi-Dimensional Voice Program (MDVP) to obtain measures of highest frequency, lowest intensity, maximum phonation time (MPT) and jitter. Kay’s Nasometer was used to obtain nasalance values.

 

“Relationship Among Glottal Area, Static Supraglottic Compression, and Laryngeal Function Studies in Unilateral Vocal Fold Paresis and Paralysis,” Bielamowicz, Steven, Ritu Kapoor, Jerome Schwartz, and Sheila V. Stager, Journal of Voice, Vol. 18, Number 1, pp. 138-145, March 2004.

In this study, we evaluated the relationship between laryngeal function measures and glottal gap ratio and normalized measures of supraglottic behaviors in patients with unilateral vocal fold paresis (UVFP). Thirty-one patients were found to have unilateral vocal fold paresis by videoendoscopy and laryngeal electromyography, and 13 controls participated in this study. Patients with UVFP demonstrated significantly larger glottal gap ratios (p = 0.016) than control subjects. The nonparalyzed or contralateral vocal fold was associated with significantly more static false vocal fold compression (p = 0.03) compared with the paralyzed vocal fold or with the controls. Patients with unilateral vocal fold paresis were divided into subgroups: those with normal or abnormal maximum phonation time, flow, or pressure measures. Smaller glottal gap ratios were identified in patients with normal maximum phonation times and flow measures. Greater false vocal fold activity was identified in unilateral vocal fold paresis patients with normal laryngeal function measures than in unilateral vocal fold paresis patients with abnormal measures. These findings suggest that some patients with documented unilateral paresis and glottal incompetence can compensate for vocal fold weakness such that their acoustic and aerodynamic measures are normal.

All subjects in this study were examined by transnasal fiberoptic laryngoscopy using Kay’s RLS, Model 9100, light source. Kay’s Model 9105, 70o rigid endoscope was also used to visualize the larynx of each subject. Still images were digitized using Kay’s videostroboscopy software, ver. 1.6.

 

“Spastic/Spasmodic vs. Tremulous Vocal Quality: Motor Speech Profile Analysis,” Lundy, Donna, S., Soham Roy, Jun W. Xue, Roy R. Casiano, and Daniel Jassir, Journal of Voice, Vol. 18, Number 1, pp. 146-152, March 2004.

Strained, strangled, and tremulous vocal qualities that are typically seen in adductor spasmodic dysphonia (ADSD), voice tremor (Tremor), and the spastic dysarthria of amyotrophic lateral sclerosis (ALS) may sound similar and be difficult to differentiate. The purpose of this study was to determine if these vocal qualities of neurologic origin could be differentiated on the basis of acoustic and motor speech parameters. Three groups of subjects (ADSD, ALS, and Tremor) were analyzed by the Motor Speech Profile System (Kay Elemetrics, Lincoln Park, NJ) for fundamental frequency (Fo), standard deviation of Fo, diadochokinetic rate (ddk), standard deviation of ddk, mean intensity and standard deviation of ddk, frequency and amplitude variability in connected speech, and speaking rate in connected speech. Profiles of the three groups are presented with the significant features that differentiated one from the other.

All patients in this study underwent acoustic/motor speech analysis using the Motor Speech Profile, Model 4341, a software option for Kay’s Computerized Speech Lab.

 

“The Importance of the Voice in Male-to-Female Transsexualism,” Neumann, Kerstin and Cornelia Welzel, Journal of Voice, Vol. 18, Number 1, pp. 153-167, March 2004.

Transsexuality is a complex, permanent transposition involving a paradoxical feeling of belonging to the opposite sex. Furthermore, in the case of male-to-female transsexuals, the unchanged male voice, which is at odds with the female outward appearance, poses a serious obstacle to full social integration of the woman.

One way of permanently raising the fundamental frequency, requiring little effort, is modified cricothyroidopexy via miniplates, which has been used in our hospital since 1993 following a technique developed by Isshiki (thyroplasty type IV).

Until now, this operation has been performed on 67 female patients. To record the anatomical-morphological and functional data, preoperatively, post-operatively, and a year after the operation, a detailed voice diagnosis was made, laryngoscopy was carried out, X-rays were taken, and computer-assisted tomography was used to examine the larynx.

Thus far, the functional results have been good. On average, the fundamental frequency has been raised by about one fourth. Whereas none of the female patients had a female-speaking voice before the operation, after the operation, about 30% of the patients’ voices were in the female range, and 32% had at least a neutral-sounding voice.

Kay’s CSL, Model 4300B, was used to measure the voice range profile of subjects in this study.

 

“Muscle tension dysphonia in patients who use computerized speech recognition systems,” Olson, David, E.L., Raul M. Cruz, Krzysztof Izdebski, and Tracey Baldwin, Ear, Nose, and Throat Journal, Vol. 83, Number 3, pp. 195-198, March 2004.

The use of speech recognition systems as a replacement for other types of transcription systems is increasing rapidly, partly because many people are unable to use conventional keyboards as a result of upper-extremity repetitive strain injury (RSI). However, the frequent or continuous use of such systems can cause muscle tension dysphonia in some patients. The scientific literature suggests that there is an association between upper-extremity RSI and muscle tension dysphonia. We present a retrospective case series of five patients with workplace upper-extremity RSI who developed muscle tension dysphonia soon after they began using discrete computerized speech recognition software. The diagnosis of dysphonia was based on laryngovideostroboscopy, acoustic analyses, and voice load testing. All patients had normal voice when using everyday speech, but speaking into the computer resulted in the rapid onset of aperiodicity, strain, and a decrease in fundamental frequency. In three of the five patients, laryngovideostroboscopy showed posterior glottic overapproximation, but no other abnormalities. Treatment was centered on voice therapy and avoidance of long periods of using computerized speech recognition systems. The condition of three of the five patients improved with therapy. We conclude that computer speech recognition programs can lead to the onset of muscle tension dysphonia in some patients. These patients can be successfully treated with voice therapy.

Kay’s CSL, Model 4100, and a Kay stroboscopy system were used by researchers in this study to perform acoustic analysis and laryngovideostroboscopy, respectively.

 

“Laryngeal Function and Vocal Fatigue After Prolonged Reading in Individuals with Unilateral Vocal Fold Paralysis,” Kelchner, Lisa N., Linda Lee, and Joseph C. Stemple, Journal of Voice, Vol. 17, Number 4, pp. 513-528, December 2003.

The purpose of the present study was to examine the effect of prolonged loud reading, intended to induce fatigue, on vocal function in adults with unilateral vocal fold paralysis (UVFP). Subjects were 20 adults, 37-60 years old, with UVFP secondary to recurrent laryngeal nerve paralysis. Subjective ratings and instrumental measures of vocal function were obtained before and after reading. Statistical analysis revealed subjects rated their vocal quality and physical effort for voicing more severely following prolonged loud reading, whereas expert raters did not detect a significant perceptual difference in vocal quality. Reading fundamental frequency (Fo) was significantly increased following prolonged loud reading, as were mean airflow rates at all pitch conditions. Maximum phonation times for comfort and low pitches significantly decreased during posttests. Multiple regression analyses revealed significant associations between ratings of posttest physical effort and select posttest measures. Interpretation of results indicates the prolonged loud reading task was successful in vocally fatiguing most of the UVFP subjects. Key physiologic correlates of vocal fatigue, in individuals with UVFP, include further reduction of glottic efficiency, resulting in decreased regulation of glottic airflow and a temporary destabilization of speaking fundamental frequency.

Kay’s Visi-Pitch was used to collect data for measures of fundamental frequency and intensity during running speech; data for fundamental frequency, intensity, and noise-to-harmonic ratio were collected for sustained vowel /ah/ using Kay’s MDVP.

 

“Effects of Vocal Training on the Acoustic Parameters of the Singing Voice,” Mendes, Ana P., Howard B. Rothman, Christine Sapienza, and W.S. Brown, Jr., Journal of Voice, Vol. 17, Number 4, pp. 529-543, December 2003.

Vocal training (VT) has, in part, been associated with the distinctions in the physiological, acoustic, and perceptual parameters found in singers’ voices versus the voices of nonsingers. This study provided information on the changes in the singing voice as a function of VT over time. Fourteen college voice majors (12 females and 2 males; age range, 17-20 years) were recorded while singing, once a semester, for four consecutive semesters. Acoustic measures included fundamental frequency (Fo) and sound pressure level (SPL) of the 10% and 90% levels of the maximum phonational frequency range (MPFR), vibrato pulses per second, vibrato amplitude variation, and the presence of the singer’s formant. Results indicated that VT had a significant effect on the MPFR, Fo and SPL of the 90% level of the MPFR and the 90-10% range increased significantly as VT progressed. However, no vibrato or singers’ formant differences were detected as a function of training. This longitudinal study not only validates previous cross-sectional research, ie, that VT has a significant effect on the singing voice, but also it demonstrates that these effects can be acoustically detected by the fourth semester of college vocal training.

In this study, Kay’s CSL, Model 4300B, was used to perform the acoustic analyses.

 

“Tracking Outcomes after Phonosurgery for Sulcus Vocalis: A Case Report,” Welham, Nathan V., Bernard Rousseau, Charles N. Ford, and Diane M. Bless, Journal of Voice, Vol. 17, Number 4, pp. 571-578, December 2003.

Outcomes data after a surgical or behavioral intervention should be tracked until stability is reached. Often it is unclear how long patients should be followed and at what point an outcome can be considered stable. These issues have implications for treatment decision making, efficacy measurement, and the design of research studies. Vocal function data were collected 24 hours before and at 1, 6, and 12 months after phonosurgery for sulcus vocalis. One data series was collected daily during the first month after surgery, providing a unique opportunity to study voice changes in the immediate postoperative period. The different vocal function indices (acoustic, perceptual, videostroboscopic, aerodynamic, psychosocial) demonstrated a general pattern of improvement after intervention; however, they appeared to reach stability at different times. This report reinforces the value of following patients until complete outcome stability.

Laryngeal imaging was performed using a Kay RLS, Model 9100, attached to a rigid endoscope; Kay’s VRP, in conjunction with CSL, Model 4300B, was used to measure Fo range.

 

“Early Results of Transcutaneous Injection Laryngoplasty with Micronized Acellular Dermis Versus Type-I Thyroplasty for Glottic Incompetence Dysphonia Due to Unilateral Vocal Fold Paralysis,” Lundy, Donna S., Roy R. Casiano, Mark E. McClinton, and Jun W. Xue, Journal of Voice, Vol. 17, Number 4, pp. 589-595, December 2003.

Medialization thyroplasty (type I) has become the gold standard to improve glottic closure due to unilateral vocal fold paralysis. A newer injection method utilizing homologous collagen from cadaveric human tissue has been described as an attractive alternative as no donor site is required, there is a very low risk of hypersensitivity, and the intact, acellular collagen fibers may suffer a reduced long-term reabsorption rate. Preliminary results on eight patients comparing presurgical and postsurgical parameters (perceptual, stroboscopic, acoustic, and aerodynamic) revealed comparable results when compared with a control group of individuals, age- and sex-matched, that had undergone standard medialization thyroplasty (type I). Further study is needed to assess the long-term results with this minimally invasive method of vocal fold medialization.

Researchers in this study used a Kay stroboscopy system to assess the degree of glottal closure; acoustic analysis, including jitter rate and noise-to-harmonic ratio, was performed using Kay’s MDVP.

 

Maxillary obturators: the relationship between patient satisfaction and speech outcome,” Rieger, Jana M., John F. Wolfaardt, Naresh Jha, and Hadi Seikaly, Head and Neck, pp. 895-903, November 2003.

Abstract:

Background: Patient satisfaction with a maxillary obturator has been studied in relation to extent of surgical defect, sociodemographic characteristics, scores on mental health inventories, and psychosocial adjustment to illness scales. However, review of the literature reveals limited study of the relationship between patient satisfaction with an obturator and clinical speech outcome measures. The purpose of this study is to relate patient satisfaction scores obtained by questionnaire with those obtained by means of clinical speech measurements.

Methods: Acoustical, aeromechanical, and perceptual measurements of speech were collected for 20 patients after receiving a definitive obturator. Patient satisfaction with their obturator was later measured with the Obturator Functioning Scale (OFS).

Results: Results reveal that poorer aeromechanical speech results were associated with patient-reported avoidance of social events, whereas lower speech intelligibility outcomes were related to overall poorer perception of speech function on the OFS. Several background patient characteristics were significantly related to several responses on the OFS and to the aeromechanical assessment outcomes.

Conclusions: Results from instrumental assessments of speech seem to be informative regarding not only speech outcome but also a patient’s satisfaction with the obturator. Consideration of background patient characteristics is important when interpreting both clinically obtained and patient-perceived outcomes.

Kay’s Nasometer, Model 6200, was used to sample and evaluate the resonance balance of oral and nasal acoustic speech energy in this study.

 

“Cymetra injection for unilateral vocal fold paralysis,” Karpenko, Andrew N., Robert J. Meleca, James P. Dworkin, and Robert J. Stachler, Annals of Otology, Rhinology & Laryngology, Vol. 112, Number 11, pp. 927-934, November 2003.

Cymetra has shown excellent tissue biocompatibility, a low rate of resorption, and no tissue reactivity when injected for treatment of facial wrinkling. On the basis of these findings, we hypothesize that injection of Cymetra into the thyroarytenoid muscle for treatment of glottal incompetence may demonstrate similar findings and lead to long-term improvement in voice quality and glottal gap closure. Ten patients with breathy dysphonia caused by unilateral vocal fold paralysis underwent transoral injection of Cymetra into the thyroarytenoid muscle. Each subject underwent preoperative and postoperative acoustic analysis, aerodynamic measures, taped voice sampling, and videostroboscopy. Significant improvements were identified in maximum phonation time, relative glottal area, and subjective judgment of glottal competency. These results were not maintained at the 3-month study interval. No significant change in quantitative or subjective voice quality was noted for the study group during the investigation. Resorption of cymetra may play a significant role in contributing to these findings.

Kay’s RLS, Model 9100, was used in the study to observe laryngeal anatomy and phonatory biomechanics. Kay’s Computerized Speech Lab (CSL), Model 4300, and Aerophone II were used to perform the acoustic and speech aerodynamic analysis, respectively.

 

“Acoustic analysis of autologous fat injection versus thyroplasty in the same patient,” Hartl, Dana M., Ollivier Laccourreye, Jacqueline Vaissière, and Daniel F. Brasnu, Annals of Otology, Rhinology & Laryngology, Vol. 112, Number 11, pp. 987-992, November 2003.

We objectively measured the acoustic effects of treatment of unilateral vocal fold paralysis by injection of autologous fat and by polytetrafluoroethylene throplasty, in the same patient. To our knowledge, this is the first report comparing the two techniques by using the patients’s normal voice as the control. The voice of a male patient was recorded before and after onset of unilateral vocal fold paralysis, after treatment with autologous fat, and after polytetrafluoroethylene throplasty. Acoustic analysis was performed on a long-term average spectrum of text and on the MDVP (Kay Elemetrics) evaluation of the vowel /a/. Jitter and shimmer were not normalized, but they improved to a greater extent after fat injection. The cepstral peak prominence, spectral skewness, and long-term average spectrum returned to preparalytic values after both treatments, but improved to a greater extent after fat injection. This study showed that both techniques can return the voice to preparalytic values. Spectral measurements best reflected the voice improvement. Further prespective studies in a larger number of patients will be necessary to confirm these results and to determine the long-term objective voice outcome obtained with these techniques.

Kay’s CSL Model, 4300B was used for the acoustic analysis in this study; Kay’s MDVP was used to determine jitter, shimmer, degree of voicelessness, and number of unvoiced segments for a 500-ms midvowel token.

 

“Voice and deglutition functions after the supracricoid and total laryngectomy procedures for advanced stage laryngeal carcinoma,” Dworkin, James Paul, Robert J. Meleca, Mark A Zacharek, Robert J. Stachler, Raza Pasha, G.G. Abkarian, Richard A. Culatta, and John R. Jacobs, Otolaryngology-Head and Neck Surgery, Vol. 129, Number 4, pp. 311-320, October 2003.

OBJECTIVE: This investigation compared speech and deglutition functions after alternative surgical treatments for advanced stage laryngeal carcinoma: the supracricoid laryngectony (SCL) versus the total laryngectomy (TL).

STUDY DESIGN AND SETTING: Cohort investigational Wayne State University School of Medicine.

METHODS: Quantitative studies of laryngeal biomechanics, acoustic and speech aerodynamic features, and deglutition skills of these individuals were coupled to listener and patient self-impressions of speech and voice characteristics for group comparative analyses.

RESULTS: Results revealed that patients from each subgroup performed comparably relative to speech intelligibility and voice quality disturbances. Videostroboscopy of the neoglottal mechanisms in these two populations helped to explain these outcomes. Acoustic and speech aerodynamic testing demonstrated variably abnormal features in both surgical subgroups. Whereas the SCL patients eventually achieved full oral diets, they required many sessions of swallowing therapy to obtain this objective and eliminate tube feeding supplementation. The TL patients did not evidence protracted swallowing difficulties or the need for specific exercises in order to remove their feeding tubes postoperatively. References to organ preservation strategies in lieu of surgical management are included for completeness purposes.

CONCLUSION: The SCL and TL surgical procedures for advanced stage laryngeal carcinoma resulted in equivalent speech and swallowing functional outcomes.

Kay’s CSL, Model 4300, and Aerophone II were used to collect all acoustic and speech aerodynamic data, respectively. Kay’s RLS, Model 9100, was used to observe structure and function of the neoglottis during all phonatory assessments.

 

“Functional outcomes after primary oropharyngeal cancer resection and reconstruction with the radial forearm free flap,” Seikaly, Hadi, Jana Rieger, John Wolfaardt, Gerald Moysa, Jeffery Harris, and Naresh Jha, Laryngoscope, Vol. 113, pp. 897–904, May 2003.

Objective: To report prospectively collected aeromechanical, acoustical, and perceptual speech outcomes, as well as preliminary swallowing data, in patients having reconstruction with radial forearm free flaps after primary resection for oropharyngeal cancer.

Study Design: Prospective cohort study.

Methods: Acoustical, aeromechanical, and perceptual speech data and swallowing data were gathered at three evaluation times (preoperatively and before and after radiation therapy) for patients treated for oropharyngeal cancer by means of primary resection and reconstruction with a radial forearm free flap. Degree of involvement of the soft palate and base of tongue, along with reconstructive techniques, were entered as between-group factors in the analysis.

Results: There were no significant differences in speech intelligibility between the patient groups based on the degree of palate and tongue resected. However, patients with resections of half or more than half of the soft palate had significantly higher nasalance values and larger velopharyngeal orifice areas than individuals who had less than half of the soft palate resected. Significant within-subject differences were revealed across evaluation times for the dependent variables nasalance, velopharyngeal orifice area, and word intelligibility. Ninety-four percent of the patients were able to resume a normal or soft diet. There was a 6% incidence of aspiration in 128 swallows that were analyzed. The amount of base of tongue resected did not significantly affect any of the speech or swallowing parameters.

Conclusions: Radial forearm free flaps are a good reconstructive option after oropharyngeal cancer extirpation. Our acoustic and aeromechanical results indicated that issues related to quality of the speech signal require further study for resections of half or more than half of the soft palate.

Kay’s Nasometer, Model 6200 was used to collect information regarding resonance balance and to obtain nasalance scores.

 

“Speech outcomes in patients rehabilitated with maxillary obturator prostheses after maxillectomy: a prospective study,” Rieger J., J. H. and N. Jha, International Journal of Prosthodontics, Vol. 15(2), pp. 139-44, March/April 2002.

PURPOSE: Speech outcome measurements are valuable in guiding treatment and determining the effectiveness of rehabilitation with a maxillary obturator prosthesis in individuals with palatal resection. Although speech outcome data exist in the literature for such patients, relatively few reports have used clinical tools designed to measure the acoustic, physiologic, and perceptual bases of speech. This investigation reports these measures for individuals rehabilitated with a maxillary obturator.

MATERIALS AND METHODS: Speech measurements were collected prospectively at three clinical visit times (preoperative, postresection without an obturator, and with a definitive obturator) for 12 patients assigned to three groups based on the extent of their resection (< half the hard palate, > or = half the hard palate, hard and soft palates). Acoustic data were obtained with the Nasometer, aeromechanical data were collected with the PERCI-SARS, and perceptual ratings of speech intelligibility were obtained through listener analysis.

RESULTS: Significant differences existed among the three treatments for all dependent variables and revealed that speech without an obturator is significantly different from the preoperative state, while speech with an obturator does not differ significantly from preoperative function. Individuals with soft palate involvement exhibited significantly poorer nasalance values than individuals with involvement of the hard palate only.

CONCLUSION: Rehabilitation with a maxillary obturator is successful in restoring preoperative speech function. Rehabilitation of individuals with involvement of the soft palate may be more challenging.

In this study, Kay’s Nasometer was used to obtain nasalance values from the acoustic data collected.

 

“Cepstral Peak Prominence: A more reliable measure of dysphonia,” Heman-Ackah, Yolanda D., Reinhardt J. Heuer, Deirdre D. Michael, Rosemary Ostrowski, Michelle Horman, Margaret M. Baroody, James Hillenbrand, and Robert T. Sataloff, Annals of Otology, Rhinology & Laryngology, Vol. 112, Number 4, pp. 324-333, April 2003.

Quantification of perceptual voice characteristics allows the assessment of voice changes. Acoustic measures of jitter, shimmer, and noise-to-harmonic ratio (NHR) are often unreliable. Measures of cepstral peak prominence (CPP) may be more reliable predictors of dysphonia. Trained listeners analyzed voice samples from 281 patients. The NHR, amplitude perturbation quotient, smoothed pitch perturbation quotient, percent jitter, and CPP were obtained from sustained vowel phonation, and the CPP was obtained from running speech. For the first time, normal and abnormal values of CPP were defined, and they were compared with other acoustic measures used to predict dysphonia. The CPP for running speech is a good predictor and a more reliable measure of dysphonia than are acoustic measures of jitter, shimmer, and NHR.

Kay’s CSL, Model 4300B, was used to digitize the voice samples collected in this study. Acoustic analysis was performed using Kay’s MDVP to obtain measures of jitter, shimmer, and noise-to-harmonic ratio.

 

“Comparison of voice characteristics following three different methods of treatment for laryngeal cancer,” Eksteen, Eddie C., Jana Rieger, Margaret Nesbitt, and Hadi Seikaly, Journal of Otolaryngology, Vol. 32, Number 4, pp. 250-253, March 2003.

Introduction: Laryngeal cancer treatment has become more complex and diversified in past decades. Many different methods of treatment have evolved, and most have been able to restore the patient’s function and maintain some form of functional speech. This study was designed to evaluate the voice and speech characteristics of patients who have undergone different treatments for laryngeal cancer and to compare those characteristics with those of age- and sex-matched normal laryngeal speakers.

Methods: Twenty-two male subjects participated in the study. Five men were treated with radiation therapy, 6 men had supracricoid partial laryngectomy, 6 men had undergone total laryngectomy with tracheoesophageal puncture, and 5 men were normal laryngeal speakers. Acoustic, acromechanical, and perceptual assessments of speech were collected.

Results: Significant age effects were found for maximum phonation times. As age increased, maximum phonation time decreased (p < .005). Significant differences were found between groups for the following dependent variables: percentage of voiceless phonation, maximum phonation time, laryngeal airway resistance, subglottal pressure, oral flow, and word intelligibility. Trends in the data for differences between groups were noted for the following acoustic variables: noise-to-harmonics ratio, jitter, and shimmer.

Conclusions: All patients developed or maintained a source of voicing after treatment and could use speech functionally, as demonstrated by normal sentence intelligibility. The radiation treatment group had voices that differed the least from the control group, whereas the opposite was true for the surgical groups, especially for those with total laryngectomy.

In this study, Kay’s Multi-Dimensional Voice Program (MDVP) in conjunction with the CSL, Model 4400, was used to perform the acoustic analysis, specifically measures of fundamental frequency, noise-to-harmonic ratio, jitter, shimmer, and percentage of voiceless phonation on speech samples of the sustained vowel/ah/; fundamental frequency was obtained for connected speech samples.

 

“Vocal Dose Measures: Quantifying Accumulated Vibration Exposure in Vocal Fold Tissues,” Titze, Ingo R., Jan G. Svec, and Peter S. Popolo. JSLHR, Vol. 46, Number 4, pp. 919-932, August 2003.

To measure the exposure to self-induced tissue vibration in speech, three vocal doses were defined and described: distance dose, which accumulates the distance that tissue particles of the vocal folds travel in an oscillatory trajectory; energy dissipation dose, which accumulates the total amount of heat dissipated over a unit volume of vocal fold tissues; and time dose, which accumulates the total phonation time. These doses were compared to a previously used vocal dose measure, the vocal loading index, which accumulates the number of vibration cycles of the vocal folds. Empirical rules for viscosity and vocal fold deformation were used to calculate all the doses from the fundamental frequency (F0) and sound pressure level (SPL) values of speech. Six participants were asked to read in normal, monotone, and exaggerated speech and the doses associated with these vocalizations were calculated. The results showed that large F0 and SPL variations in speech affected the dose measures, suggesting that accumulation of phonation time alone is insufficient. The vibration exposure of the vocal folds in normal speech was related to the industrial limits for hand-transmitted vibration, in which the safe distance dose was derived to be about 500 m. This limit was found rather low for vocalization; it was related to a comparable time dose of about 17 min of continuous vocalization, or about 35 min of continuous reading with normal breathing and unvoiced segments. The voicing pauses in normal speech and dialogue effectively prolong ht e safe time dose. The derived safety limits for vocalization will likely require refinement based on a more detailed knowledge of the differences in hand and vocal fold tissue morphology and their response to vibrational stress, and on the effect of recovery of the vocal fold tissue during voicing pauses.

In this study, Kay’s Computerized Speech Lab (CSL), Model 4400, was used in conjunction with a DAT recorder and direct DAT interface for multi-channel data acquisition (both acoustic and EGG data) and storage.

 

“Perceptual Evaluation of Tracheoesophageal Speech by Naive and Experienced Judges Through the Use of Semantic Differential Scales,” Van As, Corina,  Florien J. Koopmans-van Beinum, Louis C.W. Pols, and Frans J.M. Hilgers, JSLHR, Vol. 46, Number 4, pp. 947-959, August 2003.

The present study was conducted to investigate voice quality in tracheoesophageal speech by means of perceptual evaluations and to develop a clinically useful subset of perceptual scales sufficient for these perceptual evaluations. The perceptual ratings were obtained from both naïve and trained raters (speech-language pathologists [SLPs] after listening to a read-aloud text. The perceptual evaluations were performed by means of 19 semantic dipolar 7-point scales for the naïve raters and 20 semantic bipolar 7-point scales for the trained raters. The trained raters were also asked to judge the overall voice quality as good, reasonable, or poor. Both naïve listeners and trained SLPs were able to perform reliable perceptual judgments. Naïve raters judged the tracheoesophageal voice as more deviant than the trained raters did. Naïve raters made judgments based on 2 underlying perceptual dimensions (voice quality and pitch), whereas the trained raters made judgments based on 4 underlying perceptual dimensions (voice quality, tonicity, pitch, and tempo). These perceptual dimensions were further subdivided into a subset of 4 perceptual scales for the naïve raters and a subset of 8 perceptual scales for the trained raters. This appeared to provide a sufficient coverage of the underlying perceptual dimensions used by the listeners.

The authors used Kay’s Computerized Speech Lab (CSL), Model 4300B (in conjunction with a DAT) for recording the tracheoesophageal speech samples which were evaluated perceptually in this study.

“Changes in the Human Vocal Tract Due to Aging and the Acoustic Correlates of Speech Production: A Pilot Study,” An Xue, Steve and Grace Jianping Hao, JSLHR, Vol. 46, Number 3, pp. 689-701, June 2003.

This investigation used a derivation of acoustic reflection (AR) technology to make cross-sectional measurements of changes due to aging in the oral and pharyngeal lumina of male and female speakers. The purpose of the study was to establish preliminary normative data for such changes due to aging in the formant frequencies of selected spoken vowels and their long-term average spectra (LTAS) analysis. Thirty-eight young men and women and 38 elderly men and women were involved in the study. The oral and pharyngeal lumina of the participants were measured with AT technology, and their formant frequencies were analyzed using the Kay Elemetrics Computerized Speech Lab. The findings have delineated specific and similar patterns of aging changes in human vocal tract configurations in speakers of both genders. Namely the oral cavity length and volume of elderly speakers increased significantly compared to their young cohorts. The total vocal tract volume of elderly speakers also showed a significant increment, whereas the total vocal tract length of elderly speakers did not differ significantly from their young cohorts. Elderly speakers of both genders also showed similar patterns of acoustic changes of speech production, that is, consistent lowering of formant frequencies (especially F1) across selected vowel productions. Although new research models are still needed to succinctly account for the speech acoustic changes of the elderly, especially for their specific patterns of human vocal tract dimensional changes, this study has innovatively applied the noninvasive and cost-effective AR technology to monitor age-related human oral and pharyngeal lumina changes that have direct consequences for speech production.

Kay’s Computerized Speech Lab (CSL), Model 4300B, was used for analysis of formant frequencies of the word tokens produced by the study’s participants.

 

“Voice, Speech, and Swallowing Outcomes in Laser-Treated Laryngeal Cancer,” Jepsen, Matthew C., Deepak Gurushanthaiah, Nelson Roy, Marshall E. Smith, Steven D. Gray, and R. Kim Davis, Laryngoscope, Vol. 113, Number 6, pp. 923-928, June 2003.

To describe preliminary voice, speech, and swallowing outcomes in patients treated by endoscopic laser excision of laryngeal cancer with or without adjuvant radiation therapy. Study Design: Retrospective review. Methods: Seventeen surgically treated patients (five T2 glottic and 12 clinically staged T2 supraglottic squamous cell carcinomas) participated in the study. Self-ratings of voice (Voice Handicap Index) and swallowing (M.D. Anderson Dysphagia Inventory) were completed, as well as independent auditory-perceptual ratings of voice and speech recordings. Results: Although no significant difference between Voice Handicap Index, M.D. Anderson Dysphagia Inventory, and listener ratings was identified based on tumor site and irradiation status, there was a trend toward poorer outcomes in patients who received adjuvant radiation therapy. Whereas the patients having supraglottic cancer tended to report better voice but poorer swallowing outcomes, the glottic cancer group displayed the opposite pattern. Severity on Voice Handicap Index correlated significantly with listener severity ratings of speech, suggesting that the patients’ perception of their voice handicap was similar to the listeners’ judgments of their speech severity. Conclusions: The results suggest the following trends: 1) Adjuvant radiation therapy was associated with poorer outcomes for voice, speech, and swallowing and may be associated with more impairment than surgery alone and 2) poorer outcomes on voice and swallowing were observed for the glottic and supraglottic cancer groups, respectively. To bolster these preliminary finds, additional outcomes studies in patients treated with conservation therapy are needed.

Kay’s Computerized Speech Lab (CSL) was used for multiple token recordings and subsequent playback for the auditory-perceptual rating of the patients’ speech.

 

“Acoustic Analysis of Pathological Voices Compressed with MPEG System,” Gonzalez, Julio, Teresa Cervera, and M. Jose Llau, Journal of Voice, Vol. 17, Number 2, pp. 126-139, June 2003.

The MPEG-1 Layer 3 compression schema of audio signal, commonly known as mp3, has caused a great impact in recent years as it has reached high compression rates while conserving a high sound quality. Music and speech samples compressed at high bitrates are perceptually indistinguishable from the original samples, but very little was known about how compression acoustically affects the voice signal. A previous work with normal voices showed a high fidelity of high-bitrate compressions both in voice parameters and the amplitude-frequency spectrum. In the present work, dysphonic voices were tested through two studies. In the first study, spectrograms, long-term average spectra (LTAS), and fast Fourier transform (FFT) spectra of compressed and original samples of running speech were compared. In the second study, intensities, formant frequencies, formant bandwidths, and a multidimensional set of voice parameters were tested in a set of sustained phonations. Results showed that compression of high bitrates (96 and 128 kbps) preserved the relevant acoustic properties of the pathological voices. With compressions at lower bitrates, fidelity decreases, introducing some important alterations. Results from both works, Gonzalez and Cervera and this paper, open up the possibility of using MPEG-compression at high bitrates to store or transmit high-quality speech recordings, without altering their acoustic properties.

In this study, voice samples from Kay’s Disordered Voice Database were used; in addition, Kay’s Multi-Dimensional Voice Program was used to compare ten voice parameters (across original versus compressed speech samples) including Fo, Jitter, Shimmer, Relative Average Perturbation, Amplitude Perturbation Quotient, Noise to Harmonic Ratio, Voice Turbulence Index, Soft Phonation Index, Number of Subharmonic Segments, and Number of Unvoiced Segments.

“Vibrato Rate Adjustment,” Dromey, Christopher, Neisha Carter, and Arden Hopkin, Journal of Voice, Vol. 17, Number 2, pp. 168-178, June 2003.

The goal of the present study was to document the acoustic changes that occur as singers attempt to increase or decrease their vibrato rate to match target stimuli. Eight advanced singing students produced vowels with vibrato in three registers, both naturally and while attempting to match faster or slower rate stimuli. Slower rates were associated with lower intensity and less steady vibrato. Faster rates involved increased vibrato extent in the chest register and increased intensity in the head register. Singers whose spontaneous vibrato rates ere naturally either slower or faster tended to also be relatively slower or faster when matching target rates. This ability to modify rate may have beneficial effects on the artistic quality of the voice for performance.

In this study, Kay’s Computerized Speech Lab (CSL), Model 4300B was used to digitize the acoustic signals; Kay’s Motor Speech Profile (MSP) was subsequently used to analyze each singer’s vibrato.

“Pitch Matching Accuracy of Trained Singers, Untrained Subjects with Talented Singing Voices, and Untrained Subjects with Nontalented Singing Voices in Conditions of Varying Feedback,” Watts, Christopher, Jessica Murphy, and Kathryn Barnes-Burroughs, Journal of Voice, Vol. 17, Number 2, pp. 185-194, June 2003.

At a physiological level, the act of singing involves control and coordination of several systems involved in the production of sound, including respiration, phonation, resonance, and afferent systems used to monitor production. The ability to produce a melodious singing voice (e.g. in tune with accurate pitch) is dependent on control over those motor and sensory systems. To test this position, trained singers and untrained subjects with and without expressed singing talent were asked to match pitches of target pure tones. The ability to match pitch reflected the ability to accurately integrate sensory perception wit motor planning and execution. Pitch matching accuracy was measured at the onset of phonation (prephonatory set) before external feedback could be utilized to adjust the voiced source, during phonation when external auditory feedback could be utilized and during phonation when external auditory feedback was masked. Results revealed trained singers and untrained subjects with singing talent were no different in their pitch-matching abilities when measured before or after external feedback could be utilized. The untrained subjects with singing talent were also significantly more accurate than the trained singers when external auditory feedback was masked. Both groups were significantly more accurate than the untrained subjects without singing talent.

Kay’s Computerized Speech Lab (CSL) was used for signal generation and signal acquisition in this study. Kay’s Voice Range Profile software was used to measure each participant’s fundamental frequency range.

“Effect of Hydration and Vocal Rest on the Vocal Fatigue in Amateur Karaoke Singers,” Yiu, Edwin M-L and Rainy MM Chan, Journal of Voice, Vol. 17, Number 2, pp. 216-227, June 2003.

Karaoke singing is a very popular entertainment among young people in Asia. It is a leisure singing activity with the singer’s voice amplified with special acoustic effects in the backdrop of music. Music video and song captions are shown on television screen to remind the singers during singing. It is not uncommon to find participants singing continuously for four to five hours each time. As most of the karaoke singers have no formal training in singing, these amateur singers are more vulnerable to developing voice problems under these intensive singing activities. This study reports the performance of 20 young amateur singers (10 males and 10 females, aged between 20-25 years) on a series of phonatory function tasks carried out during continuous karaoke singing. Half of the singers were given water to drink and short duration of vocal rests at regular intervals during singing and the other half sang continuously without taking any water or rest. The subjects who were given hydration and vocal rests sang significantly longer than those who did not take any water or rest. The voice quality, as measured by perceptual and acoustic measures, and vocal function, as measured by phonetogram, did not show any significant changes during singing in the subjects who were given water and rest during the singing. However, subjects who sang continuously without drinking water and taking rests showed significant changes in the jitter measure and the highest pitch they could produce during singing. These results suggest that hydration and vocal rests are useful strategies to preserve voice function and quality during karaoke singing. This information is useful educational information for karaoke singers.

Kay’s Multi-Dimensional Voice Program (MDVP) was used in conjunction with the Computerized Speech Lab (CSL) for voice assessment, specifically to obtain mean fundamental frequency, jitter (percent), shimmer (dB), and noise-to-harmonic ratios.

“Instrumental and Perceptual Evaluations of Two Related Singers, Buder,” Eugene, H. and Teresa Wolf, Journal of Voice, Vol. 17, Number 2, pp. 228-244, June 2003.

The primary goal of this study was to characterize a performer’s singing and speaking voice. One woman was not admitted to a premier choral group, but her sister, who was comparable in physical characteristics and background, was admitted and provided a valuable control subject. The perceptual judgment of a vocal coach who conducted the group’s auditions was decisive in discriminating these 2 singers. The singer not admitted to the group described a history of voice pathology, lacked a functional head register, and spoke with a voice characterized by hoarseness. Multiple listener judgments and acoustic and aerodynamic evaluations of both singers provided a more systematic basis for determining: 1) the phonatory basis for this judgment; 2) whether similar judgments would be made by groups of vocal coaches and speech-language pathologists; and 3) whether the type of tasks (e.g., sung vs. spoken) would influence these judgments. Statistically significant differences were observed between the ratings of vocal health provided by two different groups of listeners. Significant interactions were also observed as a function of the types of voice samples heard by these listeners. Instrumental analyses provided evidence that, in comparison to her sister, the rejected singer had a compromised vocal range, glottal insufficiencies as assessed aerodynamically and electroglottographically, and impaired acoustic quality, especially in her speaking voice.

In this study, Kay’s Computerized Speech Lab (CSL), Model 4300B, was used to acquire voice samples; Kay’s Voice Range Profile (VRP), Model 4326; CSL Pitch Program, Model 4331; Multi-Dimensional Voice Program (MDVP), Model 4305; Electroglottograph (EGG), Model 4338; and Aerophone II, Model 6800, were used to noninvasively evaluate various vocal functions.

 

“En Route to the Three-Dimensional Registration and Analysis of Speech Movements: Instrumental Techniques for the Study of Articulatory Kinematics,” Earnest, Margaret M. and Ludo Max, Contemporary Issues in Communication Science and Disorders, Vol. 30, pp. 5-25, Spring 2003.

Increasing our understanding of the physiological processes underlying speech production is crucial for both theoretical and clinical reasons. Such gains in theory and clinical application, however, depend on the availability and use of technologically advanced instruments that can accurately reveal the processes of interest. Consequently, it is essential that not only researchers but also clinicians and students in communication sciences and disorders have access to up-to-date information about the most recent developments in the area of instrumental analyses of speech production. Those professionals involved in research continually need to make important decisions regarding the most appropriate instruments to address their specific research questions, whereas those in clinical practice need at least a basic understanding of the same information to interpret the contemporary literature and to decide when a specific instrument is ready to be used for diagnosis or treatment. Therefore, it is unfortunate that, despite numerous rapid and exciting advances in analog and digital technology, instrumental speech analysis procedures receive relatively little attention in most speech-language pathology students’ academic curriculum. Hence, the purpose of this paper is to provide a summary of this important information by presenting an overview of various instrumental techniques that are being used in several laboratories or that are currently in an advanced stage of development. To limit the scope of this work, only instruments suitable for the transduction of articulatory movements are presented. Specific applications, capabilities, and limitations of each of the systems are discussed.

In this overview of well-established instrumental techniques, Kay’s Palatometer was cited as one of the most commonly used systems for electropalatography.

 

“A Preliminary Study of Speech Rates in Young Australian English-Speaking Children,” Robb, Michael, Harvey Gilbert, Ricki Reed, and Amanda Bisson, Contemporary Issues in Communication Science and Disorders, Vol. 30, pp. 84-91, Spring 2003.

Speaking rate and articulation rate data are provided for a group of 10 Australian children (aged 30-51 months). Articulation rates were significantly faster than overall speaking rates. Speaking rate and articulation rate were positively correlated with utterances up to seven syllables in length. The results are compared to previous data obtained for older Australian children, as well as data obtained for same-aged American English-speaking children. The clinical applicability of the present data is discussed.

Kay’s Computerized Speech Lab (CSL), Model 4300B, was used for the acoustic analysis of each child’s individual utterances in this study.

 

“Laryngeal Effects Of Antigen Stimulation Challenge with Perennial Allergen Dermatophagoides Pteronyssinus,” Reidy, Patrick M., James P. Dworkin, and John H. Krouse, Otolaryngology-Head and Neck Surgery, Vol. 128, Number 4, pp. 455-462, April 2003.

OBJECTIVE: We conducted a pilot study to assess the effects of antigen stimulation on the appearance and function of the larynx.

STUDY DESIGN AND SUBJECTS: The prospective, double-blind, randomized study included 9 adult patients with a skin-prick test positive for Dermatophagoides pteronyssinus.

MAIN OUTCOME MEASURES: Subjects were blindly challenged via nebulizer with either an active antigenic suspension or placebo. Baseline and 30-minute evaluations of the larynx were performed. Assessments included subjective voice and videostroboscopic assessments, acoustic analysis of voice, speech aerodynamic testing, and allergy and voice handicap questionnaires.

RESULTS: Although both inflammation and increased mucus were noted, there were no significant differences between the antigen- and placebo-exposed subjects on any of the measures obtained.

CONCLUSIONS: Our preliminary investigation was not successful in demonstrating a direct causal relationship between antigen exposure and physical or functional changes in the larynx. Future studies will involve modifications to our current methodology, including increasing the concentration of antigen, prolonging the exposure time, and observing for late phase responses.

In this study, laryngeal anatomy and physiology were analyzed using Kay’s RLS, Model 9100. Kay’s CSL, Model 4300, was used for acoustic analysis, specifically to measure jitter, shimmer, Fo, and harmonic-to-noise ratio. The Aerophone was used to obtain the aerodynamic measurements of mean transglottic airflow, glottal resistance, and subglottal pressure.

 

“An Alternative Surgical Procedure for the Treatment of Vocal Fold Retention Cyst,” Chang, Hsin-Pin and Shyue-Yih Chang, Otolaryngology-Head and Neck Surgery, Vol. 128, Number 4, pp. 470-477, April 2003.

OBJECTIVE: Enucleation is the standard surgical procedure for vocal fold retention cyst, but it is sometimes technically challenging when the cyst is large. A simple alternative procedure, which evolved from the concept of marsupialization of cystic lesions, is presented, and the treatment results are reported.

METHODS: Termed the “wide-opening method,” this technique involves the removal of the medial half of the cyst with its overlying mucosa, which creates a large opening that prevents content retention. Twenty-one consecutive patients with a cyst with a diameter of greater than 2 mm were treated by this method. Videostroboscopy and voice function assessments were performed preoperatively and postoperatively.  

RESULTS: The procedure was completed on all patients without difficulty. Voice function assessments showed statistically significant improvement post-operatively. Videostroboscopy showed improved vocal fold vibration postoperatively in all patients, although in 9 patients the postoperative wound with mild local stiffness was observable. Recurrence was noted in 1 patient.

CONCLUSIONS: The wide-opening method is simple in technique, especially when the cyst is large. Voice improvement can be achieved postoperatively, and the recurrence rate is low.

In this study, Kay’s CSL was used to perform acoustic analysis including jitter (%), shimmer (dB), and HNR. Kay’s Aerophone II was used for aerodynamic measurements, while the Voice Range Profile (VRP) was used to obtain pitch range and intensity range.

 

“Vocal Outcome After Endoscopic Cordectomies for TIS and T1 Glottic Carcinomas,” Peretti, Giorgio, Cesare Piazza, Giovanna Cantarella, Cristiano Balzanelli, and Piero Nicolai, Annals of Otology, Rhinology & Laryngology, Vol. 112, Number 2, pp. 174-179, February 2003.

A cohort of 101 patients with previously untreated glottic cancer (15 Tis, 66 T1a, and 20 T1b) who underwent endoscopic CO2 laser excision between January 1995 and December 1997 was prospectively analyzed. The depth and extension of the excision were graded according to the European Laryngological Society Classification including 5 types of cordectomy. All patients were subsequently examined every 2 months for a period ranging from 30 to 66 months (mean, 48 months). The rates of 5-year overall survival, disease-free survival, ultimate local control with laser alone, and laryngeal preservation were 85%, 87%, 93%, and 95%, respectively. Sixty-nine patients underwent, at least 1 year after surgery, videolaryngostroboscopy combined with perceptual and objective evaluation of the voice, and spirometry. Acoustic parameters were compared with those obtained in a matched control group by Kruskal-Wallis test. No statistically significant difference was found (p> .05) between patients submitted to subepithelial (type I) and subligamental (type II) cordectomies and controls.

Kay’s 70-degree rigid endoscope and Digital Strobe, Model 9200, were used to perform the videolaryngostroboscopic examinations in this study. The Multi-Dimensional Voice Program (MDVP), in conjunction with the CSL, Model 4300B, was used for voice assessment, specifically to obtain objective measures of fundamental frequency, jitter, shimmer, and noise-to-harmonic ratio.

 

“Mid-Basal-to-Ceiling Versus Mid-Ceiling-to-Basal Elicitation of Maximum Phonational Frequency Range,” Zraick, Richard I., Mary P. Keyes, James C. Montague, and J. Hope Keiser, Journal of Voice, Vol. 16, Number 3, pp. 317-322, September 2002.

The purpose of this study was to investigate if there was an effect of task on the determination of maximum phonational frequency range (MPFR). Two tasks commonly used to elicit MPFR in clinical voice evaluations were compared. Normal adult females (n = 30) were examined. No statistically significant effect of task was found. Both tasks (mid-basal-to-ceiling and mid-ceiling-to-basal) were found to have a high positive correlation (0.89). Implications of the use on one task to determine maximum phonational frequency range are discussed, as is the possibility of a task effect on determination of other voice parameters.

The Voice Quality Assessment module of the Visi-Pitch II was used to obtain the quantitative acoustic measurements (i.e., max Fo, min Fo, MPFR: hertz; MPFR: semitones) reported in this study.

 

“Spectral Changes Due to Performance Environment in Singers, Nonsingers, and Actors,” Rothman, Howard B., W.S. Brown, Jr., and J. Ronald LaFond, Journal of Voice, Vol. 16, Number 3, pp. 323-332, September 2002.

From postrecording interviews of professional singers, it was hypothesized that recording environments, i.e., sound-treated environment versus an auditorium, may induce different vocal behaviors. To test this hypothesis, three groups consisting of nonsingers, singers, and actors were recorded in two different recording environments: a sound-treated booth (IAC) and an auditorium (AUD). Three recordings were obtained from each participant: recording one (IAC) and two (AUD1) required the participants to read in a normal voice; recording three (AUD2) required participants to pretend that they were “performing” before a full house. Results indicated that only the singers and the actors exhibited significant spectral and/or frequency/duration differences from one recording environment to another, with the most dramatic differences exhibited by the singers. It was concluded that the environment in which we record experimental samples from professional voice users, especially singers, should be considered as a variable that can affect results.

Frequency measures in this study were obtained using the Multi-Dimensional Voice Program (MDVP); duration and spectral measures were obtained with the FFT spectrum analysis option in the CSL.

 

“Poor Voice Quality in Future Elite Vocal Performers and Professional Voice Users,” Timmermans, B., M.S. De Bodt, F.L. Wuyts, A. Boudewijns, G. Clement, A. Peeters, and P.H. Van de Heyning, Journal of Voice, Vol. 16, Number 3, pp. 372-382, September 2002.

The voice quality of 86 occupational voice users, i.e., students of a high school for audiovisual communication, was assessed by means of a multi-dimensional test battery containing: the GRBAS scale, videolaryngostroboscopy, maximum phonation time, jitter, lowest intensity, highest frequency, dysphonia severity index (DSI), and voice handicap index (VHI). In a questionnaire on daily habits the prevalence of smoking, eating habits, and vocal abuse were recorded. A comparison of the voice characteristics of the future occupational voice users with a control group revealed significant differences. The results of the VHI and the DSI of these students revealed significantly worse scores than the score of a control group characterized by no vocal complaints. Moreover, the questionnaire on daily habits showed that the future elite vocal performers and professional voice users take less precaution for the care of their voices. These findings support the importance of a good balanced vocal coaching.

In this study, the Multi-Dimensional Voice Program (MDVP), Model 4305, was used for acoustic analysis, specifically jitter (%); the CSL was used to analyze fundamental frequency.

 

“Nasalance Changes After Functional Endoscopic Sinus Surgery,” Soneghet, Renata, Rodrigo Paula Santos, Mara Behlau, Walter Habermann, Gerhard Friedrich, and Heinz Stammberger, Journal of Voice, Vol. 16, Number 3, pp. 392-397, September 2002.

Forty adult patients diagnosed with chronic rhinosinusitis who underwent functional endoscopic sinus surgery (FESS), were analyzed with respect to postoperative resonatory voice changes. For evaluation the patients were asked about their subjective impression of voice changes using a questionnaire. An objective assessment was performed by determining the so-called nasalance using the Nasometer (Kay Elemetrics), preoperatively, on the immediate postoperative follow-up (2 days after surgery), and approximately 1 month after surgery. The mean nasalance values increased significantly one month after FESS whereas the immediate postoperative control (2 days after surgery) showed a decrease of nasalance. Although FESS is a minimally invasive procedure, it can change the acoustic characteristics of the vocal tract in the long term and produce a significant increase in nasality. The authors strongly recommend that clinicians inform all patients, in particular voice professionals, about the possible effects of endonasal sinus surgery on voice quality.

The Nasometer was used to assess nasalance at three different stages of this study: preoperatively, 2 days postoperatively, and 1 month postoperatively.

 

“Voice and Treatment Outcome from Phonosurgical Management of Early Glottic Cancer,” Zeitels, Steven M., Robert E. Hillman, Ramon A. Franco, and Glenn W. Bunting, Annals of Otolaryngology, Vol. 111, Number 12, Supplement 190, pp. 3-20, December 2002.

Phonosurgical management of early glottic cancer has evolved considerably, but objective vocal outcome data are sparse. A prospective clinical trial was done on 32 patients with unilateral cancer (T1a in 28 and T2a in 4) who underwent ultranarrow-margin resection; 15 had resection superficial to the vocal ligament, and 17 deep to it. The subepithelial infusion technique facilitated selection of these patients for the appropriate procedure. All are cancer-free without radiotherapy or open surgery. Involvement of the anterior commissure (22/32) or the vocal process (15/32) of the arytenoid cartilage did not influence local control. Nine of 17 patients had resection of paraglottic musculature, and all underwent medialization reconstruction by lipoinjection and/or Gore-Tex laryngoplasty. Eight of the 17 had resections deep to the vocal ligament, but without vocalis muscle, and 1 of the 8 underwent medialization. Posttreatment vocal function measures were obtained for all patients. A clear majority of the patients displayed normal values for average fundamental frequency (72%) during connected speech, and normal noise-to-harmonic ratio (75%) and average glottal airflow (91%) measures during sustained vowels. Smaller majorities of patients displayed normal values for average sound pressure level (SPL: 59%) during connected speech and for maximum ranges for fundamental frequency (56%) and SPL (59%). Fewer than half of the patients displayed normal values for sustained vowel measures of jitter (45%), shimmer (22%), and maximum phonation time (34%). Almost all patients had elevated subglottal pressures and reduced values for the ratio of SPL to subglottal pressure (vocal efficiency). There were significant improvements in a majority of patients for most vocal function measures after medialization reconstruction. Normal or near-normal conversation-level voices were achieved in most cases, regardless of the disease depth, by utilization of a spectrum of resection and reconstruction options. These favorable results are based on establishing aerodynamic glottal competency and preserving the layered microstructure of noncancerous glottal tissue.

In this study, CSL was used for acoustic analysis, Aerophone II for aerodynamic measurements, and the Computer-Integrated  RLS Strobscopy System for laryngeal imaging and functional assessment.

 

“Phonomicrosurgery in Singers and Performing Artists: Treatment Outcomes, Management Theories, and Future Directions,” Zeitels, Steven M., Robert E. Hillman, Rosemary Desloge, Marcello Mauri, and Patricia B. Doyle, Annals of Otolaryngology, Vol. 111, Number 12, Supplement 190, pp. 21-40, December 2002.

Phonomicrosurgery in performing artists has historically been approached with great trepidation, and vocal outcome data are sparse. The vocal liability of surgically disturbing the superficial lamina propria (SLP) and epithelium must be balanced with the inherent detrimental vocal effect of the lesion(s). A prospective investigation was performed on 185 performing artists who underwent phonomicrosurgical resection of 365 lesions: 201 nodules, 71 polyps, 66 varices and ectasias, 13 cysts, 8 keratotic lesions, 2 granulomas, 2 Reinke’s edema, and 2 papillomas. Nearly all patients with SLP lesions reported improvement in their postsurgical vocal function. This subjective result was supported by objective acoustic analysis and aerodynamic measures. All postsurgical objective vocal function measures fell within normal limits, including a few that displayed presurgical abnormalities. However, given the relative insensitivity of standard objective measures to assess higher-level vocal performance-related factors, it is even more noteworthy that 8 of 24 objective measures displayed statistically significant postsurgical improvements in vocal function. Such changes in objective measures mostly reflect overall enhancement in the efficiency of voice production. Phonomicrosurgical resection of vocal fold lesions in performing artists is enjoying an expanding role because of a variety of improvements in diagnostic assessment, surgical instrumentation and techniques, and specialized rehabilitation. Most of the lesions are the result of phonotrauma and arise within the SLP. Successful management depends on prudent patient selection and counseling, ultraprecise technique, and vigorous vocal rehabilitation. Furthermore, an understanding of the vocal function and dysfunction of this high-performance population provides all otolaryngologists who manage laryngeal problems with valuable information that they can extrapolate for use in their practices.

The authors used CSL, Electroglottograph, Aerophone II, and Computer-Integrated RLS Stroboscopy System for objective instrumental assessment pre- and post-surgery in this study. 

 

“Electroglottographic Evaluation of Gender and Vowel Effects During Modal and Vocal Fry Phonation,” Chen, Yang, Michael P. Robb, and Harvey R. Gilbert, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 821-829, October 2002.

Two unique characteristics of vocal fry register are the occurrence of multiple opening and closing phases occurring within one vibratory cycle and a similar vocal fundamental frequency (Fo) between women and men. The present study tested the hypothesis that significant differences in glottal symmetry exist between women and men during modal phonation, with no significant differences during vocal fry phonation. Consistent with previous studies of modal phonation, it was also hypothesized that a vowel effect would be apparent during vocal fry phonation. Five women and 5 men sustained modal and vocal fry phonations in four vowel contexts (a, ae, u, i). Vocal Fo, duration of opening and closing phase, and contact symmetry (speed quotient) were derived from the electroglottographic (EGG) waveforms. Both female and male speakers demonstrated significantly higher SQ values in vocal fry register than in their modal register, indicating a longer opening phase duration per glottal cycle. Women demonstrated a significantly greater increase in SQ during vocal fry phonations than men, indicating greater asymmetry between opening and closing durations. The results confirmed that gender differences in vocal fold contact behavior not only exist during modal register but also during vocal fry register. No vowel effects on vocal fold contact behavior as inferred using the SQ measure were found for either modal or vocal fry registers. Possible contributing factors to multiple opening and closing phases occurring within a vibratory cycle are discussed.

The Computerized Speech Lab (CSL), Model 4300B, was used for data acquisition and measurement in this study; the Laryngograph was used for obtaining the electroglottographic ( EGG) waveforms.   

 

“The Influence of Pitch and Loudness Changes on the Acoustics of Vocal Tremor,” Dromey, Christopher, Paul Warrick, and Jonathan Irish, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 879-890, October 2002.

The effect of tremor on phonation is to modulate an otherwise steady sound source in its amplitude, fundamental frequency, or both. The severity of untreated vocal tremor has been reported to change under certain conditions that may be related to muscle tension. In order to better understand the phenomenon of vocal tremor, its acoustic properties were examined as individuals volitionally altered their pitch and loudness. These voice conditions were anticipated to alter the tension of the intrinsic laryngeal muscles. The voices of 10 individuals with a diagnosis of voice tremor were recorded before participating in a longitudinal treatment study. They produced vowels at low and high pitch and loudness levels as well as in a comfortable voice condition. Acoustic analysis quantified the amplitude and frequency modulations of the speakers’ voices across the various conditions. Individual speakers varied in the way the pitch and loudness changes affected their tremor, but the following statistically significant effects for the speakers as a group were observed: Higher pitch phonation was associated with a more rapid rate for both amplitude and frequency modulations. Amplitude modulation became faster for louder phonation. Low-pitched phonation led to decreases in the extent of amplitude tremor. Varying pitch led to dramatic changes in the phase relationship between amplitude and frequency modulation in some of the speakers, whereas this effect was not apparent in other speakers.

The Computerized Speech Lab (CSL), Model 4300B, was used for data acquisition; the Multi-Dimensional Voice Program (MDVP) was used for tremor analysis.

 

“Changes in Speech Production in a Child With a Cochlear Implant: Acoustic and Kinematic Evidence,” Goffman, Lisa, David J. Ertmer, and Christa Erdle, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 879-890, October 2002.

A method is presented for examining change in motor patterns used to produce linguistic contrasts. In this case study, the method is applied to a child receiving new auditory input following cochlear implantation. This child experienced hearing loss at age 3 years and received a multi-channel cochlear implant at age 7 years. Data collection points occurred both pre- and post-implant and included acoustic and kinematic analysis. Overall, this child’s speech output was transcribed as accurate across the pre- and post-implant periods. Post-implant, with the onset of new auditory experience, acoustic durations showed a predictable maturational change, usually decreasing in duration. Conversely, the spatiotemporal stability of speech movements initially became more variable post-implantation. The auditory perturbations experienced by this child during development led to changes in the physiological underpinnings of speech production, even when speech output was perceived as accurate.

The Computerized Speech Lab (CSL), Model 4300B, was used for acquisition of speech samples and acoustic analysis performed in this study. Acoustic analysis included wideband spectrograms, amplitude and waveform displays to measure durations of initial consonants, consonant-vowel transitions, and vowels. 

 

Voice Amplification Versus Vocal Hygiene Instruction for Teachers With Voice Disorders: A Treatment Outcomes Study,” Roy, Nelson, Barbara Weinrich, Steven D. Gray, Kristine Tanner, Sue Walker Toledo, Heather Dove, Kim Corbin-Lewis, and Joseph C. Stemple, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 625-638, August 2002.

Voice problems are common among schoolteachers. This prospective, randomized clinical trial used patient-based treatment outcomes measures combined with acoustic analysis to evaluate the effectiveness of two treatment programs. Forty-four voice-disordered teachers were randomly assigned to one of three groups: voice amplification using ChatterVox portable amplifier (VA, n=15), vocal hygiene (VH, n=15), and a non-treatment control group (n=14). Before and after a 6-week treatment phase, all teachers completed: (a) the Voice Handicap Index (VHI), an instrument designed to appraise the self-perceived psychological consequences of voice disorders; (b) a voice severity self-rating scale; and (c) an audio recording for later acoustic analysis. Based on pre- and post-treatment comparisons, only the amplification group experienced significant reductions on mean VHI scores (p=.045), voice severity self-ratings (p=.012), and the acoustic measures of percent jitter (p=.031) and shimmer (p=.008). The non-treatment control group reported a significant increase in level of vocal handicap as assessed by the VHI (p=.012). Although most pre- to post-treatment changes were in the desired direction, no significant improvements were observed within the VH group on any of the dependent measures.

Between-group comparisons involving the three possible pairings of the groups revealed a pattern of results to suggest that: (a) compared to the control group, both treatment groups (i.e., VA and VH) experienced significantly more improvement on specific outcomes measures and (b) there were no significant differences between the VA and VH groups to indicate superiority of one treatment over another. Results, however, from a post-treatment questionnaire regarding the perceived benefits of treatment revealed that, compared to the VH group, the VA group reported more clarity of their speaking and singing voice (p=.061), greater ease of voice production (p=.001), and greater compliance with the treatment program (p=.045). These findings clearly support the clinical utility of voice amplification as an alternative for the treatment of voice problems in teachers.

In this study, acoustic samples were digitized using the Computerized Speech Lab (CSL), Model 4300B. Quantitative acoustic analysis (jitter, shimmer and noise-to-harmonic ratio) was performed with the Multi-Dimensional Voice Program (MDVP).

 

“Coarticulation and Formant Transition Rate in Young Children Who Stutter,” Chang, Soo-Eun, Ralph N. Ohde, and Edward G. Conture, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 676-688, August 2002.

The purpose of this study was to assess anticipatory coarticulation and second formant (F2) transition rate (FTR) of speech production in young children who stutter (CWS) and who do not stutter (CWNS). A total of 14 CWS and 14 age- and gender-matched CWNS in three age groups (3-, 4-, and 5- year-olds) participated in a picture-naming task that elicited single-word utterances. The initial consonant-vowel (CV) syllables of these utterances, comprising either bilabial [b m] or alveolar [d n s z] consonants and a number of vowels [i I e E Ï u o O AI AU], were used for acoustic analysis. To assess coarticulation and speech movement velocity, the F2 onset frequency and F2 vowel target frequency (for coarticulation) and FTR (for speech movement velocity) were computed for each CV syllable and for each participant. Based on these measures, locus equation statistics of slope, y-intercept, and standard error of estimate as well as the FTR were analyzed. Findings revealed a significant main effect for place of articulation and a significantly larger difference in FTR between the two places of articulation for CWNS than for CWS. Findings suggest that the organization of the FTR production for place of articulation may not be as contrastive or refined in CWS as in CWNS, a subtle difficulty in the speed of speech-language production, which may contribute to the disruption of their speech fluency.

The Computerized Speech Lab (CSL), Model 4300B, was used for acoustic analysis performed in this study.

“Laryngeal Collagen Injection as an Adjunct to Medialization Laryngoplasty,” Hoffman, Henry, Daniel McCabe, Timothy McCulloch, Sung M. Jin, and Michael Karnell, Laryngoscope, Vol. 112, pp. 1407-1413, August 2002.

OBJECTIVES/HYPOTHESIS: Dysphonia associated with laryngeal paralysis may be identified in the short term postoperatively or may develop years after successful medialization laryngoplasty. In selected cases, laryngeal collagen injection permits further medialization of one or both vocal folds by small increments to improve phonation after medialization thyroplasty. The study seeks to determine whether collagen injections result in measurable improvements in voice quality and vocal function when offered to select patients who have received medialization thyroplasty.

STUDY DESIGN:  Retrospective review of patient charts and voice database.

METHODS:  Seven patients were treated with Zyderm II collagen using indirect mirror laryngoscopy with a curved injection apparatus. Changes in voice quality and function were assessed by comparing measures obtained before treatment (mean period, 5.6 d), shortly after treatment (mean period, 38.1 d), and in the long term after treatment (mean period, 226 d).

RESULTS: Mean self-ratings of the patient, clinician’s ratings, and objective measures demonstrated measurable improvement in vocal function after collagen injection.

CONCLUSIONS: The office-based procedure offers a simple, efficient adjunct to open techniques of medialization laryngoplasty. Techniques of anesthesia, injection, and patient selection are discussed.

The authors used Kay’s stroboscopy system for vocal fold imaging and the Multi-Dimensional Voice Program (MDVP) for quantitative acoustic measurements reported in the study.

 

“Acoustic Signature of the Normal Swallow: Characterization by Age, Gender, and Bolus Volume,” Cichero, Julie A. Y. and Bruce E. Murdoch, Annals of Otology, Rhinology, and Laryngology, Vol. 111 (7), pp. 623-632, July 2002.

Despite growing clinical use, cervical auscultation suffers from a lack of research-based data. One of the strongest criticisms of cervical auscultation is that there has been little research to demonstrate how dysphagic swallowing sounds are different from normal swallowing sounds. In order to answer this question, however, one first needs to document the acoustic characteristics of “normal,” non-dysphagic swallowing sounds. This article provides the first normative database of normal swallowing sounds for the adult population. The current investigation documents the acoustic characteristics of normal swallowing sounds for individuals from 18 to more than 60 years of age over a range of thin liquid volumes. Previous research has shown the normal swallow to be a dynamic event. The normal swallow is sensitive to aging  of the oropharyngeal system, and also to the volume of bolus swallowed. The current investigation found that the acoustic signals generated during swallowing were sensitive to an individual’s age and to the volume of the bolus swallowed. There were also some gender-specific differences in the acoustic profile of the swallowing sound. It is anticipated that the results will provide a catalyst for further research into cervical auscultation.

The Computerized Speech Lab (CSL), Model 4300B, was used for digitizing and analyzing the swallowing sounds in this study.

 

“A Preliminary Report on Micronized AlloDerm Injection Laryngoplasty,” Pearl, Adam W., Peak Woo, Rosemary Ostrowski, Jackie Mojica, David Mandell, and Peter Costantino, Laryngoscope, Vol. 112, pp. 990-996, June 2002.

OBJECTIVES: To report the preliminary data of voice and quality-of-life improvement after micronized AlloDerm injection laryngoplasty in patients with unilateral vocal cord paralysis.

STUDY DESIGN: A prospective study was conducted in patients with unilateral vocal cord paralysis who underwent injection laryngoplasty with micronized AlloDerm.

METHODS: Preoperative and postoperative patient evaluation consisted of videostrobolaryngoscopy, computer voice analysis, airflow, and voice handicap index (VHI) assessment. All injections were conducted with the patient under general anesthesia using the Storz injector system and a 22-gauge spinal needle.

RESULTS: Fourteen patients received injection with an average amount of 0.641 mL. Twelve patients were available for evaluation. Initial results at 4 weeks (n=12) showed significant increase in habitual phonation time from 3.84 to 6.72 seconds (P<.01) and a decrease in airflow from 0.616 to 0.295 L/s (P<.01). The VHI rating improved from 62.8 to 37.5 (P<.01). Jitter and shimmer also improved significantly (P<.05). Stroboscopic findings showed complete closure of glottic gap in 10 patients with excellent return of mucosal wave on the injected side. The mucosal wave return after injection was rapid with little evidence of tissue reaction. Postoperative follow-up at 3 months (n=8) demonstrated slight resorption of the material, but sustained excellent voice was noted in 87.5%. Minimal morbidity and tissue reaction were noted.

CONCLUSIONS: Micronized AlloDerm appears to be a safe new material that is suitable for injection laryngoplasty. Longer results are pending.

The authors used Kay instrumentation for objective assessment of patient performance before and after surgery. The Digital Strobe was used  for stroboscopic imaging. The Computerized Speech Lab (CSL) and the Multi-Dimensional Voice Program (MDVP) were used for acoustic analysis.  Real-Time Pitch (CSL software module) was used  for maximum phonation time assessment.  Aerophone II was used  for aerodynamic analysis.

 

“Fundamental Frequency Onset and Offset Behavior: A Comparative Study of Children and Adults,” Robb, Michael P. and Allan B. Smith, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 446-456, June 2002.

Short-term changes in vowel fundamental frequency (Fo) immediately preceding (Fo offset) and following (Fo onset) production of voiceless obstruents were examined in groups of 4-year-olds, 8-year-olds, and 21-year-olds. Definitive patterns of laryngeal behavior were observed for each measure. Fo was found to significantly lower at vowel offset across age groups, with no significant differences noted between groups, suggesting that Fo offset is simply an acoustic consequence of producing a voiceless obstruent preceded by a vowel. The Fo at vowel onset was high and significantly decreased thereafter. Age-related differences were identified for Fo onset with 4-year-olds in that their Fo rose to a lesser degree than that of adults. However, adult females demonstrated a greater change in both Fo onset and Fo offset behavior than adult males and children, suggesting that age-related differences in Fo behavior are likely to be influenced by sex. The results are discussed with regard to the physiologic constraints of Fo surrounding voiceless obstruent production in children and adults.

The Computerized Speech Lab (CSL), Model 4300B, was used for data digitization and analysis done in this study.

 

“Speaking Slowly: Effects of Four Self-Guided Training Approaches on Adults’ Speech Rate and Naturalness,” Logan, Kenneth J., Rosalyn R. Roberts, Aneesha P. Pretto, and Megan J. Morey, American Journal of Speech-Language Pathology, Vol. 11, pp. 163-174, May 2002.

Speech-language pathologists often ask parents of children who stutter to reduce their conversational pace when talking with their children. Little is known, however, about how best to help parents accomplish this task. Two experiments were conducted to examine this issue. In Experiment 1, adult females altered speech rate via one of four self-guided methods. Post-training speech rates for all four experimental groups (n=8) were significantly slower than those of speakers in a control group. The extent of rate reduction varied significantly across groups, and speakers, rated their resultant speech as unnatural. In Experiment 2, 39 female listeners rated the naturalness of sentences from the five groups in Experiment 1. Naturalness ratings were higher for the Control group than for a group using a self-devised rate-reduction method (SDM). In turn, SDM ratings were higher than those for groups trained to alter articulation rate and intra-sentence pauses. Across groups, the slower a speaker’s post-training speech rate, the less natural listeners judged the speech to sound (r=.95). Results suggest that although none of the methods were clearly superior, adults can readily produce moderately slower, relatively natural-sounding speech using self-devised methods. Speakers’ and listeners’ perceptions of speech naturalness may differ considerably, however, and this must be considered during training.

Speech samples in this study were digitized and analyzed using the CSL, Model 4300B. Specifically, spectrographic and waveform displays were used for making the numerous precise durational measurements (e.g., pause time, syllabic rate, word duration, etc.) related to overall speech rate for the various rate reduction techniques.

 

“Nasalance, Nasality, Voice, and Articulation After Uvulopalatopharyngoplasty,” Van Lierde, Kristiane, John Van Borsel, and Mieke Moerman, Laryngoscope, Vol. 112, pp. 873-878, May 2002.

OBJECTIVES/HYPOTHESIS: The main purpose of the study was to determine the impact of uvulopalatopharygoplasty (UPPP) on nasalance and nasality. It was hypothesized that nasalance would change from the presurgical to the postsurgical condition because the surgical protocol involves removal of palatal tissue. An additional objective of the study was to provide objective and subjective data about changes in voice and articulation after UPPP. Because the surgical procedure of UPPP does not involve laryngeal tissue, it was hypothesized that the voice characteristics remain relatively stable. Because of removal of effective velar length, articulation problems of the uvular /R/ can occur in the Dutch language.

STUDY DESIGN: Prospective study in which 26 men were studied before (1 week before UPPP) surgery).

METHODS: The Nasometer was used to obtain nasalance scores. The mirror-fogging test, a perceptual evaluation of each subject’s readings, and the Gutzmann and the Bzoch hypernasality tests were used for the assessment of nasality. For the assessment of articulation, a phonetic analysis was performed. Voice assessment included a perceptual rating of the voice and a determination of fundamental frequency.

RESULTS: No significant differences were found between the conditions before and after surgery regarding nasalance (except for the vowel /i/), nasality, and voice. Regarding articulation, only 1 patient showed a derhotacized /R/.

CONCLUSIONS: The findings of the study indicate that UPPP does not have an impact on nasality, voice, and articulation. Regarding nasalance, no significant nasalance change occurred after UPPP, except for the high vowel /i/.

The Nasometer, Model 6200, was used to assess nasalance pre- and post-surgically in this study. Fundamental frequency measurements were made using the Multi-Dimensional Voice Program (MDVP), a software option for the Computerized Speech Lab (CSL).

 

“Acoustic Cues to the Voicing Feature in Tracheoesophageal Speech,” Searl, Jeffrey P. and Mary A. Carpenter, Journal of Speech, Language, and Hearing Research, Vol. 45, pp. 282-294,  April 2002.

Tracheoesophageal (TE) speakers often have difficulty producing the voiced/voiceless distinction. This limitation has been attributed to use of the pharyngoesophageal segment as the phonatory source. The nature of this tissue may preclude precise control of voicing onset, a contributing cue to a phoneme’s voicing feature, at least in laryngeal speech. The purpose of this study was to determine whether voiced and voiceless consonants produced by the TE speakers could be differentiated from those produced by laryngeal speakers using four acoustic measures associated with the voicing characteristic of consonants in fricative cognate pairs embedded in a carrier phrase. Three of the four acoustic measures contributed significantly to the discriminant models that differentiated accurately perceived TE and laryngeal samples. The three variables were consonant sound pressure level, consonant duration, and preceding vowel duration. In general, values for each measure were higher/longer for the TE group. The discriminant functions were interpreted as a reflection of the TE speaker attempts at overarticulation.

The CSL, Model 4300B, was used to make the voice onset time (VOT), preceding vowel duration, consonant duration, and consonant dB SPL measurements used in this study.

 

“Computer-Assisted Voice Analysis: Establishing a Pediatric Database,”Campisi, Paolo, Ted L. Tewfik, John J. Manoukian, Melvin D. Schloss, Elaine Pelland-Glais, and Nader Sadeghi, Archives of Otolaryngology - Head and Neck Surgery, Vol. 128, pp. 156-160, February 2002.

OBJECTIVES: To establish and characterize the first pediatric normative database for the Multi-Dimensional Voice Program, a computerized voice analysis system, and to compare the normative data with the vocal profiles of patients with vocal fold nodules.

DESIGNS: A cross-sectional, observational design was used to establish the normative database. The comparative study was completed using a case-control design.

SETTING: University-based outpatient pediatric otolaryngology clinic.

PARTICIPANTS: One hundred control subjects (50 boys and 50 girls) aged 4 to 18 years contributed to the normative database. The voices of 26 patients (19 boys and 7 girls) with bilateral vocal fold nodules were also analyzed.

MAIN OUTCOME MEASURES: Demographic data, including sex, age, height, weight, body mass index, and cigarette smoke exposure, were obtained. The Multi-Dimensional Voice Program extracted up to 33 acoustic variables from each voice analysis.

RESULTS: The mean (SEM) values of each of the acoustic variables are presented. At age 12, boys experience a dramatic decrease in fundamental frequency measurements. The voices of patients with vocal fold nodules had significantly elevated frequency perturbation measurements compared with control subjects (p<.001).

CONCLUSIONS: The vocal profile of children is uniform across all girls and prepubescent boys. Patients with vocal fold nodules demonstrated a consistent acoustic profile characterized by an elevation in frequency perturbation measurements. Normal acoustic reference ranges may be used to detect vocal fold pathologic abnormalities and to monitor the effects of voice therapy.

The Multi-Dimensional Voice Program (MDVP) used in conjunction with a CSL, Model 4300B was used for this pediatric normative data study.

 

“An Acoustic Analysis of Excellent Female Esophageal, Tracheoesophageal, and Laryngeal Speakers,” Bellandese, Mary H., Jay W. Lerman, and Harvey R. Gilbert, Journal of Speech, Language, and Hearing Research, Vol. 44, 1315-1320, December 2001.

Acoustic data for female esophageal speakers is sparse, particularly with regard to characteristics of female tracheoesophageal speakers. This study quantified and compared six acoustic characteristics of excellent female tracheoesophageal (TE), standard esophageal (SE), and laryngeal (LA) speakers. Results indicated there were no significant differences between TE and SE speakers with regard to mean Fo of sustained /a/, mean Fo (reading), signal-to-noise ratio, total duration of passage read, number of pauses, and syllables per minute. Significant differences were found between LA speakers and both alaryngeal groups for all variables, with the exception of mean Fo (reading). 

The Computerized Speech Lab (CSL, Model 4300B) was used for data acquisition of speech samples in this study.

 

“Laryngoscopic, Acoustic, and Environmental Characteristics of High-Risk Vocal Performers,” Hoffman-Ruddy, Bari, Jeffrey Lehman, Carl Crandell, David Ingram, and Christine Sapienza, Journal of Voice, Vol. 15, Number 4, pp. 543-552, December 2001.

Vocal performance often requires excessively high vocal demand. In particular “high risk” performers, a group of individuals who use their voices at their maximum effort level, are often exposed to unique vocal abuse characteristics which include high environmental and performance demands and inconsistencies of cast performance. Three categories of high-risk performers were studied: musical theater, choral ensemble, and street theater. Musical theater performers produce a Broadway, West End “belting” style voice. Street theater performers use a high-energy pitch-varying dialogue in order to imitate a desired character voice. Choral ensemble performance requires group cohesion and blending of four-part harmony. The melodies require sustained vocal durations within each of the respective registers. For each of these studied groups, vocal tasks of sustained production of /i/ and /a/ were subjected to analysis. Acoustic measures included fundamental frequency, standard deviation of fundamental frequency, jitter percent, shimmer percent, and noise-to-harmonic ratio. Laryngostroboscopic parameters were assessed during sustained /i/. Environmental acoustic sound field measurements were made using an A weighting and linear weighting sound pressure level. These weightings were used to describe noise levels and vocal output, respectively, within performance environments. Results of the analysis suggest that high-risk performers are a unique performance type defined by distinctive, acoustic, laryngostroboscopic, and environmental characteristics.

The Computerized Speech Lab (Model 4300B) and Multi-Dimensional Voice Program (MDVP) were used for acoustic analysis in this study. For videostroboscopic analysis of the singers, Kay’s Computer-Integrated Stroboscopy system was used.

 

A Comparison of Radiation-Induced and Presbylaryngeal Dysphonia,” Behrman, Allison, Allan L. Abramson, and David Myssiorek, Otolaryngology-Head and Neck Surgery, Vol. 125, Number 3, pp. 193-200, September 2001.

OBJECTIVE: The goal of this study was to assess voice after radiotherapy compared with patients with presbylaryngeal dysphonia.

STUDY DESIGN AND SETTING: Prospective assessment of 20 patients aged 60+ years who remained free of disease longer than 1 year after radiotherapy for T1 squamous cell carcinoma and retrospective review of 46 patients aged 60+ with presbylaryngeal dysphonia, conducted at a tertiary care, academic hospital. Assessment data included videostroboscopy, spectrography, voice range profile, and Voice Handicap Index.

RESULTS: Eighty percent of the radiotherapy patients reported a voice disorder. Acoustic data and functional measures reflected similar limitations and abnormalities for both groups. A high incidence of glottal gap in all patients may have been associated with increased mucosal stiffness in the radiotherapy group and vocal fold atrophy in the presbylaryngeal group.

CONCLUSION: Patient perception and functional outcome of voice were similar for both groups, despite differences in etiology of abnormal vocal fold vibratory behavior.

SIGNIFICANCE: Radiotherapy in older individuals may yield dysphonia that is no greater than that caused by normal aging.

Kay's RLS Stroboscopy System and Computerized Speech Lab (CSL) were used in this study. Specifically, spectrography and the Voice Range Profile (software option for CSL) were used for acoustic analysis.

 

A Description of Phonetic, Acoustic, and Physiological Changes Associated with Improved Intelligibility in a Speaker with Spastic Dysarthria, Roy, Nelson, Herbert A. Leeper, Michael Blomgren, and Rosalea M. Cameron, American Journal of Speech-Language Pathology, Vol. 10, pp. 274-290, August 2001.

Spastic dysarthria is a motor speech disorder produced by bilateral damage to the direct (pyramidal) and indirect (extrapyramidal) activation pathways of the central nervous system. This case report describes the recovery of an individual with severe spastic dysarthria and illustrates the close relationship between intelligibility measures and acoustic and physiological parameters. Detailed phonetic feature analyses combined with acoustic and physiological information helped to clarify (a) the loci of the intelligibility deficit, (b) the features of deviant speech whose improvement would lead to the greatest gains with treatment, and (c) the changes contributing to improvement in intelligibility observed over a 30-month treatment/recovery period. Though auditory-perceptual analysis remains the foundation of day-to-day dysarthria assessment, this case illustrates the potential for instrumental assessment to (a) supplement perceptual assessment techniques, (b) parse speech subsystem deficits, and (c) track the effects of interventions.

The Computerized Speech Lab (CSL, Model 4300B) was used for spectrographic and LPC analysis for inspection of defined phonetic contrasts, vowel space, and second formant transitions. Additionally, the Nasometer was used for quantification of nasalance using standardized passages.

 

Emergence of a Vowel System in a Young Cochlear Implant Recipient, Ertmer, David J., Journal of Speech, Language, and Hearing Research, Vol. 44, 803-813, August 2001.

This report chronicles changes in vowel production by a congenitally deaf child, Hannah, who received a multichannel cochlear implant at 19 months. The emergence of Hannah's vowel system was monitored by transcribing vocalic segments from spontaneous utterances produced during two 30-minute recording sessions before implant surgery and 12 monthly recording sessions after her implant was activated. Vowel types were included in her inventory whenever transcribers independently agreed that a vocalization contained an allophone of a given vowel type. Hannah exhibited three vowel types before implantation. A total of nine different vowel types were observed during her first year of implant experience, and a full range of place and height categories was represented. Acoustic analyses revealed that Hannah's vowel space was near normal in size and the formant structures of /i/ and /u/ were distinctive from other point vowels. Formant regions for /æ/ and /a/ showed some overlap. Taken together with a previous report of her vocal development (D.J. Ertmer & J.A. Mellon, 2001), Hannah appears to have made substantial progress in speech development during her first year of implant use.

The Computerized Speech Lab (CSL, Model 4300B) was used for acoustic analysis of vowels in this study. Spectrographic analysis with overlaid LPC formant traces were used to quantify the subject's vowel space.

 

Voice Activity and Participation Profile: Assessing the Impact of Voice Disorders on Daily Activities, Ma, Estella P-M. and Edwin M-L. Yiu, Journal of Speech, Language, and Hearing Research, Vol. 44, 511-524, June 2001.

Traditional clinical voice evaluation focuses primarily on the severity of voice impairment, with little emphasis on the impact of voice disorders on the individual's quality of life. This study reports the development of a 28-item assessment tool that evaluates the perception of voice problems, activity limitation, and participation restriction using the International Classification of Impairments, Disabilities and Handicaps-2 Beta-1 concept (World Health Organization, 1997). The questionnaire was administered to 40 subjects with dysphonia and 40 control subjects with normal voices. Results showed that the dysphonic group reported significantly more severe voice problems, limitation in daily voice activities, and restricted participation in these activities than the control group. The study also showed that the perception of a voice problem by the dysphonic subjects correlated positively with the perception of limitation in voice activities and restricted participation. However, the self-perceived voice problem had little correlation with the degree of voice-quality impairment measured acoustically and perceptually by speech pathologists. The data also showed that the aggregate scores of activity limitation and participation restriction were positively correlated, and the extent of activity limitation and participation restriction was similar in all except the job area. These findings highlight the importance of identifying and quantifying the impact of dysphonia on the individual's quality of life in the clinical management of voice disorders.

The Multi-Dimensional Voice Program (MDVP),  in conjunction with a CSL 4300B, was used for the quantitative acoustic analysis done in this study.

 

Failed Medialization Laryngoplasty: Management by Revision Surgery, Woo, Peak, Adam W. Pearl, Ming-Wang Hsiung, and Peter Som, Otolaryngology–Head and Neck Surgery, Vol. 124, Number 6, June 2001.

OBJECTIVE: The purpose of this study was to evaluate the cause of immediate and late medialization laryngoplasty failures and to describe their management.

METHODS: A retrospective analysis was performed in 20 patients who underwent revision surgery after failed medialization laryngoplasty. Analysis was based on preoperative spiral CT scan, preoperative and postoperative videostrobolaryngoscopy, and phonatory function measures.

RESULTS: Three major types of failures were identified. The most common problem was arytenoid rotation with a persistent posterior glottic gap (11 of 20). Malposition or wrong size of the implants resulted a lateralized vocal fold or false vocal fold medialization (6 of 20). Three patients had implants that were extruding. Late atrophy and bowing resulted in a glottal gap (2 of 20). One patient had fibrosis around the implant requiring removal. Spiral CT scan of the larynx located the implant precisely and showed the degree of arytenoid rotation. Patients with arytenoid rotation and posterior gap had revision medialization with arytenoid adduction. Revision medialization was performed in 11 patients, arytenoid adduction in 12 patients, lipoinjection in 2 patients, and 4 implants were removed. The voice was improved in 15 patients. Improved voice was correlated with improved phonation time and reduced phonatory airflow rates.

CONCLUSION: Immediate and late failures of medialization laryngoplasty are due to several possible causes. Revision surgery is feasible and highly successful. To select between the surgical alternatives, work up should include preoperative analysis of vocal function, videostrobolaryngoscopic analysis, and spiral CT of the larynx.

The authors used Kay's Digital Stroboscopy System, CSL, MDVP, Real-Time Pitch, and Aerophone II  for analysis of patients' vocal performance pre- and post-revision surgery. 

 

An Investigation of a Modal-Falsetto Register Transition Hypothesis Using Helox Gas, Spencer, Martin L. and Ingo R. Titze, Journal of Voice, Vol. 15, Number 1, pp. 15-24, March 2001.

This study concerned the effect of the first subglottal formant on the modal-falsetto register transition in males and females. Phonations using air and a helium-oxygen mixture (helox) were used in a comparative study to tease apart possible acoustic and myoelastic contributions to involuntary register transitions. Recordings of the first subglottal formant and its accompanying bandwidths, and the lower and upper shift point marking the outer boundaries of abrupt register transitions, were obtained via a neck-mounted accelerometer, and analyzed using spectrograms and power spectra on a Kay-5500 Sona-Graph. The four subjects had their hearing masked bilaterally with speech level noise to increase the likelihood of involuntary register transition via minimized auditory feedback. In three of the four test subjects, registration was surmised to be primarily a laryngeal event, as evidenced by the similar frequency dependency of voice breaks in both air and helox. It may be hypothesized that subglottal resonance influenced register transition in the fourth subject, as voice breaks rose with helox-induced phonation; however, this result did not reach statistical significance. Therefore, in this experiment subglottal resonance was not found to have a significant influence on register transition as originally hypothesized. 

The DSP Sona-Graph, Model 5500 was used to record and analyze (waveform, spectrogram, power spectra, and intensity contours) the accelerometer signals in this study.

 

Objective Voice Analysis after Autologous Fat Injection for Unilateral Vocal Fold Paralysis, Hartel, Dana M., et al., Annals of Otology, Rhinology, and Laryngology, Vol. 110 (3), March 2001.

This study was designed to objectively compare a patient's voice after onset of unilateral vocal fold paralysis (UVFP) to his or her own normal voice, and to compare the results after treatment by intrafold injection of autologous fat. Acoustic recordings were obtained for 2 male patients before thoracic surgery and after the onset iatrogrenic left UVFP. Vocal fold augmentation was performed 10 days after UVFP. The acoustic recordings were repeated within 3 days and at 1 month. The phonation quotient, pitch perturbation quotient, amplitude perturbation quotient, harmonics-to-noise ratio, cepstral peak prominence, and long-term average spectrum were analyzed. All parameters improved after treatment, with a return to preparalytic values for most. During the first month, some deterioration was noted. This is the first study comparing a subject's own normal voice to his her voice after vocal fold augmentation. We recommend overinjection of fat if vocal fold atrophy is expected.

The core CSL software was used for spectrographic, cepstral, and long-term average spectrum analysis. MDVP was used for additional acoustic measures (PPQ, APQ and N/H ratio) to quantify the patients' status over time.

 

Acoustic Comparison of Vowel Articulation in Normal and Reverse Phonation, Robb, Michael, et al. JSLHR, Vol. 44, pp. 118-127, February 2001.

Acoustic characteristics of the vowels /i, u, a/ produced by adult females and males during normal phonation were compared with the same vowels produced on deliberate ingressive airflow (i.e., “reverse” phonation). Results of the analysis revealed the average fundamental frequency (Fo) of reverse phonation to be significantly higher than the corresponding normal phonations. There were no significant differences noted in the vocal tract resonance (F1 and F2 frequency) values for /i/ during normal and reverse phonation. However, the F1 values for /a/ were significantly lower, and the F2 values for /u/ significantly higher, during reverse phonation. The results are discussed with regard to differences in the articulatory control of the speech mechanism during reverse phonation as compared to normal expiratory phonation. Also discussed are the implications of using reverse phonation as a voice management technique.

CSL was used for acoustic analysis of the normal and reverse phonation vowel utterances. Fundamental frequency, spectrographic, and linear predictive coding (LPC) analysis within CSL software were used. 

 

Beginning to Talk at 20 Months: Early Vocal Development in a Young Cochlear Implant Recipient, Ertmer, David J. and Jennifer A. Mellon, JSLHR, Vol. 44, pp. 192-206, February 2001.

Early vocal development, consonant production, and spoken vocabulary were examined in a deaf toddler whose multichannel cochlear implant was activated at 20 months. Parent-child interactions were recorded before implantation and at month intervals during the first year of implant use. The child’s utterances were classified according to developmental levels from the Stark Assessment of Early Vocal Development. The emergence of consonant types and consonant features were documented through listener transcription. Parent reports were used to monitor oral vocabulary growth. A large increase in canonical and postcanonical utterances was observed after 5 months of implant use, and these advanced prelinguistic forms were dominant in all subsequent recording sessions. Increases in the diversity of consonant types and features suggested that auditory information was used to increase phonetic diversity. It was reported that the child understood almost 240 words and spoke approximately 90 words after one year of implant experience. The combination of cochlear implantation at a young age, family support, and regular intervention appeared to facilitate efficient early vocal development and gains in spoken vocabulary.

CSL was used for coding the developmental level of the infants utterances based on acoustic analysis. Specifically, spectrographic, amplitude, and waveform displays were used to assist with utterance coding.

Acoustic, Aerodynamic, and Videostroboscopic Features of Bilateral Vocal Fold Lesions, Rosen, Clark A., Lori E. Lombard, and Thomas Murray, Annals of Otology, Rhinology, and Laryngology, Vol. 109 (9), September 2000.

Successful treatment of bilateral vocal fold lesions depends on the accuracy of the diagnosis. For example, the preferred treatment for vocal fold nodules is voice therapy; in contrast, treatment for a unilateral vocal fold lesion with a contralateral reactive vocal fold lesion (UVFL/RL) usually involves phonosurgery and voice therapy. Differentiation between vocal fold nodules and a UVFL/RL is often challenging. The purpose of this study was to facilitate diagnostic accuracy and improve treatment for patients with bilateral vocal fold lesions by attempting to identify distinct features of patients with either vocal fold nodules or a UVFL/RL with acoustic, aerodynamic, stroboscopic, and patient self-perception measures. The objective voice analysis, Voice Handicap Index, and laryngovideostroboscopic examinations of 85 patients with bilateral vocal fold lesions were reviewed. The results indicated that the patients with a UVFL/RL presented a diagnostic profile that was significantly different from that of patients with vocal fold nodules. Statistically significant differences were found for 1) symmetry of vocal fold vibration, 2) amplitude perturbations, 3) estimated subglottic pressure, and 4) Voice Handicap Index. These results suggest that a composite assessment of acoustic, aerodynamic, and videostroboscopic phonatory features facilitates differentiation between patients with vocal fold nodules and those with a UVFL/RL. The improved diagnostic accuracy afforded by multiparametric assessment provides a comprehensive framework for the treatment of these two distinct vocal fold disorders.

Instruments used in this study were the CSL (MDVP software) for acoustic analysis, Aerophone II for aerodynamic measurements, and the Computer Integrated Stroboscopy System for vocal fold imaging. 

The Dysphonia Severity Index: An Objective Measure of Vocal Quality Based on a Multiparameter Approach, Wuyts, Floris L., Marc S. De Bodt,  et al. Journal of Speech, Language, and Hearing Research, Vol. 43, pp. 796-809,  June 2000.

The vocal quality of a patient is modeled by means of a Dysphonia Severity Index (DSI), which is designed to establish an objective and quantitative correlate of the perceived vocal quality. The DSI is based on the weighted combination of the following selected set of voice measurements: highest frequency (Fo-High in Hz), lowest intensity (l-Low in dB), maximum phonation time (MPT in sec), and jitter (%). The DSI is derived from a multivariate analysis of 387 subjects with the goal of describing, purely based on objective measures, the perceived voice quality. It is constructed as DSI=0.13 x MPT + 0.0053 x Fo-High – 0.26 x l-Low-1.18 x Jitter (%) + 12.4. The DSI for perceptually normal voices equals +5 and for severely dysphonic voices –5. The more negative the patient's index, the worse is his or her vocal quality. As such, the DSI is especially useful to evaluate therapeutic evolution of dysphonic patients. Additionally, there is a high correlation between the DSI and the Voice Handicap Index score.

CSL (MDVP software) was used for acoustic analysis in this study. 

 

Reliability of the Multi-Dimensional Voice Program for the Analysis of Voice Samples of Subjects with Dysarthria, Kent, Ray D., Houri K. Vorperian, and Joseph D. Duffy,  American Journal of Speech-Language Pathology, Vol. 8, pp. 129-136, May 1999.

Computer-based analysis systems are increasingly available for the clinical assessment of speech and voice functions. These systems have the potential to provide immediate quantitative information to assist clinical assessment and treatment. The Multi-Dimensional Voice Program (MDVP) is a computer program that can calculate as many as 33 acoustic parameters from a voice sample. The MDVP appears to have potential for rapid quantitative assessment of voice in both research and clinical applications. This report evaluates the robustness and reliability of  MDVP for vocal analyses of 32 individuals with dysarthria of various etiologies. It is concluded that the reliability is generally very good and that MDVP has potential as a tool for the semi-automatic analysis of voice samples in dysarthria. Some parameters appear to hold particular value in the description of voice qualities in these speech disorders.

MDVP is a software option for the CSL and Multi-Speech. A subset of MDVP parameters is also provided in Visi-Pitch III and Sona-Speech.  

 

Voice Tremor and Psychological Stress, Mendoza, Elvira and Gloria Carballo, Journal of Voice, Vol. 13, Number 1, pp. 105-112, March 1999.

This study examines vocal tremor and its decrease in high versus low experimentally-induced stress situations. We have analyzed the Amplitude Tremor Index (ATRI) and Frequency Tremor Intensity Index (FTRI) from the prolongation of vowel /a/ for approximately 5 seconds, under baseline conditions and under 3 different test conditions (reading of tongue twister, reading of tongue twister with delayed auditory feedback [DAF], and spelling of alphabet in reverse order), in a 2-test series, with and without demanding experimental instructions (Experiments 1 and 2, respectively). Inclusion of experimental instructions was considered as making the first test situation more stressful than the second one. Our results show a significant decrease in ATRI variable when reading a tongue twister with DAF in relation to the baseline for the first test but not in the second, which suggests a suppression or significant reduction of amplitude tremor only in high-stress situations.

The Multi-Dimensional Voice Program (MDVP) used in conjunction with a CSL 4300B was used for the calculation of the Amplitude Tremor Index (ATRI) and Frequency Tremor Intensity Index (FTRI) referred to in the study. 

 

 

Aerodynamics

“Marsupialization of Vocal Fold Retention Cysts: Voice Assessment and Surgical Outcomes,” Hsu, Cheng-Ming, Gian Luca Armas, and Chih-Ying Su, Annals of Otology, Rhinology & Laryngology, Vol. 118 (4), pp. 270-275, April 2009.

Objectives: Although total excision remains the standard treatment for vocal fold retention cysts, postoperative deficits and damage to the vocal folds still occur. Marsupialization is a more conservative technique and can prevent these complications.

Methods: In this prospective clinical series, 25 patients underwent the marsupialization procedure. Under a direct laryngomicroscope, the cystic wall margin was retracted medially with microforceps. An incision was made with microscissors encircling the equator of the cyst. The cyst contents drained from the cystic cavity when the capsule was sectioned. For 7 patients with concomitant marked vocal fold atrophy, strap muscle transposition laryngoplasty was simultaneously performed.

Results: All patients had complete preoperative and postoperative voice parameter analyses. A subjective improvement in voice quality was reported by 23 of the 25 patients (92%). A small recurrent vocal fold cyst was detected in 1 patient. Small vocal fold deficits and sulcus vocalis were detected in 2 and 4 patients, respectively. Only 1 patient described a worse voice after operation. No other complications were noted.

Conclusions: Marsupialization of vocal fold retention cysts is a simple, relatively safe, and effective surgical treatment. Voice improvement, a low incidence of recurrence, and minimal vocal fold deficits demonstrate the validity of this technique. Marked preoperative vocal fold atrophy or postoperative glottal gap can be managed with medialization laryngoplasty.

In this study, the KayPENTAX RLS system was used to perform laryngostroboscopy, the KayPENTAX CSL, Model 4300B, to perform acoustic analysis, and the KayPENTAX Aerophone II, Model 6800, to measure aerodynamic parameters.

 

“Baseline laryngeal effects among individuals with dust mite allergy,” Krouse, John H., James P. Dworkin, Michael A. Carron, and Robert J. Stachler, Otolaryngology-Head and Neck Surgery, Vol. 139, No. 1, pp. 149-151, July 2008.

Objective: To examine baseline effects of perennial allergy on laryngeal appearance, laryngeal function, and perceived vocal handicap among individuals without current allergy or voice symptoms.

Data Sources: This pilot study included 47 adults: 21 with positive and 26 with negative skin test responses for the dust mite, Dermatophagoides pteronyssinus.

Methods: Subjects were tested for sensitivity to dust mite antigen by prick testing. Laryngeal appearance and function were studied with laryngovideostroboscopy, acoustic and speech aerodynamic analysis, and voice sampling. These parameters were blindly analyzed by three trained examiners. Subjects also completed the Voice Handicap Index (VHI) as a measure of vocal handicap.

Results: Subjects allergic to dust mites perceived significantly greater vocal handicap on the VHI than did nonallergic subjects. No significant differences were noted between groups in laryngeal appearance or function.

Conclusion: These pilot data suggest that, at baseline, allergic individuals perceived greater vocal handicap than their nonallergic counterparts (P  =  0.04), even in the absence of current allergy symptoms or observable physical or functional abnormalities. These preliminary observations can serve as an impetus for further research into this important area, including the potential interrelationship between acid reflux disease and allergic laryngeal inflammation.

In this study, laryngeal anatomy and physiology were analyzed using the KayPENTAX Digital Strobe system with a 70-degree rigid endoscope. Acoustic and speech aerodynamic parameters and subglottal pressure were assessed using the KayPENTAX Computerized Speech Lab (CSL) and the KayPENTAX Aerophone.

 

“Functional Significance of Arytenoid Adduction with the Suture Attaching to Cricoid Cartilage versus to Thyroid Cartilage for Unilateral Paralytic Dysphonia,” Su, Chih-Ying, Shang-Shyue Tsai, Hui-Ching Chuang, and Jeng-Fen Chiu, Laryngoscope, Vol. 115, pp. 1752-1759, October 2005.

Objective: In the treatment of unilateral paralytic dysphonia, traditional arytenoid adduction is designed to place suture through the muscular process of the arytenoid attaching anteriorly to the thyroid ala. In contrast with the suture direction of this technique, a new paramedian approach to arytenoid adduction anchors anteroinferiorly to the cricoid cartilage, mimicking the force action of the lateral cricoarytenoid muscle (the major adductor of the larynx). This study investigated the influence of these changes in suture direction on the vocal fold level as well as the vocal outcomes in these two techniques of arytenoid adduction.

 

Study Design: A prospective clinical series.

 

Methods: Thirty patients with unilateral paralytic dysphonia underwent medialization laryngoplasty with arytenoid adduction and strap muscle transposition. Under local anesthesia, the thyroid lamina on the involved side was paramedially separated. The inner perichondrium was carefully elevated away from the overlying thyroid cartilage, carrying the dissection posteriorly to the level of the superior and inferior cornua. The lamina was retracted laterally, the inner perichondrium was opened near the midpoint, and the lateral cricoarytenoid muscle identified. Tracing the muscle fibers posterosuperiorly, the muscular process of the arytenoid was identified. A 2-0 Prolene suture was placed through the muscular process and temporarily tied to the anterolateral aspect of the thyroid ala (AA-thyroid suture). Intraoperative acoustic and perceptual assessments were performed. After releasing the tie, the suture was anchored to the cricoid cartilage at the origin of the lateral cricoarytenoid muscle (AA-cricoid suture). Voice assessments were repeated, and the outcomes of the two tests were compared. The choice of the type of arytenoid adduction suture was made intraoperatively according to which condition provided better vocal performance. After securing the suture, a bipedicled strap muscle flap was transposed into the space between the lamina and inner perichondrium and the thyroid cartilages sutured back into place.

 

Results: The intraoperative acoustic and perceptual assessments revealed the vocal performance was significantly better with AA-cricoid suture than the AA-thyroid suture in this series. No major complications occurred in the study.

 

Conclusion: This study suggests that that arytenoid adduction with suture attachment along the longitudinal axis of the lateral cricoarytenoid muscle to the cricoid cartilage is more physiologic and effective than that attaching the suture to the thyroid ala. A paramedian approach to arytenoid adduction with or without strap muscle transposition is a safe and effective method for treatment of unilateral paralytic dysphonia.

In this study, the pre- and post-surgical videolaryngostroboscopy was performed using a KayPENTAX RLS, Model 9100B. For acoustic analysis, the KayPENTAX CSL, Model 4300B, was used. Aerodynamic parameters were determined using the KayPENTAX Aerophone II, Model 6800.

 

“Physiological Features of Dysarthria in Friedreich’s Ataxia,” Cahill, Louise, M., Deborah G. Theodoros, Bruce E. Murdoch, and John MacMillan, Asia Pacific Journal of Speech, Language and Hearing, Vol. 8, No. 3, pp. 221-228, 2003.

Abstract: Due to the paucity of literature concerning the motor speech impairment in persons with Friedreich’s ataxia (FA), the air of the study was to investigate the perceptual and physiological features of dysarthria in a 30-year-old male with FA, 22 years post diagnosis. The four speech subsystems were comprehensively evaluated using physiological measures of respiratory (Respitrace), laryngeal (Laryngograph, Aerophone II), velopharyngeal (Nasometer) and articulatory (lip and tongue pressure transduction systems) function. Perceptual speech evaluations included the Frenchay Dysarthia Assessment, the Assessment of Intelligibility of Dysarthric Speech and a perceptual analysis of a speech sample. The findings were compared with those of non-neurologically impaired controls, matched for age and sex. Results revealed marked impairment in respiratory, velopharyngeal and articulatory function, and mild laryngeal dysfunction. Based on these results the subject was rated as displaying a moderate mixed dysarthria (flaccid/ataxic), with a mild to moderate decrease in overall intelligibility. The results of the assessments will be discussed in relation to the possible effects of FA on motor speech function.

Researchers in this study used Kay’s Aerophone II, Model 6800 for aerodynamic evaluation; velopharyngeal function was assessed using Kay’s Nasometer, Model 6400.

 

“The Effects of HearFones on Speaking and Singing Voice Quality,” Laukkanen, Anne-Maria, Nils Peter Mickelson, Marja Laitala, Tiina Syrjä, Arla Salo, and Marketta Sihvo, Journal of Voice, Vol. 18, No. 4, pp. 475-487, December 2004.

HearFones (HF) have been designed to enhance auditory feedback during phonation. This study investigated the effect of HF (1) on sound perceivable by the subject. (2) on voice quality in reading and singing, and (3) on voice production in speech and singing at the same pitch and sound level.

Test 1: Text reading was recorded with two identical microphones in the ears of a subject. One ear was covered with HF, and the other was free. Four subjects attended this test. Tests 2 and 3: A reading sample was recorded from 13 subjects and a song from 12 subjects without and with HF on. Test 4: Six females repeated [pa:p:a] in speaking and singing modes without and with HF on same pitch and sound level.

Long-term average spectra were made (Tests 1-3), and formant frequencies, fundamental frequency, and sound level were measured (Tests 2 and 3). Subglottic pressure was estimated from oral pressure in [p], and simultaneously electroglottography (EGG) was registered during voicing on [a:] (Test 4). Voice quality in speech and singing was evaluated by three professional voice trainers (Test 2-4).

HF seemed to enhance sound perceivable at the whole range studied (0-8 kHz), with the greatest enhancement (up to ca 25 dB) being at 1-3 kHz and at 4-7 kHz. The subjects tended to decrease loudness with HF (when sound level was not being monitored). In more than half of the cases, voice quality was evaluated “less strained” and “better controlled” with HF. When pitch and loudness were constant, no clear differences were heard but closed quotient of the EGG signal was higher and the signal more skewed, suggesting a better glottal closure and/or diminished activity of the thyroarytenoid muscle.

Researchers in this study used Kay’s Aerophone, Model 6800, to measure sound level and oral pressure.

 

“Medialization laryngoplasty with strap muscle transposition for vocal fold atrophy with or without sulcus vocalis,” Su, Chih-Ying, Shang-Shyue Tsai, Jeng-Fen Chiu, and Chu-An Cheng, Laryngoscope, Vol. 114, pp. 1106-1112, June 2004.

Objective: Vocal fold atrophy with or without sulcus vocalis may result in a spindle-shaped glottal incompetence (SGI). Because of varying drawbacks with all existing materials (e.g., Silastic block, Teflon, fat, etc.) used for medialization or augmentation of the atrophic vocal folds, there is a need to supplant these materials with a more stable, autologous tissue to correct the SGI.

Study Design: Thirty-two patients with vocal fold atrophy underwent medialization laryngoplasty with strap muscle transposition.

Methods: Under local or general anesthesia, the thyroid lamina on the more affected side was vertically incised 5 mm off the midline. The inner perichondrium was carefully elevated fro the overlying thyroid ala. Care was taken not to enter the laryngeal lumen. After dividing the thyrohyoid and cricothyroid membranes, the lamina was retracted laterally. To accommodate the muscle flap more easily, the caudal edge of the lamina was trimmed using a small burr. A bipedicled strap muscle flap was then transposed into the space between the lamina and the paraglottic soft tissue. The thyroid cartilages were carefully sutured back in place. All patients underwent pre- and post-operative voice evaluations including laryngostroboscopy, perceptual assessment, and acoustic and aerodynamic analyses. Patients who had been followed up for more than 3 months were enrolled in the study.

Results: A total of 27 of the 3 patients with complete pre- and postoperative voice function measurements were included in the analysis. Vocal improvement was demonstrated in 26 of these 27 (97%) patients. No dyspnea or other major complications were noted in any patients.

Conclusions: The results indicate that medialization laryngoplasty with strap muscle transposition is a prosthesis-free, safe, and effective technique for correcting SGI caused by vocal fold atrophy.

Videolaryngostroboscopy was performed pre- and post-operatively using Kay’s RLS, Model 9100, stroboscopic light source. Kay’s CSL, Model 4300B, and Aerophone II, Model 6800, were also used in this study to perform the acoustic and aerodynamic analyses, respectively.

 

“The Effect of Auditory Feedback on Phonation Threshold Pressure Measurement,” Morgan, Michael D., Miguel A. Triana, and Thomas J. Milroy, Journal of Voice, Vol. 18, Number 1, pp. 46-55, March 2004.

The effect of auditory feedback on phonation threshold pressure (Pth) measurement was investigated in 14 females with normal, untrained voices. Two measurement systems (Glottal Enterprises MS 100—circumferentially vented mask and Kay Elemetrics Aerophone II—non-circumferentially vented mask) were examined under three conditions: (1) masked, (2) no mask, and (3) masked with enhanced auditory feedback-acoustic signal placed at ears through headphones. Masked with enhanced auditory feedback, in addition to subject training, significantly lowered Pth values regardless of mask design. The amount of auditory feedback provided by different mask designs was investigated and revealed a significant difference. Clinical significance of different auditory feedback levels provided by the two mask designs was investigated. Direct comparison of the mean values between systems was not possible because of each system’s design and calibration. Comparisons were accomplished by subtracting means of select-paired conditions (masked/no mask; masked/masked plus masked with enhanced auditory feedback) within each system and then comparing these difference scores from the same paired conditions between each system. No clinical significance in difference scores was revealed because of varying amounts of auditory feedback provided by the masks. Results support the use of enhanced auditory feedback, in addition to subject training, when measuring Pth.

In this study, Kay’s CSL, Model 4300B, and Multi-Dimensional Voice Program (MDVP) were used to ensure normal limits for Fo, jitter, and shimmer; Kay’s Real-Time Pitch Program was used to obtain each subject’s habitual pitch; Kay’s Voice Range Profile (VRP) program was used to obtain the nearest semitone to a subject’s habitual pitch; and, Kay’s Aerophone II, Model 6800, was used to acquire oral pressures and airflow measures.