Do you see what I’m saying? Optimal Visual Enhancement of Speech Recognition in Noisy Environments
Poster Presentation
Lars A. Ross
Program in Cognitive Neuroscience, Department of Psychology at the City College of the City University New York and the Nathan Kline Institute for Psy
Dave Saint-Amour
Nathan Kline Institute for Psychiatric Research Vicktoria Leavitt
Program in Neuropsychology, Department of Psychology at the Queens College of the City University New York and the Nathan Kline Institute for Psychiatric Research Daniel C. Javitt
Nathan Kline Institute for Psychiatric Research John J. Foxe
Program in Cognitive Neuroscience, Department of Psychology at the City College of the City University New York and Nathan Kline Institute for Psychiatric Research Abstract ID Number: 154 Full text:
Not available
Last modified: March 21, 2005
Abstract
Viewing a speaker’s articulatory movements substantially improves a listener’s ability to understand spoken words, especially under noisy environmental conditions. It has been claimed that this gain is most pronounced when auditory input is weakest, an effect that has been related to a well-known principle of multisensory integration - inverse effectiveness. In contrast, we show that this principle does not apply for audio-visual speech perception. Rather, the gain from viewing visual articulations is maximal at intermediate signal-to-noise ratios (SNRs) well above the lowest auditory SNR where the recognition of whole words is significantly different from zero. The multisensory speech system appears to be optimally tuned for SNRs between extremes, extremes where the system relies on either the visual (speech-reading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR- levels. At these intermediate levels, the extent of multisensory enhancement of speech-recognition is considerable, amounting to more than a threefold performance improvement relative to an auditory-alone condition. Additional data collected from patients with schizophrenia show that the gain from additional visual stimulation in speech recognition in noise is not as pronounced as in controls, while unimodal auditory speech recognition remains intact.
|