4.7 Article

Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects

Journal

JOURNAL OF NEUROSCIENCE
Volume 41, Issue 23, Pages 4991-5003

Publisher

SOC NEUROSCIENCE
DOI: 10.1523/JNEUROSCI.0906-20.2021

Keywords

CCA; EEG; hierarchical processing; multisensory integration; speech in noise; speech in quiet

Categories

Funding

  1. Science Foundation Ireland Career Development Award [15/CDA/3316]
  2. National Institutes of Health National Institute on Deafness and Other Communication Disorders [R01 DC016297]
  3. Science Foundation Ireland (SFI) [15/CDA/3316] Funding Source: Science Foundation Ireland (SFI)

Ask authors/readers for more resources

The speaker's facial movements benefit speech comprehension by providing temporal cues to the auditory cortex and aiding in recognizing specific linguistic units. EEG responses show that multisensory integration occurs at both spectrotemporal and phonetic stages of speech processing, and the integration effects may change with listening conditions. Future work is needed to further examine this effect using a within-subject design.
Seeing a speaker's face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker's face provides temporal cues to auditory cortex, and articulatory information from the speaker's mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However, it remains unclear how the integration of these cues varies as a function of listening conditions. Here, we sought to provide insight on these questions by examining EEG responses in humans (males and females) to natural audiovisual (AV), audio, and visual speech in quiet and in noise. We represented our speech stimuli in terms of their spectrograms and their phonetic features and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis (CCA). The encoding of both spectrotemporal and phonetic features was shown to be more robust in AV speech responses than what would have been expected from the summation of the audio and visual speech responses, suggesting that multisensory integration occurs at both spectrotemporal and phonetic stages of speech processing. We also found evidence to suggest that the integration effects may change with listening conditions; however, this was an exploratory analysis and future work will be required to examine this effect using a within-subject design. These findings demonstrate that integration of audio and visual speech occurs at multiple stages along the speech processing hierarchy.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available