Multi-Level Audio-Visual Interactions in Speech and Language Perception

Ariane Rhone

That we perceive our environment as a unified scene rather than individual streams ofauditory, visual, and other sensory information has recently provided motivation tomove past the long-held tradition of studying these systems separately. Although theyare each unique in their transduction organs, neural pathways, and cortical primaryareas, the senses are ultimately merged in a meaningful way which allows us tonavigate the multisensory world. Investigating how the senses are merged has becomean increasingly wide field of research in recent decades, with the introduction andincreased availability of neuroimaging techniques. Areas of study range frommultisensory object perception to cross-modal attention, multisensory interactions,and integration. This thesis focuses on audio-visual speech perception, with specialfocus on facilitatory effects of visual information on auditory processing. Whenvisual information is concordant with auditory information, it provides an advantagethat is measurable in behavioral response times and evoked auditory fields (Chapter3) and in increased entrainment to multisensory periodic stimuli reflected by steady-state responses (Chapter 4). When the audio-visual information is incongruent, thecombination can often, but not always, combine to form a third, non-physicallypresent percept (known as the McGurk effect). This effect is investigated (Chapter 5)using real word stimuli. McGurk percepts were not robustly elicited for a majority ofstimulus types, but patterns of responses suggest that the physical and lexicalproperties of the auditory and visual stimulus may affect the likelihood of obtainingthe illusion. Together, these experiments add to the growing body of knowledge thatsuggests that audio-visual interactions occur at multiple stages of processing.