The Role of Speech Production System in Audiovisual Speech Perception

Iiro P Jääskeläinen*
Department of Biomedical Engineering and Computational Science, Aalto University, Espoo, Finland

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 1300
Abstract HTML Views: 699
PDF Downloads: 324
Total Views/Downloads: 2323
Unique Statistics:

Full-Text HTML Views: 590
Abstract HTML Views: 429
PDF Downloads: 242
Total Views/Downloads: 1261

Creative Commons License
© Iiro P. Jääskeläinen; Licensee Bentham Open

open-access license: This is an open access article licensed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted, non-commercial use, distribution and reproduction in any medium, provided the work is properly cited.

* Address correspondence to this author at the Department of Biomedical Engineering and Computational Science, Aalto University, P.O. Box 12200, FIN-00076 Aalto, Finland; Tel: +358 9 47001; Fax: +358 9 470 24833; E-mail:


Seeing the articulatory gestures of the speaker significantly enhances speech perception. Findings from recent neuroimaging studies suggest that activation of the speech motor system during lipreading enhance speech perception by tuning, in a top-down fashion, speech-sound processing in the superior aspects of the posterior temporal lobe. Anatomically, the superior-posterior temporal lobe areas receive connections from the auditory, visual, and speech motor cortical areas. Thus, it is possible that neuronal receptive fields are shaped during development to respond to speech-sound features that coincide with visual and motor speech cues, in contrast with the anterior/lateral temporal lobe areas that might process speech sounds predominantly based on acoustic cues. The superior-posterior temporal lobe areas have also been consistently associated with auditory spatial processing. Thus, the involvement of these areas in audiovisual speech perception might partly be explained by the spatial processing requirements when associating sounds, seen articulations, and one’s own motor movements. Tentatively, it is possible that the anterior “what” and posterior “where / how” auditory cortical processing pathways are parts of an interacting network, the instantaneous state of which determines what one ultimately perceives, as potentially reflected in the dynamics of oscillatory activity.

Keywords: Audiovisual speech perception, speech motor theory, functional MRI, magnetoencephalography, electroencephalography.