Near-Infrared Spectroscopy and Cortical Responses to Speech Production

Near-Infrared Spectroscopy and Cortical Responses to Speech Production

The Open Neuroimaging Journal 03 Apr 2009 RESEARCH ARTICLE DOI: 10.2174/1874440000903010026


This research demonstrates near-infrared spectroscopy (NIRS) as a flexible methodology for measuring cortical activity during overt speech production while avoiding some limitations of traditional imaging technologies. Specifically, language production research has been limited in the number of participants and the types of paradigms that can be reasonably investigated using functional magnetic resonance imaging (fMRI) – where a sensitivity to motion has encouraged covert (i.e., nonvocalized) production paradigms – and positron emission tomography (PET), which allows a greater range of motion but introduces practical and ethical limitations to the populations that can be studied. Moreover, for these traditional technologies, the equipment is expensive and not portable, effectively limiting most studies to small, local samples in a relatively few labs. In contrast, NIRS is a relatively inexpensive, portable, noninvasive alternative that is robust to motion artifacts associated with overt speech production. The current study shows that NIRS data is consistent with behavioral and traditional imaging data on cortical activation associated with overt speech production. Specifically, the NIRS data show robust activation in the left temporal region and no significant change in activation in the analogous right hemisphere region in a sample of native, English-speaking adults in a picture-naming task. These findings illustrate the utility of NIRS as a valid method for tracking cortical activity and advance it as a powerful alternative when traditional imaging techniques are not a viable option for researchers investigating the neural substrates supporting speech production.

Keywords: Brain mapping, speech production, language, lateralization.


Imaging data have allowed remarkable advances in our knowledge of language and the brain, but important gaps remain because of certain methodological and practical limitations to traditional technologies. Although PET is relatively robust to motion, the use of an intravenously administered radioactive isotope sets obvious ethical limits on the populations and sample sizes that can be studied, and on the number of trials a given participant can undergo. Populations and paradigms have also been limited in fMRI studies because of extreme sensitivity to motion artifacts, which interfere with image acquisition. Finally, both imaging methods require equipment that is quite expensive to obtain as well as to maintain, and is thus beyond the reach of the majority of researchers interested in language and the brain.

Researchers have tried to circumvent the problem of motion artifacts in fMRI studies of language production by using covert paradigms, where individuals mouth words silently or think words to themselves [1, 2]. Studies of covert production have been useful in suggesting a complex auditory-motor network that has both overlapping and unique areas of neural activation in frontal and posterior areas during speech production and perception [3-5]. In particular, activity in the most posterior and medial part of the superior temporal gyrus (STG) at the temporo-parietal junction has been associated with speech production rather than perception [6]. In addition, a specific role for phonological processing has been indicated in the left inferior frontal cortex [7-9] and left STG [5, 10-12].

While these results are undeniably important, it is nonetheless a concern that the majority of language production studies using fMRI have involved covert tasks. It is known that overt articulation activates numerous motor and cognitive processes, themselves supported by specialized neural substrates [13, 14], and there is some evidence for a covert-overt distinction across imaging studies [2]. Thus, it is possible that covert methods may not induce the full range of neural activity normally elicited during natural speech production. Emphasis on covert responding also imposes methodological limitations that make it difficult to study specific populations of interest, such as young children, sign-language users, or patients with movement disorders. Covert paradigms further pose problems for research topics that require overt motion, such as writing or responding to language stimuli (e.g., during dyadic communication). In summary, there is a clear need for techniques that can more flexibly measure neural responses during overt speech production in a range of populations. Moreover, it is critical to the cumulative advancement of the literature to utilize valid brain investigation methods that are accessible to a range of researchers beyond the relatively few with access to fMRI and PET technologies.

Near-Infrared Spectroscopy (NIRS)

One method with the potential to avoid many limitations of traditional imaging techniques is NIRS, a safe, non-invasive optical imaging method that measures changes in cerebral blood volume and hemoglobin oxygenation. As with fMRI, the assumption is that stimulus-induced neural activation causes increased energy demands in the activated areas; that is, cerebral blood flow (CBF) to activated brain areas increases to meet increased demands for oxygen and glucose. Increases in CBF are parsed into local concentrations of oxyhemoglobin (oxygenated blood), which typically increases during cortical activation, and deoxyhemoglobin (deoxygenated blood), which typically decreases [15]. Total hemoglobin (HbT) is computed by summing changes in oxyhemoglobin (HbO2) and deoxyhemoglobin (HbR).

Also like fMRI, NIRS uses modulations in CBF during stimulus presentation relative to CBF during a baseline event during which no stimulus is presented to provide important information about the hemodynamic response to stimulus-induced cortical activation. Specifically, NIRS assesses hemodynamic responses by projecting near-infrared light at wavelengths of 690 and 830 nm through the scalp and skull and into the cortex, then recording intensity modulations in the reflected light from each wavelength. Because these two wavelengths are differentially absorbed by oxygenated blood (830 nm light is more sensitive to HbO2) and deoxygenated blood (690 nm light is more sensitive to HbR) [16, 17], the relative concentrations of HbO2 and HbR (and localized HbT) can be computed and used to provide an index of cortical activation [18].

One caveat is that, unlike fMRI, NIRS measurements are restricted to cortical activity, making NIRS inappropriate for investigations of subcortical structures. Another limitation is that spatial resolution with NIRS is not as precise as with fMRI, although NIRS temporal resolution is superior to fMRI.

Nonetheless, taken together with the known relationship between hemodynamics and neural activity [19], several recent studies indicate that NIRS can provide a reliable and useful estimate of cortical activity associated with language processing. For instance, researchers have used NIRS to track changes in infant cerebral hemodynamics to dissociate cortical activity coupled to visual versus linguistic stimulus processing [20] and during other forms of visual processing [21, 22]. Other studies have begun to validate NIRS against fMRI [23]. For example, NIRS has been used to show that 2-5 day old infants exhibit left hemisphere dominance, and left temporal lobe activation in particular, for processing forward, infant-directed speech relative to the same speech played backwards [24], thus replicating outcomes of a similar fMRI study with 3 month-old infants [25]. With adult speech production, NIRS has been used to examine the left inferior frontal region (i.e., Broca’s area) during overt sentence translation in bilingual adults [14], and lateralization of prefrontal areas during a language comprehension task [26]; the results of both NIRS studies were consistent with outcomes from analogous fMRI studies [20, 27] (and see [4, 28] for reviews).

Present Research

Given the need for more flexible techniques to measure cortical activity during overt speech production, the present research was designed to evaluate NIRS as a method for assessing region-specific processing in adults in response to overt speech production in order to provide evidence that NIRS is a reasonable alternative to traditional imaging techniques for this type of research. The present NIRS study focused on changes in CBF in the temporal cortices of healthy adults during an overt speech production paradigm. We chose the bilateral temporal sites as our focal region of interest (ROI) for this initial study primarily based on previous NIRS data from our lab, which revealed robust and reliable patterns of left temporal activation in monolingual infants during a language perception task [20]. In addition, the ROI was justified by outcomes from a PET study that found activation in predominantly left hemisphere regions of the temporal lobe during an overt language production task [29], and suggestions from fMRI that the left STG should be preferentially involved in phonological processing and speech production [19].

Our primary prediction was that we would observe increased activity in the left STG of monolingual adults during overt speech production, with little or no increase in activity in the homologous region of the right hemisphere. These results would extend the application of NIRS as a tool for measuring overt speech production in adults beyond frontal regions demonstrated in previous studies [26, 14].



Ten healthy, monolingual English-speaking adults (9 men, 1 woman) between the ages of 18 and 22 underwent NIRS imaging at Texas A&M University, College Station.


A soundproofed testing room was equipped with a chair, a cushioned chin rest anchored to a table, and a mounted 53-cm flat panel computer monitor on which visual stimuli were displayed. An adjustable headband connected to the NIRS instrument by two fiber optic cables (each 1 mm diameter and 15 m length) extending from the instrument to the testing room and into the adjustable headband through a sound and light dampening curtain. The cables were bundled into a single strand secured on the wall just over the participant’s right shoulder.

The NIRS instrument consisted of three main components: a) two emitter fibers consisting of fiber optic cables that delivered near-infrared light to the scalp of the participant; b) four detector fibers consisting of fiber optic cables that detected the diffusely reflected light at the scalp and transmitted it to the receiver; and c) an electronic control box that served as both the source of the near-infrared light and the receiver of the reflected light. The fibers were grouped into two emitter/detector sets (i.e., optical probes), each consisting of two detector fibers placed at 3 cm distance on either side of a central emitter fiber. The signals received by the electronic control box were processed and relayed to a DELL Inspiron 7000 laptop computer. A custom computer program recorded and analyzed the signal.

The NIRS instrument emitted light at two wavelengths, 690 and 830 nm, with two laser-emitting diodes. Laser power emitted from the end of each fiber was 4 mW, and the light was square wave modulated at audio frequencies of approximately 4 to 12 kHz. Each laser had a unique frequency so that synchronous detection could uniquely identify each laser source from the photodetector signal. No detector saturation occurred during the experiment.

One optical probe (i.e., emitter/detector set) was used to deliver and collect near-infrared light on the left temporal region at approximately T3 according to the International 10-20 system, and the other delivered and collected light on the right temporal region at approximately position T4.


Each participant was seated at approximately 76 cm distance from the computer monitor (28.1◦ visual angle at participants’ viewing distance based on a 36 cm wide screen). The chinrest was adjusted so the participant could sit comfortably with chin in chinrest and forearms resting on the chinrest table. In an effort to reduce unwanted muscle movement (and thus standardize individual variations in extraneous motor activation), participants were instructed to keep their chins in the chinrest and their teeth together throughout the experiment. The International 10-20 system for electrode placement was used to determine the locations for T3 and T4, which were then marked on the participant’s scalp. The hair around the target locations was parted and fastened to avoid interference with the light emission and detection. The two emitter/receiver optodes were centered over the target locations on the participant’s scalp and secured with Velcro straps. One emitter was positioned directly above and slightly in front of the left ear (T3), a location that targets primary auditory cortex and lateral STG. The second emitter was positioned in the analogous location in the right hemisphere (T4).

Stimuli and Design

The stimuli consisted of 41 line drawings (1 practice trial and 40 target trials) taken from [30] and the Philadelphia Naming Task [31] materials. Stimuli were one-syllable items closely matched in visual complexity and lexical frequency. Each trial began with a fixation cross, which appeared for 1700 msec, followed by a 300 msec stimulus event (i.e., line drawing) on a white background, followed by 8 sec of rest during which no stimuli were presented, for a total epoch of 10 sec. Participants were to say aloud the name of each line drawing as quickly as possible, thus requiring them to map conceptual representations onto phonological representations and produce them overtly. Accuracy was 100% for all participants. Near-infrared light that was differentially reflected from the temporal region of each hemisphere as a result of changes in CBF during the overt picture naming task was collected and analyzed.


NIRS Data Acquisition

The two detector fibers within each optical probe recorded raw optical signals, which were digitized at 200 Hz and converted to optical density units. The optical density units were then low-pass-filtered at 1Hz and high-pass-filtered at 0.02 Hz for noise reduction and decimated to 20 samples per second. The control computer then converted the filtered optical density units for each of the two wavelengths to relative concentrations of HbO2 and HbR hemoglobin using the modified Beer-Lambert law [32], which computes the relationship between light absorbance and concentration of particles within a medium. Concentration changes in HbO2 and HbR, as well as changes in total blood flow (HbT) were averaged across the 10 sec epochs and plotted by channels per region. Participant motion and systemic physiology (e.g., heartbeat and respiration) were spatially filtered using a principal components analysis (PCA) of the signals across the four channels [33]. Artifacts from extreme movement (e.g., sneezing) were operationalized as a signal change greater than 5% in a tenth of a second and were eliminated, resulting in the removal of 1.25% of the data (5 trials out of 400). The filtered data were then summed across the two channels per region and grand averaged across participants.

Hemodynamic Response Functions and Analyses

NIRS data were analyzed first by channel within each cortical region (where each emitter/detector pair within each optical probe constituted one channel, so each optical probe contained two channels per cortical region). Data from the two channels per optical probe were then averaged and responses were compared across probes (i.e., cortical regions). Fig. (1) illustrates the grand averaged hemodynamic response function for the left temporal region. The time of stimulus presentation (fixation cross + picture) is indicated by the solid red bar. Following stimulus offset, the left temporal region shows a marked increase in HbO2. There is a corresponding decrease in HbR in this region, though the effect is much less pronounced. Fig. (2) illustrates the hemodynamic response function for the right temporal region. Although there is an initial increase in HbO2 in this region at the time of stimulus onset, this is very small and reverses before the offset of the stimulus, going well below baseline. HbO2 concentration continues to decrease over the course of the post-stimulus rest period, before beginning to return to baseline prior to the end of the epoch.

Fig. (1). Hemodynamic Response Function: Left temporal area (T3).

Grand average hemodynamic (HbO2, HbR, and HbT) response curve across 10-sec epochs for the left temporal region. Stimulus presentation (1700 msec fixation cross + 300 msec picture) is indicated by the solid red bar (onset at time 0 s), followed by a no-stimulus rest period (8 sec). The y-axis indicates relative changes in concentration (micromolar) of the different chromophores.

Fig. (2) Hemodynamic Response Function: Right temporal area (T4).

Corresponding grand average hemodynamic (HbO2, HbR, and HbT) response curve across 10-sec epochs for the right temporal region.

We calculated concentration changes in response to stimulus presentation to allow for statistical comparison of the average response to test stimuli in each cortical region (left and right temporal). The analysis was performed only on changes in concentration of HbO2 within each region, as this chromophore provides the most robust contrast-to-noise ratio [20]. Because changes in concentration began manifesting shortly after stimulus onset and showed signs of abating by the end of the 10 sec epoch, concentration changes were calculated based on the average relative HbO2 concentration during time 3-5 sec for each epoch relative to HbO2 concentration at time -1 to 0 sec prior to trial onset (baseline). This revealed an average HbO2 concentration change in response to the production task (relative to baseline) in the left temporal region of .15 µM (SE = .07), while the average change in the homologous region in the right temporal region was -.05 µM (SE = .08). A paired-samples t-test showed that the difference in HbO2 concentration changes by hemispheric region was significant, t(1, 9) = 2.83, p < .05. This finding is consistent with prior research, demonstrating a dissociation of homologous regions of the temporal cortex during overt word production in monolingual adults. Together, our results provide evidence that NIRS can be successfully used to track localized changes in cerebral hemodynamics during an overt language production task.


NIRS is a non-invasive brain mapping technique that is relatively robust to motion artifacts and provides reasonably precise spatial and temporal information about cortical activation. The focus of the present study was to document the utility of NIRS as a flexible tool for mapping cortical activation in response to overt speech production, thereby laying the groundwork for future studies aimed at extending our understanding these neural substrates in a range of populations. In particular, we used NIRS to determine the relative participation of left and right temporal regions during overt speech production in healthy, English-speaking monolingual adults. The outcomes showed a significant increase in activation—as indicated by increases in HbO2—in the left relative to the right temporal region. This outcome is consistent with data from other measures and thus demonstrates the utility of NIRS for investigating the neural markers associated with overt speech production.

Given the flexibility of the NIRS technique, an important extension will be to apply NIRS in populations where movement is a concern or where it is not practical to request covert responding. For instance, the use of NIRS with infant populations in perceptual paradigms [20-22] could be extended to investigations of early speech production in infants and young children. In addition, because it is not invasive and therefore amenable to repeated use, NIRS can be used in longitudinal developmental studies, and it can support larger sample sizes than those typically supported in PET and fMRI studies. Another ideal NIRS application would be to track cortical activation during language production in different modalities (spoken versus signed), across language backgrounds (monolingual versus bilingual), and during different levels of speech production, from single words to full discourse. Such applications would go far in answering questions that remain concerning the development of brain organization for language.

One proposal in need of further investigation involves the notion of a network of auditory-motor integration formed from the overlapping and unique areas of neural activation in frontal and posterior regions during speech production and perception [3, 5, 20, 29]. Based on fMRI data, Hickok and his colleagues [3] suggest that this network guides speech development and supports speech production. Establishing the existence of this network will require using more extensive NIRS probe geometries to cover wider-ranging areas of cortex, and testing participants at various stages of language development. Given the difficulties inherent in testing children with fMRI, NIRS can serve as a useful tool to investigate this hypothesis.

Another area of research where NIRS will be helpful concerns whether the development of language processing networks is differentially influenced by the acquisition of one versus two (or more) acoustic-phonetic systems. Based on a meta-analysis of the large behavioral bilingual laterality literature, Hull and Vaid [34] have proposed that the functional organization of language(s) depends on language exposure during early development, and that these early patterns anchor the organization of any subsequently learned languages. Specifically, these researchers found that, across behavioral paradigms and across languages, monolinguals and late bilinguals showed reliable activation of predominantly left hemisphere areas (consistent with the present results with monolinguals) regardless of second language proficiency, whereas early bilinguals showed reliable bilateral activation (see also [35]). At least one fMRI study using covert word production in multilingual adults [36] and one clinical study with bilingual epilepsy patients [37] have produced results consistent with this view. Specifically, the fMRI study found that late multilinguals showed consistent left hemisphere dominance for all languages whereas the single early multilingual they tested showed consistent bilateral activation across languages. The clinical study found that late bilinguals with epileptiform discharges focused in the left temporal region demonstrated particular post-seizure dysfunction for producing speech in their second language, which was also their most proficient language. These outcomes support the proposition that NIRS could be a useful tool for extending investigations of the cortical underpinnings of overt speech production during early development and adulthood in both healthy and patient populations, particularly if more extensive probe geometries are utilized to cover more spatial locations in the cortex.

Contemporary research has demonstrated that the brain’s organization of language is more complex than previously assumed. It is also clear that more work is needed to identify the full range of neural correlates for language production in various populations in more natural settings. We suggest that NIRS represents a uniquely flexible neuroimaging tool that is sensitive enough to provide useful information about cortical activation associated with language production as a reasonable alternative to more expensive imaging methods, and that NIRS will allow study designs to extend to a wider range of overt production tasks in more natural environments. It is thus expected that NIRS will make a valuable contribution to investigations of the variety of neural consequences that may arise from different language experience, different developmental levels, and different modes of speech production.


Hickok G, Erhard P, Kassubek J, et al. A functional magnetic resonance imaging study of the role of left posterior superior temporal gyrus in speech production: implications for the explanation of conduction aphasia Neurosci Lett 2000; 287: 156-60.
Indefrey P, Levelt W. The spatial and temporal signatures of word production components Cognition 2004; 92: 101-44.
Hickok G, Buchsbaum B, Humphries C, et al. Auditory-motor interaction revealed by fMRI: speech, music, and working memory in area Spt J Cogn Neurosci 2003; 15: 673-82.
Hickok G, Poeppel D. Toward a functional neuroanatomy of speech perception Trends Cogn Sci 2000; 4: 131-8.
Parker GJM, Luzzi S, Alexander DC, et al. Lateralization of ventral and dorsal auditory-language pathways in the human brain Neuroimage 2005; 24: 656-6.
Wise RJ, Scott SK, Blank SC, et al. Separate neural subsystems within Wernicke's area Brain 2001; 124: 83-95.
Devlin J, Matthews P, Rushworth M. Semantic processing in the left inferior prefrontal cortex: a combined functional magnetic resonance imaging and transcranial magnetic stimulation study J Cogn Neurosci 2003; 15: 71-84.
Gabrieli JDE, Poldrack RA, Desmond JE. The role of the left inferior frontal cortex in language and memory Proc Natl Acad Sci USA 1998; 95: 906-13.
Klein D, Milner B, Zatorre R, et al. The neural substrates underlying word generation: a bilingual functional-imaging study Proc Natl Acad Sci USA 1995; 92: 2899-903.
Hickok G. Functional anatomy of speech perception and speech production: psycholinguistic implications J Psycholinguist Res 2001; 30: 225-35.
Hugdahl K, Kolbjorn B, Kyllingsbaek S, et al. Brain activation during dichotic presentations of consonant-vowel and musical instrument stimuli: a O-PET study Neuropsychologia 1999; 37: 431-0.
Karbe H, Wurker M, Herholz K, et al. Planum temporale and Brodmann’s area 22: magnetic resonance imaging and high-resolution position emission tomography demonstrate functional left-right asymmetry Arch Neurol Chicago 1995; 52: 869-74.
Munhall K. Functional imaging during speech production Acta Psychol 2001; 107: 95-117.
Quaresima V, Ferrari M, van der Sluijs M, et al. Lateral frontal cortex oxygenation changes during translation and language switching revealed by non-invasive near-infrared multi-point measurements Brain Res Bull 2002; 59: 235-43.
Bartocci M, Winberg J, Ruggiero C, et al. Activation of olfactory cortex in newborn infants after odor stimulation: a functional near-infrared spectroscopy study Pediatr Res 2000; 48: 18-23.
Gratton G, Sarno A, Maclin E, et al. Noninvasive 3-D imaging of the time course of cortical activity: investigation of the depth of the event-related optical signal Neuroimage 2000; 11: 491-504.
Villringer A, Chance B. Non-invasive optical spectroscopy and imaging of human brain function Trends Neurosci 1997; 20: 435-2.
Meek J. Basic principles of optical imaging and application to the study of infant development Dev Sci 2002; 5: 371-80.
Gratton G, Goodman-Wood MR, Fabiani M. Comparison of neuronal and hemodynamic measures of the brain response to visual stimulation: an optical imaging study Hum Brain Mapp 2001; 13: 13-25.
Bortfeld H, Wruck E, Boas DA. Assessing infants' cortical response to speech using near-infrared spectroscopy Neuroimage 2007; 34: 407-15.
Wilcox T, Bortfeld H, Woods R, et al. Hemodynamic response to featural changes in the occipital and inferior temporal cortex in infants: a preliminary methodological exploration Dev Sci 2008; 11(3): 361-70.
Wilcox T, Bortfeld H, Woods R, et al. Using near-infrared spectroscopy to assess neural activation during object processing in infants J Biomed Opt 2005; 10(1): 11010.
Kleinschmidt A, Obrig H, Requardt M, et al. Simultaneous recording of cerebral blood oxygenation changes during human brain activation by magnetic resonance imaging and near-infrared spectroscopy J Cereb Blood Flow Metab 1996; 16: 817-26.
Peña M, Maki A, Kovacic D, et al. Sounds and silence: an optical topography study of language recognition at birth Proc Natl Acad Sci USA 2003; 100(20): 11702-5.
Dehaene-Lambertz G, Dehaene S, Hertz-Pennier L. Functional neuroimaging of speech perception in infants Science 2002; 298: 2013-15.
Kennan RP, Kim D, Maki A, et al. Non-invasive assessment of language lateralization by transcranial near infrared optical topography and functional MRI Hum Brain Mapp 2002; 16: 183-9.
Chee M, Tan E, Thiel V. Mandarin and English single word processing studied with functional magnetic resonance imaging J Neurosci 1999; 19: 3050-6.
Vaid J, Hull R. Re-envisioning the bilingual brain using functional neuroimaging: Methodological and interpretive issues In: Fabbro F, Ed. Advances in the neurolinguistics of bilingualism: a festschrift for Michel Paradis Udine Italy Forum 2002; 315-55.
Braun AR, Guillemin A, Hosey L, et al. The neural organization of discourse: an H2O-PET study of narrative production in English and American sign language Brain 2001; 124: 2028-44.
Snodgrass JG, Vanderwart M. A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity J Exp Psychol [Hum Learn] 1980; 6: 174-215.
Roach A, Schwartz MF, Martin N, et al. The Philadelphia naming test: scoring and rationale Clin Aphasiol 1996; 24: 121-33.
Strangman G, Boas DA, Sutton JP. Non-invasive neuroimaging using near-infrared light Biol Psychiatry 2002; 52: 679-93.
Zhang Y, Brooks DH, Francescini MA, et al. Eigenvector-based spatial filtering for reduction of physiological interference in diffuse optical imaging J Biomed Opt 2005; 10: 1-11.
Hull R, Vaid J. Bilingual language lateralization: a meta-analytic tale of two hemispheres Neuropsychologia 2007; 45: 1987-2008.
Hull R, Vaid J. Laterality and language experience Laterality 2006; 11: 436-64.
Briellman RS, Saling MM, Connell AB, et al. A high-yield functional MRI study of quadric-lingual subjects Brain Lang 2004; 89: 531-42.
Aladdin Y, Snyder TJ, Ahmed SN. Pearls & oy-sters: selective postictal aphasia: cerebral language organization in bilingual patients Neurology 2008; 71: 14-7.