Volume 34, Issue 11 pp. 1817-1822
Full Access

Sensorimotor integration for speech motor learning involves the inferior parietal cortex

Mamie Shum

Mamie Shum

Neuroscience Major Program, McGill University, Montreal, Quebec, Canada

Centre for Research on Brain, Language & Music, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada

Search for more papers by this author
Douglas M. Shiller

Douglas M. Shiller

Centre for Research on Brain, Language & Music, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada

CHU Sainte-Justine Research Centre, Montreal, Quebec, Canada

École d’orthophonie et d’audiologie, Université de Montréal, Quebec, Canada

Search for more papers by this author
Shari R. Baum

Shari R. Baum

School of Communication Sciences and Disorders, McGill University, Montreal, Quebec, Canada

Centre for Research on Brain, Language & Music, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada

Search for more papers by this author
Vincent L. Gracco

Vincent L. Gracco

School of Communication Sciences and Disorders, McGill University, Montreal, Quebec, Canada

Centre for Research on Brain, Language & Music, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada

Haskins Laboratories, New Haven, CT, USA

Search for more papers by this author
First published: 18 November 2011
Citations: 73
Dr V. L. Gracco, as above.
E-mail: [email protected]

Abstract

Sensorimotor integration is important for motor learning. The inferior parietal lobe, through its connections with the frontal lobe and cerebellum, has been associated with multisensory integration and sensorimotor adaptation for motor behaviors other than speech. In the present study, the contribution of the inferior parietal cortex to speech motor learning was evaluated using repetitive transcranial magnetic stimulation (rTMS) prior to a speech motor adaptation task. Subjects’ auditory feedback was altered in a manner consistent with the auditory consequences of an unintended change in tongue position during speech production, and adaptation performance was used to evaluate sensorimotor plasticity and short-term learning. Prior to the feedback alteration, rTMS or sham stimulation was applied over the left supramarginal gyrus (SMG). Subjects who underwent the sham stimulation exhibited a robust adaptive response to the feedback alteration whereas subjects who underwent rTMS exhibited a diminished adaptive response. The results suggest that the inferior parietal region, in and around SMG, plays a role in sensorimotor adaptation for speech. The interconnections of the inferior parietal cortex with inferior frontal cortex, cerebellum and primary sensory areas suggest that this region may be an important component in learning and adapting sensorimotor patterns for speech.

Introduction

A significant factor in the development and modification of motor behavior across lifespan is sensorimotor integration as the basis for sensorimotor learning. Empirical studies of imposed sensory-based (feedback) manipulations in humans and non-humans have demonstrated the plasticity of sensorimotor systems and the obligatory coupling between self-movement and the re-afferent consequences. Nowhere is this more apparent than in studies of prism adaptation, in which subjects adjust motor output taking into account an imposed change in the sensory environment (Held & Hein, 1958; Held, 1965). Similar to prism adaptation, speech produced under constant auditory or somatosensory alteration creates the illusion of a mismatch between the intended and actual movement, resulting in an adaptive motor adjustment compensating for the altered sensory cues (Houde & Jordan, 1998, 2002; Tremblay et al., 2003; Jones & Munhall, 2005; Purcell & Munhall, 2006; Nasir & Ostry, 2009; Shiller et al., 2009). As with other forms of procedural learning, removing the feedback alteration results in an after-effect or persistence of the adaptation that is used as the principal indication that learning has occurred.

Studies with non-human primates (Baizer et al., 1999; Kurata & Hoshi, 1999), human lesion studies (Martin et al., 1996; Newport & Jackson, 2006) and more recently neuroimaging studies (Clower et al., 1996; Girgenrath et al., 2008; Luautéet al., 2009) have revealed strong cerebellar, parietal and premotor contributions to sensorimotor adaptation. During prism adaptation, at least two mechanisms are responsible for producing accurate movements – a strategic control mechanism and a spatial realignment mechanism (Redding et al., 2005). The parietal cortex has been suggested as a crucial component in the first mechanism, involving error correction and recalibration, which leads to spatial realignment, a mechanism that involves the cerebellum and other regions in parietal cortex (Pisella et al., 2005; Newport & Jackson, 2006; Chapman et al., 2010). For speech production, no studies have directly assessed the neural substrate underlying sensorimotor adaptation. Limited data on novel word learning and verbal working memory implicate the inferior parietal cortex (Smith et al., 1998; Clark & Wagner, 2003; Cornelissen et al., 2004; Veroude et al., 2010) as potentially contributing to sensorimotor adaptation and speech motor learning. Moreover, inferior parietal cortex has been implicated in linking actions with perception (Rizzolatti et al., 2006), a process that is critical for sensorimotor learning. In the present study, we test the prediction that the inferior parietal cortex in general, and the supramarginal gyrus in particular, contributes directly to sensorimotor adaptation for speech by reducing the excitability of this brain area using 1-Hz repetitive transcranial magnetic stimulation (rTMS).

Materials and methods

Subjects

Twenty adult speakers (six males; mean age = 23.95 years, SD ± 3.69) gave informed written consent with ten experimental subjects undergoing an rTMS procedure and ten control subjects undergoing sham TMS. All subjects reported no history of neurological, speech or hearing disorder and were screened for contraindications to TMS. All subjects were pre-screened for handedness based on preference for a number of unimanual tasks (handwriting, throwing, teeth-brushing, utensil use and hair grooming). The experimental procedure was approved by the Institutional Review Board of the Faculty of Medicine, McGill University. The experiments were undertaken with the understanding and written consent of each subject, and the study conformed with The Code of Ethics of the World Medical Association (Declaration of Helsinki).

Experimental procedures

Subjects were seated approximately 1 m in front of a 21-inch LCD display. The subjects’ task was to produce individual words into a microphone located 20 cm from the mouth. Speech was cued by presentation of the target word on the monitor. Each stimulus was presented for 2 s, with a 2-s inter-stimulus interval. Subjects listened to their speech through headphones mixed with speech-weighted masking noise. The levels of noise and speech signals were established during pilot tests and were determined to be sufficient to allow for clear perception of the speech signal. A VU (volume unit) meter visible to the subject was used to maintain a comparable speech sound level between subjects and throughout the testing sequence.

Prior to the rTMS procedure, 15 tokens of the words ‘hid’, ‘head’ and ‘had’ were acquired as baseline acoustic measures. Immediately following the rTMS sequence, subjects repeated the word ‘head’ under the following auditory feedback conditions: (i) unaltered feedback (30 trials, baseline phase), (ii) ramp up to maximum shift (40 trials, ramp phase), (iii) maintained at maximum shift (100 trials, hold phase) and (iv) return to unaltered feedback (30 trials, after-effect phase).

Audio recording and processing

Microphone input was amplified and passively split into two identical channels – a raw (unprocessed) signal for offline acoustic analysis and a real-time digitally modified signal presented back to the subject through headphones. Both the raw and processed audio signals were recorded on a laptop computer.

The participant’s feedback was manipulated using a commercial digital signal processor capable of altering the resonance properties of the speech signal without a corresponding change in the voice fundamental frequency (VoiceOne; T.C. Electronic, Risskov, Denmark). The feedback shift was restricted to the first major resonance of the vocal tract for the production of vowels, the first formant (F1), which is inversely related to the height of the tongue within the vocal tract. The microphone signal was split into non-overlapping low- and high-frequency components (filter cut-off of 1350 Hz for females, and 1100 Hz for males) with the low-frequency component shifted by 30%. Pilot tests confirmed that the procedure successfully increased only the F1 frequency without changing the fundamental frequency or the second formant (F2), which is related to the front–back position of the tongue. The change in F1 had the desired effect of changing the vowel’s perceived phoneme category from/ε/(‘head’) to/æ/(‘had’). The total signal processing delay was less than 15 ms.

Acoustic analysis

For the 215 productions of the target word ‘head’ (15 prior to the rTMS procedure and 200 during the adaptation procedure following rTMS), a 30-ms segment centered about the vowel midpoint was used to extract mean F1 and F2 frequencies using linear predictive coding analysis in Praat [version 5.1.3, (Boersma & Weenink, 2010)], which implements the Burg algorithm for determining linear prediction coefficients (Childers, 1978). Changes in vowel production during the adaptation procedure were computed as the proportion change in formant frequency relative to the mean for the initial baseline trials. The resulting normalized values of F1 and F2 allowed for the direct comparison of subjects with different vocal tract lengths (e.g. males and females). Using the normalized formant values, three measures were obtained: (i) the change in formant frequency at the end of the ramp phase (averaged over the last ten trials), (ii) the change in formant frequency at the end of the hold phase (averaged over the last ten trials) and (iii) the persistence of the change in formant frequency immediately following the removal of the feedback manipulation (averaged over the first ten trials for the after-effect phase).

To verify that the rTMS did not introduce changes prior to the adaptation procedure, a comparison was made between the 15 productions of the vowel/ε/prior to the stimulation sequence and the first 15 trials under normal feedback conditions (during the baseline phase, trials 1–15) immediately following the stimulation. In addition to F1 and F2 frequency, an examination of fundamental frequency (in Hz) and amplitude (in dB) was carried out between the two sets of utterances.

rTMS procedure

Targeting of rTMS stimulation was carried out on the basis of a high-resolution anatomical T1-weighted magnetic resonance imaging (MRI) scan previously obtained for each subject. Coil placement was guided using BrainSight 2 software (Rogue Research, Montreal, Canada), following an MRI-to-head co-registration procedure. Once co-registered, infrared tracking (Polaris, Northern Digital, Waterloo, Canada) was used to monitor the position of the coil relative to the subject’s brain.

All TMS stimulation was carried out using an air-cooled 70-mm figure-of-eight coil with a MagStim RAPID 1400 stimulator. Resting motor threshold (RMT) was obtained by delivering single TMS pulses to the left motor cortex hand area. RMT was defined as the stimulation intensity at which motor-evoked potentials (> 50 μV) were observed from surface muscle recordings from the first dorsal interosseous muscle of the right hand in approximately half of the trials. The RMT for the TMS group ranged from 45 to 70% of maximum stimulator output (mean ± SD = 58 ± 6.75%).

The target region for the rTMS stimulation was the left inferior parietal lobe (IPL) (supramarginal gyrus, SMG), located using MNI coordinates (x = −52, y = −40, z =36). Subjects in the experimental group received 600 pulses of 1-Hz rTMS at 110% of their RMT. rTMS at this frequency and duration causes a decrease in brain excitability over the stimulated area that lasts for at least 10–15 min (Boroojerdi et al., 2000; Gerschlager et al., 2001). The entire experiment, post TMS, lasted approximately 14 min. Pulse delivery was computer-controlled. Control subjects received sham stimulation in which the coil was placed over the same scalp region but with the stimulator output set to 1% (producing an audible click).

Results

Baseline comparison

Group mean values for fundamental frequency, amplitude, F1 and F2 during the two baseline phases (pre- and post-TMS) are shown in Fig. 1. Differences between the two groups (TMS vs. control) and between conditions (pre-TMS and post-TMS) were examined by carrying out a two-way mixed-factorial analysis of variance (anova) (with GROUP as the between-subjects factor and CONDITION as the within-subjects factor) for each acoustic measure. With a single exception, the main effects of GROUP and CONDITION were not statistically reliable (P >0.05). The single reliable result was a main effect of CONDITION for F1 frequency (F1,18 = 4.57, P <0.05), although the effect was nearly identical for the two groups (reduction of F1 by 0.29 and 0.33% in the TMS and control groups, respectively) and the GROUP × CONDITION interaction effect for F1 was not reliable (F1,18 = 0.038, P =0.85); thus while a small difference in F1 was observed between the pre-TMS and post-TMS conditions, it was not related to the application of the rTMS stimulation. Overall, no effect of the rTMS was evident in any of the baseline measures.

Details are in the caption following the image

Group means for fundamental frequency (F0), RMS amplitude, F1 and F2 during baseline word production prior to and following the TMS procedure (real or sham). Error bars show one standard error of the mean.

Adaptation comparison

Example data for one control subject during the feedback adaptation procedure are shown in Fig. 2. Following the onset of the auditory feedback manipulation (shown schematically as a solid black line), the subject showed a compensatory change in F1 frequency in the direction opposite that of the feedback shift. The change in F1 persists following the sudden restoration of normal feedback (beginning at trial 171), indicating that the adaptation involved a change in feed-forward motor planning.

Details are in the caption following the image

Example of the acoustic measurement scheme (A) and the F1 shift and resulting adaptive response to auditory feedback manipulation (B). As shown in A, a 30-ms time window is extracted from the mid-portion of the vowel and the first two vowel formants are derived. The adaptive response in B reflects the percentage change in F1 relative to the baseline.

No reliable change in F2 frequency (the non-modified formant) was found for either group (averaging < 1% change relative to baseline for all three phases – end of the ramp phase, end of the hold phase and immediately after restoring feedback to normal), and a two-way anova revealed no reliable effect of GROUP (F1,8 = 0.024, P >0.05) or PHASE for F2 (F1,8 = 1.36, P >0.05). The mean change in F1 (relative to baseline) for the three phases is shown in Fig. 3. Overall, the subjects in the control group exhibited a robust compensatory response in F1 frequency at the end of the ramp phase (mean change = −6.52%) that reached its peak at the end of the hold phase (mean change = −10.55%). The effect was maintained immediately following the removal of the feedback manipulation (after-effect mean change = −8.32%). In contrast, the subjects in the TMS group exhibited a much smaller change in F1 at the end of the ramp phase (mean change = 3.43%) and which remained small throughout the hold phase (mean change = −3.93%) and after-effect phase (mean change = −3.45%).

Details are in the caption following the image

Percentage change in F1 during production of ‘head’ during the ramp, hold and after-effect phases for sham and TMS groups.

The difference in F1 between groups and between experimental phases was evaluated using a two-way mixed-factorial anova, with GROUP (TMS vs. control) as the between-subject factor and PHASE (hold vs. after-effect) as the within-subject factor. A highly significant main effect of GROUP was observed (F1,8 = 31.99, P =0.0005), as well as a significant effect of PHASE (F1,8 = 4.28, P =0.032), and no reliable interaction (F1,8 = 2.57, P =0.107), confirming the impact of the rTMS procedure on the observed degree of adaptation. The change in F1 for the two groups across the successive blocks of ten trials is presented in Fig. 4. It can be seen that although the two groups start off similarly, by the end of the ramp phase the two groups have diverged significantly (as shown by the significant GROUP effect) with the control and experimental group diverging at block 10 (100 trials). The control group continues to adapt throughout the hold phase while the experimental group does not. Both groups display a significant after-effect.

Details are in the caption following the image

Group trends showing the proportion of change in F1 throughout the adaptation procedure for both the sham and TMS group. Error bars show one standard error of the mean.

Discussion

Sensorimotor adaptation occurs when a constant and predictable change in self-generated sensory input results in a compensatory change in motor output. The importance of movement-produced feedback to sensorimotor control, learning and plasticity was elegantly detailed in studies of distorted, displaced and delayed sensory feedback conducted by Held and colleagues (Held & Hein, 1958; Held & Freedman, 1963; Held, 1965). Although the IPL is known to play a major role in visuomotor adaptation (e.g. Clower et al., 1996; Ghilardi et al., 2000; Newport & Jackson, 2006; Girgenrath et al., 2008; Chapman et al., 2010), this is the first study that has addressed the potential role of this brain area in sensorimotor adaptation for speech. Speakers were exposed to a constant alteration in auditory feedback, consistent with a perceived lowering of the tongue within the oral cavity and they adjusted their speech motor output in a direction that is compatible with increasing tongue height. Prior to the feedback manipulation, one group of participants was exposed to inhibitory 1-Hz rTMS over the left SMG which resulted in a reduction in the adaptive response.

For studies involving prism adaptation, the distortion in the visual field occurs instantaneously and error correction is induced rapidly. In the present study, the feedback manipulation was introduced gradually over approximately 2.5 min (40 trials) and then maintained at a maximal level of alteration for approximately 7 min. Using a similar approach, it has been shown that the onset of the adaptive response depends on the feedback manipulation exceeding a certain threshold, or magnitude of change, before a motor adjustment is initiated (Purcell & Munhall, 2006). Consistent with previous studies, both groups exhibited an initial response to the feedback at approximately the same relative magnitude. The similarity in the two groups during the ramp phase of the feedback alteration suggests that the response to the error was not substantially affected by the stimulation. In contrast, the stimulation had a more significant effect once the feedback manipulation was no longer changing (the hold phase). One of the major differences is that during the hold phase, the error becomes more predictable because it is no longer changing. It is during this phase that the two groups differ substantially. The lack of change in the TMS group in the face of incomplete compensation suggests either an inability to detect the induced error, even though the error was still present, or a reduction in the ability to correct further for the induced error. Given that both groups were able to modify their speech motor output during the ramp phase, it appears that error detection was not substantially affected. Rather, it appears more likely that the IPL disruption contributed to reducing the effectiveness of the motor adjustment to the induced error.

The contribution of the IPL to speech motor adaptation is noteworthy on two accounts. In computational and theoretical models of speech production, auditory feedback errors are associated with the posterior superior temporal gyrus (STG). The posterior STG and portions of the planum temporale (PT) have been identified as sites of auditory error detection which become activated when there is a mismatch between predicted sensory consequences of a speech motor action (forward modeling) and the resulting feedback (Guenther, 2006; Tourville et al., 2008). Neuroimaging studies have implicated these brain areas as supporting multisensory (auditory–somatosensory) convergence (Foxe et al., 2002; Dhanjal et al., 2008). These same regions are activated for both speech production and speech perception (Hickok & Poeppel, 2007; Dhanjal et al., 2008; Hickok et al., 2009, 2011; Zheng et al., 2009) and hypothesized to contain both sensory and motor cells used in coordinate transformation (from sensory to motor) for speech (Hickok et al., 2009). In this regard, the lack of effect of the IPL stimulation on the unaltered speech production and the minimal influence of the IPL disruption on the initial phase of the feedback alteration suggest that these auditory cortical areas were not significantly affected by the stimulation. Hence, the IPL contribution to speech production is in contrast to that associated with auditory cortical areas.

One possible difference between the contribution of auditory cortical areas and the IPL to speech production is that of speech motor learning. The parietal cortex has been identified as a site for motor learning and is an integral component for the formation of forward and inverse internal models of motor control (Wolpert & Kawato, 1998; Desmurget & Grafton, 2000; Della-Maggiore et al., 2004). The IPL receives projections from the dorsal auditory and visual streams as well as projections from somatosensory cortex (predominantly feedback) and feed-forward projections from premotor and inferior frontal cortices (for overviews see Rauschecker & Scott, 2009; Rauschecker, 2010). The IPL is the target of output from the cerebellum (Clower et al., 2001) and works in concert with the cerebellum to facilitate sensorimotor prediction and induce sensorimotor plasticity (Blakemore & Sirigu, 2003). Overall, the IPL receives multisensory input and is ideally situated as an important network component for imitation and subsequent learning (Rizzolatti et al., 2006; Molenberghs et al., 2009). For speech, internal models have been have suggested for speech motor learning and inaccurate or impaired sensorimotor predictions have been associated with developmental speech motor problems (see Max et al., 2004; Guenther, 2006). Recently, the IPL, and particularly the SMG, has been shown to be directly involved in modality-independent phonological (sound) processing, a necessary condition for speech motor development (Hartwigsen et al., 2010).

One possibility is that for normal speech production, the auditory cortex is part of a sensorimotor control network executing well-learned, automated speech motor routines and subsequently monitoring real-time performance through auditory feedback. However, in the face of errors requiring internal model adjustment (adaptation/learning), the inferior parietal region may be engaged. Linking on-line or real-time sensorimotor control including feedback monitoring to the posterior superior temporal lobe and sensorimotor learning to the inferior parietal cortex is consistent with a number of observations. When unpredictable, auditory feedback perturbations are introduced (Tourville et al., 2008), when auditory feedback is masked (Christoffels et al., 2007) or auditory verbal feedback is distorted (Fu et al., 2006), increased activation in posterior STG and PT, but not IPL, is observed compensating for impaired self-monitoring. In contrast, when predictable feedback alterations are introduced, such as delayed auditory feedback (Hashimoto & Sakai, 2003) or the current manipulation, recruitment of the IPL and dorsal SMG is observed apparently adapting motor commands to the new sensorimotor conditions. Relatedly, producing well-learned speech (Wise et al., 1999; Blank et al., 2002; Soros et al., 2006) does not normally activate the IPL while producing novel speech or oromotor sequences does (Bohland & Guenther, 2006; Dhanjal et al., 2008).

We suggest that the area in and around the dorsal portion of the SMG in the IPL is an important component in the network for sensorimotor integration associated with speech motor learning and subsequently sensorimotor plasticity (see also Rauschecker, 2010). In contrast, the region in and around the PT and posterior STG is a region for real-time sensorimotor interactions and verbal self-monitoring. These two brain areas subserve different yet related functions. For speech the IPL may be involved in comparing multisensory information and feedback-based learning and, through connections with motor-related regions, update inverse models to establish new speech motor patterns. Undoubtedly, other cortical and subcortical areas contribute to sensorimotor adaptation for speech and speech motor learning. Additional neuroimaging data are needed to map out the underlying neural substrate.

Acknowledgements

This work was supported by grants from the Natural Sciences and Engineering Research Council of Canada, the Canadian Institutes of Health Research and the National Institutes of Health (NIDCD–R01DC00763).

    Abbreviations

  1. anova
  2. analysis of variance
  3. MRI
  4. magnetic resonance imaging
  5. PT
  6. planum temporale
  7. RMT
  8. resting motor threshold
  9. rTMS
  10. repetitive transcranial magnetic stimulation
  11. SMG
  12. supramarginal gyrus
  13. STG
  14. superior temporal gyrus
    • The full text of this article hosted at iucr.org is unavailable due to technical difficulties.