A week ago, I “discovered” a fascinating toy in the web. the Toobaloo. It is a simple tube made from plastic with two bent, broadened ends. It looks a bit like a telephone handset and “magnifies the voice, making it easier for students to hear the sounds that make up words (phonemes) as they learn to read, spell, or process language. In speech therapy, the Toobaloo’s auditory feedback helps improve articulation and phonology.” – so the vendors, and it is used by therapists against auditory processing disorder (APD), dyslexia, and stuttering (see this video).
If the Toobaloo is successfully used in the treatment of stuttering (I hope this will be evaluated soon), then this can be taken as evidence that (i) enhanced attention to the auditory feedback of speech reduces stuttering and that (ii) the opposite assumption – that stutterers too much rely on auditory feedback, and that fluency-inducing conditions like chorus reading or DAF work by distraction from auditory feedback – are false.Here are two videos about the Toobaloo:
By the way: Lind et al. (2014) demonstrated how much normal fluent speakers rely on auditory feedback: The experimenters covertly manipulated their participants’ auditory feedback in real time so that they said one thing but heard themselves saying something else. In 85% of all cases in which the exchange went undetected, the inserted words were experienced as self-produced. The results suggest: When we are speaking, we indeed have an idea of the message we are going to tell, but it is the auditory feedback which informs us about what we actually have exactly said. I will extensively discuss this study in the next post.
to the top
Today, I want to discuss the results of a study recently published by Alexander Whillier, Sina Hommel, Nicole Neef, Alexander Wolff von Gudenberg, and Martin Sommer in the journal PLoS One, titled: “Adults who stutter lack the specialised pre-speech facilitation found in non-stutterers.”
The study focused on speech preparation, particularly on the excitability of neurons in the tongue representation of the primary motor cortex prior to speech onset. Transcranial magnetic stimulation of the motor tongue area was applied to elicit motor-evoked potentials which were recorded from the tongue. Recordings were made using a special mouthpiece that allowed participants to move the tongue in speaking.
The researchers found a reduced facilitation in the motor tongue representation in adults who stutter, compared to controls, in all the three experimental conditions. The conclusion they derive from there findings is very interesting. They propose the existence of a speech specialized program of facilitation in normal fluent speakers and its absence in stutterers. They write:
“Specifically, under normal everyday speech conditions, adults who who do not stutter have a specialization for speech preparation that does not generalize to non-speech mouth movement actions; conversely, adults who stutter do not have this specialization for speech coordination, and must rely on their general movement preparation sequences for all forms of speech.” (see 4.5.0 in the study)
I totally agree with this conclusion, but now the question arises: What kind of specialization for speech preparation is it? In which way, by what, may the sensorimotor program of a word or syllable be facilitated to start? I think it is a kind of ‘sensory frame’ which is activated prior to a voluntary motor action: The motor cortex needs to be coupled with the sensory cortical areas of the modalities in which sensory the feedback of the motor action is expected.
A sensorimotor program, according to its name, has not only a motor but also a sensory component. When I reach for a cup, I expect (anticipate) to feel the handle between my fingers. I would be very surprised or terrified if this perception did not be the result of my hand movement. I don’t believe in efference copies: First, there is no copier in the brain, and second, it is not plausible to assume that sensory expectation results from motor action – sensory anticipation might rather be the trigger of voluntary motor action. I think it is the anticipated sensory goal of a voluntary motor action that facilitates the activation of just that motor program (or a learned sequence of programs) the execution of which leads to the anticipated sensory perception.
When I flex one of my fingers then I expect the finger feeling flexed – it’s very simple. Every voluntary, internally initiated motor action has a goal, and this goal is primarily a sensory one – even if there are also goals on higher, more abstract levels. I reach for the cup because I wish to drink coffee or tea in order to make my fatigue vanish and do a better job and be successful, etc. – but the primary goal of the movement is to feel the handle of the cup between my fingers. It is not plausible to believe that a voluntary motor action is initiated without the corresponding sensory expectation being activated at least at the same time, if not earlier. And it may just be this specific sensory expectation – or an interaction, an excitatory loop between sensory expectation and motor program – that facilitates the start of the movement.
Originally, jaws, tongue, and lips did not develop for speaking, but for ingestion, for biting and chewing, the sensory frame of which is some sort of tactile, kinesthetic, and taste perception – it is important not to eat spoiled food and not to bite one’s own tongue when chewing. The auditory modality is rather unimportant in this sensory frame. In speaking, bu contrast, the auditory modality is crucial for both, language acquisition and self-monitoring.
The initiation of speech is a voluntary act, and as in other voluntary motor actions, we may have goals on a higher level, e.g., telling a story or expressing a thought or emotion. But the primary and simplest sensory expectation associated with speaking is: to hear one’s own voice. This, and perhaps the expectation of the initial phoneme(s) to hear might be the appropriate sensory frame facilitating the start of the articulatory program of a word.
The simplest reason of saying a word is that it is heard. This fact is most clear in shouting aloud “hello!” or “help!” (and hardly a stutterer will be disfluent then). In everyday talking, however, we usually focus on the ‘higher level goals’ of speaking: We want to tell a content. My personal experience is: The more I focus on the intended content of my speech – the more complicated or emotionally exciting this content is – the more likely I stutter. When I, by contrast, direct some attention to the sound of my voice and of my words, then I’m fluent.
Therefore, my hypothesis about the nature of the special facilitation mechanism in speech proposed by Whillier and colleagues is the following one: It is the coupling of articulatory motor programs with their expected sensory results, mainly in the auditory modality. Simply said, one should expect to hear something when starting speech. This is, at the same time, an (automatic) allocation of attention, i.e., of perceptual and processing capacity in a way such that the auditory feedback of speech is sufficiently processed.
Insufficient processing of auditory feedback, after my theory, is the main cause of stuttering. Insufficient processing of the sensory feedback of breathing can be an additional factor, as the interplay between respiration and the movement of jaws, tongue, and lips in speaking is very different from that in eating: The start of the expiratory movement must be coordinated with the onset of an utterance, which requires sufficient processing of the sensory feedback of breathing (see Section 2.2 in the main text). Thus I think it is the appropriate sensory frame for speaking which facilitates both, the start of articulatory programs and the proper processing of their expected sensory feedback. So far my hypothesis; below, I discuss some empirical results in more detail to look whether the hypothesis is consistent with them.
Before going back to the above-mentioned study being the occasion for this post, let us have a look at two earlier studies conducted by Martin Sommer, Nicole Neef and their co-workers, in which the same method – transcranial magnetic stimulation (TMS) of the tongue representation in the primary motor cortex and recording of motor-evoked potentials (MEPs) from the tongue – was applied:
Neef et al. (2011) measured MEPs at rest after TMS pulses, and they found the intracortical excitability of the primary motor tongue representation to be less modulated by TMS in adults who stutter compared to fluent speakers. These were results at resting state, and the question arises: Did the reduced excitability modulation result from an intrinsic property of the motor cortex or from interactions with other brain regions, perhaps from the state of the whole brain in rest?
Resting state may be different in different individuals, reaching from being alert, sensorily extroverted and ready to react, on one side, to being introverted, focused on one’s own thoughts or emotions, on the other side. The first state may be associated with greater, the latter one with lesser excitability in the motor cortex. Sensory extroversion or introversion is a person’s ‘default mode’ of attention allocation, a general sensory frame of behavior. I do not claim that all stutterers are introverted, but the embedding of voluntary motor behavior in the sensory system seems to be weaker on average, compared to normal fluent speakers. This is suggested by some fMRI studies, in which resting state functional connectivity (RSFC) of brain areas in stutterers and non-stutterers was investigated.
The results show group differences in connectivity between the motor and premotor system, on one hand, and sensory, especially auditory areas, on the other hand. Yang et al. (2012) found decreased RSFC “between the posterior language area involved in the perception and decoding of sensory information and anterior brain area involved in the initiation of speech motor function”, and Yang et al. (2016) identified alterations of RSFC within basal ganglia-thalamocortical networks as “the reduced connectivity of the putamen to the superior temporal gyrus and inferior parietal lobules in adults who stutter” (both cf. the Abstracts).
Chang and Zhu (2013) found children who stutter to “exhibit attenuated [resting state] functional and structural connectivity within the auditory-motor network.” (p. 3709), and Chang et al. (2018) found aberrant, mostly decreased RSFC within and between intrinsic connectivity networks in the brain in children who stutter, mainly regarding the default mode network and the dorsal and ventral attention network, but even the visual network.
These results suggest that the state of the brain in stutterers tendentially differs in rest from that in normal fluent speakers, and the aberrant excitability of the motor tongue representation in rest found by Neef and colleagues may be part of that general difference; It may result form a weaker default connectivity between motor and sensory system in stutterers, I think this is broadly in line with the authors’ suggestion “that the reduced intracortical facilitation might be mediated by disturbed interaction between cortical and subcortical networks modulating inhibitory and facilitatory intracortical circuits” (p. 1810), although I doubt whether the interaction is really disturbed or simply different, perhaps somewhat anomalous.
To elucidate whether the reduced modulation of excitability found in resting state is also present during speech production, Neef et al. (2015) applied TMS pulses during speaking. Verbal stimuli consisted of German verbs always starting with the prefix ‘auf’ ,e.g. ‘aufbleiben’ (to stay up), ‘aufstehen’ (to stand up), etc. In each trial, a verb was presented visually without the prefix, and participants had to read the verb silently and to remember it. After an interval of 3 seconds, a plus sign was displayed informing the participant to speak the prefix ‘auf’ and to prolong the fricative [f]. After 1500ms a question mark appeared prompting the articulation of the verb previously read (see p. 715 and Fig. 1A in the study).
The result was a left-lateralized facilitation of the tongue representation in the primary motor cortex in fluent speakers, but the absence of such facilitation in adults who stutter. Moreover, the magnitude of facilitation was negatively correlated with the frequency of stuttering.
The authors interpret these results as follows: “The fact that adults who stutter did not generate a facilitation of left motor cortex is likely related to weaker structural connectivity and altered interplay between left hemisphere speech-related brain regions” (722), and I totally agree so far. But then they suppose that the transfer of selected sensorimotor programs from left BA44/BA6 to the orofacial motor cortex was impeded with the consequence that motor planning could exert a smaller influence on motor cortex excitability. However, I think it is mainly the sensory component – the sensory expectation or the appropriate sensory frame – which was not sufficiently provided.
What would happen if the motor program of a word was not or insufficiently transferred to the motor cortex? The speaker (or his brain) wouldn’t know how to move the muscles to articulate a word or phoneme. The result should either be a fuzzy or chaotic movement or a sudden lack of impetus. Both does not meet the experience in stuttering: symptoms are distinct, and speech impetus is strong. Further, if the transfer of motor plans was the crucial problem in stuttering – how then to explain the effect of altered auditory feedback? Therefore, it may rather be the insufficient sensory component of sensorimotor plans in stutterers, which caused the group difference in motor excitability facilitation.
An interesting detail is the finding that the increase of excitability during the prolongation of the fricative [f] was left-lateralized in stutterers, contrary to a trend towards a right lateralized increase in fluent speakers. That was surprising because a comparable increase of excitability in both hemispheres had been expected (p. 723). A possible explanation is that stutterers, familiar with (involuntarily) prolonged speech sounds, took the “ffff” as a phoneme, which supported a processing in the left auditory cortex, whereas non-stutterers tended to take “ffff” as non-linguistic noise, which supported a processing in the right auditory cortex. This speculation implies that the activation of the auditory cortex modulates the excitability in the orofacial motor area.
The latter, however, is also suggested by findings of a negative correlation between activation in the auditory cortex and stuttering severity (Braun et al., 1997; Fox et al., 2000; Ingham et al., 2004). If stuttering severity is negatively correlated with auditory activation (A) as well as with the magnitude of excitability facilitation in the motor tongue area (B), then A and B can hardly be independent from each other. But what is chicken, and what is egg? At least we know that an acoustic factor – altered auditory feedback of speech – can cause stuttering to disappear.
Further, there are several studies in which a relationship was found between enhanced speech fluency in certain situations (e.g., in fluency-inducing conditions, or before and after therapy) and greater activation in the auditory cortex (see Table 1 in the main text for an overview). Is it plausible to assume that such changes of fluency occur without change of the excitability in the orofacial motor area? I think it is the auditory activation indicating the presence of an adequate sensory frame of speaking which facilitates orofacial motor excitability in such conditions.
The new study, Whillier et al (2018), is an extended replication of Neef et al. (2015). They adopted the experimental setup, but designed two new patterns of procedure, Experiment 1 and 2, whereas Experiment 3 replicated the 2015 study. Here is a short description of what participants had to do in the three experiments:
Experiment 1: Speaking the prefix “auf” and the verb immediately as soon as the verb appeared on the screen. A fullstop character signaled “readiness” directly prior to the target presentation of the verb.
Experiment 2: The second experiment increased the preparation time in order to distribute the cognitive load over time. Participants silently memorized the displayed verb prior to the speech signal that appeared ca. 5 seconds after verb presentation,
Experiment 3: replicated the design applied in the 2015 study. It began with silent memorization of the verb as in Exp. 2. About 3 seconds after verb presentation, a visual signal prompted participants to pronounce the prefix “auf” for 1500ms by prolonging the fricative (“auffffff”), before transitioning into the remembered verb when the speech signal appeared.
As mentioned above, excitability facilitation of the left orofacial motor area was reduced in the stuttering group in all the three experiments. However, the between-group difference was smallest in Experiment 3, where stutterers approached non-stutterers’ levels, and greatest in Experiment 1. The authors explain this by assuming that the increased timing provided by drawing out the utterance, reduces complexity (see 4.5.0 in the study). But they concede that Experiment 3 possibly placed additional working memory demands (keeping the verb in memory concurrently with the “ffff” prolongation).
From my view, an alternative explanation could be as follows: The “ffff” prolongation provided an acoustic stimulus, which supported the activation of a speech-adequate sensory frame by drawing a sufficient portion of the speaker’s attention to the auditory channel. In Experiment 1 and 2, by contrast, nothing supported the activation of the adequate sensory frame, as stimuli and task-relevant signals were presented visually.
The hypothesis that the group difference in excitability facilitation in the motor tongue representation found by Whillier and colleagues was due to the fact that the control participants activated an appropriate sensory frame (with main focus on auditory input) before they started speaking the verb, whereas the stuttering participants did not (or in a lower degree) should be testable in the following ways:
(1) In a variation of Experiment 1, stimuli (verbs) and signals are presented acoustically. This is not simply a word repetition or ‘shadowing’ condition, as the prefix “auf” must be produced prior to the verb. A reduced group difference in MEP facilitation in this experiment as compared with the original Experiment 1 can be taken as evidence that the excitability in the motor tongue area prior to speech is modulated by the auditory cortex.
(2) In a variation of Experiment 2, verbs and task-related signals are presented acoustically, and participants are instructed to imagine the word (prefix+verb) acoustically when memorizing it (imagine what the word sounds like). In a recreation of the original Experiment 2, participants can be instructed to imagine the word (prefix+verb) visually, in its written or printed shape, when memorizing it. If the group difference in MEP facilitation in the acoustic variant is smaller than in the visual one, then this can be taken as evidence that the appropriate sensory frame for speaking facilitates the excitability in the motor tongue area prior to speech onset.
to the top