Prosody: The Music of Language and Speech Language has traditionally been considered a function so advanced that it sets the human race apart from our animal counterparts. But prosody—the melody of language—does not lag far behind in making us unique as well. Prosody is a tool of human expression that is conveyed acoustically by way of durational, ... Features
Free
Features  |   March 01, 2003
Prosody: The Music of Language and Speech
Author Notes
Article Information
Speech, Voice & Prosody / Features
Features   |   March 01, 2003
Prosody: The Music of Language and Speech
The ASHA Leader, March 2003, Vol. 8, 6-8. doi:10.1044/leader.FTR1.08042003.6
The ASHA Leader, March 2003, Vol. 8, 6-8. doi:10.1044/leader.FTR1.08042003.6
Language has traditionally been considered a function so advanced that it sets the human race apart from our animal counterparts. But prosody—the melody of language—does not lag far behind in making us unique as well.
Prosody is a tool of human expression that is conveyed acoustically by way of durational, intensity, and frequency cues. To these conventional cues, one could add linearity (e.g., abrupt vs. smooth changes in pitch, loudness, or duration) as a possible fourth dimension.
Like language, learning prosody takes many years, and there is both a productive and a receptive side to it. Some people acquire this function very well; for others, it may involve a lifelong struggle. This may be because they are unable to adequately perceive prosody or because the pattern that they produce attracts unwanted attention.
Broadly speaking, prosody serves to aid the transmission of linguistic and paralinguistic (emotional and attitudinal) information in a manner that is efficient and appropriate in a given language community. Although generally considered a part of phonology, linguistic prosody interacts in a language-specific manner with semantic, syntactic, morphologic, and pragmatic domains of language processing as well.
For example, in stress-timed languages such as English, common prosodic processes include phrasal stress so that certain words receive relative prominence. Since phrasing and syntax are also particular to a language, it follows that the marking of boundaries by way of pausing, pitch resetting, and syllable lengthening are also language specific.
Much less tied to language is what linguists term the paralinguistic or command aspects of prosody. Although not always subject to control, paralinguistic prosody can be deliberate and the outcome may not always be congruent with the content of the linguistic code. Examples of this kind of prosody would include the deliberate signaling of affect, sarcasm, empathy, or the relation one holds to the person or audience being addressed.
Outside the realm of linguistics, prosody has been viewed as an emergent and somewhat ill-defined property, frequently rated as abnormal in motor speech disorders. Recent advances in neuro-imaging along with speech analysis in the physical domain (acoustics, kinematics, force-displacement, feedback) have been instrumental in revealing the sensorimotor nature of prosody and some of its neurologic underpinnings.
Perhaps this has given impetus to increased consideration of prosody as a primary rather than a secondary process in normal and disordered speech. Indeed, prosody doesn’t just come about. Rather, like force, range, and duration of limb movements, prosody is planned and programmed in speech, favoring the assembly of smaller into larger units (e.g., onsets and codas into syllables, syllables into words, etc.).
Intrinsic and Extrinsic Prosody
The terms intrinsic and extrinsic prosody describe motor speech functions that are motivated by the linguistic code on the one hand, and use of the code for pragmatic effect on the other. Intrinsic prosody taps into left lateralized cortical circuits (including the anterior insula) that are part of a distributed loop involving the cerebellum and the basal ganglia (BG). In this regard, data suggest that the left hemisphere, more so than the right, is dedicated to faster speech activities, including voice onset time (VOT) and the production of consonants (transients).
Interestingly, both VOT and the production of consonants involve re-phasing of structures coordinated in speech (programming) that is language specific, sequence sensitive, and stress and rate dependent. For example, VOT is shorter when a stop plosive is preceded by a sibilant than when it is produced by itself as the onset of a syllable, and shorter if the syllable is rapidly produced or does not receive stress than when it is produced slowly or with stress. Likewise, the reassignment of /p/ as a syllable-arresting consonant in /kamp/ to a syllable releasing consonant in /kampout/ is an example of re-phasing. This reassignment is more likely during fast speech rates but less likely if /kamp/ receives stress, speech rate allowing.
In summary, intrinsic prosody organizes the grouping of speech gestures within the short time domain and gives rise to what is best characterized as a discontinuous rhythmicity. To this effect, the left hemisphere-BG-cerebellum complex promotes motor strategies (such as the above) that bind syllables and syllable sequences together, thereby enabling faster and smoother speech. This push toward faster speech can to some degree be accommodated by left hemispheric cortical neurons that are biased toward driving supra-laryngeal (lips/tongue) rather than laryngeal articulation. This is owing to their processing speed advantage over right hemispheric neurons that show a reverse bias.
Overall, it is apparent, however, that for the cerebellar-BG clutch function to work properly and to avoid “ bottleneck” situations, phase relations among processing stages of planning, programming, and feedback need to be properly coordinated. Phasing errors can lead to difficulties with initiating utterances and transitioning to syllables that can in turn lead to the establishment of inappropriate syllabic boundaries. Avoidance of impending bottleneck situations may well take the shape of indiscriminant re-scaling (downward or upward with overall increased monotony) and re-phasing (as may be reflected in VOT, pausing, re-syllabification).
Extrinsic prosody refers to more deliberate modulations that yield meaningful yet subtle differences in the way words are spoken. Unlike intrinsic prosody, it is probably driven by right hemisphere mechanisms, particularly if it involves intonation or is to effect longer perceptual groupings. I believe this aspect of prosody should be differentiated from the more obvious overarching prosodic variations that occur in autonomically—involuntarily—driven (i.e., limbic) emotional speech.
Apart from its preferred reliance on Fo modulation (pitch), extrinsic prosody differs from intrinsic prosody in some other important respects as well. Unlike intrinsic prosody, which typically negotiates the rapid planning and programming of micro-structured sound sequences at subconscious levels of processing, extrinsic (non-limbic) prosody capitalizes on our ability to more consciously (volitionally, deliberately) manipulate speech planning, programming, and/or monitoring for pragmatic effect.
Good lawyers are quite skilled at this function, as they raise rhetorical questions at a deliberately slower than normal speed followed by a longer than normal pause to impress the jury (You think—, this man— seated next to me— is— guilty—?). Clearly, this mode of speech contrasts with automatic stereotypical speech that is thought to rely on left cortical if not basal ganglia-cerebellar processing.
Cognitive Speech Production
For both modes of speech to be balanced in most everyday speaking situations requires working memory, attention, and resource allocation. This task is daunting because it involves organizing motor gestures into short and long assemblies, simultaneously recruiting resources in the left and right hemisphere.
Just this aspect is often problematic in patients with Alzheimer’s and Parkinson’s disease and traumatic brain injury. Faced with overload, they may frequently rely on default speech modes, which I believe are recruited in the left hemisphere. While reaching for linguistic tools, they render their communication efforts pragmatically inept because their speech is frequently too fast, un-intonated, and uncaptivating. This cognitive aspect of speech production is often overlooked and may even play a part in communication disorders associated with autism and schizophrenia, or occur with “press of speech” (incessant continuation of speech) in people with aphasia.
Within the typical speaking population, this balancing act, as well as the decision-making involved in talking in tune not only with oneself but also with an audience, takes years of growing up and listening and depends on social cognitive maturation, and linguistic and sensorimotor development. As children develop, they are bombarded with auditory input. Interestingly, although they may establish syllabic gesture control while imitating themselves as much as others, it is not until they become linguistically active that language-specific scaling patterns such as the preference of a trochaic stress pattern in English (stressed syllable followed by an unstressed syllable (e.g., round about the cauldron go) are adopted. Likewise, although infants are capable of producing a wide range of VOTs, language-specific control of VOT is mastered only when children reach about 30 months of age.
Mastering the Music of Language
Learning the music of one’s language is difficult for at least three reasons. First, children may need to avoid spectral and durational (rate) differences between their own and adult productions as they are learning to produce segmented speech that is consistent with the ambient linguistic code (i.e., the language of the caregivers). In fact, recently discovered mirror neurons (neurons whose activity correlates with observed action including speech) may help this kind of linguistic speech production as they signal information that can be employed for the programming and planning of “motor” segments. However, their contribution of dynamic aspects such as phasing and scaling of speech movements to effect loudness and rate modulations is yet to be revealed. From this perspective, it may in fact be more difficult to motorically discover and/or generate the melody of the language than it is to produce the segments of a language.
Another reason why this task is challenging is that prosodic cues are not independent and in fact systematically co-vary (e.g., loudness and pitch may both be increased in stressed syllables) to some degree. Intra- and inter-speaker differences suggest that cues may be flexibly combined to achieve desired prosodic goals. To the extent that males have a smaller range, at least where pitch is concerned, and may be biased to employ left-hemisphere prosodic strategies, they may in comparison to females be less inclined/capable to adopt listener-oriented expressive speech strategies. In this regard, males are indeed at a disadvantage to learn “motherese” (an overinflected and sometimes neologistic mode of verbal communication used by mothers to stimulate communication with their infants or toddlers).
Finally, balancing demands for intrinsic and extrinsic prosody can be quite tricky. I believe that, as children develop, they alternate periods where they seem to work on either intrinsic or extrinsic prosody. As an illustration of this, I remember my son Michael’s attempts to sing when he was about 4 years old. At that time, his renditions contained quite a bit of melody even though the song had few words to it. Yet, when he turned 5 he impressed me with fast renditions of fairly inarticulate songs that had lots of words in them.
Reading the Music of Language
The music of language and speech is intricate, and finding the key or reading the “notes” in which it is written remains a challenge. But taking on this quest is intriguing, particularly because it can be pursued from many vantage points that in themselves represent fresh and promising lines of inquiry.
We have yet to begin to unravel the implications of prosody development for the theory of mind that unfolds from childhood into adolescence. Likewise, since the melody of speech and language is at the crossroads of many domains—cognitive motor, emotional motor, attentional motor, to name just a few—paying close scrutiny to the manner in which it is disordered will likely be instrumental in the diagnosis and treatment of speech and language disorders.
Frank Boutsen is a former Mayo Fellow and has published several articles and chapters in the area of motor speech disorders. His current research interests include prosody in motor speech disorders. He is affiliated with the University of Oklahoma Health Sciences Center where he teaches undergraduate, graduate, and doctoral courses in the neural bases and neuropathologies of speech.
Suggested Readings
Boutsen, F., & Christman, S. (2001). Aprosodia: Whether, where and why? In Maassen, B., Hulstijn, W., Kent, R., & van Lieshout, P.H.M.M. (Eds.), Speech motor control in normal and disordered speech (pp. 232–236). Nijmegen, the Netherlands: Vantilt.
Boutsen, F., & Christman, S. (2001). Aprosodia: Whether, where and why? In Maassen, B., Hulstijn, W., Kent, R., & van Lieshout, P.H.M.M. (Eds.), Speech motor control in normal and disordered speech (pp. 232–236). Nijmegen, the Netherlands: Vantilt.×
Buccino, C., Binkofski, G. R., Fink, G. R., Fadiga, L., Fogassi, L., Gallese, R., et al. (2000). Action observation activates premotor and parietal areas in a somatotopic manner: An fMRI study. European Journal of Neuroscience, 13, 400–404.
Buccino, C., Binkofski, G. R., Fink, G. R., Fadiga, L., Fogassi, L., Gallese, R., et al. (2000). Action observation activates premotor and parietal areas in a somatotopic manner: An fMRI study. European Journal of Neuroscience, 13, 400–404.×
Cadalbert, A., Landis, T., Regard, M., & Graves, R. E. (1994). Singing with and without words: Hemispheric asymmetries in motor control. Journal of Clinical and Experimental Neuropsychology, 16(5), 664–670. [Article] [PubMed]
Cadalbert, A., Landis, T., Regard, M., & Graves, R. E. (1994). Singing with and without words: Hemispheric asymmetries in motor control. Journal of Clinical and Experimental Neuropsychology, 16(5), 664–670. [Article] [PubMed]×
Mayer, M., Dogil, G., Ackermann, H., Erb, M., Rieker, A.,Wildgruber, D., & Grodd, W. (2000). Prosody in speech function: A paradigm for functional imaging and first result. In Proceedings on the fifth seminar on speech production: models and data (pp. 281–284). Munich: Universitat Munchen.
Mayer, M., Dogil, G., Ackermann, H., Erb, M., Rieker, A.,Wildgruber, D., & Grodd, W. (2000). Prosody in speech function: A paradigm for functional imaging and first result. In Proceedings on the fifth seminar on speech production: models and data (pp. 281–284). Munich: Universitat Munchen.×
Monrad-Krohn, G. H. (1947). Dysprosody or altered “melody of language.” Brain, 70, 405–415. [Article] [PubMed]
Monrad-Krohn, G. H. (1947). Dysprosody or altered “melody of language.” Brain, 70, 405–415. [Article] [PubMed]×
Monrad-Krohn, G. H. (1963). The third element of speech: Prosody and its disorders. In Halpern, L. (Ed.), Problems of dynamic neurology (pp. 101–117). Jerusalem: Hebrew University Press.
Monrad-Krohn, G. H. (1963). The third element of speech: Prosody and its disorders. In Halpern, L. (Ed.), Problems of dynamic neurology (pp. 101–117). Jerusalem: Hebrew University Press.×
Riecker, A., Ackermann, H., Wildgruber, D., Dogil, G., & Grodd, W. (2000). Opposite hemispheric lateralization effects during speaking and singing at motor cortex, insula, and cerebellum. NeuroReport, 11(9), 1997–2000. [Article] [PubMed]
Riecker, A., Ackermann, H., Wildgruber, D., Dogil, G., & Grodd, W. (2000). Opposite hemispheric lateralization effects during speaking and singing at motor cortex, insula, and cerebellum. NeuroReport, 11(9), 1997–2000. [Article] [PubMed]×
Studderd-Kennedy, M. (2000). Imitation and the emergence of segments. Phonetica, 57, 276–283.
Studderd-Kennedy, M. (2000). Imitation and the emergence of segments. Phonetica, 57, 276–283.×
0 Comments
Submit a Comment
Submit A Comment
Name
Comment Title
Comment


This feature is available to Subscribers Only
Sign In or Create an Account ×
FROM THIS ISSUE
March 2003
Volume 8, Issue 4