Spoken Language Processing: A Convergent Approach to Conceptualizing (Central) Auditory Processing More than 50 years have passed since it was determined that lesions in the central auditory nervous system could significantly impact the processing of auditorily presented information (Bocca et al., 1954). Initially, testing focused on determining sites of lesion but eventually researchers applied many of these tests to children who ... Features
Free
Features  |   June 01, 2006
Spoken Language Processing: A Convergent Approach to Conceptualizing (Central) Auditory Processing
Author Notes
  • Larry Medwetsky, is currently Vice President of Audiology at the Rochester Hearing and Speech Center. He has served individuals with spoken language processing difficulties in various settings. Medwetsky has published and presented on many different topics with a special focus on the underlying speech processing processes and deficits in populations with and without hearing loss. Contact him by e-mail at lmedwetsky@rhsc.org.
    Larry Medwetsky, is currently Vice President of Audiology at the Rochester Hearing and Speech Center. He has served individuals with spoken language processing difficulties in various settings. Medwetsky has published and presented on many different topics with a special focus on the underlying speech processing processes and deficits in populations with and without hearing loss. Contact him by e-mail at lmedwetsky@rhsc.org.×
Article Information
Hearing & Speech Perception / Hearing Disorders / Attention, Memory & Executive Functions / Speech, Voice & Prosody / Features
Features   |   June 01, 2006
Spoken Language Processing: A Convergent Approach to Conceptualizing (Central) Auditory Processing
The ASHA Leader, June 2006, Vol. 11, 6-33. doi:10.1044/leader.FTR2.11082006.6
The ASHA Leader, June 2006, Vol. 11, 6-33. doi:10.1044/leader.FTR2.11082006.6
More than 50 years have passed since it was determined that lesions in the central auditory nervous system could significantly impact the processing of auditorily presented information (Bocca et al., 1954). Initially, testing focused on determining sites of lesion but eventually researchers applied many of these tests to children who were exhibiting processing difficulties in the academic setting. In time, researchers have tried to develop a better understanding of the processes underlying these difficulties and how to best assess and determine the nature of these deficits. Yet, there has been much debate about what underlies central auditory processing and whether, in fact, it actually exists. Even as far back as 1973, Rees (1973) questioned whether central auditory processing disorder (CAPD) was a meaningful concept. That is, was CAPD an actual disorder or merely a reflection of a language disorder that was manifested when stimuli were presented auditorily?
Documents Related to CAPD
In 1995, ASHA convened a task force with the goal of developing a consensus statement (ASHA Task Force on Central Auditory Processing Consensus Development, 1996). The task force defined central auditory processes as the auditory system mechanisms and functions responsible for a number of behavioral phenomena, including: sound localization and lateralization, auditory discrimination, auditory pattern recognition, temporal aspects of audition, auditory performance with competing or degraded acoustic signals. CAPD was defined as an observed deficiency in one or more of the listed behaviors. The consensus group also recognized that there were many neurocognitive mechanisms and processes enlisted in the processing of acoustic signals.
In 2000, a gathering of 14 scientists and clinicians proposed a name change from central auditory processing disorders to auditory processing disorders (APD) to reflect that difficulty in processing auditorily presented information can occur not only in the central auditory nervous system but in peripheral structures as well (Jerger & Musiek, 2000). The group broadly defined APD as a deficit in the processing of information specific to the auditory modality, that is, devoid of any external influences, such as memory, attention, and so forth.
ASHA reconvened a task force in 2005 to address the role of the audiologist in (C)APD. The task force developed a technical report and put forth a position statement (ASHA, 2005) that reaffirmed that (C)APD refers to difficulties in the processing of auditory information in the central nervous system, as demonstrated by poor performance in one or more of the behavioral phenomenon listed earlier in the 1996 consensus statement. However, the working group also concluded any definition of CAPD that requires complete modality specificity (i.e., demonstration of a deficit in the neural processing of auditory stimuli due solely to a deficit in the auditory modality) as a diagnostic criterion is neurophysiologically untenable. For example, it has been shown that attention can affect auditory evoked potential responses as early as 20–50 milliseconds post onset (Woldorff et al., 1993).
A Broader Perspective
One problem with all of the above documents is that a determination of only those aspects related to auditory perception, at best, may address only some of the “listening” difficulties confronting people in everyday life. Second, at this point in time there is no definition of CAP that provides a conceptual framework as to how the various processes are engaged in the transduction of acoustic stimuli (especially speech) and ultimately processed.
This broader perspective moves us beyond auditory processing toward an attempt to understand how individuals perceive and process spoken language, in what I term “spoken language processing.” In this context, CAP is considered a component of spoken language processing.
Spoken Language Processing
What follows is a simplified conceptualization of the major stages/transformations in which spoken language information is processed. For more detailed information, the reader is referred to Medwetsky (2002).
Initially, the acoustic signal is transformed into its neuroelectrical representations. These undergo a number of refinements within the central auditory nervous system whereby frequency, intensity, and phase information is derived. It should be noted that Kraus and her colleagues at Northwestern University (Johnson, Nicol & Kraus, 2005) have discovered that approximately one-third of the individuals with learning disabilities (LD) they evaluated revealed abnormal brain stem response patterns when presented with the speech stimulus /da/ (selected because of the short duration of the consonant, rapid formant transitions, and steady state vowel). This finding suggests that a significant proportion of individuals with LD exhibit difficulty with neural transmission at the very earliest stages of processing (i.e., within the central auditory nervous system).
The derived neuroelectrical patterns are ultimately relayed to the cortices of both hemispheres. For most individuals, the right hemisphere is involved in analyzing the more global aspects of the acoustic signal, that is, the slower changing acoustic aspects that convey suprasegmental information (such as amplitude and fundamental frequency contours), while the left hemisphere is involved in the analysis of the faster changing, frequency, and temporal features such as those that underlie formant transitions and voice onset time.
The extracted neuronal information is then conveyed to the linguistic centers and, in turn, activate phonological/lexical representations in long-term memory that best match the incoming neuroelectrical patterns (although, where and at what point in processing the suprasegmental information is integrated with the linguistic information remains unclear). The process, whereby the words corresponding to the acoustic input are derived is called lexical decoding. The speed by which words are decoded is coined “lexical decoding speed” and depends on a number of factors, including: (1) initial stimulus strength (i.e., more intense signals are more likely to activate a stored percept than one whose intensity is around threshold); (2) amount of attention allocated to the processing of the stimulus; (3) activation thresholds of the stored representations in long-term memory (e.g. words that are of common usage or of high emotional content-such as one’s name-have low activation thresholds and are more easily activated); (4) neuronal organization and connectivity (e.g., commonly occurring tasks or familiar information have highly organized pathways, while unfamiliar tasks/information likely have weak, disorganized connections), and (5) phonological, semantic, and syntactical representation/organization, as well as world knowledge of the topic being presented. It is this combination of factors that allow individuals to hear the sentence “My mother baked a c…,” and be able to identify the missing word as cake, without hearing the whole word. It is important to view this result as a complex interaction of the different types of information received via the sensory modalities and from information already stored in the brain.
When activation thresholds are sufficiently exceeded, they result in conscious perception (i.e., become accessible to the person, with the information being said to reside in short-term memory). However, neurons can only fire for approximately one to two seconds before they return back to their resting state. If individuals are to retain newly presented information, such as a phone number, neuronal firing must continue. This is accomplished through the active cognitive process known as attention (discussed earlier as part of the initial decoding stage).
Attention, as discussed here, is the process whereby cognitive resources are allocated to a specific source(s), and the competing information that the listener considers as irrelevant, is ignored. There is a general consensus that attentional direction is under the domain of the pre-frontal context (collection of neuronal structures in the pre-frontal cortical region of the brain). Attention is thought to involve pre-frontal cortical neurons creating heightened excitation states in those neurons processing the “target” information, and possibly in concomitant neuronal inhibition for stimuli deemed to be irrelevant. This continued neuronal excitation (through cognitive strategies such as rehearsal and visualization that maintain neuronal activation) allows the neurons to refire and maintain information in short-term memory. Cessation of attention prematurely will result in the rapid forgetting of that information. Thus, an inability to allocate attentional resources effectively, such as in children with attention deficit hyperactivity disorder (ADHD), is often associated with weak working memory.
A key aspect of short-term memory is its limited duration and capacity. Only a limited amount of information can be retained in short-term memory at any one point in time, commonly known as “short-term memory span.” In adults, this is limited to the magic number 7+/-2 units/chunks; that is, for familiar stimuli, from 5 to 9 units can typically be held and recalled. A unit can be a number, letter, word, grammatical clause, direction, concept, etc. It has been theorized that 7+/-2 units represent the maximum number of neuronal regions whose firing can be maintained/ attended to at any one point in time.
To be useful in linguistic processing, information must not only be retained in short-term memory but also in the same order as presented. For segmental information, this likely occurs as part of the initial processing of the stimuli. However, for longer segments, such as a sequence of directions, the ability to maintain the correct order appears to require pre-frontal cortical control.
Another important processing stage, though not always appreciated, involves the integration of suprasegmental and linguistic information. This must occur in real time and relies on bi-directional inter-hemispheric information transfer across the corpus callosum. It is this process that allows one to differentiate the intended meanings of: (1) “That’s an airplane?” (rising intonation contour) versus “That’s an airplane.” (falling intonation contour); (2) per´mit (a license) versus per mit´ (to allow) and (3) “Out standing in his field” (a farmer) versus “Outstanding in his field” (Nobel prize winner). Integration difficulties can make it harder to derive intended meanings. For example, individuals with autism or Asperger’s syndrome have much greater difficulty integrating and using the information conveyed by suprasegmental information and, as a result tend to be very literal in their interpretation of what has been said. These difficulties also occur in those language users who find it difficult to interpret jokes, puns, and sarcasm because they cannot recognize and interpret the subtle suprasegmental changes used in these forms of discourse.
The above describes the processes that occur in quiet listening situations, but what about those situations in which competing noise is present? Processing in noise versus quiet can present significantly different challenges to the listener. Of interest is, what happens when competing noise is present and what are the processes that are engaged? One component involves frequency resolution/tuning curves for initial speech-to-noise enhancement. That is, the sharper the tuning curves, the more easily extraneous frequency components can be rejected from impinging on speech perception. However, in individuals with hearing loss (and likely in some individuals with normal hearing), the tuning curves tend to be wider and allow for more extraneous stimuli to be processed. A second component involves auditory coherence which relates to the perceptual ability of segregating competing stimuli into separate acoustic streams, a crucial component in the ability to selectively attend to one stream and ignore “irrelevant” streams. Another factor is the degree of separation between the “target” talker and competing source, with research showing that the degree of separation is correlated to an individual’s performance in competing noise; this is likely due to the ease to which neuronal attentional direction can be implemented. Lastly, the strength/fidelity of the neurological firing patterns impacts the strength of the ensuing memory traces. Weak and/or distorted acoustic/neurological transmission results in weak memory traces that are easily susceptible to the effects of forward/backward masking, and, in turn, can result in a reduced ability to perceive signals in noise.
A Better Conceptualization
The conceptualization of processing described above allows us to: (1) have a frame of reference for how normal spoken language processing occurs and a basis for understanding the impact/implication of deficits in specific processes; (2) have a foundation for developing a test battery that can determine the underlying nature and severity of any deficits present; and (3) derive management strategies that can best address these deficits, if present.
Assessment Goal
What is our goal in assessing an individual who is having difficulty processing spoken language, and, in turn, experiencing academic- or work-related difficulties? If we recognize that the processing of spoken language involves an intertwining of auditory processing, cognition, and language, ultimately we can construct an assessment battery that allows us to better understand the overall nature of an individual’s difficulties. This can be accomplished only through an interdisciplinary approach to testing.
Certain aspects of processing are best assessed by audiologists. This can include electrophysiologic measures (such as brain stem responses to speech stimuli, P300) as well as the assessment of processing skills that are best examined through audiometric behavioral test measures (such as temporal resolution, lexical decoding speed, binaural separation and integration, divided auditory attention-see below). By using stimuli such as pure tones, digits, single syllable words, and sentences of low-level linguistic content, audiologists can minimize the contribution of basic language components on testing (though still recognizing the impact of higher order linguistic influences such as phonological representations and lexical representations/organization on processing).
Speech-language pathologists can also greatly add to the understanding of the nature of the spoken language processing deficit(s) that may be present. Depending on the presenting symptamatology, this can involve assessing phonological awareness/phonics, lexical and semantic knowledge/organization, syntactical abilities, sequencing, figurative language, pragmatics, etc. Other professionals also may contribute to the assessment process. See the sidebar for a suggested assessment battery.
Management Recommendations and Implementation
Once the results and implications are gleaned, recommendations are provided and management strategies are implemented using the Tripod Approach advocated by Ferre (2002). This consists of cataloguing management approaches into three main categories:
  • Compensatory approaches: strategies that address processing deficits regardless of the underlying deficit, such as extended time, insertion of pauses into discourse, pre-teaching of material before being covered in class, etc. Other approaches such as a spelling variance (whereby students are judged by the content of their work and are not penalized by any spelling errors present) are designed to address a specific deficit.

  • Environmental modifications: acoustical treatment, use of overhead projectors/PowerPoint presentations, preferential seating, quiet testing room, earplugs during study time, assistive listening systems.

  • Specific therapeutic approaches: address specific deficits, such as phonological awareness, lexical decoding speed, auditory-linguistic integration (such as improving inter-hemispheric transfer of auditory-linguistic information to reduce the effects of ear dominance), fading-memory (i.e., earlier presented information fades rapidly from short-term memory), and so forth.

An Interdisciplinary Approach
In going back to the question posed by Rees in the beginning of this article, “Is CAPD an actual disorder or merely a reflection of a language disorder that is manifested when stimuli are presented auditorily?” the answer appears to be “Yes” and “Yes.” The work by Kraus provides objective evidence that for a certain segment of the population with processing issues their difficulties are due at least in part to inefficient transmission within the central auditory nervous system. However, for others, the difficulties in processing speech may be due to language processing issues (such as disorganized lexical representations/associations) or ineffective cognitive skills such as those involving sequencing or attentional allocation.
Thus, spoken language processing clearly involves the intertwining of auditory, cognitive, and language processing mechanisms. Deficits in any of the processing stages will impact on an individual’s ability to process incoming speech effectively. Although delineation of specific central auditory processing deficits is a worthwhile endeavor and will add significantly to our test regimen, we have the ability to go beyond this and assess an individual’s overall spoken language processing abilities. This necessitates an interdisciplinary assessment approach. Some processing skills will be best assessed by audiologists, while others will require the expertise of an SLP (and, in some cases, related professionals). The combined information can determine if and why individuals are having processing issues that are impacting them in everyday life, and, in turn, provide recommendations/management strategies that can assist clients directly and/or guide other professionals in implementing specific therapeutic approaches.
Summary of Processes Engaged in Spoken Language Processing
  • Incoming stimuli are converted to neurological representations that are compared to patterns stored in long-term memory (LTM).

  • If there is a match and sufficient attention is paid to the signal, the LTM representation is activated (this activated state is called short-term memory or STM). This matching and activation process, known as decoding, must be done quickly and accurately.

  • Information can reside in STM for a very short period of time, unless attention is devoted to maintaining this information (e.g., rehearsal or active processing-thinking about what has been presented).

  • For most individuals, linguistic information is processed in the left hemisphere, while suprasegmental information is processed in the right hemisphere; these are somehow integrated “on the fly.”

  • The processed information must be retained in the same order as presented.

  • Individuals often must listen in the presence of competing noise; in order to attend to the “target” stimuli in noise, the brain directs attention to those neurons corresponding to the stimuli of interest, while ignoring/inhibiting the neurons corresponding to “competing” stimuli.

  • A separate process over time involves the establishment of individual sound families (phonemes) and its symbolic representations.

Assessing Spoken Language Processes

The following is a summary of the processes that I examine using the Rochester Hearing and Speech Center approach:

  • Temporal resolution: the ability to detect rapid changes in the speech signal

  • Lexical decoding speed: the ability to process the words of speech quickly and accurately

  • Short-term/working memory: the degree and patterns in which information is maintained in conscious memory (e.g., comparison of earlier to later presented information)

  • Short-term/working memory span: the amount of information (# units) that can be retained in short-term/working memory

  • Sequencing: the ability to maintain speech sounds, words, or directions in correct order

  • Auditory-linguistic integration: the ability to integrate information across different auditory/language processing regions

  • Prosodic perception: the ability to perceive/replicate rhythmic patterns

  • Selective auditory attention: the ability to focus and recall target stimuli while blocking out competing stimuli. This can be evaluated by (a) figure-ground tests (i.e., speech embedded in noise) and (b) binaural separation (whereby competing stimuli are presented dichotically)

  • Divided auditory attention: the ability to recall both competing stimuli presented

  • Sustained auditory attention: the ability to maintain attention to verbally presented information over a period of time without a break

I also examine higher order phonological skills, including:

  • Phonemic synthesis: the ability to blend individually presented speech sounds and derive the target whole word

  • Sound-symbol associations (i.e., phonics): the ability to discriminate, sequence, and represent speech sounds through the use of symbols

Focus on Divisions

Division 1, Language Learning and Education focuses research and professional issues related to the diagnosis and treatment of language disorders, including early assessment and intervention strategies and auditory processing disorders. The Division offers affiliates the opportunity to earn CEUs through self-study of the Division publication, Perspectives (published three times annually), an exclusive e-mail list and Web forum, and other benefits. Visit the ASHA Web site to learn more about Division 1.

References
American Speech-Language-Hearing Association. Task Force on Central Auditory Processing Consensus Development. (1996). Central auditory processing: Current status of research and implications for clinical practice. American Journal of Audiology, 5, 41–54.
American Speech-Language-Hearing Association. Task Force on Central Auditory Processing Consensus Development. (1996). Central auditory processing: Current status of research and implications for clinical practice. American Journal of Audiology, 5, 41–54.×
American Speech-Language-Hearing Association. (2005). (Central) auditory processing disorders (Technical report). Rockville, MD: Author. Available on the ASHA Web site.
American Speech-Language-Hearing Association. (2005). (Central) auditory processing disorders (Technical report). Rockville, MD: Author. Available on the ASHA Web site.×
Bocca, E., Calearo, C., & Cassinari, V. (1954). A new method for testing hearing in temporal lobe tumors. Acta Oto-Laryngologica, 44, 219–221. [Article] [PubMed]
Bocca, E., Calearo, C., & Cassinari, V. (1954). A new method for testing hearing in temporal lobe tumors. Acta Oto-Laryngologica, 44, 219–221. [Article] [PubMed]×
Ferre, J. M. (2002). Behavioral therapeutic approaches for central auditory problems. In Katz, J. (Ed.), Handbook of clinical audiology (5th ed., pp. 525–531). Philadelphia: Lippincott Williams and Wilkins.
Ferre, J. M. (2002). Behavioral therapeutic approaches for central auditory problems. In Katz, J. (Ed.), Handbook of clinical audiology (5th ed., pp. 525–531). Philadelphia: Lippincott Williams and Wilkins.×
Jerger, H., & Musiek, F. (2000). Report on the consensus conference on the diagnosis of auditory processing disorders in school-aged children. Journal of the American Academy of Audiology, 11(9), 467–474. [PubMed]
Jerger, H., & Musiek, F. (2000). Report on the consensus conference on the diagnosis of auditory processing disorders in school-aged children. Journal of the American Academy of Audiology, 11(9), 467–474. [PubMed]×
Rees, N. S. (1973). Auditory processing factors in language disorders: the view from Procrustes bed. Journal of Speech and Hearing Disorders, 38, 304–315. [Article] [PubMed]
Rees, N. S. (1973). Auditory processing factors in language disorders: the view from Procrustes bed. Journal of Speech and Hearing Disorders, 38, 304–315. [Article] [PubMed]×
Woldorff, M. G., Gallen, C. C., Hampson, S. A., Hillyard, S. A., Pantev, C., Sobel, D., & Bloom, F. E. (1993). Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proceedings of the National Academy of Sciences of the United States of America, 90, 8722–8726. [Article] [PubMed]
Woldorff, M. G., Gallen, C. C., Hampson, S. A., Hillyard, S. A., Pantev, C., Sobel, D., & Bloom, F. E. (1993). Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proceedings of the National Academy of Sciences of the United States of America, 90, 8722–8726. [Article] [PubMed]×
0 Comments
Submit a Comment
Submit A Comment
Name
Comment Title
Comment


This feature is available to Subscribers Only
Sign In or Create an Account ×
FROM THIS ISSUE
June 2006
Volume 11, Issue 8