1 Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA, 2 Cognitive Brain Research Unit, Department of Psychology, University of Helsinki and Helsinki Brain Research Centre, Finland, 3 Department of Psychology, University of Iowa, Iowa City, IA, USA, 4 Laboratory of Neuropsychology, NIMH, Bethesda, MD, USA, 5 F.M. Kirby Research Center for Functional Brain Imaging, Kennedy Krieger Institute, Baltimore, MD, USA
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: auditory system, functional magnetic resonance imaging, nonspatial memory, prefrontal cortex, spatial memory
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Single cells in these cortical areas that receive input from auditory areas respond best to different features of the auditory stimulus. Neurons selectively responsive to vocalizations were found in the ventral prefrontal cortex (Romanski and Goldman-Rakic, 2002). In contrast, neurons responsive to spatial features of auditory stimulation were recorded in the dorsal prefrontal cortex (Azuma and Suzuki, 1984; Vaadia et al., 1986
). Discharge of auditory responsive neurons in the temporo-parietal association cortex was dependent on the spatial source of the sound and most of the auditory responses were elicited by natural sounds (Leinonen et al., 1980
). The lateral intraparietal area also has been shown to contain neurons with spatially tuned auditory responses (Mazzoni et al., 1996
). The responsiveness of auditory neurons in both the prefrontal and parietal cortices is dependent on the behavioral significance of the stimulus; that is, the neurons exhibit stronger responses to active localization or memory tasks than to detection or simple fixation tasks (Vaadia et al., 1986
; Grunewald et al., 1999
; Linden et al., 1999
).
As in the monkey, there is some evidence that the human auditory system also contains functionally dissociable pathways for processing spatial and nonspatial information. Some investigators obtained support for this dissociation in temporal auditory cortex (Baumgart et al., 1999; Belin et al., 2000
, 2002; Shah et al., 2001
), parietal cortex (Weeks et al., 1999
), and prefrontal cortex (Alain et al., 2001
), although others have found no clear evidence for such domain specificity in the auditory system (Bushara et al., 1999
; Maeder et al., 2001
; Zatorre et al., 1999
, 2002). The nature of the dissociation as well as the functional neuroanatomy of this possible auditory domain-specificity is therefore not yet clear, and none of these studies provide evidence regarding which cognitive operations required by the tasks were actually responsible for the observed dissociations.
In the present work, we used functional magnetic resonance imaging to study working memory for the location and identity of human voices in an attempt to determine whether the neural system for auditory working memory in humans, like the one for visual working memory (e.g. Courtney et al., 1996, 1998; Sala et al., 2003
), exhibits a functional dissociation for spatial and nonspatial information. We used voices because prefrontal neurons in monkeys were shown to respond better to natural sounds or monkey vocalizations than to pure tones (Azuma and Suzuki, 1984
; Romanski and Goldman-Rakic, 2002), and also because the anterior part of the superior temporal sulcus (STS) in humans has been shown to exhibit selective activation for voices, leading to the suggestion that this region may be analogous to the face sensitive area in the fusiform gyrus which is a part of the ventral visual pathway (Belin et al., 2000
, 2002; Shah et al., 2001
). The subjects performed a delayed recognition task for human voices and voice locations and a sensory-motor control task. To find out whether a functional dissociation between voice and location recognition might occur during specific phases of working memory, we performed separate analyses of task-related activations evoked during the sample, delay, and test periods of the two memory tasks.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Fourteen right-handed subjects (10 females) between the ages of 18 and 27 years (mean 22 years) participated in the study. The subjects were native English speakers and were screened for mental and physical health. They had no history of head injury, or of drug or alcohol abuse, and no current use of medications that affect central nervous system or cardiovascular function. The subjects gave written informed consent, and were paid 50 USD for participating in the experiment. The experimental protocol was approved by the Review Board on the Use of Human Subjects of the Johns Hopkins University and by the Joint Committee on Clinical Investigations of the Johns Hopkins Medical Institutions.
Stimuli
Voice samples consisted of pairs of words. The first word was a two-syllable adjective, and the second word, a five-syllable noun. The samples were recorded in a sound-proof room using CSL software (Sensimetrics Corporation, Somerville, MA, USA). The sampling rate was 44.1 KHz. The targeted words were situated within a sentence (John says that [further consideration] is important) to encourage natural speech, and the speakers were instructed to read the sentences in a neutral tone. The pair of targeted words was always situated in the same position within a sentence. The speakers read each sentence twice during the recording. Ten pairs of words were recorded, and three of them (further consideration, simple inauguration, and constant unreality), excerpted from the recorded sentences, were chosen for use in the study. Eight female voices were recorded. The speakers were native English speakers. The mean durations of the three word pairs were 1316 ms (SD 106 ms), 1378 ms (SD 59 ms), and 1333 ms (SD 62 ms), respectively. There were no significant differences in duration between the three pairs [F(2,21) = 1.33, P = 0.29]. The energy levels (db) of voice samples were normalized using CSL/ASPP software.
The voice samples were transformed with head related transfer functions (HRTFs) to create localizable stereo stimuli for presentation through headphones (TDT-PD1 system; Tucker-Davis Technologies). Stimuli were presented at eight possible locations around the head. The coordinates of sounds locations, from the center of the head at nose level, were the following (azimuth/elevation in degrees from straight ahead): 0/40, 30/30, 40/0, 30/-30, 0/-40, 30/-30, 40/0 and 30/30. We created individualized HRTFs for seven different head sizes and measured the dimensions of each subjects head to find the best HRTF for each subject. In the magnet, the stimuli were presented through air conduction headphones.
For the control task, the auditory stimuli were phase-scrambled in the Fourier domain, maintaining frequency information and stimulus amplitude envelopes equal to those in the memory tasks. The phase-scrambled voices were presented simultaneously from four randomly selected locations, and thus were neither identifiable nor localizable.
Each location and voice was presented 2426 times during the experiment. Before the experiment, the subjects heard each location and voice once to gain familiarity with the stimuli, and once or twice more during the memory task training.
Visual stimuli (e.g. trial instructions and fixation cross) were presented using an LCD projector, located outside of the scanning room, connected to a Power Macintosh G3 computer running SuperLab software. The stimuli were projected on a rear projection screen mounted inside the bore of the magnet, behind the subjects head. Subjects viewed the stimuli through a mirror mounted at the top of the head coil.
Tasks
Two working memory tasks and a control task (Fig. 1) were presented in a delayed recognition paradigm in which subjects were instructed to remember either locations or voices, or neither. One second before each trial, the subjects were presented with an instruction image (for 0.5 s) consisting of the word place (for the location task), voice (for the voice task), or none (for the control task) indicating which task was to be performed. In the location task, the subjects were to memorize the auditory location independent of speaker or words spoken, and in the voice task, the speaker of the sample independent of auditory location or words spoken. The sample was presented for 1.5 s followed by a memory delay of 4.5 s during which the subjects saw a blank screen with a fixation cross. Then, a test stimulus was presented for
1.5 s during which time the subject indicated with a left or right button press whether or not the test stimulus was the same as the one in the sample period. Each subject was allowed to choose whether the right or left hand would correspond to the match response. The other hand would be used for the no match response. Responses were made with left or right thumb presses of hand held button boxes that were connected via a fiber optic cable to a Cedrus RB-6 x 0 Response Box. The recorded words presented during the test period never matched the words presented during the sample period. Also, for the voice task, the auditory location presented during the test period never matched the location presented during the sample period. Similarly, for the location task, the voice presented during the test period never matched the voice presented during the sample period. Following each trial there was an intertrial interval of 3.0 s. Subjects also performed a sensorimotor control task with no mnemonic demand. For this task, the scrambled stimuli were presented with the same timing as in the memory tasks but the subjects were instructed that they need not remember the locations or voices, but simply press both buttons when the test stimulus was played.
|
FMR imaging and data analysis
MR-images were acquired with a 1.5 Tesla Philips Gyroscan ACS-NT MR scanner (Philips Medical Systems). A T1-weighted structural image (70 axial slices, 2.5 mm, no gap, TR = 20 ms, TE = 4.6 ms, flip angle = 30°, matrix 256 x 256, FOV = 230 mm) was obtained before the functional scanning. During the performance of the tasks, subjects underwent T2*-weighted interleaved gradient-echo, echo-planar imaging (21 axial slices, 5 mm thickness, no gap, TR = 1500 ms, TE = 40 ms, flip angle = 70°, matrix 64 x 64, FOV = 230 mm). The images were phase-shifted using Fourier transformation to correct for slice acquisition time, then motion-corrected using automatic image registration (AIR) software (Woods et al., 1998), and analyzed separately for each subject using multiple regression (Friston et al., 1995
; Ward, 2001
) with Analysis of Functional NeuroImages (AFNI) software (Cox, 1996
). Changes in neural activity were modeled as square-wave functions matching the time course of events of experimental tasks. These square-waves were convolved with a gamma function model of the hemodynamic response using the following values: 2.0 s for lag, 3.0 s for rise time, and 5.0 s for fall time to create the regressors of interest in the multiple regression analysis. Additional regressors were included to model sources of variance not related to the experimental manipulations (mean intensity between and linear drift within time series). Both memory task conditions (location and voice) were separately contrasted to the control task, and to each other, for each of the three main events of the tasks (sample, delay, and test). Each of these contrasts resulted in a Z-map for each subject.
Z-maps were registered into the Talairach coordinate system (Talairach and Tournoux, 1988) and resampled to 1 mm3. Average Z-maps were computed by dividing the sum of Z-values by the square root of the sample size using AFNI software (Cox, 1996
). All tests of voxelwise significance were held to a Z threshold of 2.33, corresponding to a P < 0.01, and corrected for multiple comparisons (experiment-wise P < 0.05) using a measure of probability that uses the individual voxel Z score threshold and the number of contiguous significant voxels. Based upon a Monte Carlo simulation run via AFNI (Ward, 2000), it was estimated that a 387 mm3 contiguous volume (six voxels, each measuring 3.59 mm x 3.59 mm x 5 mm) for the volume of the entire brain would meet the P < 0.05 threshold. For the direct comparison between memory tasks, the analysis was restricted to only those voxels showing significantly greater activity for any of the memory tasks versus control. Within this restricted number of voxels, a 258 mm3 cluster size (four voxels) satisfied a 0.05 experiment-wise probability. Activations were anatomically localized in the averaged maps using T1-weighted images.
Frontal Cortex Region of Interest (ROI) Analysis
Based on anatomical hypotheses derived from previous studies of spatial and nonspatial working memory for visual stimuli (Courtney et al., 1998; Sala et al., 2003
), ROIs encompassing the anterior inferior frontal gyrus and anterior insula (IFG/Insula), middle and posterior IFG (IFG), anterior middle frontal gyrus (MFG), and superior frontal sulcus/precentral gyrus (SFS/PreCG) were drawn in both hemispheres of a Talairach transformed brain according to Brodmann areas (BAs) and anatomical landmarks of the Talairach (Talairach and Tournoux, 1988
) and Damasio (1995
) brain atlases. The IFG/Insula ROI included BAs 45 and 47 of the IFG (z = 5.0 mm to 16.00 mm). The posterior border of the IFG/Insula ROI was the anterior bank of the sylvian fissure (z = 5.0 mm to 12.0 mm) and the anterior bank of the precentral sulcus (PreCS) (z = 12.0 mm to 16.00 mm). The anterior border of the IFG/Insula ROI was the inferior frontal sulcus (IFS). The IFG ROI included BAs 44 and 45 of the IFG (z = 17.0 mm to 34.0 mm). The posterior border of the IFG ROI was the anterior bank of PreCS, and the anterior border was the posterior bank of IFS. The anterior MFG ROI included BAs 46 and 10 of the MFG (z = 5.0 mm to 23.0 mm). The posterior border of the MFG ROI was the anterior bank of IFS, and the anterior border was the posterior bank of the superior frontal sulcus (SFS). The SFS/PreCG ROI included the SFG within
6 mm of either side of the SFS (z = 35.00 mm to 63.00 mm) and BA 6 of the PreCG (z = 44.00 mm to 63.00 mm). The posterior border of the SFS/PreCG ROI was the anterior bank of the central sulcus (CS), and the anterior border was the posterior bank of PreCS.
For each ROI, the number of voxels significantly activated (not corrected for multiple comparisons) in each of the three main periods of the memory tasks relative to the corresponding period in the control task was computed for each subject. The number of significantly activated voxels was then normalized by dividing by the total number of voxels in each ROI. In addition, for each ROI, the signal intensities (ß-coefficients) of the significantly activated voxels (corrected for multiple comparisons), determined as described above, were computed for each subject. Analysis of variance for repeated measures with subject as a random factor (BMDP2v, BMDP Statistical Software, Inc., Release 7.1) was used to test the main effects and interactions of task, event, hemisphere, and brain region on both the number of suprathreshold voxels and ß-coefficients. A pairwise t-test was then used to test the effect of task on the number of activated voxels and ß-coefficients separately for each ROI.
Response Topography Correlation Analysis
In the group average Z maps, clusters of voxels that were activated in any of the planned contrasts location versus control, voice versus control, or both tasks combined versus control were assigned to six broadly defined anatomical regions: right and left lateral frontal cortex, right and left posterior parietal cortex, and right and left anterior parietal/temporal cortex. These regions are shown in Figure 7. Within each of these regions, for each subject, voxels were identified that were significantly positively activated for that subject individually, in any of the same contrasts. Within each subject, within each of the three regions (frontal, parietal, and temporal) these voxels were ordered hierarchically first by ventral to dorsal, then by anterior to posterior, then by left to right, to create a voxel index. The beta weights as a function of voxel index for each WM task thus became a single metric for the response topography within each region. The multiple regression was re-run with separate regressors for either odd or even numbered blocks of each task. Correlations were calculated between the response topography on odd blocks and the response topography on even blocks of the same WM task, for a measure of within-task consistency of the topography. Correlations were also calculated between the response topography on odd (even) blocks of one WM task and the response topography on odd (even) blocks of the other WM task, for a measure of between-task consistency of the topography. Correlation coefficients were converted to Z scores and t-tests were performed to test whether or not the response topography within each region was more highly correlated within task than between tasks.
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The subjects were equally accurate in both memory task conditions. The percentage of correct responses for the location task was 83%, and for the voice task, 84%. The reaction times were significantly faster for the location (1869 ms) than for the voice (1956 ms) task (P < 0.05). The subjects evaluated both tasks equally difficult to perform. Subjects reported having used several different memory strategies: Visual, verbal, and auditory imagery to remember the locations, and mainly auditory imagery but also verbal strategies to remember the voices.
fMRI Results
Voxelwise Multiple Regression
Location and Voice Task Activations Relative to Control.
Sample Period (Table 1 and Fig. 2). For the location samples, activation was detected in the left superior temporal sulcus/gyrus (STS/STG) and in the left inferior parietal lobe/postcentral gyrus (IPL/PostCG). For the voice samples, there was bilateral activation of STS/STG and STG/Insula.
Delay Period (Table 2 and Fig. 3). Several temporal, parietal, and frontal regions were activated during the delay period of the tasks. In the temporal lobe, the right STG and bilateral STS/middle temporal gyrus (MTG) were activated during voice delays. In the parietal lobe, the right IPL and bilateral superior parietal lobe (SPL) were activated only during location delays, whereas the left IPL was activated during both delays. Finally, in the frontal lobe, the anterior middle frontal gyrus (MFG) was activated during location delays, while the inferior frontal gyrus/Insula (IFG/Insula), IFG, superior frontal sulcus/precentral gyrus (SFS/PreCG), and the medial part of the superior frontal gyrus (SFGm) were activated during both delays.
Test Period (Table 3 and Fig. 4). Several temporal, parietal, and frontal regions were also activated during the test period of the tasks. In the temporal lobe, the STS/STG was bilaterally activated by both tasks. In the parietal and frontal cortices, the IPL, SPL, IFG/Insula, IFG, anterior MFG, and SFGm were activated by both tasks.
Direct Voxelwise Comparisons: Location > Voice and Voice > Location (Table 4 and Fig. 5). Direct voxelwise comparisons between the two tasks revealed no significant differences during the sample period. During the delay period, the left SFS/PreCG and the right SPL were activated more for the location task than for the voice task, but there was no region exhibiting greater activation for voice than for location delays (when corrected for multiple comparisons). During the test period, however, whereas the right SPL was again activated more for locations than for voices, bilateral IFG/Insula was activated more for voices than for locations.
ROI Analysis in the Frontal Cortex
The voxelwise regression analysis suggested a dissociation between dorsal (parietal and SFS/PreCG) and ventral (IFG/Insula) cortical areas for spatial versus nonspatial working memory, but this analysis did not show a convincing double dissociation. Such a result does not prove the absence of a functional dissociation, however, and so the data were further analyzed using two other methods. First, because we had an a priori hypothesis regarding specific anatomical criteria for defining regions of interest in the frontal cortex, but not in other areas, we performed an ROI analysis only within the frontal cortex on both the number of activated voxels and the signal intensity (ß-coefficients).
Sample Period. The ROI analysis during the sample period of the tasks demonstrated that there was a significant main effect of brain region on the number of significantly activated voxels [F(3,39) = 3.97, P < 0.05] and on signal intensities [F(3,39) = 11.56, P < 0.001] but no main effect of task nor interaction between the task and brain region.
Delay Period. During the delay period, there was a significant main effect of brain region [F(3,39) = 8.97, P < 0.005] and an interaction between task and brain region [F(3,39) = 6.20, P < 0.005] on the number of suprathreshold voxels. This number was significantly greater for voice than for location delays in the left IFG/Insula (0.034 versus 0.022 [number of activated voxels divided by the total number of voxels in the ROI], respectively, P < 0.01) and the left IFG (0.071 versus 0.053, P < 0.05). An interaction between task and brain region during the delay period of the tasks was significant also for signal intensities (ß-coefficients) of activated voxels [F(3,39) = 4.49, P < 0.05]. Signal intensity was significantly greater for voice than for location delays in the left IFG/Insula (0.0056 versus 0.0040, P < 0.05). Conversely, signal intensity of activated voxels was significantly greater for location than for voice delays in the right SFS/PreCG (0.0060 versus 0.0033, P < 0.05) (Fig. 6).
|
Comparisons across Sample, Delay, and Test Events. The results obtained from multiple regression and ROI analyses suggest that the nature and magnitude of spatial/nonspatial dissociation may be different at different times during the performance of working memory task. Therefore, we also performed a 4-way ANOVA to test main effects and interactions of task, event within the task, hemisphere, and brain region. The results showed that there was a significant task x event interaction for number of activated voxels [F(2,26) = 5.23, P < 0.05], although the interaction for signal intensity was not significant.
Functional Topography Correlation Analysis
To test the robustness of these findings in the frontal cortex and to further test for functional topographies in other brain regions where we did not have such specific, anatomically based hypotheses, we performed a correlation analysis on the pattern of activation magnitude within activated clusters in frontal, parietal, and temporal cortices. The analysis is described in detail in the Materials and Methods section, and the results for the delay period in two individual subjects are shown in Figure 7.
Sample Period. Functional topographies were not statistically more similar within task (across odd and even blocks) than between task (within odd or even blocks) during the sample period for any of the activated regions, although there was a trend toward this effect in the frontal cortex (r = 0.36 versus 0.32, P = 0.08). We also calculated the slope of the regression line for the beta coefficients as a function of voxel index for each subject. This is only a rough indicator of the functional topography, because, as can be seen in Figure 7, the plots are highly nonlinear. Nevertheless, for the ventral to dorsal voxel index order, the slopes were significantly different for the spatial and the identity tasks in the frontal cortex (5.1 x 106 and 2.9 x 106, respectively, P < 0.05), indicating that the amount of activation for the spatial task increases from ventral to dorsal frontal cortex while the amount of activation for the identity task decreases.
Delay Period. During the delay period, functional topographies were significantly more similar within task than between tasks for left frontal (r = 0.69 and 0.56, respectively, P < 0.005), right frontal (r = 0.76 and 0.68, respectively, P < 0.05) and right parietal (r = 0.74 and 0.55, respectively, P < 0.05) cortices. As illustrated in Figure 7, frontal cortex activation for the location task increased whereas that for the voice identity task decreased with increasing voxel index, indicating a dorsal/ventral spatial/nonspatial functional topography (slopes = 1.86 x 106 versus 1.62 x 106 respectively, P < 0.05). Although the spatial task tended to produce greater activation than did the voice identity task across all activated portions of parietal cortex, the dissociation between the tasks was greatest in the most superior portion of this region (slopes = 1.30 x 104 versus 4.12 x 105 respectively, P < 0.05). The results of the correlation analysis are independent of the particular ordering chosen to define the voxel index. If the voxels are ordered first from posterior to anterior instead of from ventral to dorsal, and similar plots are prepared for the activations in the temporal region of the same subjects illustrated in Figure 7, there appears to be greater activation for the identity task in the anterior portion of the temporal region, consistent with previously reported results (for review see Rauschecker and Tian, 2000). However, neither left nor right temporal cortex showed a consistent functional topography across subjects with this analysis (r{within/between} = {0.70/0.72}, P = 0.3 and r = {0.76, 0.73}, P = 0.3, respectively.
Test Period. During the test period, functional topographies were not statistically more similar within task than between task for any of the activated regions. However, the slopes were significantly different for the spatial and the identity tasks in the right frontal cortex (3.5 x 106 and 9.6 x 106 respectively, P < 0.05), indicating again that the amount of activation for the spatial task increases from ventral to dorsal frontal cortex while the amount of activation for the identity task decreases.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Previous neuroimaging studies on spatial and nonspatial auditory processing have not provided evidence regarding which cognitive operations required by the tasks were responsible for the observed spatial/nonspatial dissociations (e.g. Weeks et al., 1999; Alain et al., 2001
; Maeder et al., 2001
). The current study suggests that the magnitude of the dissociation is greatest during maintenance in working memory (i.e. delay period), less during recognition or retrieval (test period), and least during encoding (sample period). The reason for this result is not entirely clear. Examination of the data suggests that the variance in the beta coefficient estimates was greater for the sample and test periods than during the delay, possibly because of intersubject variability in the hemodynamic lag. It also appears that the spatial extent of the proposed spatial/nonspatial functional topography may be smaller in parietal and temporal areas than in prefrontal cortex. Therefore, intersubject anatomical variability would interfere more with our ability to detect such a functional topography in the former areas. In addition, it may be that during stimulus presentation, both spatial and nonspatial information are processed, but because only the task-relevant information is actively maintained during the delay, the difference between the tasks becomes more pronounced during this time. This would also help explain why the dissociation was most robust in frontal cortex rather than in posterior areas. Posterior areas would be expected to show a greater dissociation if the differences in activation pattern reflected attentional modulation during stimulus presentation rather than working memory maintenance.
Unlike previous studies of spatial versus nonspatial auditory working memory, the present study used the identity of human voices as the nonspatial information to be remembered. Vocalization and natural sounds have been shown to elicit strong neuronal responses throughout the auditory system, including the temporal, parietal, and frontal cortices (Leinonen et al., 1980; Azuma and Suzuki, 1984
; Romanski and Goldman-Rakic, 2002). As with human faces, a human voice contains information about the identity of a person and, thus, it can be considered as an auditory face (Belin et al., 2000
, 2002). In monkeys, it has been shown that neurons sensitive to monkey vocalization were located in the ventral prefrontal cortex (Romanski and Goldman-Rakic, 2002). In the visual system, a clear ventral/dorsal dissociation in the prefrontal cortex was demonstrated earlier using faces and locations of faces as memoranda (Courtney et al., 1996
, 1998; Sala et al., 2003
). Working memory maintenance of face identity preferentially activated the inferior and middle frontal gyri, whereas maintenance of face locations preferentially activated the superior frontal sulcus (Courtney et al., 1996
, 1998; Sala et al., 2003
). This dorsal/ventral dissociation for visual locations and objects may be greater for faces than for other objects, but it is not specific to faces, as other objects show the same dissociation (Sala et al., 2003
). Therefore, it is reasonable to presume that the dissociation observed in the current study is a general spatial versus nonspatial distinction and is not specific to voices. Indeed, the same dissociation was observed by Alain et al. (2001
) and Arnott et al. (2002
) using synthesized noise bursts. Ventral prefrontal regions have also been shown to be recruited by other types of nonspatial auditory tasks such as melodic, phonemic and pitch discrimination (Zatorre et al., 1992
, 1994; Hsieh et al., 2001
).
Previous research regarding spatial and nonspatial auditory perception, attention, and working memory has yielded seemingly contradictory results regarding whether there are dissociable neural systems for the different information domains. In one study, the right auditory cortex was shown to exhibit greater activity for moving than for stationary sounds (Baumgart et al., 1999). However, other studies have not found differential activity in the auditory cortex during active localization of sounds relative to passive listening (Bushara et al., 1999
; Weeks et al., 1999
). Recently, it was shown that the posterior superior temporal gyrus (STG) was activated by simultaneously presented spatially and spectrotemporally variable sounds but not by sequentially presented sounds, suggesting that the posterior STG is sensitive to both spatial and spectrotemporal features of sounds (Zatorre et al., 2002
). In the present study, although there were no significant differences between the location and voice tasks during the sample period in the voxelwise multiple regression analysis, inspection of the patterns of activation for each memory task versus control suggests that perhaps the voice activation extends further anteriorly than the location activation. Results from individual subjects in the correlation analysis (Fig. 7) also suggest that anterior temporal cortex responds more during the voice identity task than the location task. Such a pattern would be consistent with the organization of auditory cortex that has been found in monkeys, with the AL and ML regions more selective for nonspatial auditory features and the CL region being more selective for auditory locations (for review see Rauschecker and Tian, 2000
). Similarly, although the differences were not significant in direct comparisons between the location and voice tasks, the inferior parietal cortex was activated by location samples relative to control samples, but not by voice relative to control samples, which is in line with previous studies showing that the parietal cortex is involved in discrimination and memorizing of audiospatial information (e.g. Bushara et al., 1999
; Weeks et al., 1999
; Martinkauppi et al., 2000
; Zatorre et al., 2002
). Therefore, although the current results do not provide direct evidence for a spatial/nonspatial organization in auditory association areas during encoding, they are not inconsistent with this idea.
Only a few neuroimaging studies have compared spatial and nonspatial auditory processing directly (e.g. Weeks et al., 1999; Zatorre et al., 1999
; Alain et al., 2001
; Maeder et al., 2001
). In one study, auditory attention to locations in space and sound frequencies were shown to activate similar cortical regions in temporal, parietal, and frontal regions (Zatorre et al., 1999
). On the other hand, three other studies showed anatomically dissociable patterns of activation during sound identification and localization tasks (Weeks et al., 1999
; Alain et al., 2001
; Maeder et al., 2001
). In the study by Weeks et al. (1999
), the subjects were performing frequency and location discrimination tasks, and the right inferior parietal cortex was shown to be predominantly activated by localization, whereas the left inferior parietal cortex by identification of sounds. Although the primary dissociation in the current study was between right dorsal frontal for the spatial task and left ventral frontal for the nonspatial task, there were no hemispheric laterality differences within parietal cortex, or within ventral or dorsal prefrontal cortex. Alain et al. (2001
) asked their subjects to perform a delayed comparison task with 1 second delay for locations and frequencies of synthesized sounds. The results showed that the right inferior frontal gyrus was activated more by the pitch than by the location task, whereas the right superior frontal sulcus was activated more by the location than by the pitch task. In the study by Maeder et al. (2001
), the subjects were also asked to perform a delayed comparison task for locations of noise bursts. In their nonspatial task, the subjects were asked to detect certain environmental sounds (animal cries) among the others (e.g. street, beach, railway station). This study did not reveal as clear a ventral/dorsal dissociation in the frontal cortex as did the study by Alain et al. (2001
). There were slight differences in locations of peak activities for direct comparisons between the tasks, but ventral and dorsal prefrontal regions were activated for both comparisons. The results of the current study are more similar to those of Alain et al. (2001
).
The overall dorsal/ventral, spatial/nonspatial functional topography of the frontal cortex appears to be highly similar for auditory and visual working memory (e.g. Levy and Goldman-Rakic, 2000; Sala et al., 2003
). Evidence from the monkey suggests that there is an auditory processing domain, separate from the visual processing domain, in the ventral prefrontal cortex. Auditory neurons were located more anteriorly and laterally than were visually responsive neurons (Romanski and Goldman-Rakic, 2002). In humans, however, within the spatial resolution of fMRI, working memory maintenance of faces and of voices appear to activate the ventral frontal cortex similarly (Rämä et al. 2001). Further research is needed to ascertain whether there are two distinct systems for maintenance of visual and auditory information in frontal cortex, both of which show a dorsal/ventral, spatial/nonspatial functional topography, or whether there is a single system for information maintenance independent of stimulus modality.
![]() |
Notes |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
|
|
|
Address correspondence to Pia Rämä, Cognitive Brain Research Unit, Department of Psychology, PO Box 9, 00014 University of Helsinki, Finland. Email: prama{at}cc.helsinki.fi.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Arnott SR, Alain C, Hevenor S, Graham S, Dade LA, Grady C (2002) What, where, and how in the human prefrontal cortex. Program No. 181.1. 2002 Abstract Viewer/Itinerary Planner. Washington, DC: Society for Neuroscience, 2002. Online.
Azuma M, Suzuki H (1984) Properties and distribution of auditory neurons in the dorsolateral prefrontal cortex of the alert monkey. Brain Res 298: 343346.[CrossRef][ISI][Medline]
Baumgart F, Gaschler-Markefski B, Woldorff MG, Heinze HJ, Scheich H (1999) A movement-sensitive area in auditory cortex. Nature 400: 724726.[ISI][Medline]
Belin P, Zatorre RJ, Lafaille P, Ahad P, Pike B (2000) Voice-selective areas in human auditory cortex. Nature 403: 309312.[CrossRef][ISI][Medline]
Belin P, Zatorre RJ, Ahad P (2002) Human temporal-lobe response to vocal sounds.Brain Res Cogn Brain Res 13: 1726.[CrossRef][ISI][Medline]
Bushara KO, Weeks RA, Ishii K, Catalan M., Tian B, Rauschecker JP, Hallett M (1999) Modality-specific frontal and parietal areas for auditory and visual spatial localization in humans. Nat Neurosci 8: 759766.[CrossRef]
Courtney SM, Ungerleider LG, Keil K, Haxby JV (1996). Object and spatial visual working memory activate separate neural systems in human cortex. Cereb Cortex 6: 3949.[Abstract]
Courtney SM, Petit L, Maisog JM, Ungerleider LG, Haxby JV (1998). An area specialized for spatial working memory in human frontal cortex. Science 279: 13471351.
Cox RW (1996) AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res 29: 162173.[CrossRef][ISI][Medline]
Damasio H (1995) Human brain anatomy in computerized images. New York: Oxford University Press.
Friston KJ, Holmes AP, Poline JB, Grasby PJ, Williams CR, Frackowiak RSJ (1995) Analysis of fMRI time-series revisited. Neuroimage 2: 4553.[CrossRef][ISI][Medline]
Grunewald A, Linden JF, Andersen RA (1999) Responses to auditory stimuli in macaque lateral intraparietal area. I. Effects of training. J Neurophysiol 82:330342.
Hsieh L, Gandour J, Wong D, Hutchins GD (2001) Functional heterogeneity of inferior frontal gyrus is shaped by linguistic experience. Brain Lang 76: 227252.
Leinonen L, Hyvärinen J, Sovijärvi ARA (1980) Functional properties of neurons in the temporo-parietal association cortex of awake monkey. Exp Brain Res 39: 203215.[ISI][Medline]
Levy R, Goldman-Rakic PS (2000) Segregation of working memory functions within the dorsolateral prefrontal cortex. Exp Brain Res 133: 2332.[CrossRef][ISI][Medline]
Linden JF, Grunewald A, Andersen RA (1999) Responses to auditory stimuli in macaque lateral intraparietal area. II. Behavioral modulation. J Neurophysiol 82:343358.
Maeder PP, Meuli RA, Adriani M, Bellmann A, Fornari E, Thiran JP, Pittet A, Clarke S (2001) Distinct pathways involved in sound recognition and localization: a human fMRI study.Neuroimage 14:802816.[CrossRef][ISI][Medline]
Martinkauppi S, Rämä P, Korvenoja A, Aronen H., Carlson S (2000) Working memory of auditory localization. Cereb Cortex 10: 889898.
Mazzoni P, Bracewell RM, Barash S, Andersen RA (1996) Spatially tuned auditory responses in area LIP of macaques performing memory saccades to acoustic targets. J Neurophysiol 75: 12331241.
Rämä P, Falconero L, Courtney SM (2001) Working memory for faces and voices. Soc Neurosci Abstr 25:81.5.
Rauschecker JP, Tian B (2000) Mechanisms and streams for processing of what and where in auditory cortex. Proc Natl Acad Sci USA 97:1180011806.
Romanski LM, Tian B, Fritz J, Mishkin M, Goldman-Rakic PS, Rauschecker JP (1999) Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci 12: 11311136[CrossRef]
Romanski LM, Goldman-Rakic PS (2002) An auditory domain in primate prefrontal cortex. Nat Neurosci 5: 1516.[CrossRef][ISI][Medline]
Sala JB, Rämä P, Courtney SM (2003) Functional topography of a distributed neural system for spatial and nonspatial information maintenance in working memory. Neuropsychologia 41: 341356.[CrossRef][ISI][Medline]
Shah NJ, Marshall JC, Zafiris O, Schwab A, Zilles K, Markowitsch HJ, Fink GR (2001) The neural correlates of person familiarity. A functional magnetic resonance imaging study with clinical implications. Brain 124: 804815.
Talairach J, Tournoux P (1988) Co-planar stereotaxic atlas of the human brain. New York: Thieme.
Tian B, Reser D, Durham A, Kustov A, Rauschecker JP (2001) Functional specialization in rhesus monkey auditory cortex. Science 292: 290293.
Ungerleider LG, Mishkin M (1982) Two cortical visual systems. In: Analysis of visual behavior (Ingle DJ, Goodale MA, Mansfield RJW, eds). Cambridge: MIT Press.
Ungerleider LG, Haxby JV (1994) What and where in the human brain. Curr Opin Neurobiol 4:157165.[CrossRef][Medline]
Vaadia E, Benson DA, Hienz RD, Goldstein MH Jr (1986) Unit study of monkey frontal cortex: active localization of auditory and of visual stimuli. J Neurophysiol 56:934952.
Ward BD (2000) Simultaneous inference for fMRI data. http://afni.nimh.nih.gov/afni/docpdf/AlphaSim.pdf
Ward BD (2001) Deconvolution analysis of fMRI time series data. http://afni.nimh.nih.gov/afni/docpdf/3dDeconvolve.pdf
Weeks RA, Aziz-Sultan A, Bushara KO, Tian B, Wessinger CM, Dang N, Rauschecker JP, Hallett M (1999) A PET study of human auditory processing. Neurosci Lett 12: 155158.
Wilson FA, Scalaidhe SP, Goldman-Rakic PS (1993) Dissociation of object and spatial processing domains in primate prefrontal cortex. Science 260(5116):19551958.[ISI][Medline]
Woods RP, Grafton S., Holmes CJ, Cherry SR, Mazziotta JC (1998) Automated image registration: I., General methods and intrasubject, intramodality validation. J Comput Assist Tomogr 22: 139152.[CrossRef][ISI][Medline]
Zatorre RJ, Evans AC, Meyer E, Gjedde A (1992) Lateralization of phonetic and pitch discrimination in speech processing. Science 256: 846849.[ISI][Medline]
Zatorre RJ, Evans AC, Meyer E (1994) Neural mechanisms underlying melodic perception and memory for pitch. J Neurosci 14: 19081919.[Abstract]
Zatorre RJ, Mondor TA, Evans AC (1999) Auditory attention to space and frequency activates similar cerebral systems. Neuroimage 10:544554.[CrossRef][ISI][Medline]
Zatorre RJ, Bouffard M, Ahad P, Belin P (2002) Where is where in the human auditory cortex? Nat Neurosci 5: 905909.[CrossRef][ISI][Medline]