 |
INTRODUCTION |
In a reverberant environment, sound waves propagate in many directions and reflect from nearby surfaces, presenting each ear with multiple echoes, arriving from different directions and with various time delays relative to the original source. Thus there is competition for localization between the first sound (leading source) and its echoes (lagging sources). A phenomenon that is thought to resolve this competition by suppressing directional information from the lagging sources is known as the precedence effect (PE). Not all information from echoes is suppressed, for they contribute to the overall percept of the leading source. With a few exceptions (Ebata et al. 1968
; McCall et al. 1995
), most studies on the PE do not emulate "real world" situations but rather try to simplify phenomena that occur in the real world by simulating only one lagging source (Blauert 1983
; Wallach et al. 1949
; Zurek 1980
).
Studies on the PE can be carried out either in free field, where the leading and lagging sounds occur at different locations, or under headphones, where each location is simulated with different interaural time differences (ITDs). The onset of the leading and lagging sources is separated by a variable interstimulus delay (ISD). For ISD = 0, the sounds from the lead and lag sources are perceptually fused and the "phantom" image is located midway between the two speakers. As ISD is increased up to ~1 ms, the location of the phantom source gradually moves toward the leading speaker; this is the range of ISDs for what is termed "summing localization," whereby both the lead and lag contribute to the perceived direction of a single fused image. The PE is active for longer delays (typically in the range of 1-5 ms or more, depending on the stimulus waveform), where the perceived image location is dominated by the location of the leading source and echo suppression occurs (Blauert 1983
; Litovsky and Macmillan 1994
; Perrott et al. 1989
; Shinn-Cunningham et al. 1993
; Zurek 1980
). Finally, for further increases in ISD, the limit of the PE, known as the echo threshold, is reached; the leading and lagging sources are both heard and localized at their respective positions. Echo thresholds vary with testing situation (Blauert 1983
) as well as stimulus characteristics (Blauert 1983
). For example, echo suppression increases for noise compared with clicks (Blauert 1983
) and is stronger with lower overall level (Shinn-Cunningham et al. 1993
) and increased stimulus duration (Blauert 1983
).
Studies on the PE described thus far usually involve stimuli located along the horizontal axis. Recent studies have shown that it also operates in the elevation, or median sagittal plane, where interaural difference cues are virtually absent (Litovsky et al. 1997a
, b; Rakerd and Hartmann 1994
).
Several behavioral studies have suggested that the PE also is experienced by other animals such as cats (Cranford 1982
; Cranford and Oberholtzer 1976
; Cranford et al. 1971
; Populin and Yin 1998
; Whitfield et al. 1972
), rats (Kelly 1974
), and crickets (Wyttenbach and Hoy 1993
). Cranford (1982)
showed that at delays of 2-10 ms (for which humans experience the PE), cats localize the two-click stimulus at the position of the leading source, making this species a good model for studying neural correlates of the PE. Recent physiological studies in the ICC have suggested that single neurons exhibit a correlate of the PE. Yin (1994)
and Litovsky (1998) reported that, in anesthetized cats and kittens, respectively, most ICC neurons show a suppressed response to the lagging source, with recovery to 50% of the nonsuppressed response varying from ISDs of 1-100 ms. Furthermore, the discharge rate of at least some neurons reflected the perceived location of sound sources when ISDs in the summing localization range were used (Yin 1994
). In the ICC and superior olive of the unanesthetized rabbit, Fitzpatrick et al. (1995)
recently reported similar suppressive effects, although the recovery times were somewhat shorter than those in the anesthetized cat preparation.
This paper and its companion (Litovsky and Yin 1998
) are concerned with single-neuron responses in the ICC of anesthetized cats to stimuli known to evoke the PE. This paper focuses on responses to changes in stimulus conditions that have been used in psychophysical studies and that an organism might encounter in a natural listening environment, such as when source level and duration are varied. In addition, comparisons of neural echo suppression were made for stimuli in the azimuthal and median planes, parallelling recent studies on this topic in psychophysics. In the following paper, we discuss neural responses to stimulus conditions that do not occur in the real world, which may be informative concerning mechanisms of the suppression.
 |
MATERIALS AND METHODS |
Surgery
Adult domestic cats (weight 2-4 kg) with clean external ear canals and no sign of middle ear infection were used. Anesthesia was induced either with pentobarbital sodium given intraperitoneally (35 mg/kg) or ketamine given intramuscularly (20 mg/kg). In most cases, atropine methyl nitrate was given subcutaneously (0.1 mg/kg) to reduce mucous secretions. A venous cannula was inserted into the femoral vein to administer additional doses of pentobarbital as needed throughout the experiment. Body temperature was maintained at 37°C, and a tracheal cannula was inserted. The dorsal surface of the IC was exposed with a craniotomy. The overlying cortex was aspirated, and in some animals, the tentorium was removed partially to allow access to the posterior IC.
Extracellular recordings were made from the IC using commercial tungsten microelectrodes (Microprobe) with tip exposures of 8-12 µm. A motor-driven hydraulic microdrive was used to move the electrode remotely from outside the room. Single spikes were discriminated with either a level detector or a peak-detector circuit. During the experiment, physiological criteria were used to locate cells within the ICC (Carney and Yin 1989
). At the end of the experiment, the cats were overdosed and the brains immersed in 10% formalin in saline. After fixation the brain stem was removed and cut in the coronal plane with frozen sections at 50-µm thickness and stained with cresyl violet. Confirmation of the penetrations through the ICC was made from anatomic examination of these sections. With the exception of 13 neurons from one cat (94-004) the neurons of which were localized in the external nucleus of the IC, all penetrations were found to be in the ICC. The responses of the cells in 94-004 were in the main indistinguishable from the rest of the ICC cells with regard to responses to the PE stimuli, so they have been included in our analysis.
Free-field experiments
Experiments were conducted in a sound-insulated room (IAC) (2.25 × 2.15 and × m) (Fig. 1A). To reduce acoustic reflections, all surfaces were lined with 4-in reticulated wedged foam (Sonex). The skin overlying the skull was dissected, and a stainless steel rod was secured to the skull on the side opposite to the recording site with screws and dental acrylic. The rod was attached to the animal holder to hold the head for the duration of the experiment. After exposure of the IC, the skin was sutured back so that the ears assumed a natural position while still allowing access to the brain. The electrode manipulator also was attached to the head holder to increase stability during recording. The animal holder was anchored to the floor of the room such that the head was positioned in the center of a circle of 90-cm radius defined by the loudspeaker array. One set of 13 loudspeakers (Realistic 3-in midrange tweeter) was positioned along the horizontal axis in the frontal hemifield at 0° elevation with loudspeakers every 15°. Nine loudspeakers, also separated by 15°, were placed in elevation along the midsagittal axis at 0° azimuth. The seventh speaker in elevation was the same as the middle speaker in azimuth (0°, 0°) (Fig. 1A). Positive angles refer to sounds in the contralateral hemifield or above the head. The loudspeakers were chosen from a large set and carefully matched for frequency response to be within 2 dB by monitoring their outputs to clicks and to tone bursts delivered from 0.1 to 25 kHz in steps of 100 Hz.

View larger version (74K):
[in this window]
[in a new window]
| FIG. 1.
Stimulus configuration in free field (A) and under dichotic conditions (B). A: cat is placed in an anechoic chamber in the center of 2 semicircular arrays of loudspeakers, positioned every 15°: one along the azimuth from +90° (right and contralateral to recording site) to 90° (left), and the other in elevation along the midsagittal plane from +90° (top) to 30° (bottom). Head support system and micromanipulator are not shown in the schematic diagram. B: 2 pairs of clicks are delivered dichotically, separated by an interstimulus delay (ISD); each click pair contains an interaural time delay, e.g., 200 and +200 µs as diagrammed.
|
|
The characteristic frequency (CF) of each cell was defined as the frequency with the lowest threshold to tonal stimuli at +45° on the horizontal axis. In the small number of cells that preferred the ipsilateral hemifield, CF was obtained at
45°. Thresholds for clicks and noise also were obtained at +45°. Finally, azimuth and elevation response area curves were obtained by presenting 50 repetitions of either clicks or noise from each loudspeaker at 5-20 dB above threshold. The PE was simulated by presenting two sounds from different loudspeakers with one sound delayed relative to the other. Either clicks or noise stimuli were used, usually with 50 repetitions at an interval of 300 ms.
Dichotic experiments
The animal was placed in a double-walled, sound-insulated chamber (IAC). Both pinnae were dissected away, and the external ear canals were transected to reveal the tympanic membrane. Acoustic stimuli were delivered independently to each ear through hollow ear pieces, which were connected to the earphones (Telex 140) and tightly sealed into the ear canals with Audalin ear impression compound. A small hole was drilled in each bulla, and polyethylene tubes (0.9 mm ID and 30 cm in length) were inserted into the hole to equalize middle ear pressure during the experiment. Acoustic stimuli were generated by a digital stimulus system, which was calibrated from 0.1 to 42 kHz in 50-Hz steps in amplitude and phase for each animal using a 0.5-in Bruel and Kjaer condenser microphone, which was coupled to a probe tube positioned inside the ear canal 1-2 mm away from the tympanic membrane. The CF of each cell was defined as the frequency with the lowest threshold for the contralateral ear. By convention positive ITDs refer to the contralateral ear leading. As in the free-field experiments, thresholds were estimated for clicks and noise, but they were obtained either for the contralateral ear alone or for binaural stimuli with an ITD of 0 if stimulation to the contralateral ear alone was not effective. Stimuli with a PE configuration were simulated by presenting two dichotic pairs of clicks or noise separated by an ISD of several milliseconds; ITDs were imposed separately for each stimulus pair (Fig. 1B). The ISD was defined as the time difference between the onset of the two stimuli delivered to the contralateral ear.
Auditory stimuli
Auditory stimuli were digitally stored in a general waveform buffer for delivery by our digital stimulus system. Click stimuli were usually 100-µs duration (although 20 and 500 µs were sometimes used as well). Noise stimuli were 1-40 ms in duration, depending on the experimental manipulation, with rise-fall times of 0.5 ms. Wideband noise spectra extended from 0.1 to 30 kHz. The levels of noise and click stimuli in dichotic experiments were computed as the effective sound pressure level (SPL); the total energy in the waveform was summed by convolving the spectrum of the signal with the transfer characteristics of the earphone.

View larger version (33K):
[in this window]
[in a new window]
| FIG. 2.
Effect of varying the location of single clicks (A and C) or noise (B and D) along the horizontal axis in free field for 2 representative cells (left: CF = 5.6 kHz; right: CF = 2.6 kHz). Each curve represents the responses at a different level, normalized by the number of stimulus repetitions.
|
|
Normalizing the lagging responses in recovery curves
For each neuron, we counted the number of spikes in response to the leading and lagging stimuli at each ISD during discrete time windows, chosen to accommodate the latency and duration of each neuron's response. To measure the degree of suppression as a function of ISD, we plotted recovery curves by computing and normalizing the response so that recovery to 1.0 represented a lagging response that was equal to that obtained in the absence of a leading stimulus. In cases where the leading and lagging stimuli were identical, such as under dichotic conditions with the same ITDs imposed, the lagging responses were divided by the leading responses at each ISD. In cases where the stimuli were different, e.g., all free-field conditions (because the leading and lagging stimuli never occurred at the same location), lagging stimuli were normalized either by the response to the lagging stimulus presented in isolation or at a large enough ISD that there was no apparent effect of the leading stimulus. When the analysis time windows for the leading and lagging response overlapped, such as when ISD was very short (i.e., Fig. 4, A and C), we assumed that there was no variation in the response to the leading stimulus as a function of ISD; the lagging response then was computed by subtracting the mean leading responses at the five longest ISDs from the total number of spikes at the ISDs with overlap.

View larger version (27K):
[in this window]
[in a new window]
| FIG. 4.
Echo suppression for 2 neurons, 1 studied under free-field (left; CF = 2.9 kHz) and 1 under dichotic conditions (right; CF = 0.8 kHz). A: dot rasters showing the responses to the leading and lagging stimuli, with ISDs ranging from 1 (bottom) to 101 (top) ms. Leading and lagging stimuli were placed at locations along the horizontal plane that elicited robust responses by the neuron, +30° and 0°, respectively. C: dot rasters showing the responses studied under dichotic conditions, with ISDs ranging from 1 (bottom) to 51 (top) ms. Interaural time differences (ITDs) for the leading and lagging click pairs were both set to +150 µs. B and D: recovery curves for the dot rasters shown in A and C. Normalized (see METHODS) response to the lag is plotted as a function of ISD. Horizontal lines represent the neurons' half-maximal ISD.
|
|
 |
RESULTS |
Responses of 178 neurons in the ICC of 17 cats were studied; of these, 66 neurons were studied dichotically and 112 in the free field. In addition, we encountered 54 cells that were not studied with precedence stimuli because they were only driven by pure tones or long-duration (>50 ms) noise but not short-duration (1-10 ms) noise or clicks. On encountering a neuron, it was studied initially with click stimuli and subsequently with short-duration noise; neurons that were unresponsive to clicks were studied with short-duration noise only. Recordings include neurons with CFs ranging from 0.08 to 30 kHz; 64% (114/178) were <3 kHz, 31% (55/178) had CFs between 3 and 30 kHz; and 5% (9/178) had unknown CFs.
Threshold was not estimated using identical stimulus parameters for all neurons because not all neurons responded maximally at the same location. Hence, for neurons that responded maximally to stimuli in the contralateral hemifield, threshold was estimated by obtaining rate-level functions at +45°. For other neurons, we first had to obtain azimuth functions at relatively low levels and subsequently measure rate-level functions at the preferred location.
Sensitivity to stimulus location in free field
We always began the free-field experiments by measuring the cell's sensitivity to clicks and noise varying in azimuth and elevation. Those data will be discussed in detail elsewhere. In short, we found that most neurons were sensitive to locations along the horizontal dimension with maximal responses to stimuli presented in the contralateral hemifield (clicks: 60%, 54/89; noise: 66%, 25/38) and with similar azimuthal or elevational sensitivity using clicks or noise. For each neuron we calculated the preferred horizontal region (PHR), defined as the area within which responses were >75% of the maximum spike rate. For 81% of neurons, the PHR fell within the contralateral hemifield: 5% spanning 90°, 16% spanning 75°, 18% spanning 60°, 31% spanning 30-45°, and 11% spanning <30°. For 13% of neurons, the PHR fell within the ipsilateral hemifield and spanned either 15-30° (8%), 45° (2.5%), or 60° (2.5%). Finally 5% of neurons had their PHR in both hemifields, spanning 75°.
In Fig. 2, we plot the azimuthal response area for two neurons, each studied with both clicks and noise and representing cells with strong or weak azimuthal sensitivity. For the neuron shown in Fig. 2, A and B, both clicks and noise elicited maximal responses when stimuli were presented in the contralateral hemifield, and responses at the lower level were restricted to fewer azimuthal locations. For the neuron shown in Fig. 2, C and D, both clicks and noise produced responses at most azimuthal locations with little effect of level on selectivity to azimuthal locations and an increase in discharge rate at higher levels.
In free field, 38 neurons also were studied with the stimuli in the midsagittal plane at different elevations. Directional selectivity of one neuron is shown in the azimuth and elevation in Fig. 3, A and B, for noise and clicks, respectively. For this neuron, as well as most neurons in the ICC, directional selectivity is weaker in the elevation than along the azimuth.

View larger version (18K):
[in this window]
[in a new window]
| FIG. 3.
Comparison of response areas along azimuth and elevation for the same neuron, using either clicks (top) or noise (bottom). Each function represents the number of spikes per stimulus presentation at every speaker position.
|
|
Half-maximal ISD in response to clicks
Results presented in the following section describe responses of ICC neurons to stimuli that are known to evoke the PE. Figure 4 shows the effects of varying the ISD for two different neurons under free-field (Fig. 4, A and B) and dichotic (Fig. 4, C and D) conditions using click stimuli. The positions of the leading and lagging stimuli were chosen because they each elicited strong responses when presented in isolation. When the ISDs were long, both neurons responded vigorously to both the leading and the lagging stimuli. As the ISDs were decreased, the delayed response was diminished, almost disappearing completely by 21 ms in Fig. 4A. In contrast, the lagging response was not completely suppressed even at 1 ms for the neuron in Fig. 4C. These two neurons were chosen to reflect the variability in the strength of echo suppression among IC neurons. Figure 4, B and D, shows recovery functions for the lagging responses of the neurons in Fig. 4, A and C, respectively. For each neuron, we calculated the half-maximal ISD (delay at which recovery function reaches 50% of the unsuppressed response); these values are 33 and 12 ms for Fig. 4, B and D, respectively.
The variability in suppression demonstrated in Fig. 4 is a general feature of the population of ICC neurons. This can be seen more clearly in Fig. 5, which shows representative samples of neurons studied in free field (A) and dichotically (C), using click stimuli presented 10-15 dB above threshold. In free field, the leading and lagging stimuli were selected for each neuron so that each produced strong responses; thus the two stimuli were sometimes in close proximity. Dichotically, ITD1 and ITD2 were chosen to be equal and at an ITD that produced a maximal response. Although this latter stimulus configuration simulates the improbable situation where the leading and lagging stimuli originate from the same location, it has the advantage of demonstrating the maximal suppression that can be obtained because suppression is greatest when the leading sound is at a location that exerts a strong response when presented in isolation (Litovsky and Yin 1998
). Whereas the half-maximal ISDs of some neurons are <5 ms, others are >100 ms. In Fig. 5, right, we compare histogram distributions for half-maximal ISDs of 94 neurons, 49 studied in free field (B) and 45 dichotically (D); the overall means ± SD were 35 ± 29 and 38 ± 36 ms, respectively. For the entire population, 20% (19/94) had half-maximal ISDs <10 ms, which approximates the range for perceptual echo thresholds. Finally, Fig. 6 displays a scatter plot of half-maximal ISDs as a function of CF for the data presented in Fig. 5. There is no apparent relationship between these variables (r = 0.06).

View larger version (34K):
[in this window]
[in a new window]
| FIG. 5.
Examples of recovery functions (left) and population histograms (right) of neural half-maximal ISDs for free-field (top) and dichotic (bottom) conditions. Recovery functions (left) represent the normalized lagging responses as a function of ISD. Although the recovery functions only include 22 sample neurons from each of the free field and dichotic populations, the histograms on the right are based on the entire populations of neurons studied (n = 49 for free field and n = 45 for dichotic). The bins at 100 ms include half-maximal ISDs of 100 ms.
|
|
Half-maximal ISD in response to noise stimuli
Because psychophysical studies have demonstrated the PE using a variety of different stimuli, including clicks, noise, speech, and music, we measured the neural responses to wideband noise stimuli with different configurations of noise tokens used for the leading and lagging stimuli. For 21 neurons, we compared half-maximal ISDs when lead and lag stimuli were either both identical tokens of noise or identical click stimuli (10- or 100-µs duration). Figure 7, A-C, shows the responses of three neurons to noise and clicks measured dichotically. For 43% of neurons (9/21), half-maximal ISDs were
5 ms longer with noise stimuli than clicks (e.g., Fig. 7A); for this subset of neurons, the average of noise half-maximal ISD minus click half-maximal ISDs was 22.5 ms. In 28.5% (6/21) of the neurons, half-maximal ISDs were
5 ms longer for clicks than for noise (e.g., Fig. 7B); for this subset, the average of click half-maximal ISD minus noise half-maximal ISDs was 36.8 ms. Finally in 28.5% (6/21) of neurons, the difference in half-maximal ISDs was <5 ms (e.g., Fig. 7C), favoring clicks in four cases and noise in two cases. In Fig. 7D, we show a population correlation plot of half-maximal ISDs using clicks and noise for 21 neurons, indicating that the two measures are positively related (r = 0.505).

View larger version (25K):
[in this window]
[in a new window]
| FIG. 7.
Comparison of recovery curves for clicks and noise. Each panel represents responses from a single cell. A: noise duration is 10 ms, wideband; suppression was stronger with either noise than clicks (CF = 0.8 kHz). B: 2 independent tokens of wideband noise bursts, each 5 ms in duration, 200-Hz bandwidth centered at 23 kHz (CF = 23 kHz); suppression was stronger with clicks. C: noise duration of 5 ms (CF = 1.7 kHz); there was little difference in suppression with noise or clicks. D: population correlation plot of 50% suppression for 21 neurons studied with clicks (x axis) or noise (y axis). Dotted line has a slope of 1.0 for reference.
|
|
To determine whether it is necessary to use identical stimuli for the lead and lag (as has been used in all previous figures), uncorrelated noise signals, which we will call noise A and noise B, were used for the lead and lag. The neuron shown in Fig. 8 had response thresholds that were 5 dB higher to noise B than to noise A. We varied the noise signal in the lead and lag to include four possible combinations (lead-lag as A-A, B-A, B-B, or A-B), with all stimuli at 55 dB. As in all other recovery curves, the lagging responses were normalized by the response to the lagging stimulus in isolation, i.e., for conditions A-A and B-A, the lagging responses were normalized by the response to noise A in isolation. Figure 8 shows, first, that suppression also is observed when the leading and lagging noise bursts are uncorrelated; this finding was true for all neurons studied under these conditions (n = 9). Second, for all nine neurons studied with these manipulations, the noise sample with the lower threshold exerted stronger suppression, e.g., for the neuron in Fig. 8, noise A exerted stronger suppression than noise B.

View larger version (17K):
[in this window]
[in a new window]
| FIG. 8.
Recovery functions for 4 permutations of wideband, independent noise bursts, each with a duration of 10 ms. Leading and lagging stimuli were either the same (AA, BB) or different (AB, BA; CF = 0.8 kHz).
|
|
Effects of varying overall level
So far, we have only considered stimuli at relatively low levels, even though the PE has been observed psychophysically over a wide range of sound levels. Although we expect leading stimuli of increasing level to generate more suppression, it is not obvious what to expect when the lagging stimulus level is raised concomitantly. Plotted in Fig. 9 are results from four neurons, two studied in the free field with clicks (Fig. 9, A and C), one in free field with noise (Fig. 9B), and one studied dichotically with noise (Fig. 9D). To assess the effect of overall stimulus level, for each neuron we estimated the level threshold change index (lTCI) or the change in half-maximal ISD in milliseconds that occurs for every decibel change in stimulus level. For example, an increase of 5 dB resulting in an increase of 5 ms in half-maximal ISD would correspond to a lTCI of 1; a negative ratio implies that an increase in level results in a decrease in half-maximal ISD. In Fig. 9A, the lTCIs were
9.8 and
1.1 for level changes from 40-45 and 45-50 dB, respectively. In Fig. 9B, the lTCIs were
7 and
3, for level changes from 25-30 and 30-35 dB, respectively. In both cases, increased level resulted in decreased suppression, and the largest change occurred between the two lower levels. A similar result was obtained under dichotic conditions, as seen in Fig. 9D with lTCIs of 0 and
3.4 for level changes from 61-66 dB and 66-71 dB, respectively. The neuron shown in Fig. 9C is an example of a neuron with a positive lTCI (+4) for changes between 40 and 45 dB. Figure 9E plots the half-maximal ISD as a function of overall SPL for a sample of 40 neurons in which we had measured recovery curves at two or more SPLs. The overall trend in the population is a negative slope, indicating negative lTCI values.

View larger version (31K):
[in this window]
[in a new window]
| FIG. 9.
Each panel represents data from 1 neuron. Effects of overall level on click (A and C) and noise (B and D) recovery curves studied in free field (A-C) or dichotically (D). Noise stimuli were as follows: B, 10-ms duration, 200-Hz bandwidth centered at 6 kHz (CF = 5.5 kHz); C, 5-ms duration, wideband. Stimulus levels of both leading and lagging sounds were always equal and covaried between conditions. In each plot, half-maximal ISDs are marked by the ISD at which the functions meet the 0.5 discharge rate line. CFs were 14, 5.5, 3.1, and 6.0 kHz for cells A-D, respectively. E: summary of changes in half-maximal ISDs as a function of overall level for the population (n = 40). Each line represents responses of 1 neuron at 2 levels. F: population histogram showing the number of units at each threshold change index (TCI) value.
|
|

View larger version (30K):
[in this window]
[in a new window]
| FIG. 10.
Effect of stimulus duration on noise recovery curves. Examples are shown for neurons tested under free-field (A; CF = 10 kHz) or dichotic (B; CF = 0.7 kHz) conditions. Stimuli were: A, narrowband noise burst with 200-Hz bandwidth centered at 8.5 kHz and durations ranging from 5 to 20 ms; B, wideband noise burst and durations varying from 1 to 10 ms. C: summary plot of changes in half-maximal ISDs as a function of duration for the population of neurons studied. Each line represents responses of one neuron tested at 2 duration values. D: population histogram showing the number of units at each TCI value.
|
|
For each neuron we calculated the lTCI for every adjacent pair of levels, and the mean of those values represents the average lTCI. The majority of neurons (29/40; 72.5%) had negative average lTCI values, though many were near 0. Plotted in Fig. 9F is a population histogram of average lTCI; 16/40 (40%) neurons had lTCI less than
1.0, 4/40 (10%) had lTCI greater than 1.0, and 20/40 (50%) had
1 < lTCI < 1. The correlation between average lTCI values and average levels used was low (r = 0.013, not shown), suggesting that the changes in half-maximal ISD with level do not depend on the absolute value of SPL.
Effects of overall stimulus duration
The effect of stimulus duration was tested using short noise bursts rather than clicks; this allows changes in duration that do not compromise the frequency bandwidth of the stimulus. Experiments were conducted under both dichotic and free-field conditions with the duration of the lead and lag being covaried. Again, we expect greater suppression with longer duration leading stimuli, but the counteracting effect of longer duration lagging stimuli is difficult to predict. Figure 10 shows data from two neurons, tested in free field (Fig. 10A) and dichotically (Fig. 10B). For both neurons, suppression seems to be stronger at longer durations. To quantify this effect, for each neuron we estimated the duration threshold change index (dTCI) or the change in half-maximal ISD in msec that occurs for every millisecond change in stimulus duration. The effect of duration varies with individual neurons. For example, in Fig. 10A, half-maximal ISDs increased by 13.2 ms for a duration increase of 5-10 ms (dTCI of 2.64). In contrast, for the neuron in Fig. 10B, there was no effect of duration between 5 and 10 ms and a dTCI of 1.6 ms between 1 and 5 ms. In Fig. 10C, we plot neural half-maximal ISDs as a function of stimulus duration for 18 neurons. For each neuron we calculated the dTCI for every adjacent pair of durations as described in the previous section for level. The mean of those values represents the average dTCI. Plotted in Fig. 10D is a population histogram of dTCI values; the majority of neurons (12/16; 75%) had positive average dTCI values (L1), a small number (3/16; 19%) had negative average dTCI values (less than
1), and one neuron (6%) had
1 < dTCI < 1. There was no correlation between the average dTCI values and the average durations that were used to obtain each neuron's dTCI (r = 0.131, not shown).

View larger version (38K):
[in this window]
[in a new window]
| FIG. 11.
Physiological evidence for precedence in elevation for 1 neuron in free field. A: azimuth and elevation rate functions for noise bursts (200-Hz bandwidth centered near the neuron's CF at 2.4 kHz, 5-ms duration) show a preference for the contralateral field and little sensitivity in elevation. B: recovery functions for the responses in C and D; CF = 2.5 kHz. C and D: dot rasters showing the responses of the same neuron to the same noise bursts for ISDs ranging from 1 (bottom) to 71 (top) ms along the azimuthal (left) and elevational (right) planes. In both cases, the lagging stimulus was at location 0°, which is common to both axes. Leading stimuli were placed at +90° on both the azimuth and elevation, which elicited similar discharge rates for single noise bursts (A).
|
|
Echo suppression in elevation
In the experiments described thus far, PE stimuli were placed in free field along the horizontal plane where stimuli contain natural interaural differences in time, level, and spectrum or were presented dichotically with varying ITDs. In the following experiment, we tested whether interaural differences are necessary for the PE to be observed. To explore this question in the free field, we used the array of loudspeakers at varying elevations in the midsagittal plane at 0° azimuth (Fig. 1A). To compare the amount of echo suppression observed in the azimuth and elevation, the delayed sound was placed at (0°,0°), a position common to the two planes. To minimize any differential effect due to the strength of the response to the leading stimulus, we chose the leading stimuli such that they elicited similar discharge rates to stimuli presented in isolation. For example, for the neuron in Fig. 11A, responses to stimuli at +90° azimuth and +90° elevation had similar discharge rates, thus we compared the effect of a leading stimulus at these two positions on a lagging stimulus at (0°, 0°). The ISDs were varied from 1 to 71 ms. In both cases, the half-maximal ISD was ~10 ms as shown by the recovery curves in Fig. 11B, and the overall shapes of the recovery curves for the leading stimulus in the azimuth or elevation were very similar.
Two important points can be made from Fig. 11. First, echo suppression was evident on both the azimuth and elevation. Second, the amount of echo suppression was almost identical on the two axes. In fact, the recovery curves and half-maximal ISDs for the lagging responses were very similar in azimuth and elevation (Fig. 11B). This similarity in recovery curves was also evident across the population as shown in Fig. 12, which shows azimuthal (Fig. 12A) and elevational (Fig. 12B) recovery curves for a sample of 11 cells. Recovery curves for the same cell are shown in the two panels by matching symbols. The recovery curves in Fig. 12 show the same large variability in half-maximal ISDs as in our larger population (Fig. 5), ranging from a few milliseconds to 80 ms. Furthermore, by comparing recovery curves with the same symbol, it is evident that cells with long half-maximal ISDs for azimuthal stimuli also have long delays for elevational stimuli. Finally, a direct comparison of the half-maximal ISDs in the azimuth and elevation for the entire population of 38 neurons that were studied on both axes show a high correlation of 0.87 (Fig. 12C). When the leading and lagging stimuli were both presented from locations in the elevation that elicited a robust response, echo suppression was seen in 95% (36/38) of neurons. In the remaining 5% (2/38) of cells, echo suppression in elevation was not seen or was too weak to be measured, and the phenomenon was also absent on the horizontal axis. Thus it appears that interaural differences are not necessary for echo suppression in ICC cells, and the amount of suppression observed is very similar in azimuth and elevation, provided the leading stimuli elicit similar responses.

View larger version (23K):
[in this window]
[in a new window]
| FIG. 12.
Recovery functions along the azimuthal (horizontal) and elevational (median) planes in free field. Each plot shows samples of 10 neurons with the lagging stimulus always positioned at 0°; for each neuron, the leading stimulus was placed at positions along the 2 planes that elicited similar discharge rates. Responses of each neuron are plotted using the same symbols for the azimuth (top; A) and elevation (bottom; B). C: correlation plot of half-maximal ISDs along the azimuth and elevation in free field. Each point represents 1 neuron's responses along the 2 planes. Line is drawn with a slope of 1.0 for reference (reprinted with permission from Litovsky et al. 1997 ).
|
|
Do ICC cells show the buildup phenomenon?
Recent psychophysical experiments have suggested that echo suppression is a dynamic process that accumulates or "builds up" with ongoing stimulation (Clifton and Freyman 1989
, 1997
; Freyman et al. 1991
; Yang and Grantham 1997
). When human listeners hear a continuous train of PE stimuli that are near echo threshold, both sounds are heard at their respective locations at the onset of the train, but as the train continues, the delayed sound fades away and only the leading sound is heard. Freyman et al. (1991)
discovered that the number of clicks in the train is one of the most essential elements of the "buildup" effect. We asked whether IC cells also show this effect by examining each cell's responses at an ISD that was near the neuron's half-maximal ISD. In Fig. 13, left, we plotted latency dot rasters from two sample neurons, both at the ISDs that were nearest those neurons' half-maximal ISDs, 20 ms in Fig. 13A and 40 ms in Fig. 13B. In both cases, the neurons responded to the leading stimulus on every one of the 50 trials and only responded to the delayed sound on about half of the trials. Had buildup been present in the responses of these neurons, we would have expected to see a response to the delayed sound during the first few trials with no response during the later trials. However, the suppression appears to be distributed randomly throughout the 50 trials, with no distinct suppression during the latter portion of the trial series.

View larger version (29K):
[in this window]
[in a new window]
| FIG. 13.
No evidence for "buildup" in the IC. A: dot rasters showing 1 neuron's responses to 50 repetitions, commencing at the bottom and culminating at the top, with a constant ISD of 20 ms (CF = 3 kHz). B: different neuron presented with 50 repetitions of the same stimulus at a constant ISD of 40 ms (CF = 2.8 kHz). C: summary population data. Data for each neuron were divided into 3 separate clusters by trial number. Top: each neuron's responses were divided into the 1st 5 trials (1-5) and the next 45 trials (6-50). Shown in that panel are the proportion of trials on which neurons fired (i.e., a spike occurred during each cluster, divided by the number of trials in the cluster). Middle: clusters compared for trials 1-10 and 11-50. Bottom: comparison of trials 1-25 and 26-50.
|
|
We conducted a statistical analysis of this phenomenon for 47 neurons. Each neuron had been stimulated at positions (or ITDs) that produced strong suppression with 50 repetitions at each delay. The probability of discharge per stimulus during the first N trials was compared with the probability during the next 50
N for three cases (n = 5, 10, and 15). The proportions are plotted in Fig. 13, right. Dependent t-tests revealed no significant difference in proportions for any of the comparisons (P > 0.05).
 |
DISCUSSION |
These data provide physiological evidence that most cells in the ICC of anesthetized cats exhibit a correlate of the PE in the form of echo suppression. We used stimuli that generally correspond to sounds in a natural acoustic environment and that perceptually result in the PE, although for some of our manipulations the correlate psychophysical data are sparse. In the ensuing discussion, we will concentrate on relating our results to known physiological and psychophysical findings with most of the emphasis placed on functional implications.
Echo suppression
The general finding that most cells in the ICC show a suppressed response to the lagging stimulus is consistent with previous reports using stimuli known to evoke the PE (Fitzpatrick et al. 1995
; Keller and Takahashi 1996
; Yin 1994
). Previous studies in the ICC also have described a suppressive effect after an excitatory event on the time course of tens of milliseconds (Carney and Yin 1989
; Nelson and Erulkar 1963
). Most of our results also show that many perceptual phenomena related to the PE find correlates in single-neuron ICC responses. Nonetheless, we must be cautious in directly relating these physiological findings to perceptual echo thresholds in humans. In our sample of 178 neurons, there was significant variability in half-maximal ISDs, ranging from 1.5 to >100 ms for clicks, which corresponds to previous results using PE stimuli in the ICC of anesthetized cats (Yin 1994
). However, psychophysically, the PE is thought to be strongest for clicks at delays that are <10 ms (Blauert 1983
; Freyman et al. 1991
; Wallach et al. 1949
; Zurek 1987
). Our results are consistent with the evidence for other stimulus dimensions that perceptual thresholds are generated by those neurons with the lowest thresholds rather than by mean behavior of the population (Elliot et al. 1960; Liberman 1978
; Yin 1994
). Examining neurons that were studied with both lead and lag at locations that elicited strong excitation, we found 20% (19/94; see Fig. 5) had half-maximal ISDs that fell within the general range of the PE for clicks, i.e., <10 ms.
A recent study by Fitzpatrick et al. (1995)
found that the median delay at which 50% recovery from echo suppression occurred in the brain stem of the awake rabbit was ~6 ms, although the overall population responses did extend to 69 ms. The significant differences between the mean of their population and that of ours may be due to species differences or to the effects of anesthesia. Kuwada et al. (1989)
have shown through recordings of responses to non-PE stimuli from ICC neurons with and without barbiturate anesthesia that the former results in increased inhibition. In addition, barbiturate anesthesia is known to potentiate
-aminobutyric acid-mediated inhibition (Barker and Ransome 1978
; Richter and Holtman 1982
), which might result in stronger suppression of the lagging response. To better understand this effect, one ideally should study the PE with and without anesthetic with an approach similar to that of Kuwada et al. (1989)
.
Yin (1994)
has suggested that ICC cells also are involved in the PE and summing localization because the discharge of many cells is correlated with the perceived location of the sound source if we assume that cats experience the PE and summing localization in a manner similar to that of human subjects. Recent behavioral data suggest that cats do experience summing localization similarly to humans (Populin and Yin 1998
). Cats trained to look at sound sources made eye movements that were consonant with the perceived locations reported by human subjects listening to the same stimuli: the perceived locations were between the two speakers from which single clicks with ISDs were emitted, biased systematically toward the leading speaker, near the midline when the clicks were delivered simultaneously, and fully lateralized for ISDs
300 µs (as compared with 800-1,000 µs in humans) (Blauert 1983
), which is consistent with the smaller head size of the cat.
Covariation in the lead and lag
In a natural listening environment, many features of sounds undergo frequent fluctuations, depending on the spectrum, distance between the source and listener, and physical characteristics of the environment. These fluctuations will tend to covary in the original source and its reflections. We studied neuronal responses for two of the most obvious conditions that might change in this manner: stimulus level and duration. Whereas most neurons showed stronger suppression for lower stimulus levels (when the stimulus produced fewer neural responses), the opposite effect was observed for stimulus duration, where longer durations (when the excitatory effect of the stimulus is increased) resulted in stronger suppression. Similar results have been reported in human psychophysics for both level (Shinn-Cunningham et al. 1993
) and duration (Damaske 1971).
As mentioned earlier, the effect of increased suppression with lower stimulus levels is not one that necessarily would be expected, either psychophysically or physiologically, because both the leading and lagging stimuli are covaried. Our data suggest that for the majority of neurons, these changes reflect the increased suppressibility of the lag with lower levels rather than on increasing effectiveness of the lead for suppression. That the physiology is consistent with psychophysical results is encouraging regarding the potential connection between the two.
Physiological correlates of the PE in elevation
Traditionally, both behavioral and physiological studies of the precedence effect have been conducted with stimuli on the horizontal axis (Blauert 1983
). Several studies have shown that the PE also occurs in the median sagittal plane. Blauert (1971)
originally showed that summing localization occurs in elevation and hence was not necessarily dependent on binaural information. More recent evidence suggests that echo suppression and localization dominance occur almost equally in azimuth and elevation (Litovsky et al. 1997a
, b; Rakerd and Hartmann 1994
).
In this study, we have compared the echo suppression on the horizontal axis with that obtained on the elevational axis by examining the suppression of the speaker at (0°,0°), which is at the intersection of the two planes, by leading sounds along each axis. In our population, any neuron that showed suppression did so in both planes with a high correlation in the half-maximal ISDs for the two (r = 0.8). In the companion paper (Litovsky and Yin 1998
), we show that most phenomena related to echo suppression that are found on the horizontal plane also can be demonstrated in elevation. In a recent collaborative effort, we showed that the human behavioral data and our physiological data in cats are highly correlated and likely to reflect processing by the same mechanisms (Litovsky et al. 1997b
).
The nature of such mechanisms remains to be explored further, although it is clear that models of the PE need not require binaural disparities. In elevation, binaural disparities are virtually absent, and listeners must base any decisions concerning the locations of sounds on spectral information that arises from directional properties of the head, pinna, and body. Each location in space is associated with characteristic peaks and valleys in the spectrum of sounds, and listeners have been shown to use this information effectively in sound localization (Hebrank and Wright 1974
; Searle et al. 1976
). Many models of the PE primarily take into account interaural difference information (Colburn and Ibrahim 1993
; Lindemann 1986
; Shinn-Cunningham et al. 1993
), and because the PE appears to occur in elevation, where interaural cues are minimal, these models should be reevaluated to include cases in which interaural cues are absent.
PE with uncorrelated stimuli
An important question is the degree to which a lagging source can be different from a leading source but still be suppressed. In most natural environments, the echo will be a filtered version of the leading source; the degree of correlation of the lead and lag will depend on the acoustic properties of the reflecting surfaces, distance to the subject, etc. We began our physiological examination of this question by using uncorrelated noises for the lead and lag. Our results show that neural suppression for uncorrelated lead-lag noise samples is similar to that for correlated samples (Fig. 8), which is in accord with psychophysical data. Several studies (Blauert and Divenyi 1988
; Zurek 1980
) have shown that a broadband noise suppresses a lagging source that is different or uncorrelated. Furthermore, the PE still operates for narrowband sound with different center frequencies (Blauert and Divenyi 1988
; Divenyi 1992
; Divenyi and Blauert 1987
; Shinn-Cunningham et al. 1995
), although the magnitude of suppression is related inversely to the amount of overlap in the power spectrum of the leading and lagging sources (Blauert and Divenyi 1988
; Divenyi and Blauert 1987
). Together with the psychophysical data, these findings suggest that the PE is a generalized mechanism that serves to suppress information from delayed signals, which are not necessarily exact replicas of the original sound source (Zurek 1987
).
Further considerations
We did not find physiological evidence for the psychophysical effect known as buildup, which refers to the finding that echo suppression depends on, and changes dynamically during, ongoing auditory stimulation (e.g., Clifton and Freyman 1989
; Freyman et al. 1991
). It has been suggested that listeners "size up" the acoustic properties of a room during the course of several stimulus repetitions and adjust their suppression to the appropriate range of ISDs for that room (Clifton and Freyman 1997
; Clifton et al. 1994
). To the extent that the buildup effect reflects an active cognitive process related to the PE, it might be expected to be cortical in origin and therefore not manifested in the responses in the IC. Our results provide evidence compatible with the view that buildup is a "higher-level" process, distinct from other phenomena that were observed and discussed in detail in preceding sections.
It should be noted that our data are insufficient for demonstrating that ICC neurons form the major physiological substrate for the PE. However, to the extent that the neural responses we present here mirror known perceptual phenomena related to the PE, we can suggest that the ICC may be one of the sites involved in echo suppression. Yin (1994)
has shown that single-neuron responses in the ICC may reflect the perceived location of paired clicks in the "summing localization" range (ISDs of ±1 ms). To establish a more direct link between the physiology and psychophysics, one would need to record from neurons in animals that were simultaneously perceiving the effect. We expect, however, that an organism's ability to initiate a localization behavior in a PE task may reflect processing at levels beyond the IC, such as the cortex. Several lines of evidence exist to support this idea. First, unilateral ablation of the auditory cortex severely disrupts the PE while leaving intact the ability to localize single-source sounds. That is, before being operated on, these animals localized a click pair to the leading source, but postoperatively this ability was impaired when the leading source was contralateral to the ablation site (Cranford and Oberholtzer 1976
; Whitfield et al. 1972
). A second source of evidence comes from studies with human infants. Newborn infants, in whom cortical development lags behind that of more peripheral auditory structures, exhibit the same behavior as that of operated cats (Clifton 1985
; Clifton et al. 1981; Litovsky and Ashmead 1997
). The PE, however, does appear during the first year of life when the cortex is undergoing rapid maturation (Morrongiello et al. 1984
; Muir et al. 1989
). Interestingly, recordings made in the IC of young kittens suggest that a physiological correlate of the PE is present as early as 8 days postnatal and occurs at similar ISDs to those recorded in adult cats (Litovsky 1998). Suppression in kitten neurons varies with stimulus level, duration, and azimuthal position in a similar manner to that seen in the present paper in adult neurons. Because the age at which correlates of the PE can be found in the kitten precedes the age at which kittens can localize sound sources effectively, Litovsky (1998) suggested that neural mechanisms that might be involved in the first stages of processing PE stimuli may be in place well before the behavioral correlate develops. Finally, human patients with neuropathologies including temporal lobe epilepsy and multiple sclerosis show deficits in performing on tasks involving PE sounds but not single-source stimuli (Cranford et al. 1990
; Hochster and Kelly 1981
).
Free-field location selectivity
The azimuthal and elevational response areas, particularly with clicks, also deserve some attention. In general, ICC neurons are directionally sensitive to click stimuli with preference for the contralateral hemifield. These data are consistent with previous reports in the ICC using noise and tones, which suggest that a significant proportion of neurons respond maximally in the contralateral hemifield, with increase in the response area curves at higher levels (Aitkin and Martin 1987
; Aitkin et al. 1984
, 1985
). Although the exact proportion of highly directional neurons is difficult to compare across studies, our findings in the ICC are consistent with those reported in area AI of the cortex by Imig et al. (1990)
, whose criteria we used to define location sensitivity: the proportion of highly directional neurons was 69% in the ICC and 66% in the cortex.
A novel aspect of the present results comes from the click data, which to our knowledge has not previously been described in the ICC in free field. Our results suggest that ICC neurons respond in free field in a similar fashion to low-level clicks and noise along the azimuth and elevation. This is in accordance with the observations made under dichotic conditions that the ITD functions obtained in the ICC for clicks and noise show similar sensitivity to ITDs (Carney and Yin 1989
), though the spatial sensitivity of ICC neurons is governed by cues other than ITDs as well (Delgutte et al. 1995
). One similarity between our free-field results and those obtained under dichotic conditions by Carney and Yin (1989)
is the tendency for ICC neurons to saturate with increasing level, resulting in a small dynamic range. Hence, at relatively high levels when the neurons' responses have saturated, there is neither ITD sensitivity dichotically nor spatial sensitivity in free field. In contrast, noise responses are less liable to saturate and thus both ITD and azimuth sensitivity is retained for relatively large changes in noise level.
The response areas obtained in elevation with clicks are also the first measurements of click responses in the ICC along that dimension. In general, when click stimuli were used, ICC neurons were less sensitive to direction in elevation than in azimuth. In elevation, spectral cues due to transformation of the sound by the head and ears provide the major localization cues (Hebrank and Wright 1974
, 1976
; Searle et al. 1975
). Results using virtual space stimuli to simulate sounds in free field and to manipulate the localization cues independently (Delgutte et al. 1995
) suggest that spectral cues are also less potent than ITDs and ILDs (which provide for sensitivity along the azimuthal plane) in determining directional sensitivity in the ICC. In addition, psychophysical data in humans (Middlebrooks and Green 1991
; Wightman and Kistler 1989
) and in cats (Populin and Yin 1998
) have suggested that sound localization precision in elevation is not as acute as that in azimuth. If ICC neurons are involved in sound localization, as has been suggested (Yin 1994
), then their decreased sensitivity in elevation may mirror a similar perceptual effect.