Desynchronizing Responses to Correlated Noise: A Mechanism for Binaural Masking Level Differences at the Inferior Colliculus

Alan R. Palmer, Dan Jiang, and David McAlpine

Medical Research Council, Institute of Hearing Research, University of Nottingham, Nottingham NG7 2RD, United Kingdom


    ABSTRACT
Top
Abstract
Introduction
Methods
Results
Discussion
References

Palmer, Alan R., Dan Jiang, and David McAlpine. Desynchronizing responses to correlated noise: a mechanism for binaural masking level differences at the inferior colliculus. We examined the adequacy of decorrelation of the responses to dichotic noise as an explanation for the binaural masking level difference (BMLD). The responses of 48 low-frequency neurons in the inferior colliculus of anesthetized guinea pigs were recorded to binaurally presented noise with various degrees of interaural correlation and to interaurally correlated noise in the presence of 500-Hz tones in either zero or pi  interaural phase. In response to fully correlated noise, neurons' responses were modulated with interaural delay, showing quasiperiodic noise delay functions (NDFs) with a central peak and side peaks, separated by intervals roughly equivalent to the period of the neuron's best frequency. For noise with zero interaural correlation (independent noises presented to each ear), neurons were insensitive to the interaural delay. Their NDFs were unmodulated, with the majority showing a level of activity approximately equal to the mean of the peaks and troughs of the NDF obtained with fully correlated noise. Partial decorrelation of the noise resulted in NDFs that were, in general, intermediate between the fully correlated and fully decorrelated noise. Presenting 500-Hz tones simultaneously with fully correlated noise also had the effect of demodulating the NDFs. In the case of tones with zero interaural phase, this demodulation appeared to be a saturation process, raising the discharge at all noise delays to that at the largest peak in the NDF. In the majority of neurons, presenting the tones in pi  phase had a similar effect on the NDFs to decorrelating the noise; the response was demodulated toward the mean of the peaks and troughs of the NDF. Thus the effect of added tones on the responses of delay-sensitive inferior colliculus neurons to noise could be accounted for by a desynchronizing effect. This result is entirely consistent with cross-correlation models of the BMLD. However, in some neurons, the effects of an added tone on the NDF appeared more extreme than the effect of decorrelating the noise, suggesting the possibility of additional inhibitory influences.


    INTRODUCTION
Top
Abstract
Introduction
Methods
Results
Discussion
References

The binaural masking level difference (BMLD) is a much investigated psychophysical phenomenon in which signals presented to both ears, masked by a noise to both ears, are made more audible by changes in the interaural phase of the masker or signal (Hirsh 1948; Licklider 1948; see Colburn 1996 and Durlach and Colburn 1978 for comprehensive reviews). A specific, well-documented case is the masking of a 500-Hz tone that is identical at the two ears (So) by a noise that is identical at the two ears (No). The 500-Hz tone can be made 12-15 dB more audible by inverting either the tone (Spi ) or the noise (Npi ) at one ear but not both.

An account of the mechanisms of the BMLD can be given in terms of a network of neurons that are sensitive to the interaural delay of the signal (Colburn 1977). Such a network, consisting of coincidence detectors fed by a series of delay lines, was first proposed by Jeffress (1948) as an explanation of low-frequency binaural hearing. This model can also be applied specifically to account for binaural unmasking. In the case of the 500-Hz BMLD task described previously, the model proposes that neurons perform a cross-correlation of the activity in a single-frequency channel. Within that channel, centered at the tone frequency, filtered noise components are identical. Adding identical tones to the noise at the two ears produces the same within-channel phase shifts, leaving the responses to the noises still synchronized with each other. At most, the extra energy within the channel results in a slightly higher discharge rate at the coincidence detector output. However, when pi -phase tones are added, the within-channel noise components from each ear are subject to unequal phase shifts, and the result is a desynchronization of the inputs to the coincidence detector and hence a lower output. The asymmetry between the extra coincident spikes caused by addition of identical tones and the reduction in coincident spikes caused by pi -phase tones underlies their improved detectability and hence the BMLD.

In support of the general cross-correlation model, there is good physiological evidence to indicate that medial superior olivary (MSO) neurons act as interaural cross-correlators (Spitzer and Semple 1995; Yin and Chan 1990), characteristics also evident in the projection target of the MSO, the inferior colliculus (IC) (Kuwada and Yin 1983; Yin and Kuwada 1983a,b). Responses of many low-frequency IC neurons are modulated with the interaural delay of binaural signals, showing maximum discharge rates (peaks) at certain delays and minimum discharge rates (troughs) at others. For tonal stimuli, the effect is to produce a periodic delay function, the period being equal to that of the stimulus waveform. For noise stimuli, the effect is to produce a damped oscillatory (or quasiperiodic) noise delay function (NDF), which commonly contains a central peak and attenuated side peaks separated from the central peak by the period of the neuronal best frequency (BF) (Yin et al. 1986, 1987). Reducing the interaural correlation of dichotically presented noises demodulates the NDF (Yin et al. 1987); irrespective of the interaural delay, neurons show a level of activity approximately equal to the mean of the peaks and troughs of the NDF obtained with fully correlated noise. This provides additional support for the cross-correlation model of low-frequency binaural hearing.

In investigations into the neural basis of the BMLD, we found that inverting the phase of a masked 500-Hz tone at one ear made it detectable at a lower sound level in a majority of IC neurons (Jiang et al. 1997a,b). Inversion of the noise in one ear made So tones more detectable (unpublished data). Further, we found that the most sensitive indicator of the presence of the Spi tone in No noise was a decrease in the level of activity caused by the masking noise. For any specific IC neuron, the response to Spi tones was predictable from its sensitivity to the interaural delay of tones and noises and could consist of either an increase or a decrease in the discharge rate. These empiric physiological data are in good agreement with cross-correlation models of the BMLD.

Here we investigate the adequacy of the desynchronization model to account for the BMLD at the neuronal level. Specifically, we examine whether desynchronization provides a sufficient explanation for the reduced levels of activity that we found in neurons of the IC when a 500-Hz tone signal was added to masking noise. We compared the activity of neurons to noises with different interaural correlation (producing desynchronization of the noise) to that when an identical binaural noise is presented simultaneously with a tonal signal. In the majority of instances, it appears that the reduced activity caused by addition of the 500-Hz signal could be accounted for by desynchronization of the activity due to the No noise. In some instances, however, the reduction in activity caused by the 500-Hz signal exceeded that predicted simply by desynchronization and could be a result of inhibition.


    METHODS
Top
Abstract
Introduction
Methods
Results
Discussion
References

Anesthesia and surgical preparation

We recorded from 48 neurons in the ICs of 23 pigmented guinea pigs weighing between 300 and 430 g, all of which were used in experiments examining other aspects of low-frequency binaural processing. The small numbers of neurons analyzed in any one animal were not different qualitatively or quantitatively from the larger sample analyzed for other purposes. The animals were premedicated with atropine sulfate (0.06 mg sc) and anesthetized with urethan (1.3 g/kg in 20% solution ip). Further analgesia was obtained with phenoperidine (1 mg/kg im). Supplementary doses of phenoperidine (0.5-1 mg/kg im) were given on indication provided by the pedal withdrawal reflex. All animals were tracheotomized, and core temperature was maintained at 37°C with a heating blanket. In some cases, the animal was artificially ventilated with 95% oxygen-5% CO2, and end-tidal CO2 was monitored. The animal was placed inside a sound-attenuating room in a stereotaxic frame with hollow plastic specula replacing the ear bars. Pressure equalization within the middle ear was achieved by a narrow polythene tube (0.5-mm external diam) sealed into a small hole in the bulla on each side. The cochlear condition was assessed by monitoring the cochlear action potential (CAP) in the left ear at intervals throughout the experiment by using a silver wire electrode on the round window. The threshold of the filtered and amplified CAP to a series of short-tone bursts was measured automatically (Palmer et al. 1986) at selected frequencies (0.5, 1, 2, 4, 5, 7, 10, 15, 20, and 30 kHz). The acoustic cross-talk between the two ears of our closed-field acoustic system was previously reported; at frequencies from 0.5 to 10 kHz the cross-talk was >50 dB down and was >45 dB at all frequencies (Palmer et al. 1990). These values are similar to those reported in other studies in the guinea pig (Popelar et al. 1988; Teas and Nielsen 1975).

A craniotomy was performed on the right side, extending 2-3 mm rostral and caudal of the interaural axis and 3-4 mm lateral from midline. After removal of the dura, the exposed brain was covered with 1.5% agar. Recordings were made with stereotaxically placed tungsten-in-glass microelectrodes (Bullock et al. 1988) advanced by a piezoelectric motor (Burleigh Inchworm, IW-711-00) into the IC through the intact cortex.

Stimulus presentation

The stimuli were delivered through sealed acoustic systems that consisted of a 12.7-mm condenser earphone (Brüel and Kjaer 4134), coupled to a damped 4-mm diam probe tube that fitted into the speculum. The outputs were calibrated a few millimeters from the tympanic membrane with a Brüel and Kjaer 4134 microphone fitted with a calibrated 1-mm probe tube. The sound system response on each side was flat to within ±5 dB from 0.1 to 10 kHz, and the left and right systems were matched to within ±2 dB over this range.

Stimuli

The stimuli used in this study were tones and noises presented to the two ears. The noises used were digitally synthesized "frozen" noise with a bandwidth of 0.05-5 kHz and output at a sampling rate of 50 kHz via a digital-analog converter (TDT QDA2) and a waveform reconstruction filter (Kemo VBF33, cutoff 5 kHz, slope 135 dB/octave). The noises were gated on and off with a 5-ms ramp, and interaural delays were introduced during synthesis. Two independent noise samples were synthesized with the method of Klatt (1980) in which each sample is the sum of 16 values from a random number generator. The noises were digitally filtered with a pass-band of 0.05-5 kHz. The noise to the left ear was always the same frozen noise sample, whereas that to the right was the sum of the two independently synthesized noises. The proportions of the independent noises added together to produce the right ear noise were selected to give specific values of interaural correlation as described by Yin et al. (1986). The values required for the noise scaling factors were first computed theoretically to give interaural correlation values ranging from 0.0 (completely independent noises) to 1.0 (the same noise applied to both ears) in increments of 0.1 by using the following equation
<IT>x</IT> = <FENCE><IT>r</IT><SUP>2</SUP> − <IT>r</IT><RAD><RCD>(1 − <IT>r</IT><SUP>2</SUP>)</RCD></RAD></FENCE>/(2<IT>r</IT><SUP>2</SUP> − 1) (1)
Thus a correlation r is obtained when the right noise is generated by adding x multiplied by the left noise to (1 - x) multiplied by an independent noise. The correlations obtained were checked by capturing 1-s samples of the left and right output waveforms and performing a cross-correlation. In the first few experiments, 50-ms noise bursts were used. However, the theoretical correlation value is the long-term average, and over short time epochs the correlation may be very different. Combined with the relatively narrow filtering properties of IC neurons, this may lead to higher interaural correlations than otherwise predicted. To avoid this problem in later experiments we took two steps. The first was to use relatively long stimulus durations (320 ms), and the second was to use a different sample of noise on the right ear on every sweep. Using the factors obtained from Eq. 1 led to a small imbalance in the sound levels of the noise at the ears. This was 0 dB for correlations of 1.0 and 0.0 and maximally 3.0 dB for correlations of 0.8. Because sensitivity to interaural delay is relatively insensitive to the interaural level difference over quite wide ranges (Kuwada and Yin 1983; Peña et al. 1996), this small imbalance was not considered to be important in the interpretation of the data.

The 500-Hz tones used in conjunction with the noise were generated by a Hewlett Packard 3325A waveform synthesizer and were presented to the ears via a digital delay line set either to produce So (i.e., synchronous gating) or Spi (the tone to the right ear was delayed by one half period of the tone frequency). The two-channel digital delay line enabled the interaural delays of the tones to be set by computer or manually with a resolution of 1 µs. The tone waveform was not phase locked to the frozen noises. The tones were gated on and off simultaneously with the noise stimuli.

Other tonal stimuli were digitally synthesized and output from a digital analog converter (TDT QDA2) and a waveform reconstruction filter (Kemo VBF33, cutoff 5 kHz, slope 135 dB/octave).

Data collection and analysis

Single neurons were isolated with 50-ms tone and/or noise bursts as search stimuli. The extracellularly recorded neural action potentials were amplified (Axoprobe 1A: ×100, and a further ×10 by using an in-house amplifier), filtered (155-1,800 Hz), converted to logic pulses by an amplitude discriminator and timed with 10-µs resolution (CED 1401 plus). The lowest binaural threshold to interaurally in-phase tones and the frequency at which it was obtained (BF) were determined audiovisually. The spontaneous rate was routinely measured over a 10-s period in the absence of controlled acoustic stimulation.

The following analyses were carried out in this study.

FREQUENCY-RESPONSE AREAS. These were obtained by presenting 50-ms tone bursts in a pseudorandom order at a rate of 5/s covering a frequency range from two octaves above to four octaves below the unit's BF in steps of 0.12 octaves and over a sound level range of 100 dB in 5-dB steps. The number of spikes elicited by a single frequency and level combination was counted and after smoothing displayed as graded blocks with densities proportional to the spike count at the appropriate frequency and level position. This analysis was run binaurally with a fixed interaural delay of zero interaural time delay (ITD) across all frequencies.

INTERAURAL PHASE DIFFERENCE (IPD) HISTOGRAMS. IPD histograms were measured with binaural beat stimuli (Kuwada et al. 1979). Digitally synthesized tones that differed by 1 Hz to the two ears were used, resulting in a linear change in the binaural phase disparity at a rate of 360°/s. The frequency of the signal delivered to the left ear (contralateral to the recording site) was always 1 Hz greater than that delivered to the right (ipsilateral) ear. The duration of the stimulus was 3,000 ms, which included three complete cycles of the entire range of possible IPDs. The stimulus was repeated 10 times with an interstimulus interval of 6.5 s. The best IPD, the corresponding mean best delay for the signal (SBD), and the vector strength for each frequency were computed from the combined middle two cycles (over the middle 2 s of the stimulus duration) by using the method described by Goldberg and Brown (1969) and Yin and Kuwada (1983a). All neurons were tested with BF tones and with 500-Hz tones. Because we were concerned with the most common BMLD condition we only then analyzed units which responded to 500 Hz.

NDFS. NDFs were measured by presenting frozen noises with interaural time disparities over a range equal to 3 times the period of the neuron's BF, in 52 equal delay steps, starting from ipsilateral leading. The duration of the stimulus was 320 ms with three repetitions presented at one per second. NDFs were measured for a range of interaural correlations of the noise and also for noise identical at the two ears (No) in the presence of synchronously gated 500-Hz tones at a range of tone levels.

MASKED RATE-LEVEL FUNCTIONS (MRLFS). MRLFs were obtained by running tone rate-level functions in the presence of a noise masker at a fixed level. The masker noise was either No or Npi (inverted at 1 ear). Tone rate-level functions were generated by presenting tones (50-ms duration, rise-fall time 1 ms) and noise (5 kHz bandwidth) simultaneously and varying the level of the tone pseudorandomly over a maximum range of 100 dB in 1-dB steps. The fixed noise level was arbitrarily chosen to be 7-15 dB above the No noise threshold, a level at which a reasonable No-driven response and a well-tuned NDF was obtained. Similar levels were used for the Npi noise. Possible order effects were minimized by ensuring that each stimulus was never >50 dB weaker than the one preceding it. The number of spikes elicited by each tone was counted, and the average MRLF was computed from 10 presentations at each level. The frequency of the tone used was 500 Hz either interaurally in phase (So) or out of phase (Spi ). The stimulus presentation consisted of alternating the signal-plus-noise and the noise alone at a rate of 5/s. Only the discharges in the signal-plus-noise interval were used to construct the MRLFs.


    RESULTS
Top
Abstract
Introduction
Methods
Results
Discussion
References

First, to provide the context for the desynchronization study, we provide examples of the discharge rate reductions in the IC that result from adding tones to noises.

Responses of IC neurons to signals evoking BMLDs

NOSO VERSUS NOSpi . The MRLFs of two single neurons in the IC to 500-Hz So and Spi tones in the presence of No noise are illustrated in Fig. 1. Depending on the delay sensitivity to tones and noise, the No noise is more or less effective in evoking a steady level of neural activity, and the 500-Hz tones may cause either increases or decreases in the noise-evoked activity. For the units shown in Fig. 1, the So tone simply increases the discharge rate. However, although the Spi tone causes an increase in the discharge rate over and above that caused by the noise, for the unit shown in Fig. 1A, it causes a reduction in the discharge rate of the unit in Fig. 1B. The relationship of such discharge rate changes to the neuron's delay sensitivity was described in detail elsewhere (Jiang et al. 1997a,b). The arrows in Fig. 1 indicate the masked thresholds for 500-Hz tones determined by a method based on signal detection theory (Jiang et al. 1997a), and it can be seen that Spi tones were often detectable at a lower sound level than So tones, even in individual units. Thus the units in Fig. 1 exhibit BMLDs in the same direction as the psychophysical data.



View larger version (18K):
[in this window]
[in a new window]
 
Fig. 1. Discharge rate versus level functions of 2 different inferior colliculus (IC) neurons in response to 500-Hz dichotic tones presented either in phase (So) or out of phase (Spi ) in the presence of noise identical at the 2 ears (No). A: best frequency (BF) = 0.22 kHz, threshold = 35 dB SPL, SR = 5.9 sp/s. Noise spectral level at threshold = 11 dB SPL, noise level at 10 dB above threshold B. BF = 0.644 kHz, threshold = 23 dB SPL, SR = 6.4 sp/s. Noise spectral level at threshold = -6 dB SPL. Noise level at 10 dB above threshold. The filled circles in each plot show the responses to the So tones, and the unfilled circles show the responses to the Spi tones. The labeled arrows show the masked thresholds for the 500-Hz tones determined by signal detection methods.

NOSO VERSUS Npi SO. The two units in Fig. 2 show the effects of increasing the level of So tones in the presence of No and Npi noises. The majority of units have a peak in their NDF near zero delay, and the No noise is more effective in driving the unit than the Npi noise (Fig. 2A). As the So tone is steadily increased in the presence of either No or Npi noise the discharge rate is increased in Fig. 2A. However, for those units whose delay function is characterized by a trough rather than a peak near zero interaural delay (see Palmer et al. 1990; Yin et al. 1986), Npi noise is more effective than No noise (Fig. 2B). Thus the unit shown in Fig. 2B is better activated by Npi noise, and the So tone (also located at or near the trough region) produces decreases in discharge rate in the presence of either the No or Npi noise. For the unit shown in Fig. 2B the high level of activity caused by the Npi noise is considerably reduced by the increasing level of the So tone. The No noise is relatively ineffective in driving the unit, but the So tone still produces a further decrease in discharge rate. The So tones produced increased response at a relatively high level in the presence of both No and Npi noise (possibly because the output is dominated by monaural coincidences) (see Han and Colburn 1993). The vertical arrows in Fig. 2 indicate the masked thresholds for the 500-Hz tones computed with signal detection theory. In the examples shown the threshold of the So signal in Npi noise is lower for one of the units (Fig. 2A) but higher for the other. Over our complete sample of data measured in noise at 10 dB above threshold the So signal was detectable at a lower level in Npi noise than No noise in 33 of 62 neurons (unpublished data).



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 2. Discharge rate vs. level functions of 2 different IC neurons in response to 500-Hz dichotic tones presented in phase (So) in the presence of noise either identical at the 2 ears (No) or inverted at 1 ear (Npi ). A: BF = 0.451 kHz, threshold = 40 dB SPL, SR = 0.8 sp/s. Noise spectral level at threshold = 13 dB SPL, noise level at 4 dB above threshold. B: BF = 0.241 kHz, threshold = 42 dB SPL, SR = 0.0 sp/s. Noise spectral level at threshold = 20 dB SPL, noise level at 10 dB above threshold. The filled circles in each plot show the responses to the So tones in No noise, and the unfilled circles show the responses to So tones in Npi noise. The labeled arrows show the masked thresholds for the 500-Hz tones determined by signal detection methods.

Responses to noise as a function of the interaural cross-correlation

When the noise at the two ears is fully correlated (identical), the NDF (Fig. 3, solid squares) has pronounced peaks and troughs separated by a period equal to the unit BF (see McAlpine et al. 1996; Palmer et al. 1990; and Yin et al. 1986). In all instances, the NDF became progressively less modulated as the interaural correlation of the noise was reduced. At zero interaural correlation, all but 1 of the 48 neurons showed NDFs that were not modulated with interaural time delay. The one neuron that did not show a flat NDF, by using uncorrelated noise, was one of seven neurons from earlier experiments in which short-duration (50 ms) noise bursts were used. This neuron (Fig. 3D) still showed some structure to its NDF, although the variation in discharge rate with ITD was substantially attenuated. However, the effect of decorrelating the noise was not entirely stereotypical in that, although in Fig. 3A there was a progressive demodulation of the NDF toward the mean value of the function for a correlation of 1.0 (dotted horizontal line), in Fig. 3B an interaural correlation of 0.8 led to a small increase in the discharge across all delays. Also, the unit in Fig. 3D showed a decline in discharge rate at all delays when the correlation was reduced from 1.0 to 0.8, and subsequent decreases produced demodulation toward the mean of the 0.8 rather than that of the 1.0 function. These effects cannot be explained in terms of changes in responsiveness over time. Demodulation of the NDF toward the mean value by decorrelating the noise at the two ears was previously reported by Yin et al. (1987), and we only used the full range of interaural correlations for these four units and one other that responded in a similar manner. Subsequently, to save recording time, we used only interaural correlations of 1.0 and 0.0. 



View larger version (50K):
[in this window]
[in a new window]
 
Fig. 3. Spike count functions vs. interaural delay for 4 different IC neurons as a function of the interaural correlation of the dichotic noise. A: BF = 0.34 kHz, threshold = 39 dB, SR = 0.0 sp/s, noise spectral level at threshold = 18 dB SPL, noise best delay = 540 µs, noise at 10 dB above threshold. B: BF = 1.3 kHz, threshold = 22 dB SPL, SR = 0.7 sp/s, noise spectral level at threshold = -4 dB SPL, noise best delay = 120 µs, noise at 10 dB above threshold. C: BF = 0.669 kHz, threshold = 27 dB SPL, SR = 1.6 sp/s, noise spectral level at threshold = 16 dB SPL, noise best delay = 240 µs, noise at 10 dB above threshold. D: BF = 0.25 kHz, threshold = 38 dB SPL, SR = 0.0 sp/s, noise spectral level at threshold = 21 dB SPL, noise best delay = 480 µs, noise at 10 dB above threshold. black-square, correlation = 1.0; open circle , correlation = 0.8; , correlation = 0.5; triangle , correlation = 0.2; black-triangle, correlation = 0.0. In Fig. 2D the open triangles indicate a correlation of 0.3.

Of the 47 neurons for which NDFs were flat for zero correlation (i.e., discharge rate was not dependent on interaural time delay), more than one-half showed a discharge rate approximately equal to the mean value of the function at 1.0 interaural correlation (25/47). For units for which the major feature of the NDF is a trough not a peak, decorrelation led to a flat function at the level of the maximum of the fully correlated NDF (6/47). The discharge rate of the uncorrelated NDF for the remaining 16 units was below the mean of the correlated function, and in 11 of these the uncorrelated discharge rates approximated the minimum of the correlated NDF.

For comparison with the effects of 500-Hz tones on the responses to No noise the discharge rate changes that occur at zero ITD on the NDFs are most relevant. These discharge-rate changes are shown in Fig. 4 as a function of the noise correlation for five units (Fig. 4, A and B) and as a difference between 0 and 1.0 correlation for the whole sample (Fig. 4, C and D). For some units, decorrelation produced virtually no change in the discharge rate, as can be seen from the closed circles in Fig. 4, A and B; these data were from the unit also shown in Fig. 3A. From Fig. 3A it can be seen that the demodulation that occurs as a result of decorrelation collapses the functions to the discharge rate measured at zero ITD. This would result in a decrease in the discharge rate at the noise best delay and an increase in the discharge rate at the noise worst delay. This unit produced a very small BMLD for NoSo versus NoSpi signals. Other units did produce quite large changes in the discharge rate as the noise was decorrelated as can be seen from the solid triangles in Fig. 4, A and B, which is the same unit as shown in Fig. 3D. Figure 4, C and D, shows the relationship between the best delay to the noise and the change in discharge rate caused by decorrelation. The largest decreases in discharge rate when the noise is decorrelated occurred in units with best delays close to zero, as one would expect, and the increases in discharge were generally associated with units with long best delays. Five of the eight units showing increases had troughs in their delay functions near zero ITD, and hence their noise best delays were displaced by large fractions of the BF period (Fig. 4D).



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 4. Change in discharge rate in response to dichotic noise at 0 interaural time delay as the correlation at the ears is reduced. A: as a function of the correlation for all 5 units for which a range of correlations were employed. B: same data as in A normalized to percentage of the difference between the maximum peak and minimum trough in the fully correlated noise delay function (NDF). C: discharge rate changes for the whole sample of units as the noise is decorrelated from 1.0 to 0.0. The data were normalized to overcome differences in raw discharge rates by expressing them as a percentage of the difference between the maximum peak and the minimum trough in the fully correlated NDF. D: same data as in C but plotted against the delay expressed in terms of the period of the BF.

In response to noise, the delay function is dominated by signal components close to the unit BF such that the NDF is periodic at the BF period (McAlpine et al. 1996; Palmer et al. 1990; Yin et al. 1987). Thus the responses on the NDFs at ± one-half a period of the BF tone either side of zero indicate the effect of decorrelation on the response to Npi noise. For the individual neurons (Fig. 5, A and B) there is a general tendency for the discharge rate to increase as the noise is decorrelated. Decorrelation produces an increase in discharge rate for most neurons with best delays near zero and a decrease for those with best delays close to one-half a period of the BF away from zero (Fig. 5, C and D).



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 5. Change in discharge rate in response to dichotic noise at an interaural time delay corresponding to Npi as the correlation at the ears is reduced. There are 2 points on the noise delay curves located at one-half a period of the BF away from 0. In general, the position located at ipsilateral leads produced larger changes, but here we plotted the mean of the 2 values. Format as in Fig. 4.

Comparison of the effects of interaural decorrelation with those of adding 500-Hz tones

Figure 6 shows an extensive data set obtained from one IC neuron with a BF of ~250 Hz (Fig. 6A). On decorrelating the noise, the NDF was demodulated and became flat at a discharge rate approximately equal to the mean of the fully correlated NDF (Fig. 6B). In the presence of No noise, increasing levels of the 500-Hz So tones caused a progressive increase in the discharge rate, and the Spi tones caused a progressive decrease (Fig. 6, C and D). The increase for So and the decrease for Spi is consistent with the responses to binaural beats (Fig. 6E) in which So was close to the response maximum and Spi was at the response minimum (marked by arrows).



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 6. Response profile of a single unit in the IC with a BF of 300 Hz. A: frequency/intensity response area determined with tones presented to both ears with 0 interaural delay. The vertical line shows the position of 500 Hz. B: response to noise as a function of the interaural time delay for interaural correlations of 1.0 and 0.0. C and D: masked rate-level functions (MRLFs) to 500-Hz tones in the presence of identical noise at a spectral level of 31 dB SPL at the 2 ears either with 0 (C, So) or pi  (D, Spi ) interaural phase difference. The thin vertical lines in C and D show the tone levels at which NDFs of F and G were measured. E: response to binaural beats at 500 Hz. F: responses to noise as a function of the interaural time delay as the level of a 500-Hz So tone is progressively increased. G: responses to noise as a function of the interaural time delay as the level of a 500-Hz Spi tone is progressively increased. In each panel of F and G the fully correlated NDF is shown for comparison. The dashed horizontal line is the average value of the fully uncorrelated noise function. The MRLFs were computed with short tones and noise (50 ms), and the NDFs were computed with 320-ms tones producing many more spikes per stimulus.

Figure 6F shows a series of NDFs measured in the presence of 500-Hz So tones presented at a series of levels that are depicted on the MRLFs of Fig. 6C by thin vertical lines. The NoSo masked threshold for this unit was 62 dB SPL. At 46 dB SPL, the So tone produced a small increase in the discharge to the noise at all ITDs. This increase became larger as the level of the So tone was raised, but even at 66 dB SPL clear peaks and troughs in the NDF were evident; however, by 76 dB SPL the function was completely flat. The discharge rate at the best delay saturates even at the lowest So level, and as the So tone is increased in level the discharge rate at all other ITDs also reaches this saturated rate. The NoSpi masked threshold of this unit was 51 dB SPL, and a quite different pattern of changes takes place as the level of the Spi tone is increased (Fig. 6G). Very little effect is seen at 46 dB SPL, but at higher levels the NDF is progressively demodulated, reaching a discharge rate approximately equal to that of the decorrelated NDF (horizontal dashed line). It appears that for this unit, while So tones contribute extra coincident inputs to the coincidence detectors, Spi tones produce fewer coincident inputs in a manner qualitatively similar to the effect of decorrelating the noise. Full decorrelation of the responses to the noise requires an Spi tone at 76 dB SPL, a level 30 dB above that producing a decrease in discharge rate. Responses, which we considered to be equivalent to decorrelating the noise, were found in 32 of 48 neurons.

In contrast, the data shown in Fig. 7 depict a unit with BF of 700 Hz for which the responses to Spi tones do not appear to mirror the effects of decorrelation. Figure 7B indicates that the effect of decorrelating the noise is virtually indistinguishable from that in Fig. 6; decorrelation demodulates the NDF to the mean. The MRLFs (Fig. 7, C and D) also show increases in discharge for So and decreases for Spi , and the effects of 500-Hz So tones (Fig. 7F) on the NDF are similar to those in Fig. 6F. The effect of Spi tones over only a 20-dB range reduces the discharge rate evoked by the No noise to the minimum of the NDF (Fig. 7G), a more profound reduction than that caused by decorrelating the noise (dashed line). Effects such as those in Fig. 7 were found in 13 of 48 neurons.



View larger version (50K):
[in this window]
[in a new window]
 
Fig. 7. Response profile a second IC unit with BF = 700 Hz in the same format as Fig. 6 with noise spectral level of 4 dB SPL.

In some instances (among the 13 described previously), we found mixed responses (not shown) that appear to mirror the effects of decorrelation (demodulation) for low levels of Spi but produce more profound reductions in discharge at higher Spi levels (reduction to the minimum of the NDF). In these neurons it appears that, after first decorrelating the responses to the noise, the high levels of tone dominate the response and the output reduces to that characteristic of the Spi tone alone.

A radical departure from the simple desynchronization hypothesis is shown by the final example (the only neuron showing this pattern) in Fig. 8. Here, in a unit with BF at 500 Hz, decorrelation of the noise causes demodulation of the NDF toward the mean of the correlated function (Fig. 8B). Note that the lowest levels of the added tones [51 and 41 dB SPL (Fig. 8, F and G)] were chosen to fall in the low-level region of the response area (which is shown in relative tone levels: 0 dB at 500 Hz was 100 dB SPL). However, increasing the level of both So and Spi tones reduced the discharge rate caused by the No noise (Fig. 8, C and D). One reason that this occurs is shown by the binaural beat response in Fig. 8E, which reveals that neither So nor Spi tones occur near the peak of the unit's delay sensitivity; the Spi tone is at the minimum of the response and the So tone falls on the medial edge of the response peak. Figure 8, F and G, shows the effects on the NDF of increasing the levels of So and Spi tones. Unlike in previous examples, both the So and Spi tones demodulate the NDF toward the baseline. In its simplest form, the prediction of the desynchronization model is that only one of the tones should cause a decrease in the noise-evoked activity and the other should cause an increase.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 8. Response profile of a third IC unit with BF = 500 Hz in the same format as Fig. 6 with a noise spectral level of 18 dB SPL.

We have no way of making a simple comparison of the effect of the added tones to the effect of decorrelation because we did not a priori determine what level of the tone should be compared with the fully decorrelated noise (see DISCUSSION). Nevertheless, we can compare the effects more indirectly. In Fig. 9 we plot the change in the discharge rate at zero ITD caused by full decorrelation (the value at 0 ITD in the fully correlated NDF minus the value at zero in the uncorrelated NDF) against the change in discharge rate caused by the added tones (the rate to the tones minus the rate to the No noise). We computed these values for the curves obtained when the tones were 10, 20, and 30 dB above the level at which there was a noticeable change in firing rate as indicated by the MRLF (because the masked threshold was only computed after the experiment the levels of tones are not exactly 10, 20, and 30 dB above the masked threshold). The lines plotted on each curve are the regression fits. Even at 10 dB above threshold (Fig. 9B) there is a tendency for the Spi discharge rates to covary with the desynchronization (R2 = 0.228), and there is no such tendency for the So data (Fig. 9A; R2 = 0.015). These trends are also apparent in the plots for 20 dB (Fig. 9, C and D) and 30 dB (Fig. 9, E and F) above threshold where the covariation of the Spi data with the decorrelation data becomes progressively more pronounced (r2 increasing to 0.49 and then to 0.643). The So data remain essentially uncorrelated at all tone levels.



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 9. Comparison of the effects of decorrelating the noise at the 2 ears with the effect of adding So (A, C, and E) and Spi tones (B, D, and F). Tone levels re threshold are shown to the right. The equations for the regression fits and the correlation coefficients are shown.


    DISCUSSION
Top
Abstract
Introduction
Methods
Results
Discussion
References

Although the BMLD is undoubtedly an indication of the use of two ears to improve the detection of signals in noise, it does not directly reflect the use of spatial positional cues in grouping and segregating sounds (e.g., Jeffress and McFadden 1971). The BMLD is measured in a threshold detection task and as such the tone signal is detected as a change in the ongoing activity caused by the masking noise rather than as a separate perceptual entity. Because at the ouput of an auditory filter at 500 Hz the noise is essentially tonal with random variations in amplitude and phase, a useful way of visualizing this process is in the form of vector addition as suggested by Webster (1951) and as was used in subsequent descriptions of the BMLD (e.g., Jeffress and McFadden 1971; Jeffress et al. 1952, 1956, 1962). We use this form of representation in Fig. 10 to illustrate the effects of adding So and Spi tones to identical noises at the ears and to emphasise the equivalence with uncorrelated noise signals. In Fig. 10A the identical noises at the right and left ears at a single time instant are indicated as the horizontal vectors. The phase of the noise signal is essentially random, and thus the resultant vector will also vary randomly in phase and amplitude. The addition of So signals produces an equal shift in the noise phases at both ears and thus has no effect on the interaural cues. The amplitude of the resultant is increased by the addition of the signal. By contrast, when Spi tones are added to identical noise signals, the phase shifts induced in the noise signals are in opposite directions, as shown in Fig. 10B. In this case, the addition of the signal generates an interaural phase and level difference (see Jeffress and McFadden 1971 for detailed discussion) between the noise signals reaching each ear. Thus at the coincidence detector the inputs would not arrive at the same time and can be considered to be desynchronized. Again the phase of the noise will vary randomly, but the effects of adding the Spi tones will always be equal and opposite. Finally, in Fig. 10C is shown the equivalent vector representation for noise signals that are independent (uncorrelated). One way of generating decorrelated noises with different degrees of correlation is to add different amounts of independent noises to a noise that is common at the two ears. The vector diagram was drawn in this way to illustrate the similarities with the NoSpi condition in Fig. 10B. (Although we produced the decorrelated noises in this paper by the addition of 2 independent noises sources this is mathematically the same as decomposing the noise into a common component and 2 independent components as shown.) The independent noises at any instant generate interaural level and time differences. Because the common noise source and the added noises have independently random phases the effect will be random; at some instants the signals at the two ears will be identical, but for most of the time there will be interaural differences that will be larger the more the noises are decorrelated. When the common noise component is absent this represents uncorrelated noise at the two ears, and the interaural cues will also be random. Thus adding independent noise sources to a common source has the effect of desynchronizing the inputs to the coincidence detectors and is therefore analogous to adding Spi to No.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 10. Vector representation of the desynchronizing effects of adding an Spi tone compared with uncorrelated noise at the ears.

For some units at high tone levels the tone appeared to dominate the response and reduce the discharge rate to that characteristic of the Spi tone alone. However, for the majority even the highest levels of tone that we presented only resulted in a demodulation of the NDF to the mean level and thus appeared to be simply desynchronization rather than a result of domination by the tone.

The range of discharge rates to uncorrelated noise may represent different processes underlying the generation of the correlated noise delay curves. For example, the curves for which the decorrelated rates reached the correlated minima represent units for which the peaks but not the troughs are generated by coincident activity from each ear. For units characterized by a trough near zero delay it is the trough that is generated by coincident activity, and the peaks are due to noncoincident input (Yin and Kuwada 1983b). For the units in which the decorrelated discharge rate coincided with the average of the correlated, both the peaks and the troughs in the delay functions must represent the response to specific phase relations of the noise inputs from each side (after cochlear filtering). This is the type of unit that was modeled by Colburn et al. (1990), where it was shown that timed inhibition was not a necessary requirement for producing discharge rates lower than to either ear alone. Indeed timed inhibition seems implausible because different timing would be required to match the one-half period at different frequencies.

In some instances, the effect of adding Spi tones appeared to be more profound than decorrelating the noise. Here it appeared that the presence of an Spi tone at sufficient level significantly reduced the neuronal output in response to noise with any interaural delay. It is tempting to suggest that the Spi tones in this instance produce inhibition of the activity. If the inhibition does not derive from the lower brain stem it might be more local to the IC. This seems possible in view of the inhibitory inputs to the IC that derive from a range of lower nuclei (Faingold et al. 1991; Oliver and Shneiderman 1991) and the fact that there is good evidence for convergence onto delay-sensitive IC neurons (Kuwada et al. 1987, 1996; McAlpine et al. 1998). One plausible scenario for responses such as those shown in Fig. 7 would be for input from both a direct MSO input to an IC neuron and from a trough unit via an inhibitory interneuron (within IC or possibly from DNLL) (e.g., see Cai et al. 1998). The trough characteristic derives, in the brain stem, from inhibition from one ear and excitation from the other fed to a coincidence detector via paths of different length. Troughs in such units tend to be close to zero interaural delay and thus will result in little activity generated by So tones. Spi tones in contrast are generally not located near the trough and thus evoke activity. At the higher level this means that there would be strong inhibitory inflow from an interneuron fed by a trough unit for Spi stimuli and little or none for So stimuli. Thus for the unit in Fig. 7, although So tones gave responses such as those in Fig. 6, Spi tones did not demodulate the noise curves to the mean but rather at the higher levels produced suppression at all noise delays.

The unit shown in Fig. 8 requires an alternative explanation. Both the So and Spi tones appeared, at higher sound levels, to suppress progressively the output at all noise delays. This type of response we attributed (previously) to inhibitory inputs. Because both So and Spi had a similar suppressive effect, the inhibition would need to be delay independent and indeed might well be a result of a monaural inhibitory input to the neuron (this possibility was not tested at the time). More difficult to explain is the desynchronizing effect of the low level So tones (Fig. 8F at 51 dB SPL). Because the noise delay curve has a peak near zero interaural delay So tones at BF (500 Hz) should also produce spikes at the coincidence detector and thus augment the noise delay curve at all delays as in Figs. 6F and 7F rather than demodulating the noise responses. Clearly, the So tones here act more like the Spi tones in Fig. 7G, and the desynchronizing effect is superceded at higher sound levels by the delay-independent inhibitory effect of the 500-Hz tones. We have no plausible explanation for the initially desynchronizing action of the low-level So tones.

The most appropriate comparison between the effects of decorrelating the noises at the two ears and adding extra tones would be to use that level of tone that achieves the same degree of decorrelation. This can be computed within each frequency channel (Durlach et al. 1986), and to achieve full decorrelation of the noise appears to require a within-channel signal-to-noise ratio of 0 dB (van de Par and Kohlrausch 1995). Unfortunately, for physiological expediancy, while collecting data, we used fixed levels of the added tones resulting in the within-channel signal-to-noise ratio varying as the sample included BFs from 0.161 to 1.8 kHz, and the noise levels were chosen specifically to give a good NDF (7-15 dB above binaural noise threshold). We therefore resorted to the analysis shown in Fig. 9, where we show that the higher levels of tone (30 dB above the levels at which they first affect the noise response) produce effects, which covary with decorrelation.

These data are consistent with the hypothesis that the effect of adding tonal signals to noise is equivalent to the effect of decorrelating the noise at the two ears, in the majority of neurons. This is because the synchronized input to the brain stem coincidence detectors caused by identical noises at the two ears will be disrupted or desynchronized by the addition of the Spi tone signals. However, in some neurons there were effects of adding extra tones that could not be fully described in terms of such a desychronization process.


    ACKNOWLEDGMENTS

We thank P. Moorjani for technical assistance, D. Marshall for computing the factors for producing decorrelated noise, and T. Shackleton and M. Akeroyd for providing valuable comments on early drafts of this paper.


    FOOTNOTES

Address for reprint requests: A. Palmer, MRC Institute of Hearing Research, University of Nottingham, University Park, Nottingham NG7 2RD, United Kingdom.

Received 11 May 1998; accepted in final form 23 October 1998.


    REFERENCES
Top
Abstract
Introduction
Methods
Results
Discussion
References

0022-3077/99 $5.00 Copyright © 1999 The American Physiological Society