Center for Neural Science, New York University, New York, New York 10003
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Krishna, B. Suresh and Malcolm N. Semple. Auditory Temporal Processing: Responses to Sinusoidally Amplitude-Modulated Tones in the Inferior Colliculus. J. Neurophysiol. 84: 255-273, 2000. Time-varying envelopes are a common feature of acoustic communication signals like human speech and induce a variety of percepts in human listeners. We studied the responses of 109 single neurons in the inferior colliculus (IC) of the anesthetized Mongolian gerbil to contralaterally presented sinusoidally amplitude-modulated (SAM) tones with a wide range of parameters. Modulation transfer functions (MTFs) based on average spike rate (rMTFs) showed regions of enhancement and suppression, where spike rates increased or decreased respectively as stimulus modulation depth increased. Specifically, almost all IC rMTFs could be described by some combination of a primary and a secondary region of enhancement and an intervening region of suppression, with these regions present to varying degrees in individual rMTFs. rMTF characteristics of most neurons were dependent on sound pressure level (SPL). rMTFs in most neurons with "onset" or "onset-sustained" peri-stimulus time histograms (PSTHs) in response to brief pure tones showed only a peaked primary region of enhancement. The region of suppression tended to occur in neurons with "sustained" or "pauser" PSTHs, and usually emerged at higher SPLs. The secondary region of enhancement was only found in eight neurons. The lowest modulation frequency at which the spike rate reached a clear peak ("best modulation frequency" or BMF) was measured. All but two mean BMFs lay between 0 and 100 Hz. Fifty percent of the 49 neurons tested over at least a 20-dB range of SPLs showed a BMF variation larger than 66% of their mean BMF. MTFs based on vector strength (tMTFs) showed a variety of patterns; although mostly similar to those reported from the cochlear nucleus, tMTFs of IC neurons showed higher maximum values, smaller dynamic range with depth, and a lower high-frequency limit for significant phase locking. Systematic and large increases in phase-lead commonly occurred as SPL increased. rMTFs measured at multiple carrier frequencies (Fcs) showed that the suppressive region was not the result of sideband inhibition. There was no systematic relationship between BMF and Fc of stimulation in the cells studied, even at low carrier frequencies. The results suggest various possible mechanisms that could create IC MTFs, and strongly support the idea that inhibitory inputs shape the rMTF by sharpening regions of enhancement and creating a suppressive region. The paucity of BMFs above 100 Hz argues against simple rate-coding schemes for pitch. Finally, any labeled line or topographic representation of modulation frequency is unlikely to be independent of SPL.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Sinusoidally amplitude-modulated (SAM) tones
induce in human listeners, over different modulation frequency ranges,
a variety of percepts such as fluctuation, roughness, and pitch. SAM
signals have also been used in psychophysical experiments within a
linear systems framework to study the temporal filtering properties of the auditory system, as distinguished (but not separate) from its
spectral filtering properties (e.g., Dau et al. 1997;
Salvi et al. 1982
; Viemeister 1979
). It
is known that human speech and animal communication sounds contain many
amplitude-modulated features (e.g., Rosen 1992
). In
particular, the envelopes found in natural speech have been shown to
contain sufficient information below 50 Hz to allow a high degree of
speech recognition performance, even in the relative absence of
spectral cues (Shannon et al. 1995
). Further, it has
been reported that training language-learning impaired children with
modified speech that included enhancement of envelope components in a
similar frequency range (3-30 Hz) led to significant improvements in
performance on various speech and language tests (Tallal et al.
1996
). Understanding the processing of modulated signals is
also of relevance to the design of stimulation strategies for cochlear
implants, where one attempts to convey information about speech signals
with a temporally varying stimulation strength at a limited number of
points on the cochlea.
There now exist considerable data on physiological
responses to SAM tones in the auditory nerve (e.g., Joris and
Yin 1992) and cochlear nucleus (e.g., Frisina et al.
1990
; Moller 1974
; Rhode 1994
;
Rhode and Greenberg 1994
). Neurons in these lower levels
of the ascending auditory pathway convey information about the envelope
by "phase locking" to the modulation frequency of the SAM tone.
Spike rate variations with modulation frequency are generally poorly
tuned or nonexistent in the auditory nerve and for most neuronal types
in the cochlear nucleus. In contrast, neurons at higher levels of the
auditory system, including the inferior colliculus (IC), have been
shown to exhibit large variations in average spike rate as the
modulation frequency of the SAM tone is varied (e.g., Langner
and Schreiner 1988
; Rees and Moller 1983
; Schreiner and Urbas 1988
). It has been suggested that
this represents a transformation from a "temporal code" for
modulation frequency in the auditory nerve and cochlear nucleus to a
"rate code" at higher levels, and that this transformation is
essentially complete at the level of the IC (Langner and
Schreiner 1988
). However, many details of the parametric
dependence of the responses to SAM tones of single neurons at locations
other than the auditory nerve and cochlear nucleus remain unclear. As a
result, existing models of the processing of amplitude-modulation (AM)
at higher levels of the auditory system are poorly constrained by
experimental data and hence difficult to evaluate.
The IC is an obligatory relay in the primary lemniscal pathway from the
extensively interconnected auditory midbrain network to the cortex. It
receives ascending projections from various ipsilateral and
contralateral sites in the auditory periphery and descending
projections from the cortex (Oliver and Huerta 1991). As
a result of this extensive set of excitatory and inhibitory inputs and
because it is the primary source of projections to the thalamus and
cortex, it is often considered to perform an "integrative" role in
auditory processing. In particular, in contrast to auditory nerve
fibers and most neurons in the cochlear nucleus, prior studies (e.g.,
Heil et al. 1995
; Langner and Schreiner
1988
) of the responses of IC neurons to SAM tones with
different modulation frequencies (and all other parameters fixed) have
reported that most neurons in the IC show systematic variations of
average spike rate with modulation frequency, with neurons often
showing a clear maximum response at a best modulation frequency (BMF).
However, studies (e.g., Rees and Moller 1983
;
Rees and Palmer 1989
) that have varied other parameters
of the stimulus have indicated that response patterns to SAM tones can
be strongly dependent on stimulus parameters other than modulation
frequency, like the sound pressure level (SPL) of stimulation.
Nonetheless, a systematic description of the spike rate and
synchronization properties of single IC neurons to SAM tones with a
wide range of parameters is still lacking. Such a description would
help clarify the implications of the emergent rate tuning in the
auditory midbrain, both for neural processing mechanisms and for the
representation of modulation frequency in the auditory system.
With this in mind, we present here the results of a detailed characterization of extracellularly recorded responses from single neurons in the physiologically defined central nucleus of the IC of the Mongolian gerbil (Meriones unguiculatus) to SAM tones with a range of modulation frequencies presented via a closed system to the contralateral ear at multiple modulation depths, SPLs, and carrier frequencies (Fcs).
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Surgical and recording procedures
Adult Mongolian gerbils with clean external ears and no sign of
middle ear infections were initially anesthetized with an intraperitoneal injection of pentobarbital sodium (60 mg/kg) and subsequently maintained in an areflexive state with supplemental intramuscular injections of ketamine hydrochloride (approximately 30 mg
· kg1 · h
1). A
heating pad was used to maintain a constant rectal temperature of
37°C. The pinnae were removed, and craniotomy of the interparietal bone was performed to expose the cerebellum. The animal was then transferred to a double-walled sound-attenuated room (IAC), where sound
delivery speculae were sealed to the temporal bone and muscle around
the opening of the external auditory meatus bilaterally.
Activity of single units was recorded with platinum-plated,
glass-insulated tungsten microelectrodes advanced into the IC through
the intact cerebellum with a stepping motor microdrive (CalTech).
Electrodes typically had exposed tip lengths of 5-10 µm and an
impedance of 2-5 M at 1 kHz. Electrical signals recorded from the
brain were amplified (variable gain) and filtered (typically 0.25-10
kHz). Neural signals were displayed on an oscilloscope and fed to an
audio monitor. Single units were identified and isolated using the
criteria of waveform similarity and separation of successive spikes by
an absolute refractory period; only very well isolated single units
were studied. An event timer (MALab, Kaiser Instruments) logged the
occurrence of discriminated action potentials together with stimulus
zero-crossings to a resolution of 1 µs. Event times were stored in a
FIFO buffer from which they were retrieved by the host computer.
Acoustic stimuli were presented as the electrode was advanced to
isolate responsive units. As the electrode passed through the dorsal
mantle of the IC, responses that were broadly tuned, driven better by
broadband noise than tones, and often habituating to repeated tonal
stimulation were found. Occasionally, a coarse descending tonotopic
sequence was encountered. Entry into the physiologically defined
central nucleus was always marked by an increase in spontaneous
background activity and the beginning of a clear ascending tonotopic
progression, with narrowly tuned tone response areas that showed no
signs of habituation to repeated stimulation. Within the central
nucleus, both modulated and unmodulated tonal stimuli were effective in revealing responsive units.
Stimulus generation and data acquisition
Stimulus waveforms were generated by two digital synthesizers
controlled by a microprocessor and custom hardware for timing, data
logging, and waveform control (MALab, Kaiser Instruments). The
dedicated microprocessor communicated with a host computer via an
IEEE-488 interface. Stimuli were digitally attenuated and transduced by
electrostatic earphones (Stax Lambda, housing designed by G. Sokolich)
coupled to the ear pieces. For each individual animal, SPL (expressed
in dB re 20 µPa) near the tympanic membrane was calibrated for both
ears from 40 Hz to 40 kHz under computer control using a previously
calibrated probe tube and a condenser microphone (Bruel and Kjaer, type
4134). Phase was calibrated for both ears from 40 Hz to 5 kHz by
measuring the disparity in phase between a reference sinusoid generated
electrically and the recorded acoustic signal. Note that all the data
in this paper result from monaural contralateral presentation of sound.
The magnitude transfer function of the speaker was usually smooth, with
a slow rolloff until 25 kHz, the maximum frequency used; two or three
local resonances, never deviating by more than 10 dB from the baseline,
were also present. Deviations of the phase response from linearity were
small, and not usually present. Appropriate compensations were made to
the carrier prior to modulation. All acoustic stimuli were shaped
digitally with a cosine-squared ramp of 5-50 ms rise time to reduce
spectral splatter at onset and offset. The SAM tone is represented by
the formula A sin(ct)[1 + m sin(
mt)], where
c is the angular frequency of the carrier,
m that of the modulator, A the
amplitude of the carrier, t the time after signal onset, and
m the modulation depth (0-100%). Such a stimulus, if
sufficiently long, has a simple three component spectrum centered at
Fc, as shown in Fig.
1. The SPL of the SAM tone is defined as
the SPL of the carrier that is being modulated. The modulation
frequency is the frequency corresponding to
m.
|
Frequency response areas, either at a single SPL or at multiple SPLs,
were recorded using pure tones (usually 100 ms, with a rise time of 10 ms, and a repetition rate <2.5 Hz) and the best frequency (BF) was
noted. Following this, a complete spike rate versus SPL function
[rate-level function (RLF)] was usually measured (at BF). Neurons
were then studied using pure tones (at multiple SPLs) and SAM tones
while monitoring the responses (spike rate and vector strength) on-line
to gain an initial qualitative characterization of the neuron's
response properties. This initial exploration helped confine the formal
presentation to the ranges of levels, depths, carrier frequencies,
durations, and modulation frequencies where clear variations in
response were present, thus maximizing the use of the limited recording
time available. Modulation frequencies were varied within the range of
0.1 Hz to a value just less than the
Fc used. Stimulus durations varied
from 1 to 10 s (this was determined partly by the period of the
lowest modulation frequency used, so that at least 1 period was
presented), while rise times varied from 5 to 50 ms. Unless it was
itself the parameter being varied, Fc
was set to the BF of the neuron (as measured using pure tones) at the
SPL being tested. If responses were being recorded at multiple SPLs,
Fc was kept constant and usually equal
to the BF at an SPL in the middle of the spanned SPL range. Because it was not uncommon for the BF to vary with SPL (e.g., Kuwada et al. 1984), Fc sometimes
differed slightly from the BF at some SPLs. In any case, the BF can
also depend on the kind of stimulation (SAM tone vs. pure tone), the
duration of stimulation (100 ms vs. 1-10 s) and the modulation
frequency of stimulation (see RESULTS).
Data analysis
Three descriptive measures of the pure-tone response are used in
this paper. The first is the mean spike rate averaged over the duration
of stimulation. The second is the minimum (across all SPLs tested) mean
(averaged across stimulus presentations) latency at the BF of the
neuron or close to it (if the BF varied with SPL). Finally, neurons
were classified into one of four classes (onset, onset-sustained,
pauser, and sustained) on the basis of their peri-stimulus time
histograms (PSTHs) (as in Le Beau et al. 1996). However,
because PSTHs can change with SPL, we also used some additional
criteria: neurons were classified as pausers if they showed a pauser
pattern at any SPL, and neurons were classified as sustained or onset
types only if they showed this pattern at all SPLs. The few broad onset
neurons found were included in the onset-sustained category, and since
the identification of choppers requires more tone presentations than we
used, these were not separately identified.
The response to AM is also characterized by three measures: mean spike
rate (averaged over the duration of stimulation), vector strength at
the modulation
frequency1
(Goldberg and Brown 1969) and the mean phase-lead of the
response relative to the modulating sinusoid. The functions describing the variation of each of these measures with modulation frequency will
be referred to as modulation transfer functions (MTFs). MTFs plotted
using the spike rate, vector strength, and mean phase-leads are called
rate modulation transfer functions (rMTFs), temporal modulation
transfer functions (tMTFs), and phase modulation transfer functions
(pMTFs), respectively.
The vector strength is a measure of the synchrony to the modulating
waveform and is equal to
F1/F0,
where F1 is the spectral magnitude of
the response at m and
F0 the average spike rate. It varies
from a minimum of zero to a maximum of one, and as a reference, the
vector strength of the sinusoidal modulating waveform of a 100% depth
SAM tone is 0.5, while that of a half-wave rectified sinusoid is 0.784. A response with all spikes at precisely the same unique phase has a
vector strength of 1 ("perfect" phase locking). The significance of
the vector strength was assessed using the Rayleigh statistic
(Stephens 1969
) at the 1% significance level.
Additionally, to minimize the contribution of onset spikes, at least
six spikes per stimulus presentation were required for significance.
The mean phase-lead was computed as (90
), where
is the
direction of the mean vector in the vector strength calculation. The
mean phase-lead is equal to the phase of the spectral component of the
response at
m, relative to that of the
modulating waveform. (The direction of the mean vector of the
modulating waveform is 90°, since it produces a unimodal nonnegative
period histogram symmetric about its peak at 90°.) The mean
phase-lead was decremented by an appropriate multiple of 360°, i.e.,
unwrapped, whenever it went from a response close to 360° to one
closer to 0°. pMTFs were truncated when the sampling resolution
became poor enough that the phase might have skipped more than one
cycle; this was not usually required, and usually only affected values
above 300 Hz. Only significant vector strengths and their associated
mean phase-leads are shown in the plots in this paper. Consistent with the fact that phase locking to pure tones is common only below 600 Hz
(Kuwada et al. 1984
), we observed significant vector
strength at the carrier frequency only in the few neurons in our sample with BFs in that frequency range. Synchrony to the carrier will not be
discussed further in this report.
Most MTFs did not vary much when different 1-s time windows after the beginning of the stimulus were chosen for analysis, even though most neurons showed an adaptation of their mean firing rate during the stimulation period. All measures were therefore calculated over the entire stimulation period.
A BMF was defined for each rMTF as follows: first, the modulation frequency that elicited the maximum spike rate was identified. If there were two distinct maxima (e.g., Fig. 3A), the one at the lower modulation frequency (primary peak) was taken. Following this, a range of modulation frequencies (b1 to b2, Fig. 1E) where the response was >90% of this response maximum was extracted. The BMF was chosen as the mean of b1 and b2. This procedure essentially corrected for skewed or irregular peaks and was almost always close to the BMF as chosen by eye. BMFs were only measured if the spike rate dropped by at least 70% on both sides of the BMF. If a 70% drop was only present on the high-frequency side, then the modulation frequency (higher than the BMF) that elicited 90% of the maximum response was chosen as the corner frequency of the rMTF. A cutoff frequency was also measured; this was the frequency at which the response fell to the minimum spike rate plus 10% of the difference between the maximum and minimum spike rates, on the high-frequency side of the primary peak (w1 in Fig. 1E). Finally, a worst modulation frequency (WMF) was extracted in rMTFs with a clear suppressive region, as evidenced by a lower spike rate in comparison with that to one or more lower depth stimuli. The method was similar to that used for the BMF: the mean of a 10% range (w1 to w2, Fig. 1E) was taken as the WMF. In all cases, linear interpolation was used if required.
Kendall's (Press et al. 1993
) is used often in this
paper as a nonparametric measure of correlation between two or three variables.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We measured responses from 109 single neurons in the
physiologically characterized central nucleus of the IC in 34 Mongolian gerbils. Three onset neurons were unresponsive to AM across the entire
parameter space, a proportion similar to that reported earlier
(Rees and Palmer 1989).
Varying modulation depth
Varying SAM tone modulation depth revealed that IC rMTFs were composed of 1 or 2 regions of enhancement (modulation frequency ranges where the spike rate increases with increase in stimulus modulation depth) and/or 1 region of suppression (a range of modulation frequencies where the spike rate decreases as stimulus modulation depth increases). Figures 2 and 3 show illustrative examples of rMTFs and tMTFs measured from individual neurons (with a range of BFs) using SAM tones with varying modulation depths and constant SPL. In all cases (except Fig. 2D, see legend), the Fc was equal to the BF of the neuron at that SPL. The rMTFs in Fig. 2 show examples of neurons with rMTFs that predominantly show either an increase (A-C) or decrease (D) in spike rate with increasing modulation depth at all modulation frequencies. The magnitude of change depends on the modulation frequency. These changes create band-pass (A-C) or bandsuppressive (D) rMTFs that are characterized by their single prominent region of enhancement and suppression, respectively. In contrast, Fig. 3 illustrates examples of neurons whose rMTFs show both regions of enhancement and suppression. In other words, both the direction (increase or decrease) and magnitude of change (in spike rate with modulation depth) depends on the modulation frequency. The rMTFs in Fig. 3, A, C, and D, show secondary peaks at higher modulation frequencies; these were only found in eight neurons (see DISCUSSION). However, in contrast to responses at the primary peak (BMF), responses at the secondary peak possessed low vector strength; i.e., they were not synchronized to the modulation frequency. We formally recorded the effects of systematic variation of modulation depth from 31 neurons; in most cases, the spike rate at any given modulation frequency increased or decreased almost monotonically with increasing modulation depth.
|
|
tMTFs ranged from low-pass to band-pass (more peaked) shapes; this is
related to the SPL of stimulation, as will be shown in a subsequent
section. Notice that the vector strength remains high (and often
reaches its peak) as the spike rate reaches its minimum (Figs.
2D and 3). Also, low modulation depths can elicit responses
with high vector strengths, i.e., the dynamic range of the vector
strength measure is small as modulation depth is varied. For example,
the neuron in Fig. 2C showed a vector strength >0.8 in
response to a 10% stimulus modulation (an increase of 0.82 dB and a
decrease of 0.91 dB from the mean SPL) at 30 Hz. In contrast to the
rapid saturation of the vector strength of the response (VSr), the
vector strength of the modulating waveform (VSs; equal to half the
modulation depth) increases linearly with depth from 0 to 0.5. As a
result, an alternative measure of synchrony, modulation gain [20
log(VSr/VSs)] (e.g., Frisina et al. 1990) decreases
almost monotonically as the depth increases. No consistent trend was
found in the behavior of pMTFs at different modulation depths, and in
most cases, the magnitude of variation was small.
Varying sound pressure level
The rMTFs for each of the neurons in Figs. 2 and 3 were recorded at a fixed SPL and Fc. Therefore it is of interest to know whether the observed rMTFs are invariant characteristics of the neuron, independent of the particular parameters (SPL and Fc) chosen for stimulation. Figure 4 partially answers this question by demonstrating that it is possible for a region of enhancement in the rMTF from a single neuron to be transformed systematically into a region of suppression as the SPL is increased. Note that these rMTFs were measured with the carrier at 1,500 Hz, while the effects of using a 1,200-Hz carrier are shown in Fig. 2D for the same neuron. The position of the rate minimum is not affected by this change in Fc, suggesting that spectral effects (like the positioning of the AM sidebands in putatively inhibitory regions of the frequency response area) are not important in its generation (see subsequent section on effects of varying carrier frequency). The tMTFs at multiple SPLs and 100% depth show a systematic "low-pass to band-pass shift," as a result of the decrease in vector strength at low modulation frequencies with increasing SPL (Fig. 4F).
|
To explore the emergence of suppression at higher SPLs further, we
systematically varied the SPL while keeping the depth at 100% and the
carrier at or close to the BF in 56 neurons. Illustrative examples
spanning the range of behaviors seen are shown in Figs. 5 and 6.
Figure 5 shows examples of MTFs at multiple SPLs from four neurons with
a range of best frequencies and with rMTFs that remain band pass (but
with varying bandwidths) at all tested SPLs. The neurons depicted in
A and B responded at all modulation frequencies tested, but suffered a loss of tuning at higher SPLs (possibly due to
saturation of the spike rate). A qualitatively similar pattern has been
reported previously (Rees and Palmer 1989). The neurons
in C and D were "transient responders," with
a poor or absent response to long-duration unmodulated tones. Neurons
of this kind maintain their tuning over a broad range of SPLs. The spike rate at a given modulation frequency varied with SPL in either a
monotonic or a nonmonotonic fashion, in a manner that could depend on
modulation frequency (A and D). The BMF showed shifts of variable direction and magnitude as the SPL increased. As in
Fig. 4, the tMTFs show variable degrees of a "low-pass to band-pass
shift," similar to that seen in the cochlear nucleus (Frisina
et al. 1990
; Rhode 1994
; Rhode and
Greenberg 1994
).
|
|
Four examples of neurons with a suppressive region in their rMTFs are shown in Fig. 6. The suppressive region either emerges at higher SPLs (Fig. 6, A-C) as in the example of Fig. 4 or is present at all tested SPLs (Fig. 6D). The tMTFs show the same low-pass to band-pass shift seen in Fig. 5. Again, responses at modulation frequencies within the suppressive region retain considerable synchrony (i.e., show high vector strengths).
A correlation with some other neuronal response property would clearly be useful in delineating the possible mechanisms underlying the variety of rMTF changes seen with increases in SPL. No systematic relationship seems to exist between the presence of a suppressive region and the shape of the RLF (i.e., whether it is monotonic or nonmonotonic). However, it was found that the presence of a suppressive region in the rMTF at 1 or more SPLs was significantly correlated with the PSTH pattern in response to pure tones (Table 1). An rMTF region was defined as suppressive if the spike rate at modulation frequencies within that region showed a clear decrease as the modulation depth was increased (see METHODS). Neurons classified as pausers or sustained types on the basis of their PSTHs were more likely to possess rMTFs with a suppressive region, while rMTFs from neurons with onset or onset-sustained PSTHs usually only had a single region of enhancement (Fig. 5). However, it should be noted that using our aforementioned criterion, suppression can only be demonstrated clearly for neurons that respond well to unmodulated tones; for example, the rMTFs in Fig. 2B probably possess a suppressive region of very small magnitude between 20 and 100 Hz. Eleven of the 18 onset or onset-sustained neurons, and 2 of the 5 pauser neurons without a suppressive region in their rMTFs did not respond well to long-duration unmodulated tones. It therefore remains possible that a putative suppressive mechanism (possibly inhibition: see DISCUSSION) sharpens the high-frequency slope in rMTFs like those in Fig. 5, C and D, without actually resulting in a visible suppressive region. This is consistent with the fact that spike rates are often nonmonotonic with SPL over broad ranges of modulation frequency, even in rMTFs without suppression (e.g., Fig. 5D).
|
The final measure used to characterize MTFs was the mean phase of the response. Figure 7 shows the pMTFs for three neurons, whose rMTFs and tMTFs have been shown in earlier figures (Figs. 5, A and D, and 6D). A systematic increase in the phase lead (phase-advance) with increasing SPL, as illustrated in Fig. 7, A and B, was observed in the pMTFs from 30 of 56 neurons. Other neurons showed both increasing and decreasing leads over different modulation frequency ranges in their pMTFs. A decreasing phase lead (phase-delay) was sometimes seen in the pMTF in frequency regions corresponding to the decreasing high-frequency slope of the associated rMTF (especially in neurons with sustained PSTHs). This behavior was usually less systematic, but a fairly clear example is shown in Fig. 7C. The phase-advance is not always accompanied by an increase in spike rate; for example, the neuron in Fig. 7A shows a phase-advance even though its rMTFs are nonmonotonic with SPL (Fig. 5D). The phase-advance in Fig. 7B seems to be at least partly due to the marked adaptation seen in the response that shifts the peak of the response toward the beginning of the cycle (see the period histograms). The phase advance in Fig. 7A could be due to a phasic response that occurs earlier in the cycle as the SPL increases; this could possibly also be viewed as a very rapid adaptation. There was a consistent tendency for the phase-advance to be maximum at intermediate modulation frequencies, as exemplified by the data in Fig. 7B. This implies that it is inaccurate to view the phase-advance as simply reflecting a decreased time delay at higher SPLs that mirrors the known decrease in latency with SPL for pure tones; because that explanation predicts, at least in its simplest form, a linear relationship between phase-advance and modulation frequency (i.e., the phase-advance ought then to be maximum at the highest modulation frequency). On the contrary, the phase-advance is often minimal at the highest modulation frequencies. Finally, the modulation frequency at which the maximum phase-advance in a given neuron was found was not necessarily identical to that at which its rMTF peaked.
|
The properties of the only offset neuron observed in this study are
shown in Fig. 8. The rMTF showed a
band-pass shape, while the vector strength remained high throughout the
range of modulation frequencies that elicited a response. The pMTF
showed a systematic phase-delay with SPL at low modulation frequencies.
The period histogram suggests that this is the result of a phasic
response occurring later in the cycle as SPL increases. Consistent with its "offset" nature, the neuron fires during the falling phase of
the amplitude envelope (as indicated by the negative value of the low
modulation frequency asymptote). These properties are similar to the
recently described properties of offset neurons from periolivary
regions (Kuwada and Batra 1999).
|
MTF characteristics across the population
Of the 106 neurons responsive to SAM tones, 96 were studied by
stimulating at 100% depth and with a
Fc that was at or close to the BF of
the neuron at all the SPLs studied. Figure
9 is a representation of the BMFs at
100% depth and different SPLs for all 96 neurons. The mean BMF
(averaged across the different SPLs tested and including corner
frequencies; see METHODS) did not show a significant
correlation (Kendall's = 0.0708; P = 0.1533) with the BF. Some interesting properties are evident in the cumulative probability distribution of the mean BMF, shown in Fig.
10A. The maximum mean BMF
encountered was 140 Hz, and about 50% of the mean BMFs lay below 25 Hz. The range of variation of the BMF for an individual neuron (both
absolute and relative to the mean) observed at different SPLs is also
shown on the same graph. Fifty percent of the 49 neurons tested over at
least a 20-dB range of SPLs showed a BMF variation larger than 66% of
their mean BMF; in absolute terms, 50% of the neurons showed a range
larger than 10.9 Hz. No systematic pattern was found in the variation
of BMFs with SPL for individual neurons (Fig. 10B).
|
|
At least one rMTF with a suppressive region was observed in 43 of the 96 neurons. The mean WMF (averaged across SPL; see METHODS) lay between 0 and 200 Hz in most cases, with a mode near 100 Hz (Fig. 10C). No systematic pattern was found in the variation of WMFs with SPL for individual neurons (data not shown).
The minimum mean latency in response to pure tones at BF is
significantly correlated (Kendall's =
0.3276,
P = 0.0001) with the mean BMF (averaged across SPL) of
the neuron in response to SAM tones at 100% depth and with a carrier
at or close to BF (Fig. 11). Neither
measure showed a significant correlation with the BF [
(BF-latency) =
0.0288 (P = 0.366) and
(BF-mean BMF) = 0.0329 (P = 0.3479)].
Accordingly, a three-way test resulted in a
of
0.3269, only
slightly different from that for the BMF-latency relationship,
suggesting that the BF of the neuron was not a confounding variable.
|
Figure 12A reveals that cutoff frequencies (see METHODS) are markedly lower in rMTFs that possess a suppressive region, when compared with those that do not. This could be interpreted as an effect of the suppressive mechanism sharpening the high-frequency slope of the rMTF, thus leading to a lower rMTF cutoff frequency.
|
The tMTF cutoff frequency (defined here as the maximum modulation
frequency at which neurons retain synchrony to the modulation frequency) can be regarded as an index of the low-pass filtering and
internal noise in the system. The measure shows no significant correlation with the Fc used
(Kendall's = 0.003, P = 0.4829). Over 85% of
neurons lack significant synchrony above 300 Hz (Fig. 12B).
Cutoff frequencies in the IC are thus substantially lower than those
found at lower levels like the cochlear nucleus and the lateral
superior olive (Joris and Yin 1998
; Rhode and
Greenberg 1994
; both in the cat).
Some other aspects of the vector strength transformation between the
cochlear nucleus and the IC are shown in Fig.
13. As shown in Fig. 13A,
the maximum synchrony found in responses from almost all IC neurons
(0.907 ± 0.009, mean ± SE, n = 96) is
clearly more than the means reported for any neuron type in the
cochlear nucleus, with the possible exception of
OI units (Rhode 1994; Rhode
and Greenberg 1994
). Also, responses from neurons that possess
BMFs (i.e., the rMTF shows a clear peak, see METHODS) at
all SPLs studied are almost maximally synchronized at the BMF; i.e.,
the mean difference between the vector strength at BMF and the maximum
vector strength is close to zero (e.g., MTFs in Figs. 2,
A-C, and 5). In contrast, neurons that possess corner
frequencies show larger mean differences between the vector strength at
their rMTF peak (BMF or corner) and the maximal vector strength in the
associated tMTF (for example, see MTFs in Figs. 3, 4, and 6).
|
The examples in Figs. 2 and 3 suggest that IC neurons show considerable
phase locking (i.e., high vector strength) at relatively low depths of
modulation. This is confirmed in Fig. 13B, where the lowest
modulation depth at which a significant vector strength was found is
plotted against the value of the vector strength for the 31 neurons
whose MTFs were recorded at multiple depths. Comparing this to
modulation depth-vector strength functions recorded from the cat
cochlear nucleus population (Rhode 1994: Fig.
13) confirms that vector strengths of
IC neurons at low depths are more than those for almost all cochlear
nucleus neuron types (onset-choppers being the exception).
One pMTF each from 95 of the 96 neurons is plotted in Fig.
14A to display the
population characteristics of the pMTFs. The positive low modulation
frequency asymptotes indicate a tendency for most neurons to fire on
the rising portion of the sinusoidal envelope. A straight line was
often a good fit to the high-frequency (greater than or equal to 100 Hz; in our observations, responses in this range usually showed small
or absent SPL-dependent phase shifts) phase responses; the maximum and
minimum values of the slope (see legend) across all neurons were 15.6 and 6.09 ms. These may be interpreted as time delays (e.g.,
Anderson et al. 1971; but also see Ruggero
1980
) that contain possible contributions from the fixed delays
and filtering properties of the system.
|
Thirty of the 56 neurons studied at multiple SPLs showed a systematic phase-advance as the SPL increased. This is shown in Fig. 14B, where phase leads (at the modulation frequency at which the largest phase-advance was found) are plotted as a function of SPL for each of the 30 neurons. The distribution of the maximum phase-advance (i.e., the range of each of the 30 functions in Fig. 14B) per 10-dB rise in SPL is shown as a histogram in Fig. 14C. The mean maximum increase was 23.81° per 10 dB (SE, 1.78). This may be larger than similar increases reported from responses at lower levels in the auditory pathway (see DISCUSSION).
Varying carrier frequency
We also investigated in 34 neurons the effects of stimulating different parts of the frequency response area by varying the Fc of the SAM tone. Illustrative examples are shown in Figs. 15 and 16. The examples in Fig. 15 show that band-pass rMTFs derived from an individual neuron can have different shapes; i.e., they are not scaled versions of each other. The BMFs can show shifts of varying magnitude and direction. The spike rate varies with carrier frequency in a manner that is roughly consistent with the frequency response area. Finally, as discussed earlier in the context of Fig. 4, one might imagine that suppressive regions could result from the positioning of the AM sidebands over inhibitory regions of the frequency response area. This explanation might also predict that as Fc is varied, suppressive regions would show shifts of the same magnitude as the shift in Fc. This does not seem to be the case (Fig. 16, A-C).
|
|
Figure 16D shows data from a neuron whose rMTF seems to lack suppressive regions when stimulated well away from its BF. Because suppressive regions can emerge at higher SPLs (see Fig. 6A for MTFs at multiple SPLs from the same neuron), this raises the issue of whether the effects of Fc variation can be explained in part on the basis of the SPL relative to threshold at each Fc. We do not yet have sufficient data to address this question.
The tMTF variation with Fc is roughly
similar to the variation observed in the same neuron when SPL is
varied. Thus when Fc is varied at a
particular SPL, tMTFs appear to vary over the same range of values seen
when stimulated between threshold and that particular SPL at BF. For
example, the neuron in Fig. 15C showed almost invariant
tMTFs as the SPL was varied (data not shown, but similar in this
respect to Fig. 5, C and D); this was mirrored in
the tMTF invariance with Fc.
Similarly, the neuron in Fig. 16A showed a large drop in
vector strength at low modulation frequencies as SPL increased (Fig.
4); the tMTF in Fig. 16A shows the same property as
Fc became closer to BF. However, the
details of the variation within this range did not show any clear
pattern, especially at higher levels; for example, the vector strength
did not show a consistent relationship either with
Fc or with spike rate. These results
are similar to those reported from the cochlear nucleus (Rhode
1994). Finally, no systematic changes in the pMTFs were found
when Fc was varied.
One scheme for the generation of spike rate tuning to modulation
frequency in the midbrain (Langner 1981) for low carrier frequencies predicts a linear relation between 1/BMF and
1/Fc (and therefore a monotonic
relationship between BMF and Fc).
Figure 17, A-E, show data
from the five cells (in this study) with peaked rMTFs that were studied
at multiple low carrier frequencies (<5 kHz) within their frequency
response area. No systematic pattern was found for BMF shifts with
Fc. Plotting these shifts as functions of 1/BMF versus 1/Fc (Fig.
17F) confirms the lack of any systematic linearity in our
data.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The present study provides detailed descriptions of the variations in spike rate, synchrony, and phase of spike discharges of single neurons in the IC in response to variations in modulation depth, SPL, and Fc of SAM tone stimuli at a range of modulation frequencies. The systematic (and often dramatic) changes seen in all these response measures when stimulus parameters are varied have been described in RESULTS. We now discuss the implications of these findings both for the neural mechanisms generating AM responses in the IC and for performance in psychophysical tasks involving auditory temporal processing.
Excitation and inhibition together create the IC rMTF
Most neuron types in the cochlear nucleus predominantly show
poorly tuned or nonexistent variations in spike rate (i.e., low-pass or
flat rMTFs) with modulation frequency (Frisina et al.
1990; Rhode and Greenberg 1994
). Information
about the modulation frequency is instead present in the response
component locked to the modulation frequency (i.e., a "temporal
code" rather than a "rate code"). In contrast, neurons in the IC
show large variations of spike rate with modulation frequency. These
variations have previously been reported to result in rMTFs of varied
complex shapes, with band-pass rMTFs that possess a BMF being the most
common variety (Heil et al. 1995
; Langner and
Schreiner 1988
; Rees and Palmer 1989
). The data
in this paper show that rMTFs are composed of regions of enhancement
and suppression, where the spike rate increases or decreases,
respectively, with an increase in the modulation depth of the SAM tone
stimulus. In particular, almost all IC rMTFs could be described by some
combination of a primary and a secondary region of enhancement and an
intervening region of suppression (Fig.
18), with these regions present to
varying degrees in individual rMTFs. The regions of enhancement have
band-pass shapes, with the low-frequency region of enhancement (E1)
forming the primary peak (BMF or corner frequency), and the much less
commonly found high-frequency region (E2) the secondary peak. The
region of suppression creates the band-suppressive shape, with its
trough forming the WMF.
|
In the spirit of most current models (see Mechanisms generating
the secondary region of enhancement), we speculate that
the primary region of enhancement is primarily the result of a
transformation of excitatory inputs to the IC. This seems plausible
because at least some rMTFs in the mustache bat IC remain band-pass
after iontophoretic application of various inhibitory blockers
(Burger and Pollak 1998). A similar result has been
reported for onset neurons in the mustache bat dorsal nucleus of the
lateral lemniscus (DNLL) (Yang and Pollak 1997
),
supporting the notion that purely excitatory mechanisms are capable of
creating band-pass rMTFs. However, it is possible that both in the IC
and the DNLL, the inputs themselves show band-pass spike rate tuning
(see next section); this could also account for the minimal effect of
inhibitory blockers on the band-pass tuning of IC rMTFs. On the other
hand, it seems very likely that inhibitory inputs shape the rMTF by
creating the region of suppression. In addition, inhibition may sharpen the high-frequency rolloff of the primary region of enhancement without
resulting in a visible region of suppression (as a result of the low
firing rate in response to unmodulated tones: see discussion of Table 1
in RESULTS). The properties of the secondary region of
enhancement remain unclear; potential mechanisms that could generate it
are discussed in a later section.
In the next few sections, various issues related to the observed MTFs are discussed in greater detail. However, it must be emphasized that in the absence of much pertinent data about intrinsic properties and input patterns of IC neurons, as well as the AM response properties of inputs to the IC, much of this discussion must remain speculative.
Is rate tuning created de novo in the IC?
With contralateral sound presentation, the excitatory afferents to
the IC that are likely to be active include those from the
contralateral cochlear nucleus (in particular, stellate cells in the
ventral cochlear nucleus, and fusiform cells in the dorsal cochlear
nucleus), contralateral lateral superior olive (LSO), ipsilateral
medial superior olive (MSO), and possibly, the contralateral IC
(Moore et al. 1998; Oliver and Huerta
1991
). Most neurons in the cochlear nucleus do not show much
rate tuning. Although little is known about the response properties of
MSO neurons to SAM tones, it appears from preliminary data that neurons
in the LSO possess, in addition to tuned tMTFs, rMTFs that show
systematic changes of spike rate with modulation frequency (Thornton SK
and Semple MN, unpublished observations). LSO rMTFs can show peaks
(BMFs); the majority of BMFs seem to lie below 400 Hz. Thus rMTFs of IC neurons may at least in part reflect the rate tuning present in their
rate-tuned inputs (e.g., from the LSO); low-pass filtering of the
inputs (which are also phase locked to the modulation frequency) may
potentially account for the lower BMFs in the IC. However, the extent
to which different inputs overlap in their projections onto single IC
cells is not clear. It therefore remains plausible that at least some
of the rate tuning seen in the IC emerges as a result of collicular processing.
Coincidence detection mechanisms may create the primary region of enhancement
Mechanisms clearly exist in the auditory midbrain to create tuned
rMTFs from inputs (e.g., from the cochlear nucleus) that show varying
amounts of synchrony to the modulation frequency, but little or no
spike rate changes with modulation frequency. One candidate scheme
suggests that band-pass rMTFs (regions of enhancement) seen in IC
neurons result from coincidence detection of synchronized excitatory
inputs (Hewitt and Meddis 1994). In this model, the IC
neuron is considered to be a coincidence detector that fires maximally
when its inputs (from the cochlear nucleus) are maximally synchronized,
thus converting the peak in the input tMTFs to a rMTF peak in the IC.
Such a model can reproduce some aspects of the previously reported
data, including the flattening of some rMTFs at high SPLs (Rees
and Palmer 1989
) (Fig. 5A in present study: because
the coincidences generated by the high input spike rates at high SPLs
fire the coincidence detector independent of the synchrony in the
inputs). However, in comparison to the tMTF peaks in the inputs from
the cochlear nucleus (80-520 Hz: Frisina et al. 1990
,
gerbil; mean = 330 Hz: Rhode and Greenberg 1994
,
cat) and the lateral superior olive (200-600 Hz: Joris and Yin
1998
, cat), IC BMFs appear to span a much lower range (0-100 Hz: Fig. 10A). In other words, rMTF peaks in the IC do not
seem to be equal to tMTF peaks in their inputs, as specified in the model. Other mechanisms (possibly including some combination of a stage
of low-pass filtering, inhibitory inputs causing a sharper high-frequency rolloff, or intrinsic cellular properties) may need to
be included to account for the data. However, it seems prima facie
possible that such a model (with the addition of inhibitory inputs to
create the region of suppression: see next section) might serve
as a good first attempt to reproduce various other aspects of AM
responses reported in this paper. An extensive modeling study would
also offer insight into input patterns and cellular properties that
could generate the diversity of MTF characteristics seen in the IC.
An alternate scheme that has been proposed to explain peaked rMTFs also
treats the IC neuron as a coincidence detector. However, the structure
of this model is very different from that above: it is posited that a
cross-correlation analysis is performed by neurons that detect
coincidences between spike trains synchronized to the modulation
frequency and carrier frequency, respectively, and delayed by different
small time periods (Langner 1981). For various reasons,
such a scheme seems less attractive than the model discussed above.
First, the model only works at low carrier frequencies (<5 kHz), where
IC inputs retain synchrony to the carrier; thus requiring that a
different mechanism be invoked to explain the rMTFs in neurons at high
carrier frequencies (which seem, at least qualitatively, similar to
those at low carrier frequencies). Second, there is little evidence
that inputs of the kind required by the model actually exist. Finally,
the evidence supporting one of the key predictions of the model (that
there is a linear relationship between 1/BMF and
1/Fc) is unclear; a linear
relationship with positive slope has been reported from one multiple
unit in the cat IC (Langner and Schreiner 1988
), while a
negative slope has been reported to exist for about 10% of neurons in
the guinea fowl midbrain nucleus (Langner 1983
). No
systematic linear relationship was found for the five neurons examined
using low carrier frequencies in the present study.
Inhibition (tonic or phasic) creates the region of suppression
The IC receives multiple inhibitory inputs that are expected to be
driven by monaural input to the contralateral ear; some of the
prominent ones include the DNLL bilaterally, the ipsilateral ventral
nucleus of the lateral lemniscus, the opposite IC, and inhibitory
interneurons within the IC (Moore et al. 1998;
Oliver and Huerta 1991
). It therefore seems plausible to
assume (in the absence of more direct evidence) that the suppressive
region is a result of inhibition. The magnitude of the suppressive
effect is dependent on the modulation frequency, with maximum
suppression occurring at the WMF. In one simple scheme, this dependence
on modulation frequency may reflect the fact that the net inhibitory input may show rate tuning and thus possess a BMF. Assuming that the
inhibitory effect is simply proportional to the mean spike rate of the
inhibitory inputs (i.e., the inhibitory inputs act in a tonic manner),
the maximal inhibitory effect would then be exerted at the BMF for the
inhibitory inputs. The WMF in the rMTF of an IC neuron would thus
correspond to the BMF of its inhibitory inputs. Preliminary results
(Thornton SK and Semple MN, unpublished observations) suggest that
neurons in the gerbil DNLL can indeed show peaked rMTFs with peaks in
the same frequency range as the IC WMFs (0-200 Hz; Fig.
10C). The inputs from the opposite IC and the intra-IC
inhibitory input will have lower BMFs; these could result in the WMFs
in the lower end of the range.
In an alternative scheme, even if the mean rate of the inhibitory
inputs is independent of modulation frequency, time delays between
excitatory and inhibitory inputs (both of which are synchronized to the
modulation frequency) can lead to suppression whose magnitude depends
on modulation frequency. For example, if the inhibitory input is
slightly delayed (by a constant amount, independent of modulation
frequency) with respect to excitation, then when the modulation period
is equal to the delay period, the inhibitory input from a preceding
cycle will overlap maximally with the excitatory input from a
succeeding cycle, thus leading to a minimum response from the output
cell (see Grothe 1994 for such a model proposed for the
MSO of the mustached bat). At higher frequencies, the spike rate may
recover back to a higher value because the inhibitory input could then
become unsynchronized to the modulation frequency. The precise rMTF
shapes predicted by such a scheme are likely to depend on a
quantitative specification of the model.
Most of the inhibitory inputs to the IC remain synchronized in the frequency range where IC WMFs lie (0-200 Hz). However, the combination of the multiple inputs that sum at potentially random phase (with respect to each other), the longer time scales associated with inhibitory postsynaptic potentials, and pre- and postsynaptic filtering could result in a tonic effect. It is thus not possible to distinguish between the two (and other more complex) postulated schemes for the mechanisms of inhibitory effect at this point.
Since there is no direct evidence implicating inhibitory input as the
suppressive mechanism, it seems worthwhile to consider mechanisms that
could potentially explain the drop in response below that to the
unmodulated tone without invoking inhibition. When the input spike rate
is high, and a coincidence detector has a short refractory period,
allowing it to fire multiple times during a single cycle, synchronized
inputs may actually lead to a lower spike rate than inputs that occupy
more of the cycle period, thus allowing the detector to fire more than
once in each cycle (see Reed and Durbeck 1995). However,
this mechanism seems inconsistent with the almost complete suppression
of the response in many cases (Figs. 3 and 6). Further, it is not clear
whether such a scheme (or for that matter, other schemes based purely
on excitatory inputs) could account for the fact that the magnitude of
suppression depends on modulation frequency.
Mechanisms generating the secondary region of enhancement
The secondary region of enhancement is of some interest, if only because it generates a secondary peak at a high modulation frequency in the rMTFs of eight neurons in our population. In six of these neurons, this was demonstrably not a result of the sidebands moving into local peaks in the frequency response area, as it is seen in response to stimulation with a wide range of carrier frequencies (and SPLs). Since responses at the secondary peak were not synchronized to the modulation frequency, it seems unlikely that it could be created by coincidence detection mechanisms. There are many other ways in which such peaks could be generated. A tuned inhibitory input may create a suppressive region in the middle of an otherwise low-pass or band-pass rMTF, which would create a secondary peak. Alternatively, these could reflect high-frequency peaks in the input rMTFs. Yet another possibility is that they are a result of spectral interactions above the cochlear nucleus. For example, if the dynamic range of individual inputs to the IC is narrow, there will be a stronger net input at higher modulation frequencies when the sidebands excite different inputs as a result of their becoming more spectrally separated. If this were the case, these secondary peaks would be invariant to changes in relative phase between the three components of the SAM stimulus. Finally, at high modulation frequencies (where the lower and upper sideband are well separated from each other), it is possible that the rebounds in response actually reflect differences in the speaker transfer function at the carrier and the two sideband frequencies (as noted in METHODS, the compensation was performed prior to modulation with the Fc as reference). MTF shapes resulting from this are likely to change as the Fc is varied. Differentiating between these various possibilities awaits further experimentation.
tMTFs
The low-pass to band-pass shift with increasing SPL seen in IC
tMTFs is similar (at least qualitatively) to that observed in the
cochlear nucleus (Frisina et al. 1990; Rhode
1994
; Rhode and Greenberg 1994
). This is
probably because at low modulation frequencies (where the stimulus
changes minimally over a time-scale equivalent to the "integration
time" of the neuron), the response in neurons that do not respond in
a phasic manner can be roughly viewed as tracking the instantaneous SPL
of the stimulus (e.g., Cooper et al. 1993
). As the SPL
increases, the stimulus remains suprathreshold for longer durations
during each cycle, thereby eliciting a broader response (at least at
low modulation frequencies). The period histograms in Fig.
7B suggest that this indeed seems to be the case, as has
been pointed out by previous investigators (e.g., Hewitt and
Meddis 1994
; Rees and Palmer 1989
). Neurons tend
to maintain their vector strength at the peak of the tMTF at all SPLs,
thus generating the band-pass shape at high SPLs. The vector strength
changes appear to be independent of the associated spike rate changes
in the response, even at low modulation frequencies (Figs. 5 and 6).
The considerable variability in the tMTF variations with SPL and
Fc seen in the IC (present study) is
probably at least in part, a reflection of the similar variety of
changes seen in the different neuron types of the cochlear nucleus
(which directly or indirectly, are the source of the inputs to the IC).
Compared with neurons in the cochlear nucleus, IC neurons also show
enhanced synchrony (higher vector strengths: Fig. 13) and lower cutoff
frequencies for synchronization (Fig. 12). The enhancement is
consistent with previous results that indicate a progressive increase
in the maximum vector strength values achieved at higher levels
(lateral superior olive > cochlear nucleus > auditory
nerve) in the auditory pathway (Joris and Yin 1998). The
increased synchrony (relative to that in cochlear nucleus neurons)
found in IC neurons (present study: Fig. 13) is consistent with the
predictions of the hypothesis that the IC neuron acts as a coincidence
detector (Burkitt and Clark 1999
; see also Wang
and Sachs 1995
). However, other possible explanations include
1) inhibition raising the threshold for spiking (and thus
decreasing the range of phases over which the neuron discharges),
2) successive adaptation making the period histogram
narrower by speeding up the response decay within each cycle, and
3) a decreased spontaneous rate at higher levels of the
auditory pathway (possibly as a result of anesthesia). The reasons
behind the lower cutoff frequencies measured in the IC are not entirely
clear; but factors similar to those indicated in a preceding section
for the loss of synchrony in inhibitory inputs may play a role.
However, the presence of rate tuning in the IC also means that a lack
of sufficient response at high modulation frequencies often limits the
detection of any possible synchrony. This, combined with the fact that
high modulation frequencies were not sampled with an aim to extract the
cutoff precisely, means that the measured values are likely to be an
underestimate of the true cutoff values.
pMTFs
A systematic phase-advance was observed in the pMTFs of 30 of 56 neurons as the SPL increased. Phase-advances with SPL have also been
documented for the auditory nerve, spherical and globular bushy cells
in the cochlear nucleus, the medial nucleus of the trapezoid body, and
the lateral superior olive (all in the cat) where the mean phase
advances observed are 4.07, 4.1, 6.0, 7.3, and 9.97°/10 dB,
respectively (Joris and Yin 1998). All of these values
are substantially lower than the mean maximum increase of 23.81°/10
dB for IC neurons in the present study. However, the values for nuclei
other than the IC result from measurements made at a predetermined
modulation frequency (100 Hz) that was not necessarily the frequency at
which the maximum advance would be seen. Nevertheless, the seemingly
systematic increase in the phase advance observed at succeeding levels
in the auditory pathway may be the result of an incremental advance at
each stage of processing. The phase-advance seems to be at least partly
due to a marked adaptation seen in the response during each cycle. This
also seems to be the basis for this phenomenon in the auditory nerve
(Joris and Yin 1992
; for a similar explanation for
phase-advance in V1 neurons in response to visual stimulation at
increasing contrasts, see Chance et al. 1998
). Both
synaptic adaptation (as in models of the phase advance in auditory
nerve and V1) and intrinsic cellular properties could contribute to the
observed phase-advance.
As previously noted (Rees and Moller 1983), IC responses
to SAM tones are commonly nonsinusoidal in nature. A variety of
nonlinearities that resemble rectification, peak clipping,
phase-locking, asymmetry (possibly due to adaptation), and bimodality
are seen in the period histograms over different modulation frequency
ranges (see Fig. 7). The intervals between clearly visible peaks in
bimodal histograms varied from 4 to 20 ms (data not shown), and
probably reflect a wide variety of underlying mechanisms.
If the nonmonotonicity in pure tone RLFs is the result of a decrease in
excitation, one might expect 180° phase-flips (with increasing SPL at
a given modulation frequency) at low modulation frequencies (where the
stimulus changes very slowly: see DISCUSSION of tMTFs
above) and depths sufficiently low that the amplitude of the SAM tone
traverses a SPL range that lies in the negative-slope limb of the RLF
(because the response should decrease as the instantaneous SPL
increases). Such 180° reversals were never seen, suggesting that the
nonmonotonicity in RLFs is instead the result of an increase in
inhibitory input strength (with the mean phase of the response to SAM
tones reflecting the phase of the excitatory input); this conclusion is
consistent with that reached from experiments using ionotophoretic
injection of inhibitory blockers (e.g., Fuzessery and Hall
1996).
Range of BMFs
The range of BMFs encountered in this study spans a somewhat lower
range (0-140 Hz) than those reported in some of the previous studies
that have measured this for different species from the rMTFs
(Heil et al. 1995; Langner and Schreiner
1988
; Muller-Preuss et al. 1994
). In these
studies, BMFs were derived from both single and multiple units;
however, only one study (Langner and Schreiner 1988
)
differentiated the responses in the two sets. About 80% of single-unit
BMFs in that study were reported to be <100 Hz, with most of the
remaining <300 Hz. The slightly higher range may reflect a variety of
differences between the two studies in the sampling of neurons, the
species studied (cat vs. gerbil) and in the choice of BMF at 60 dB (re:
threshold) versus the mean BMF (averaged across SPL and including
corner frequencies) measure used in the present study. Finally, the
small proportion (approximately 2%) of very high BMFs (>300 Hz)
reported by Langner and Schreiner may correspond to those classified as
secondary peaks in the present study.
Langner and Schreiner also observed that BMFs >300 Hz were
preferentially obtained from multiple-unit recordings and suggested that this could result either from morphological differences between high BMF and low BMF neurons, or from the fact that multiple-unit responses include responses from input fibers and therefore reflect their tendency toward higher BMFs. The other two studies (Heil et al. 1995; Muller-Preuss et al. 1994
in the
gerbil and squirrel monkey, respectively), reported that most BMFs lay
at (and below) 160 and 128 Hz. The differences mentioned above (in the
context of the Langner and Schreiner study) as well as the inclusion of multiple units may bias the range of BMFs toward a slightly higher range of values. On the other hand, a BMF distribution similar to that
in the present study has been reported from the rat IC (Rees and
Moller 1983
); however, these investigators used a different measure of response strength viz. the peak height of the period histogram.
Almost all BMFs in the thalamus and cortex lie below 100 Hz
(Bieser and Muller-Preuss 1996; Preuss and
Muller-Preuss 1990
; Schreiner and Urbas 1988
).
The peaks in vector strength also fall within the same range. This is
at least roughly consistent with the distribution of BMFs observed in
the IC in the present study. Thus even though it may be potentially
advantageous to create nonsynchronized responses that show BMFs at high
modulation frequencies in the IC (because these responses will not be
lost due to low-pass filtering at higher stages), such BMFs are rarely
seen. It is, however, also possible that thalamic and cortical neurons
act as band-pass filters and are therefore not very responsive to unmodulated inputs at high BMFs from IC neurons.
AM response properties across carrier frequency
Consistent with a previous study in the squirrel monkey
(Muller-Preuss et al. 1994), the results of our study do
not reveal any correlation of SAM response properties with the BF of
the neuron in the IC. The BMF-BF relationship reported in another previous study (Langner and Schreiner 1988
) is weak and
is predominantly shaped by multiple-unit BMFs above 200 Hz. The
increase of tMTF cutoff frequency with BF (as a result of the increase
in bandwidth of the basilar membrane band-pass filter with BF) seen in
auditory nerve fibers (Joris and Yin 1992
) appears to be
absent at the level of the IC (Fig. 12B), probably as a
result of successive low-pass filtering at intervening auditory nuclei.
Latency-BMF relationship
Previous studies (Heil et al. 1995; Langner
et al. 1987
) have shown that the onset latency to tones at a
single SPL above threshold is inversely correlated with BMF (in the
range of 0-200 Hz). This finding is confirmed by our data using a
slightly different analysis procedure (Fig. 11). The distribution of
latencies in the present study is also similar to that in the previous
two studies. The inverse correlation could be a result of the common effect of successive synaptic low-pass filtering on both measures. Alternatively, it could reflect the presence of inhibitory inputs that
increase the latency and sharpen the high-frequency slope of the rMTF
(thus decreasing the BMF).
Topography
A topographic map of BMFs in the IC of the cat has been reported
on the basis of many closely spaced multiple-unit recordings made from
two different iso-frequency laminae (Schreiner and Langner 1988). The present study was designed to allow for an extensive parametric characterization (requiring long recording times) of well-isolated single units with a range of BFs. It therefore does not
provide any direct data on this issue. However, the dependence of the
rMTFs on SPL suggests that a topographic map is likely to be dependent
on SPL. The discrepancy between the range of BMFs observed in single
and multiple units has been discussed in a preceding section. As
observed by the authors of the earlier study (Schreiner and
Langner 1988
), because multiple-unit recordings may include
responses from input fibers, it remains unclear whether the observed
map reflects an organization based on the input pattern to the IC or a
topographic organization of the response properties of IC neurons.
Implications for perception
Psychophysical measurements in humans and chinchillas
(Salvi et al. 1982; Viemeister 1979
)
indicate that detection of modulation of SAM noise is possible up to
modulation frequencies above 2,000 Hz. However, MTF peaks are rare
above 100 Hz in both the IC (present study) and the cortex
(Schreiner and Urbas 1988
). MTFs measured with noise
carriers from the cochlear nucleus and cortex appear to be roughly
similar to those measured with tones (Eggermont 1998
;
Rhode 1994
). The limited range of BMFs observed will
therefore hamper any scheme (e.g., Langner and Schreiner
1988
) based on single neurons acting as labeled lines for
particular modulation frequencies by means of the peaks of their rMTFs.
Further, at least in the simplest versions, such a scheme will depend
on the subpopulation of neurons that remain tuned at high SPLs (because performance in modulation detection tasks remains good at high SPLs).
It would also probably require them to possess a relatively invariant
BMF across SPL. Thus because some neurons (at least in the IC) continue
to respond at modulation frequencies above 100 Hz, information about
high modulation frequencies is probably carried in some as yet
unspecified manner by these response spike trains across the population
of IC neurons. The secondary peaks sometimes noticed may be relevant.
Also, it has recently been suggested that modulation frequencies in
this high-frequency range are actually represented in the responses of
neurons with frequency response areas well outside the spectrum of the
presented SAM tone (Schulze and Langner 1997
).
It is worth noting that most IC neurons respond with significant synchrony to modulation frequencies above 100 Hz (and below 300 Hz: Fig. 12B); tMTF cutoffs below 100 Hz are almost always because the spike rate falls to very low values. Thus considerable temporal information remains in the response in the range of modulation frequencies where most BMFs lie. Therefore the emergence of rate tuning does not necessarily preclude the possibility that information about modulation frequency is also present in a "temporal code" (for example, one based on vector strength or interspike intervals).
In human listeners, SAM tones evoke percepts of "fluctuation" below
20 Hz, "roughness" from about 20 to 300 Hz with a peak at around 70 Hz (Fastl 1990), and of pitch from about 20 to 1,000 Hz.
While it is possible that the first two percepts could be encoded by
schemes based either on the BMF or possibly by the phase-locked firing
of neurons in the cortex (Eggermont 1998
; Schulze
and Langner 1997
), such schemes for the third (pitch) percept
run into the same problems identified while discussing the
high-frequency range for the AM detection task. Further, the pitch of
complex sounds is not the result of a simple determination of
modulation frequency (see Cariani and Delgutte 1996
for
a detailed discussion of this issue). The neural code underlying this
percept at higher levels remains unclear. It is interesting in this
context that there appears to be sufficient information below 50 Hz in the envelope to allow for a high degree of performance on speech recognition tasks (Shannon et al. 1995
).
Human listeners can perform modulation depth discrimination tasks at
all depths above the detection threshold (Lee and Bacon 1997; Wakefield and Viemeister 1990
). Because
vector strengths reach close to maximum levels at very low depths in IC
neurons (i.e., the measure has a limited dynamic range), some other
measure like the spike rate may be a better single neuron correlate of task performance.
It is of interest to note that a recent model of the auditory
processing of AM (Dau et al. 1997) is based on the
presence of channels for modulation frequencies below 1,000 Hz in the
auditory system. This represents a marked departure from the low-pass
filter models that have been employed for some time (Viemeister
1979
). Some evidence for modulation frequency specific
adaptation has also been presented previously (Tansley and
Suffield 1983
). It would be interesting to see if the
properties of perceptual channels and neuronal MTFs in the same species
show any similarities.
Finally, we do not know of any current psychophysical models where the suppressive regions in the rMTF would clearly play a useful role. One might speculate that the possible sharpening of the high-frequency slopes of rMTFs might have functional implications for the selectivity of the putative psychophysical channels for modulation frequency.
![]() |
ACKNOWLEDGMENTS |
---|
We thank S. Thornton and B. Malone for participation in some of the experiments. The manuscript benefited from insightful comments by B. Malone, B. Scott, T. Lewis, R. Shapley, D. Sanes, and especially D. Tranchina and an anonymous reviewer.
This research was supported by National Institute on Deafness and Other Communication Disorders Grant DC-01767.
![]() |
FOOTNOTES |
---|
Address for reprint requests: M. N. Semple (E-mail: mal{at}cns.nyu.edu).
1
The measures traditionally used to characterize
responses to SAM tones have been spike rate and vector strength. These
are measures of the spectral magnitude of the response spike train at
zero frequency and at the modulation frequency (Fm)
respectively. As a result of the nonlinear processes leading to spike
generation in the auditory nerve (including the nonlinear rectification
and low-pass filtering in the inner hair cell that underlies the
demodulation of the SAM tone), single auditory nerve fibers respond to
SAM tones with a spike train that has considerable power at these two
frequencies, with the power at Fm decreasing at high
frequencies (above about 1500 Hz in fibers with high characteristic
frequencies: see Joris and Yin 1992). However, in addition, the
auditory nerve fiber response may also contain power at additional
frequency components, like the carrier frequency (Fc),
Fc ± Fm, Fc ± 2Fm, and multiples of Fm (Khanna and
Teich 1989
). In certain situations (e.g., Fig.
15D), it is possible that spectral magnitude at these additional frequencies may be as large as or even larger than that at
Fm itself (e.g., Fc-Fm, when it is
much lower than Fm). Further, it is possible that these
components (at frequencies other than zero and Fm) may
carry stimulus-related information, in addition to that carried by the
components at zero frequency (spike rate) and at Fm. A
complete characterization of the temporal information
contained in the IC cell's response would require a more detailed
analysis of the response spike trains. However, such an analysis is
beyond the scope of this paper.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 13 July 1999; accepted in final form 28 March 2000.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|