Effects of Auditory Stimulus Context on the Representation of Frequency in the Gerbil Inferior Colliculus

B. J. Malone and M. N. Semple

Center for Neural Science, New York University, New York, New York 10003


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Malone, B. J. and M. N. Semple. Effects of Auditory Stimulus Context on the Representation of Frequency in the Gerbil Inferior Colliculus. J. Neurophysiol. 86: 1113-1130, 2001. Prior studies of dynamic conditioning have focused on modulation of binaural localization cues, revealing that the responses of inferior colliculus (IC) neurons to particular values of interaural phase and level disparities depend critically on the context in which they occur. Here we show that monaural frequency transitions, which do not simulate azimuthal motion, also condition the responses of IC neurons. We characterized single-unit responses to two frequency transition stimuli: a glide stimulus comprising two tones linked by a linear frequency sweep (origin-sweep-target) and a step stimulus consisting of one tone followed immediately by another (origin-target). Using sets of glide and step stimuli converging on a common target, we constructed conditioned response functions (RFs) depicting the variability in the response to an identical stimulus as a function of the preceding origin frequency. For nearly all cells, the response to the target depended on the origin frequency, even for origins outside the excitatory frequency response area of the cell. Results from conditioned RFs based on long (2-4 s) and short (200 ms) duration step stimuli indicate that conditioning effects can be induced in the absence of the dynamic sweep, and by stimuli of relatively short duration. Because IC neurons are tuned to frequency, changes in the origin frequency often change the "effective" stimulus duty cycle. In many cases, the enhancement of the target response appeared related to the decrease in the "effective" stimulus duty cycle rather than to the prior presentation of a particular origin frequency. Although this implies that nonselective adaptive mechanisms are responsible for conditioned responses, slightly more than half of IC neurons in each paradigm responded significantly differently to targets following origins that elicited statistically indistinguishable responses. The prevailing influence of stimulus context when discharge history is controlled demonstrates that not all the mechanisms governing conditioning depend on the discharge history of the recorded neuron. Selective adaptation among the neuron's variously tuned afferents may help engender stimulus-specific conditioning. The demonstration that conditioning effects reflect sensitivity to spectral as well as spatial stimulus contrast has broad implications for the processing of a wide range of dynamic acoustic signals and sound sequences.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Dynamic binaural stimuli have dominated the environments in which mammalian auditory systems evolved, and numerous studies (Ahissar et al. 1992; Poirier et al. 1997; Sanes et al. 1998; Semple 1999; Sovijärvi and Hyvärinen 1974; Spitzer and Semple 1991, 1993, 1998; Takahashi and Keller 1992; Wagner and Takahashi 1990) have demonstrated that auditory motion can dramatically influence the responses of auditory neurons. Although one might expect that virtual azimuthal-motion stimuli created by the modulation of interaural disparities of level (ILD) or phase (IPD) would elicit instantaneous discharge rates that reflect the absolute disparity at each point in the sweep, this has proven not to be true for the typical mammalian inferior colliculus (IC) neuron (Sanes et al. 1998; Semple 1999; Spitzer and Semple 1991, 1993, 1998). Instead, the response of the neuron at a particular value of ILD or IPD may be dramatically enhanced or suppressed depending on the context in which that value occurs. Thus the responses of auditory neurons appear to contain information about the trajectory, rather than simply the instantaneous position, of the stimulus in a space defined by the binaural parameters (e.g., ILD or IPD) to which they are sensitive.

Previous demonstrations of dynamic conditioning have been confined to stimuli that involve binaural cues for the localization of sound in the azimuthal plane. At present we do not know whether conditioning phenomena are specific to stimuli that appear to move in space or whether they reflect a general sensitivity to stimuli that change in time. We also do not know whether conditioning phenomena are specific to binaural rather than monaural auditory processing. Conditioned enhancement and suppression obtained with "virtual" motion stimuli may reflect a more fundamental sensitivity of IC neurons to signals that "move" along any of the response gradients that collectively define the neuron's response area. Determining whether conditioning phenomena are specific to auditory motion would clarify their functional role and constrain hypotheses concerning the mechanisms that generate them. A major goal of this paper is to investigate the generality of conditioning phenomena by explicitly testing the hypothesis that conditioning reflects a special sensitivity to auditory motion.

We measured the responses of IC neurons to monaural stimuli involving frequency transitions. Frequency modulation (FM) is a prominent feature of human speech, the behaviorally relevant vocalizations of multiple species, and numerous environmental sounds. Because pinnae-derived spectral filtering cues to sound source location are weak for narrowband stimuli, monaural FM stimuli would not be expected to produce salient percepts of azimuthal auditory motion. On the other hand, because the cochlea contains a place map along the basilar membrane, auditory FM may be more directly analogous to motion in other sensory modalities, where moving stimuli elicit activity that traverses the receptor surface (e.g., the retina in vision or the skin in somatosensation).

Prior experiments on dynamic conditioning and interpretations of those results have focused on the importance of the dynamic component of the stimulus. In the case of IPD, responses measured during the dynamic component were compared with statically derived tuning curves. It has been implicitly assumed that the dynamic portion of the stimulus was necessary for the induction of the physiological effects. In the present study, we attempt to refine our understanding of "stimulus context" by testing frequency step stimuli that do not include FM components.

If conditioning is, as our results suggest, a general property of auditory processing, then it is possible that conditioning itself is a reflection of a general property of neural processing: response adaptation. For example, it has been proposed that apparent motion sensitivity can be explained in terms of the history of a neuron's response to the stimulus rather than to the history of the stimulus itself (McAlpine et al. 2000). In the present study, we explicitly test the hypothesis that conditioned responses to the target tone can be explained by the history of action potentials fired in response to the preceding origin tone.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Surgical and recording procedures

Adult Mongolian gerbils (Meriones unguiculatus; n = 12) with clean external and middle ears were anesthetized with pentobarbital sodium (60 mg/kg ip). The pinnae were removed, and the cerebellum was exposed by a craniotomy of the interparietal bone. The animal was transferred to a double-walled sound attenuating chamber (Industrial Acoustics), and sound-delivery speculae were sealed to the openings of the external auditory meatuses. An electric heating blanket maintained a constant rectal temperature of 38°, and the trachea was cannulated to prevent respiratory distress. Supplemental injections of ketamine hydrochloride (30 mg · kg-1 · h-1 im) maintained the animal in an areflexive state throughout the experiment. This anesthetic protocol allows for direct comparison with prior work on conditioning. The separate concern of the potential confounding effects of anesthesia has been addressed in a separate study in which conditioning elicited by an IPD stimulus was demonstrated in an unanesthetized primate (Malone and Semple 2000).

Single-unit activity was recorded with platinum-plated, glass-insulated tungsten microelectrodes (5- to 10-µm tip exposures; Ainsworth) advanced into the IC with a stepping motor microdrive (CalTech). Electrical signals from the brain were amplified (variable gain), filtered (typically from 0.25 to 10 kHz), and passed to an oscilloscope, audio speaker, and event timer (MALab, Kaiser Instruments). The occurrence of discriminated action potentials and stimulus zero-crossings was logged with a resolution of 1 µs. Event times were then retrieved from a FIFO buffer and stored by the host computer for analysis and display. Entry into the IC central nucleus was demarcated physiologically by an abrupt transition from the diffuse, broadly tuned, and habituating responses characteristic of the dorsal mantle to increased spontaneous background activity, robust and selective responses to pure tones, and the beginnings of a clear ascending tonotopic progression. All procedures associated with this report were reviewed and approved by the New York University Institutional Animal Care and Use Committee.

Stimulus generation and data acquisition

Stimulus waveforms were generated by digital synthesizers controlled by a microprocessor and custom hardware (MALab, Kaiser Instruments). Stimulus characteristics were specified in software running on the host computer, which communicated with the dedicated microprocessor via an IEEE-488 interface. Stimuli were generated and attenuated digitally and transduced by electrostatic earphones (STAX Lambda) in custom housings (Custom Sound Systems) fitted to ear inserts. Before each experiment, the sound pressure level (SPL) expressed in decibels (dB, re: 20 µPa) at each ear was calibrated under computer control for level and phase from 40 Hz to 40 kHz, using a previously calibrated probe tube and condenser microphone (Brüel and Kjær, 4134).

The frequency glide stimuli depicted in Fig. 1 begin with a steady-state period (1 s) at an initial frequency (origin). The frequency is then linearly modulated at 8 kHz/s to a new frequency (target) and presented for a variable interval determined by the depth (and thus the duration) of the sweep. Because auditory neurons are sensitive to the rate of FM (e.g., Felsheim and Ostwald 1996; Mendelson and Cynader 1985), a constant rate of modulation rather than a constant duration of modulation was employed for glides of varying depths. The overall duration of the typical stimulus used in these experiments was 4 s, with an interstimulus interval of 1 s separating repeated presentations (e.g., Fig. 1B). The rate of modulation (8 kHz/s) was chosen to allow for the testing of a sufficient range of depths at sweep durations roughly equivalent to those used in prior experiments with trapezoidally modulated ILD (Sanes et al. 1998) and IPD (Miko et al. 1999) stimuli. The general effectiveness of the particular choice of modulation rate was confirmed in early experiments with glides.



View larger version (33K):
[in this window]
[in a new window]
 
Fig. 1. Stimulus context dramatically influences the responses of inferior colliculus (IC) neurons. A: conditioned enhancement. The filled histogram (black) shows spike counts in response to repeated presentations of a glide stimulus (see icons above responses) modulated (8 kHz/s) from an origin of 4 kHz to a target of 7 kHz. The overlaid unfilled histogram in gray shows the response to the unmodulated control stimulus at the target frequency. The response was significantly enhanced (Wilcoxon, P < 0.0001) relative to both the control-at-target and -origin references. B: conditioned suppression. The filled histogram (black) shows spike counts in response to a glide stimulus modulated from 7 to 2 kHz. The overlaid unfilled histogram in gray shows the response to the unmodulated control stimulus at the target frequency (2 kHz). The response was significantly suppressed (Wilcoxon, P < 0.0001) relative to both the control-at-target and -origin references. In this and subsequent figures depicting peristimulus time histograms, the ordinate axis has been removed for clarity. Where a single number and tick mark is retained (e.g., 15 in A and B), it indicates the spike count per bin.

To determine whether the dynamic (sweep) component of the glide was necessary for eliciting conditioning in our sample, a frequency step stimulus typically composed of two 2-s steady-state tones was also employed. In a few cases, longer-duration glide and step stimuli were employed to characterize the kinetics of conditioned responses more fully. The relatively long duration of our typical stimuli is comparable to stimulus durations in prior studies of conditioning (Sanes et al. 1998; Thornton et al. 1999). To facilitate comparison of our results with those of studies involving much shorter tone pips (i.e., durations of milliseconds, rather than seconds), we also employed a "quickstep" stimulus composed of two 200-ms tone pips and an interstimulus interval of 100 ms. Glide and step stimuli were typically repeated five times, and quickstep stimuli were repeated 10 times. To reduce spectral splatter, each of the tones in the step paradigms was shaped with a cosine-squared ramp (10 ms) at onset and offset. The glide stimulus, which is continuous, was gated with a similar ramp at the onset of the origin tone and the offset of the target tone.

Because the glide stimuli in these experiments involved FM, it is important to consider the calibration procedure carefully. A concern arises when the maximum output at the origin and the target frequencies are different because the attenuation for the entire glide stimulus is set by the choice of origin frequency. To address this concern, a five-band parametric equalizer (Symetrix 551e) was used to equalize the magnitude transfer function of the speakers. The frequency range of glide stimuli was confined to the most effectively equalized portion of the calibration curve in each experiment (generally 0.5-10 kHz), and the selection of data points was guided by the calibration curve during the experiment itself. Additionally, all origin-target pairs where a discrepancy in attenuator values greater than 3 dB predicted the observed change in discharge were eliminated from quantitative analysis. Results from earlier experiments where adequately controlled calibration curves were unavailable were also eliminated from detailed quantitative analysis. Results obtained in the step paradigms, where the attenuator values are set independently for each tone, are not impacted by the foregoing concerns and served as an additional check on the authenticity of conditioning effects obtained with glides in the same cells.

In addition to the conditioning stimuli described in the preceding text, the best frequency (BF), best SPL, and minimum latency for each cell were determined on the basis of responses to 200-ms tone pips. This initial characterization aided the search for parameter combinations (primarily origin and target frequency and SPL) that elicited robust conditioning effects. To gauge the magnitude of conditioning in each cell, we sought parameter values that maximized the effects obtained for that cell. The dimensionality of the search space is high and includes axes for each steady state duration, target and origin frequencies, SPL, and modulation rate (glides) or temporal delay between tones (steps). Previous studies have examined in detail the effects of varying temporal delay for forward masking and forward facilitation (Boettcher et al. 1990; Brosch et al. 1999; Finlayson 1999; Finlayson and Adam 1997; Harris and Dallos 1979; Shore 1995; Smith 1977, 1979) and the rate of FM (Felsheim and Ostwald 1996; Heil et al. 1992a,b; Mendelson and Cynader 1985; Mendelson and Grasse 1992; Mendelson et al. 1993; Møller 1969, 1971; Nelken and Versnel 2000; Poon and Chen 1992; Poon et al. 1991; Rees and Møller 1983; Ricketts et al. 1998; Sinex and Geisler 1981) throughout the auditory pathway. By contrast, we focused on manipulating origin and target frequencies and SPLs and constructed conditioned response functions (RFs), which depict the variability in the response to an identical stimulus (the target) as a function of the preceding origin frequency (see Fig. 4). Typically, the BF of the cell was chosen as the target, and the SPL was chosen to be slightly lower (10-15 dB) than the level that elicited the strongest response to 200-ms tone pips (best SPL). The origin frequency was then varied in 0.5- to 1-kHz steps that spanned the range of frequencies where the calibration concerns described above were negligible for glides. Conditioned RFs based on the step paradigms cover a range of origins similar but not identical to those used for glides. This reflects the fact that the choices of origin-target pairs are not constrained by the concerns about calibration described in the preceding text. The quickstep conditioning stimuli match those of the step paradigm exactly except that the stimulus duration was an order of magnitude shorter.

Data analysis and statistical verification of results

Conditioning is operationally defined as a deviation from the firing rate associated with a particular stimulus due to a change in stimulus context. Thus the analysis of conditioning effects requires a reference firing rate. Sanes et al. (1998) compared the firing rate immediately subsequent to ILD modulation with that obtained for the equivalent ILD presented statically, at an equivalent point in time relative to the stimulus onset. For example, for a 2-kHz depth glide stimulus, the firing rate at the onset of the target tone at 1.25 s (1 s at the origin, plus the 0.25-s sweep) would be compared with the firing rate 1.25 s into the static control stimulus. We refer to this as the control-at-target reference (Fig. 1). It could be argued that stimuli whose origin frequencies lie well outside the response area of the cell have "effective" onset times defined by the point at which the frequency sweep enters the response area, rather than the onset of the origin tone. Accordingly, the response to the target tone was also compared with the response beginning at the onset of the control stimulus, which we refer to as the control-at-origin reference (Fig. 1).

Following a thorough search of the glide stimulus space for each cell, the best potential instance of a conditioning effect of each type (enhancement, n = 31; suppression, n = 11) was identified for use in the analysis of the time course and general prevalence of conditioning effects in the IC. The spike counts in each 50-ms bin of the histogram of the response to the target tone of the glide stimulus were compared with the spike counts of the corresponding bins for the unmodulated control stimulus (Wilcoxon signed ranks). The duration of the stimulus was constant (e.g., 4 s), but because the duration of the sweep component varied with the depth of modulation, the duration of the control-at-target reference was also inversely proportional to the depth of modulation, typically ranging from 2.25 to 2.9375 s.

Statistical evaluation of the conditioned RFs, each of which involves multiple comparisons against the reference value, was performed differently to avoid compromising the statistical criterion. Dunnett's test (which is analogous to a t-test corrected for multiple comparisons against a single control value) was used to compare firing rates averaged from 0 to 500 and 500 to 1,000 ms relative to target onset to control responses during the equivalent interval. Responses to the two-tone stimuli were averaged and analyzed similarly with respect to the onset of the second "probe" tone, except that the responses to the quickstep stimuli were averaged in 100- rather than 500-ms blocks. Each control stimulus was generally presented thrice in the sequence of origin-target pairs (beginning, middle, and end), providing an estimate of the variance of the control value maximally sensitive to changes in overall responsiveness during data collection. This procedure makes our analysis conservative with respect to the statistical verification of conditioning effects. The data available for most cells permitted the construction of multiple conditioned RFs for varying SPLs or target frequencies. For each cell, a single conditioned RF containing the most robust and representative conditioning effect of each type (enhancement and/or suppression) was selected for statistical analysis.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The primary goal of this study was to provide strong evidence for the existence of frequency-based conditioning in the responses of comprehensively examined single units in the IC. The quality of evidence for enhancement or suppression of a neural response relative to some reference value is inherently limited by the trial-to-trial variability of the responses themselves. For this reason, we confined quantitative analysis of conditioned responses to 41 cells whose responses remained very stable throughout the relatively long recording times these experiments required. Cells in the larger sample whose responses could not unambiguously be attributed to conditioning effects, rather than changes in responsiveness or concerns related to calibration (see METHODS), were excluded. Because nearly all cells showed evidence of conditioning (both those included and excluded), the restriction of our sample did not compromise our estimate of the incidence of the conditioning effects.

The BFs of this population ranged from 1.4 to 10 kHz, with a median BF of 4 kHz. The predominance of low-BF cells reflects a deliberate restriction of the sample (see METHODS). The distribution of best levels at the BF was bimodal with peaks at the mode of 50 dB and a secondary peak at 80 dB, which, with few exceptions, was the maximum SPL tested. Nonmonotonicity was clearly common in the population: about half (21/41) of the sample had best SPLs below 50 dB. The SPL at which the largest conditioning effect was obtained (for cells tested with both glides and steps, the glide value was used) was typically lower (mean 11 dB) than the best SPL. The best conditioning SPL was less than the best SPL in 75% of cases. Minimum latencies in the sample spanned a broad range, from 6 to 86 ms, with a median of 12 ms. Minimum latency exceeded 25 ms in 25% of cases. Conditioned enhancement and suppression were each observed for neurons spanning the range in latency. The likelihood of observing a particular category of response (enhancement, suppression, or no significant effect) was not statistically related to any of these simple characterizations of the cells in the population (Kruskal-Wallis, P > 0.05).

Prevalence of conditioned responses

The stimulus configuration that elicited the most robust conditioned enhancement and/or suppression from each cell was identified during recording. Of cells tested comprehensively with glides (n = 31), all exhibited significant conditioning enhancement or suppression for at least one identified stimulus configuration. Examples of conditioned enhancement and suppression are shown in Fig. 1, A and B, respectively. Relative to the control-at-target reference, 28 cells showed enhancement, 8 showed suppression, and 5 exhibited both effects (P < 0.01). Most (29/36) instances of conditioning were significant by a substantially more stringent statistical criterion (P < 0.0001). Relative to the control-at-origin reference, 19 cells showed enhancement, 9 suppression, and 5 both effects.

The glide stimulus was chosen to allow for a fairly direct comparison with responses to a periodic trapezoidally modulated ILD stimulus (Sanes et al. 1998). An example of responses to a periodic trapezoidal FM stimulus is shown in Fig. 2B. The half-trapezoidal stimuli presented in this study were chosen to simplify the analysis of data and interpretation of results. We confirmed in a few cells that responses were consistent across these two stimulus paradigms as is evident by comparing the responses to the half-trapezoidal stimulus, shown in Fig. 2A, to the responses to the periodic FM stimulus wrapped by the modulation period in Fig. 2C.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 2. Conditioning effects obtained with the glide stimulus mirrored those obtained with repeated cycles of trapezoidal FM. A: response to repeated presentations of a glide stimulus modulating from 6 to 4 kHz. Unfilled histograms (gray) indicate the response to the 3-s control tone at 4 kHz. B: response to a periodic stimulus with steady-state durations of 1 s, modulating repetitively from 6 to 4 kHz. The response to a 12-s unmodulated pure tone at 4 kHz is shown in gray. C: response to the periodic stimulus in B wrapped by the modulation period. The similarly wrapped response to the unmodulated 4-kHz control is shown in gray. Responses to glides in A and periodic FM in B and C are quite similar. The response to the 6 kHz origin (0-1 s) for the glide in A is not significantly different (Wilcoxon, P = 0.8602) from the control-at-origin reference nor is the response to the repeating 6 kHz origin (0-1 s in the modulation period) in C different from the response to the 4-kHz unmodulated control (Wilcoxon, P = 0.2233) over the same interval. Thus the powerful resuscitation of the response following the modulation to 4 kHz in both paradigms is not due to recovery from adaptation during the presentation of the origin frequency.

Time course and magnitude of conditioned responses

One aim of this analysis was to determine whether the time course of conditioning obtained with monaural frequency sweeps was similar to that observed when modulating ILD (Sanes et al. 1998). To allow for direct comparison with effects obtained with ILD modulation, splines were fitted to the binned spike counts of the difference (or the negative of the difference in the case of suppression) of the conditioned and control responses throughout the duration of the comparison interval. The time corresponding to the intersection of the fit with zero was considered to mark the end of the effect. The typical length of the entire glide stimulus was 4 s, and most (23/34; 2 cases were excluded because the trial length was insufficiently long for this analysis) conditioned responses persisted throughout the duration of the comparison interval, which lasted slightly more than 2.5 s on average. The difference between the conditioned and control response did fall below zero in 11 cases with a mean of 1.64 s. The fact that most effects in our sample have a duration that exceeds 2.5 s when calculated by the same procedure utilized by Sanes et al. (1998) confirms that conditioning effects obtained in both paradigms persist for a number of seconds after the modulation has ceased.

To facilitate comparisons to the broader literature, single exponentials were fit to the binned differences in spike counts for conditioned responses. We also fit the temporal profile of the control response at the onset of the origin tone with a single exponential that included an additional parameter (the "pedestal") for the steady-state firing rate. In 14 cells, the time courses of the response at the onset of the control stimulus and the decay of the conditioned effect magnitude could both be well fit by decaying exponential functions with median time constants of 257 and 864 ms, respectively. The time constants obtained from each of these fits in each cell were not significantly correlated (Spearman-rho, P > 0.05). The temporal profile of the response to the control tone at onset could only rarely be fit in those cells whose responses showed significant suppression for glides. Of the eight cells exhibiting conditioned suppression (6 of these also exhibited conditioned enhancement for a different glide stimulus), five had buildup responses (the fit converged on a rising exponential) to the control. The remaining three had nearly constant firing rates that were not well described by a decaying exponential. By contrast, most cases of enhancement (17/28) were associated with exponentially decaying responses to the onset of the control stimulus. This relationship between conditioning effect class (enhancement or suppression) and response profile (decaying, buildup, poorly fit) was significant (chi 2, P = 0.0008). In the instances where the magnitude of conditioned suppression over time could be fit (5/8), the time constants were qualitatively quite similar to those obtained for conditioned enhancement.

Figure 3, B and D, shows the typical temporal evolution of conditioned enhancement and suppression respectively. Firing rates were normalized by taking the ratio of the response in each 50-ms bin to the average response for the duration of the control-at-target reference. These ratios were then averaged over all cells, and splines fit to these ratios were plotted. The control responses, in gray, were calculated and plotted similarly. All the responses that were analyzed quantitatively are included, regardless of significance. In the first second subsequent to the sweep, enhanced responses typically decay from 3.8 to 1.6 times the response to the control average, whereas suppressed responses rise from 0.32 to 0.59 times that average.



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 3. Average time course of conditioned enhancement (B) and suppression (D) to glide targets, including the history of the response to the preceding enhancing (A) or suppressive (C) origin frequency (note change in time scale). Data are based on the responses to the glide stimulus producing the most robust conditioning effect of each type for each cell. Although an instance of conditioned enhancement was drawn from each cell (n = 31) tested with glides, regardless of significance, instances of suppression were less common (n = 11). Spike counts (50-ms bins) were normalized by the average spike count of the control response during the control-at-target reference. Thus the control curves (gray) in B and D average to unity. All curves shown represent spline fits to normalized histograms. A: responses to origin frequencies that enhanced the response to the targets in B. Comparison of the black curves in A and B shows that the responses to origins associated with enhancement were typically less robust than the sustained response to the target. In most cases, there was no response to the enhancing origin tone, as indicated by the origin median (dashed line) at 0. The profile of the control response during the 1st second (A, gray) shows that cells that exhibit enhancement typically show firing rate adaptation to the presentation of the unmodulated control stimulus. B: time course of conditioned enhancement. Comparison of the target curve shows that enhanced responses exceed both the control-at-target (gray, B) and control-at-origin (gray, A) references. C: responses to origins that induced conditioned suppression in D. Despite the fact that the origin median (dashed line) is 0, the origin mean (black) is substantially elevated relative to the sustained response to the target. This apparent discrepancy exists because the response histories of instances of suppression fall into 2 discrete classes (see RESULTS): suppression following a robust response and suppression following no response to the origin. D: time course of conditioned suppression.

Figure 3, A and C, show the average response that preceded the conditioned and control responses shown in B and D. The gray curves in A and C, representing the responses during the first second of the control tone, indicate that firing rates decay substantially during the presentation of the unmodulated control stimulus for cells exhibiting enhancement (A), but not for cells exhibiting suppression (C). This is consistent with the fact that the latter cases were poorly fit with decaying exponentials as described in the preceding text. The curves in black (A and C) representing the average response to the origin tone that preceded the common target tone indicate that conditioned enhancement tends to follow a relatively weak response to the origin tone, but conditioned suppression occurs after a stronger response. The medians (dashed lines) of response history for conditioned enhancement (A) and suppression (C), however, are both zero throughout the presentation of the control tone because most origin tones did not elicit action potentials.

Instances of conditioned suppression can be divided neatly into two categories on the basis of discharge history. Cases where the response to the origin tone exceeds the response to the target tone because the glide stimulus modulates away from the cell's BF were less common (4/11) because we generally chose the BF of the cell as the target. The second, somewhat larger (7/11) category includes cases of conditioned suppression in the absence of a significant response to the origin tone.

Effects of varying the origin frequency

For glide stimuli that modulate to a common target, it is possible to construct a conditioned RF that represents the response to an identical stimulus (the target) as a function of the frequency of the preceding origin tone (Fig. 4). Responses averaged over 500-ms intervals following the onset of the origin and target tones were used to construct frequency and conditioned RFs, respectively. The frequency RF is both a measurement of the frequency tuning of the cell and a record of the history of the cell's response prior to the presentation of the target tone. Examples of such functions for two cells exhibiting conditioned enhancement and suppression are shown in Figs. 5 and 6. The dashed lines indicate the response to the target predicted by the control-at-origin (gray) and control-at-target (black) references. The distance between them indexes the adaptation (or buildup) of the response to the unmodulated control. If IC neurons were insensitive to the context in which the target tone appeared, then the conditioned RFs in black should be flat lines (the d's in Fig. 4), subject to the intrinsic variability of the measured response. It is clear from Figs. 5 and 6 that the presentation of the origin tone and origin-to-target sweep can profoundly affect the response to the target even when the origin tone lies well outside the excitatory frequency response area.



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 4. Construction of conditioned response functions (RFs). A: glide stimuli spanning a range of origin frequencies (e.g., 5 and 9 kHz) modulate to a common target (7 kHz). The unmodulated control stimulus is indicated by the dashed gray line. Firing rates averaged over the 1st 500 ms (i.e., 0-0.5 s) of the presentation of the origin tones (a-c) were used to generate the frequency RF shown in B (gray). Firing rates averaged over a 500-ms interval beginning at the onset of the target frequency (d: 1.25-1.75 s for a modulation depth of 2 kHz) were used to generate the conditioned RF shown in B (black). Conditioning is assessed by comparing the responses to the target (d, black) to the control-at-target (d, gray) and control-at-origin (b) references. B: cartoon of the conditioned RF: responses to the target (e.g., 7 kHz) are plotted as a function of the frequency of the origin that preceded it. If stimulus context did not affect the response to the target, the conditioned RF would be flat as shown. The equivalence of b and d (gray) at the target frequency (7 kHz) indicates that the firing rate remained constant during the presentation of the unmodulated control stimulus. Responses were also averaged over a 2nd set of 500-ms comparison intervals beginning at the end of the first set (e.g., 0.5-1 s, and 1.75-2.25 s) to examine the persistence of the conditioned responses.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 5. Conditioned RF exhibiting conditioned enhancement for a range of origin frequencies in the glide paradigm. Graphing conventions described here also apply to a number of subsequent figures. A: the frequency RF based on the 1st 500 ms of the response to the origin is shown in gray, and the conditioned RF based on the 1st 500 ms of the response to the target (7 kHz) is shown in black. Vertical bars on each curve indicate standard error. Dashed lines intersect the conditioned RF (black) and frequency RF (gray) at the target frequency and indicate the values of the control-at-target and -origin references, respectively. The brackets to the right indicate the ranges of the conditioned RF (black) and frequency RF (gray). Histograms illustrating the time course of conditioning for individual glide stimuli are shown in B-D. The intervals during which responses were averaged to calculate the conditioned RF are filled in black.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 6. Conditioned RF exhibiting conditioned suppression for a range of origin frequencies in the glide paradigm (see Fig. 5, legend). Note that in this case the target frequency (5 kHz) was not the BF of the cell (4 kHz). Suppression of the response to the target occurs following the presentation of origin frequencies that do (4 kHz) and do not (e.g., 2.5, 8 kHz) drive the cell.

It is evident from Figs. 1-3 that conditioned responses converge on the steady-state response associated with a particular choice of target. Figures 7 and 8 illustrate how entire conditioned RFs converge on the steady-state response predicted by the control-at-target reference for enhancement and suppression respectively. Each curve is based on an average of the responses in a 750-ms window, and each successive curve is based on an interval beginning 250 ms later. The histograms show the time course of the response to the target preceded by the origin frequency indicated by the asterisk on each figure. The bars under the stimulus icon in each histogram indicate the intervals where responses were averaged to create the series of conditioned RFs below. Although neither curve flattens completely in the time shown, it is nevertheless clear that the conditioning effects that create the variability in the conditioned RFs diminish with time. Consideration of the curve in Fig. 8 reveals that conditioned suppression for an origin frequency proximal to the peak of the frequency RF (1.5 kHz) appears to decay less rapidly than does suppression of the response induced by the presentation of the 8-kHz origin tone. Variation in the kinetics of conditioned responses elicited by different stimuli, but delivered to the same cell, suggests that the mechanisms responsible for conditioning phenomena are either diverse or operate differently at different synapses of the recorded neuron.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 7. Time course of conditioned enhancement across a range of origin frequencies in the glide paradigm. The histogram, top, illustrates the response of the neuron to a glide stimulus modulated from 8 to 7 kHz. The corresponding point on the conditioned RF is indicated (*). The series of bars shown above the histogram indicate the intervals during which responses were averaged to create the series of curves shown (750 ms, staggered by 250 ms); the curves illustrate the relaxation of enhanced responses to the sustained firing rate normally associated with the target. Dashed lines indicate the responses to the control stimulus for a 750-ms window beginning at the onset of the control stimulus (gray) and 1 s thereafter (black). These are effectively equivalent to the control-at-origin and control-at-target references, respectively. This cell did not respond to any origin frequencies other than 7 kHz.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 8. Time course of conditioned suppression across a range of origin frequencies in the glide paradigm. There is moderate (P < 0.05) but long-lasting suppression following the presentation of the 1.5-kHz origin, which drove the cell only weakly (4.67 Hz). More powerful but more transient suppression resulted from the prior presentation of frequencies well outside the cell's response area (e.g., 8 kHz, shown in the histogram and indicated by an asterisk on the conditioned RF plot). Dashed lines indicate the response to the control stimulus (2 kHz, the cell's BF) during a 750-ms window beginning at control onset (black) and 1 s thereafter (gray).

Is continuous FM required to elicit "dynamic" conditioning?

The term "dynamic conditioning" (Sanes et al. 1998) implies that a dynamic stimulus component is necessary to elicit the phenomenon. By presenting sweep-target stimuli that omitted the origin, we obtained anecdotal evidence that the sweep alone was sufficient to elicit conditioned responses. Elimination of the sweep, on the other hand, allows us to test directly the hypothesis that the dynamic component of the stimulus is necessary for conditioning to occur. Using sets of tone pairs ("probe" and "masker" tones with no delay) and varying the frequency of the origin tone, we created conditioned RFs for step stimuli analogous to those described previously for glides. Examples of conditioned RFs exhibiting conditioned enhancement are shown in Figs. 9 and 10. In Fig. 9, the response to the 5-kHz target is suppressed by the presentation of the 6-kHz origin tone (Fig. 9C) which, although proximal to the BF, does not by itself elicit a response. In Fig. 10, conditioned suppression relative to the control-at-origin reference (gray dashed line) occurs for origin frequencies near the 7-kHz target, but not for origin frequencies at or just below the 4-kHz BF (Fig. 10B). Clearly the elimination of the modulated component did not eliminate evidence of conditioning. Moreover the conditioned RFs for these cells are not simply the inverse of the frequency RFs.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 9. Conditioned RF exhibiting both conditioned enhancement and suppression in the step paradigm. Although nearly all origin frequencies that do not drive the cell are followed by enhanced responses (e.g., 10 kHz in D), the prior presentation of 6 kHz (C) suppresses the response to the 5-kHz target more powerfully than does the prior presentation of the best frequency (BF, 5 kHz; B).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 10. Conditioned RF based on the step paradigm illustrating conditioned enhancement and suppression for a target (7 kHz) well away from the BF (4 kHz). Remote origin tones were associated with enhancement of the response (e.g., 10 kHz in D). The response to the 7-kHz target, when preceded by an origin tone at the BF (4 kHz, in B), was much larger than when preceded by an origin tone of 7 kHz (C) despite the much larger response to the BF stimulus.

We examined 28 conditioned RFs based on glides and 28 conditioned RFs based on steps for evidence of significantly conditioned responses. Because these stimuli are much longer than those typically used in studies of forward masking and sequence selectivity (Brosch et al. 1999; Calford and Semple 1995), we collected an additional 14 conditioned RFs based on 200-ms tone pips and interstimulus intervals of 100 ms. Conditioned RFs based on these "quickstep" stimuli were exactly matched in frequency and SPL to step conditioned RFs obtained in the same cells and test the hypothesis that conditioning effects require long-duration tones for their induction. Significance of conditioned responses was assessed for each choice of reference (control-at-target and control-at-origin) and for each block of data (firing rates for quicksteps were averaged over 100- rather than 500-ms comparison intervals). Results of this analysis appear in Table 1. A cell was considered to exhibit conditioning if at least one response was significantly enhanced or suppressed relative to the reference. The sum of enhanced points and suppressed points can exceed the number of conditioned RFs if there are cells that exhibit both enhancement and suppression in the same conditioned RF.


                              
View this table:
[in this window]
[in a new window]
 
Table 1. Population summary of effects for conditioned response functions

Averaged across the three stimulus paradigms, 93% of conditioned RFs included at least one significantly conditioned response relative to the control-at-target reference for the first comparison interval. Most (73%) of these effects persisted into the second comparison interval. Using the alternative control-at-origin reference, 70% of conditioned RFs showed evidence of conditioning during the first comparison interval.

It should also be stressed that although we explicitly searched for stimuli that elicited robust conditioning effects, several different stimuli were effective in eliciting conditioned responses in most cells. Although the total number of curves was the same (28), many more points were tested in the step paradigm, and a higher proportion (58 vs. 36%) showed enhancement. Although differences in the duration of the origin tone between the glide (1 s) and step (2 s) paradigms could explain the discrepancy in effect prevalence, quickstep stimuli also readily and robustly conditioned the responses of IC neurons with origin tones lasting only 200 ms. Since the modulated component was evidently not necessary to produce these effects, the frequency sweep probably diminishes the likelihood of detecting conditioned responses during the steady states because the contextual effects exerted by the origin tone wane during the course of the sweep. The similarity of effect prevalence across time scale indicates that the induction of conditioned responses is relatively rapid and occurs on a time scale of a few hundred milliseconds as well as seconds. The observation that conditioning effects appear to decay more rapidly in the quickstep than step and glide paradigms suggests that the duration of the eliciting stimulus affects the duration of the conditioned effect.

Can the conditioned RFs be explained by response history?

No particular origin or target frequencies or modulation depths (expressed in either kilohertz or octaves) were systematically related to either the incidence or magnitude of conditioned effects in our sample. Although the structure of the conditioned RFs was not conserved across cells, conditioned RFs were consistent within cells, across both stimulus type and time scale. For all cells tested with matched origin-target pairs across different stimulus paradigms, we computed the ratio of each point in the conditioned RF to the control-at-target/-origin reference. The correlations of these ratios were significant across both the stimulus type (r = 0.329/0.208; P < 0.0001) and time scale (r = 0.149/0.383; P < 0.0001).

While the foregoing analysis can establish that the structure of the conditioned RF is anchored to the response area of each cell, it cannot specify whether the conditioned responses reflect the particular origin stimulus that preceded the target, or the response to that stimulus. For a neuron very sharply tuned to 4 kHz, the "effective" onset of a glide stimulus with an origin of 8 kHz might not occur until very near the end of the sweep to the 4-kHz target, when the cell begins to respond. For a 4-s stimulus that includes a 1-s interstimulus interval, the duty cycle of the control stimulus over repeated presentations would be 4/5. For the remote-origin glide stimulus, however, the "effective" duty cycle of the stimulus---the fraction of time that the stimulus spends at frequencies that drive the cell to fire action potentials---would be ~2/5. From the standpoint of response adaptation, the cell would be recovering from adaptation not only during the interstimulus interval but also during the presentation of the origin tone and much of the sweep. In addition to recovering from adaptation of excitation during the presentation of the origin stimulus, cells with inhibitory sidebands may also be actively inhibited. For the purposes of our analysis here, however, inhibitory events and their consequences are treated as aspects of stimulus history because by the history of the cell's response we mean only what we can measure extracellularly (i.e., the discharge history).

Our results indicate that enhanced responses typically exceed the response even at the onset of the control tone (i.e., the control-at-origin reference). Figures 11 and 12 show two examples of conditioned responses (11, B and C, and 12B) that clearly exceed the responses at the onset of the control stimuli (Figs. 11A and 12A, ) for very-long-duration tones. In both instances, the delay between the tones was varied, and even a temporal separation of 2 s (11C; data not shown for the cell in Fig. 12) did not produce any change in the magnitude or character of the response to the target. For tone durations of 5 s and an interstimulus interval of 2 s, the "effective" duty cycle of the control stimulus in Fig. 11 is 10/12, while that of the stimuli shown in B and C is 5/12. The insensitivity of the conditioned response to the introduction of long delays (1-3 s) between the origin and target tones is evidence that the enhanced response to the target results from a change in "effective" duty cycle rather than the presentation of a particular (e.g., 12 kHz) stimulus in the immediate past.



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 11. Introduction of a delay between the origin and target tones did not abolish response enhancement. A: histogram showing the response to the step control stimulus. B: presentation of an origin tone whose frequency is well outside the response area of the cell results in profound enhancement of the response. C: introduction of a 1- to 3-s delay (2-s delay shown) between origin offset and target onset does not substantially alter the strength of the response when averaged over the duration of the target tone (Kruskal-Wallis, P = 0.2208).



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 12. Conditioned enhancement often cannot be attributed to "recovery from adaptation" during the presentation of the origin tone. A: histogram depicting the response to a very long (10 s/tone) step control stimulus. The magnitude of the response to the target was not affected by the introduction of delays as long as 3 s (data not shown). Neither the 2-kHz origin tone nor the 7-kHz target drove the cell particularly strongly, relative to the BF (4 kHz; see Fig. 10). B: the response to the 7-kHz target () is significantly enhanced relative to the control response to either the origin or target tone (Wilcoxon, P < 0.0001). The response to the 2-kHz origin tone in B, however, is not significantly (Wilcoxon, P = 0.0740) different from the response to the target tone in A.

If a nonspecific (and in our case, monaural) mechanism of response adaptation accounts for our results, then it should be possible to demonstrate that in instances where the responses to different origin tones are equivalent, the responses to the ensuing target tone are likewise equivalent. The neuron whose responses are illustrated in Fig. 12 shows robust enhancement of the response to the 7-kHz target when preceded by a 2-kHz origin tone. This cell cannot be said to have recovered from adaptation during the presentation of the 2-kHz origin tone any more that it recovers during the presentation of the target tone during the control stimulus: the neuron responds slightly but not significantly more strongly to the 2-kHz origin tone (Fig. 12B, ) than it does to the presentation of the target tone for the control stimulus (Fig. 12A, ). It could be said, however, that the cell's afferents tuned to 7 kHz recover from adaptation during the presentation of the 2-kHz origin tone and that the "effective" duty cycle for these afferents differs substantially across the control and conditioning stimuli. A similar argument applies to the responses to the 6- and 4-kHz origin tones in Fig. 2A.

Figure 13B is a cartoon of a conditioned RF based purely on adaptation of excitation. The response to the target presented alone (i.e., the origin tone is replaced with silence) is indicated (*). For every spike fired in response to the origin tone, there is a proportional reduction in the response to the target so that as the origin tone moves out of the cell's excitatory response area, the conditioned RF asymptotes to the response to the target tone alone (e.g., <3 and >10 kHz). Figure 14 depicts a conditioned RF obtained with very-short-duration tone pairs (20 ms) that conforms to this simple model. The second excitatory peak of the frequency RF (8-9 kHz) is matched by a second trough in the conditioned RF. The data derived from the conventional stimulus paradigms, however, did not typically conform to the predictions of the adaptation of excitation hypothesis, which predicts that the peak of the frequency RF should align with the trough of the conditioned RF. Instances like that schematized in Fig. 13C, where ineffective origin tones suppress the response to the subsequent target, were not uncommon. In Fig. 15, for example, the minimum of the conditioned RF occurs at 4 kHz, where the origin response was no different from that of a number of origins that preceded robust target responses. In addition, the control target response is near the maximum of the conditioned RF despite the prior presentation of the BF at 4.5 kHz.



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 13. Schematics depicting the way the history of the cell's discharges to the origin stimulus (the frequency RF) may (B) or may not (C and D) predict the structure of the associated conditioned RF. A: no effect of context. A flat conditioned RF at the level of the frequency RF peak at the target (6 kHz) indicates that the response to 6 kHz is constant under all conditions (no firing rate adaptation, or sensitivity to "effective" duty cycle). B: discharge history completely determines the conditioned RF. Reduction of the "effective" duty cycle for origin frequencies outside the cell's response area enhances responses to the target relative to the frequency RF peak. Enhancement asymptotes to the level predicted by the response to the target when presented alone (*). Within the response area, the response to the origin tone causes a proportional reduction in the response to the target, indicated by the alignment of the trough of the conditioned RF at the frequency RF peak. C: stimulus specific suppression. The misalignment of the frequency RF peak (6 kHz) and conditioned RF trough (5 kHz) indicates that the discharge history does not fully determine the subsequent response to the target. D: stimulus specific enhancement. The response to the target after particular origins exceeds the responses obtained at more remote origins, and is not explicable in terms of discharge history (e.g., both 9 and 12 kHz elicit the same lack of response). Such phenomena may reflect a postinhibitory rebound in the cell or adaptation of broadly tuned inhibitory afferents during presentation of the origin.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 14. Discharge history (the frequency RF) predicts the form of the conditioned RF quite well for some cells, particularly for very short tone pairs. Data shown here are based on 50 repetitions of a step stimulus with individual tone durations of 20 ms. The conditioned RF (black) and frequency RF (gray) vary inversely, such that their sum is nearly constant.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 15. Misalignment of the frequency RF peak and conditioned RF trough indicate suppression of the target response in the absence of an extracellularly observable response to the origin. Data shown here are based on the quicksteps paradigm.

Careful consideration of the conditioned RFs in earlier figures reveals numerous examples of the failure of the discharge history of the cell, embodied in the frequency RF, to account for the response to the subsequent target tone. As has been noted, misalignment of the frequency RF peak and conditioned RF valley is evident in Figs. 9 and 10. In the latter case, the response to the 7-kHz target is not substantially diminished by the robust response to the preceding 4-kHz origin tone, while the response to the target is powerfully suppressed by the relatively weak response to the preceding origin tone of the same frequency. Again these results are consistent with frequency-specific adaptation occurring in the afferents to the recorded IC neuron. As Figs. 9 and 15 demonstrate, decrements in the response to the target tone can occur in the absence of a significant extracellularly observable response to the origin tone. Although response adaptation subsequent to the robust response to the BF of 4 kHz in Fig. 6 clearly accounts for the trough of the conditioned RF there, the suppression of the responses to targets which follow origin tones that do not elicit responses above the spontaneous rate (e.g., 2.5, 8 kHz) cannot be so explained.

Finally, we also encountered instances of stimulus-specific conditioned enhancement like that schematized in Fig. 13D. The response following a particular origin stimulus is enhanced relative to the response to origin frequencies more remote from the BF, where the "effective" duty cycle of the stimulus would be minimized. The tuned peak of the conditioned RF at 7 kHz in Fig. 16 is incompatible with a nonspecific adaptation of excitation mechanism because the discharge history (and thus the "effective" duty cycle) for the origin tones from 7 to 10 kHz is exactly the same: there was no response prior to the onset of the 4-kHz target. Similarly, the cells considered in Figs. 5 and 9 effectively respond only to 7 and 5 kHz, respectively, but the target responses following other equivalently ineffective origin tones differ significantly from one another.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 16. Differential enhancement over a range of origin frequencies that elicit statistically indistinguishable responses (P > 0.1) cannot be explained by discharge history. Despite the fact that neither 7- nor 9-kHz origin tones drive the cell, the subsequent responses to the 4-kHz target tone differ significantly (Tukey-Kramer HSD, P < 0.001). Data shown here are based on the step paradigm with 2-s tones.

Evidence for sensitivity to stimulus history in the conditioned RFs

We assessed the prevalence of sensitivity to stimulus rather than discharge history by restricting each of the measured conditioned RFs to origin frequencies that elicited statistically indistinguishable responses. Because the range of tested origin frequencies greatly exceeded the excitatory response area of each cell, there were generally several origin-target pairs that could be equated in terms of discharge history and "effective" duty cycle: none of those origin tones elicited a response when presented (e.g., 4-6 and 8-10 kHz in Fig. 5). In a few cases, target responses that followed relatively robust but equivalent responses to different origin tones were also retained and analyzed separately. The equivalence of the responses to the remaining origin tones was verified by obtaining a nonsignificant result in an ANOVA of repeated trials by origin frequency (P > 0.10). Firing rates <2 Hz, where 2 Hz represented <10% of the response to the BF, were considered not to be significantly different from zero regardless of the outcome of the test. After eliminating differences in discharge history from the conditioned RFs in this way, an ANOVA by origin frequency was then performed on the responses to the target tones. A significant (P < 0.01) result was taken as evidence for a cell being specifically sensitive to stimulus history. Overall, 57% of the conditioned RFs in each paradigm responded significantly differently to the same target tone when it was preceded by origin tones that elicited statistically indistinguishable responses.

Context sensitivity attributable solely to stimulus history was estimated by examining the effect of controlling for discharge history on the ranges of the conditioned RFs. The range of firing rates spanned by the conditioned RF indicates the variability in the response to an identical stimulus wrought by changing the context in which that stimulus appeared. This range has the advantage of being independent of the choice of control reference used elsewhere to estimate the incidence and magnitude of conditioned effects. To normalize for differences in overall firing rate across cells, we divided the range of each conditioned RF by the range of the associated frequency RF. These ranges are indicated by the brackets to the right of conditioned RFs on earlier figures. A value of unity for this "context sensitivity index" indicates that response variability to the same target occurring in different contexts is as great as the variability in the responses to origin tones spanning the excitatory response area of the cell. Context sensitivity indices did not differ overall across stimulus type, or time scale, but individual cells tested in more than one paradigm yielded values that were consistent across both stimulus type (n = 20; rho  = 0.552; P = 0.0116) and time scale (n = 14; rho  = 0.641; P = 0.0135).

Stimulus-specific context sensitivity was then calculated from the ranges of conditioned RFs restricted to those origin-target pairs where the history of the response to the origin tone was statistically indistinguishable as described in the preceding text. The distribution of these indices also did not differ across paradigm. The graph in Fig. 17A shows context sensitivity with response history controlled plotted against the context sensitivity index for the full conditioned RF. The vertical distance from the diagonal indicates the reduction in the range of the conditioned RF when response history is controlled. In a number of conditioned RFs, most conditioning effects were induced by origin tones outside the excitatory frequency response area of the cell and thus cannot be explained by nonspecific adaptation of excitation. The error index in Fig. 17B estimates the degree of apparent "context sensitivity" attributable to variability over repeated trials. It was computed by normalizing the mean standard error of each conditioned RF by the range of the frequency RF. The median values of the context sensitivity indices when discharge history was (0.43) or was not controlled (0.81) far exceeded the median value of the error index (0.074).



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 17. Controlling for discharge history reduces the context sensitivity only slightly for many neurons. Context sensitivity is expressed as the ratio of the range of the conditioned RF to the frequency RF. A: context sensitivity indices calculated for conditioned RFs limited to origin frequencies that elicit similar responses (see RESULTS) are plotted against the context sensitivity indices for the entire conditioned RF. A single cell with conditioned RFs generated in each paradigm would contribute 3 points to this graph. Most cells were tested in >= 2 paradigms. Points lie along the diagonal for conditioned RFs whose maxima and minima occur for origin tones that elicit statistically indistinguishable responses. The vertical distance of a point from the diagonal indicates the reduction in the range of the conditioned RF when response history is controlled. B: error indices were calculated as the ratio of the mean standard error to the range of the frequency RF and indicate the degree of "context sensitivity" predicted solely by the typical response variability for repeated trials. A single outlier (glide) was eliminated from this analysis because its atypically low firing rates resulted in abnormally large error (0.92) and context sensitivity (2.92) indices. If stimulus context did not impact the responses of IC neurons, then all context sensitivity indices along the abscissa would fall below 0.25 or within the hatched box in the lower left corner in A; if all effects of stimulus context were attributable to discharge history, then all points along the ordinate in A would lie below the top of the hatched box.

It should be noted that the foregoing analysis is conservative with respect to the estimate of stimulus-specific conditioning effects because it effectively assumes that the variability in the responses to target tones preceded by origin tones within the excitatory response area is entirely due to discharge history.

Adaptation and conditioning in the IC

If nonspecific adaptation of excitation were the dominant force in conditioning responses to the common target tone, the firing rates elicited by the origin and target tones should be negatively correlated. Although sometimes negative, none of these correlations attained significance for any paradigm. Because the inclusion of origin frequencies that did not elicit significant conditioning effects could obscure the underlying relationship between adaptation and conditioning phenomena, we restricted the analysis to origin-target pairs that showed significant conditioning relative to the control-at-target reference. Although data from glides and steps failed to show a significant correlation, the response to the origin tone does predict the decrement in the response to the target for the quicksteps (rho  = -0.287; P = 0.0133), suggesting that nonspecific adaptation of excitation increasingly dominates context effects for short-duration stimuli (cf. Fig. 14).

It is possible that cells with particular adaptation characteristics have particular conditioning characteristics. To investigate the relation of monaural response adaptation to conditioning magnitudes obtained with monaural frequency glides and steps, we took the ratio of the responses to the control stimuli 500-1,000 and 0-500 ms after onset (when analyzed in 500-ms blocks, the responses in our sample reach a statistically "steady" firing rate by the 2nd block). This ratio was extremely consistent across repeated presentations of the control stimulus for each cell included in the analysis, and the average of these ratios was considered the adaptation ratio for the cell. This ratio was then compared with the context sensitivity index described in the preceding text (Fig. 17A). The lack of a significant correlation between these two measures suggests that the factors that control firing rate adaptation for constant stimuli are not necessarily the factors that regulate conditioning effects induced by frequency transitions.

We measured the decay in firing rate to the target tones by taking the ratio of firing rates measured 500-1,000 and 0-500 ms after target onset (for quicksteps, 100-200 and 0-100 ms). Median values of this ratio for the glide, step, and quickstep stimuli were 0.83, 0.80, and 0.81, respectively. We extracted the contributions of conditioning effects by analyzing the distributions of this ratio for each paradigm by response category (enhancement, suppression, and no effect). The firing rates decrease more from one interval to the next for enhanced points, and increase more for suppressed points, relative to origin-target pairs that did not exhibit conditioning, for all paradigms (Kruskal-Wallis, P < 0.0001). Thus it appears that conditioning effects are superimposed on, rather than determined by, the adaptation profile of each cell.

Effects of varying level on conditioning

Our exploration of the effects of level was generally limited to tests during the initial search for stimuli that elicited robust conditioning. Once an effective pair of origin and target frequencies had been identified, we typically varied SPL over a 30-dB range in an attempt to maximize the magnitude of the conditioned effect. The level that proved best was then set, and the additional points necessary to construct a conditioned RF were selected. In several instances, particularly in later experiments, we obtained entire conditioned RFs at multiple SPLs. The primary value of these data was to verify the reliability of origin-specific conditioning effects because the shapes of conditioned RFs generally changed gradually as the SPL was varied. Qualitatively, the nature of these changes appeared to be as specific to individual cells as the conditioned RFs themselves. For example, although it was often clear that the magnitude of conditioned enhancement could be limited by response saturation to the target tone, regardless of context, the high incidence of nonmonotonicity in our sample complicates generalizations about the SPL range that saturated the control response. It was possible for the sign of the conditioned effect associated with a particular origin-target pair to change as the SPL was varied over a sufficiently large range, as shown in Fig. 18. Nevertheless, as we have noted, SPLs 10-15 dB below the best SPL for the cell were most likely to be associated with large conditioning effects.



View larger version (50K):
[in this window]
[in a new window]
 
Fig. 18. The sign of conditioned effects (enhancement or suppression) can vary even for a single origin and target pair, as a function of SPL. Shown here are the responses to a glide stimulus modulated from 8 to 2 kHz (filled black histograms) at 70 and 40 dB. Control responses are shown in gray unfilled histograms. Although at moderate levels (40 dB) the response to the 2-kHz target is suppressed when preceded by the 8-kHz origin, the response to the target is enhanced when the same glide stimulus is presented at higher (70 dB) levels.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Is conditioning specific to binaural processing?

Sanes et al. (1998) noted the possibility that time-varying IPD and ILD stimuli might be two instances of a more general class of dynamic stimuli that condition the responsiveness of central auditory neurons. The primary finding of the current study is that this class includes monaural stimuli that do not induce a strong percept of auditory motion and stimuli that involve a discrete rather than continuous change of frequency. Because modulations of IPD and ILD induce the perception of auditory motion, the potential significance of dynamic conditioning has been considered primarily within the context of sound localization and auditory motion sensitivity (McAlpine et al. 2000; Spitzer and Semple 1991, 1993). Our current results suggest that although conditioning has important implications for the processing of dynamic IPD and ILD stimuli, the phenomenon constitutes evidence not of a special sensitivity to auditory motion but rather of a more general sensitivity to acoustic contrast.

The demonstration that monaural frequency sweeps and steps condition the responses of auditory neurons reveals that the phenomenon can be induced without the engagement of the ascending binaural auditory pathway. These results cannot show that Spitzer and Semple's (1998) model, which involves an inhibitory projection from the contralateral dorsal nucleus of the lateral lemniscus, is insufficient for explaining the processing of time-varying IPD in the IC. The fact that conditioning effects elicited by dynamic interaural phase modulation can be observed following the transection of the commissure of the IC and the commissure of Probst (Miko et al. 1999), however, does support such a conclusion. The finding that direct application of inhibitory neurotransmitters by pressure injection could mimic the conditioning effects observed in EI-type IC neurons for modulations of ILD (Sanes et al. 1998) suggests that the mechanisms that subserve conditioning are at least partially intrinsic to the IC and that the engagement of the binaural system is not necessary for conditioning to occur.

The apparent generality of conditioning effects requires explanation. Sanes et al. (1998) hypothesized that any auditory stimulus or experimental manipulation that dynamically alters the balance of excitation and inhibition could elicit conditioning. On the other hand, it was recently argued that the apparent sensitivity of IC neurons to stimulus history might be an epiphenomenon of a nonspecific sensitivity to the history of the neuron's own response to the stimulus (McAlpine et al. 2000). This interpretation implies that the generality of conditioning could be explained by the ubiquity of adaptation in neural systems. The following discussion will focus on the roles that adaptation and synaptic inhibition play in our data and on how sensitivity to stimulus history, in addition to discharge history, impacts our understanding of context sensitivity in the IC.

Mechanisms of conditioning

We propose the following simple conceptual framework for interpreting the structure of conditioned RFs for IC neurons. Cells vary widely in terms of their sensitivity to their own discharge history. This sensitivity determines both the magnitude of the nonselective decrease in the response to a subsequent tone following robust discharges to origins at or near the BF and the limit on apparent response enhancement for the minimum "effective" duty cycle, which occurs whenever the origin frequency is completely outside the cell's excitatory response area. For a neuron with a frequency RF symmetric about the BF, the conditioned RF predicted solely on the basis of response history would be something like the "winged-V" shape depicted in Fig. 13B. Additional effects that are specific to the stimulus, rather than discharge history, shift the responses about this effective duty cycle-determined rate and deform the winged-V structure of the conditioned RF. As our analysis has shown, these effects are quite common on and just outside the borders of the excitatory response area, where they can be readily distinguished. For origins that do elicit spikes from the cell, stimulus-specific effects may be overwhelmed by nonspecific adaptation of excitation in cells that are very sensitive to their own discharge history. When the conditioned RFs show peakedness in a particular region (Fig. 13D), which is fairly common, the position of this peak may indicate a point in the response area where the balance of inhibition and excitation has shifted most strongly in favor of inhibition, as would occur in a strong inhibitory sideband.

Our conceptual model of conditioning phenomena could be considered an "adaptation" model if it is understood that the concept of adaptation embraces adaptation of synaptic excitation in the absence of action potentials and the adaptation of inhibition as well. With our data, we cannot distinguish between a postinhibitory rebound based on synaptic effects acting on the membrane potential of the recorded neuron from adaptation occurring in inhibitory afferents to the cell. In fact, there is no reason to believe that the two processes are mutually exclusive. The frequency specificity of many suppressive effects suggests that adaptation specific to the activity of differently tuned afferents, rather than the cell's own discharge history, accounts for the sensitivity to stimulus history in the IC. Finally, instances of profound suppression of the response to targets preceded by origins that elicit little or no response from the cell indicate that the history of the cell's "response" most relevant to conditioning is the record of its synaptic rather than spike-generating activity.

It is our belief that adaptation is no more a mechanism that accounts for conditioning phenomena than conditioning is a mechanism that accounts for context sensitivity in the auditory system. The terms adaptation and conditioning describe how neuronal activity varies with time and context, respectively, and as such are mechanistically ambiguous. The phenomenon of firing rate adaptation in the IC is the result of a number of biophysical mechanisms governing synaptic transmission at multiple stages of the ascending auditory pathway. It has been suggested that some of the contributing mechanisms, acting at both excitatory and inhibitory synapses, include purinergic transmission, activation of metabotropic receptors, modulation of pre- and postsynaptic ion currents, and the regulation of vesicle release through presynaptic protein regulation (Finlayson and Adam 1997). Thus if any of the mechanisms responsible for firing rate adaptation do not depend directly on action potential generation, controlling for the discharge history elicited by a stimulus would not necessarily preclude the involvement of adaptive mechanisms.

The relative prominence of conditioning effects that are not explicable in terms of discharge history reflect deliberate stimulus choices. By choosing a range of origin frequencies that extended beyond the extracellularly defined excitatory response area, but converged on a common target, we were better able to dissociate stimulus and response history than was possible in prior studies (Ingham et al. 2001; McAlpine et al. 2000; Sanes et al. 1998; Spitzer and Semple 1998). Similarly, suppression attributable to adaptation of excitation was limited by the fact that we rarely chose a BF origin and an off-BF target (e.g., Figs. 6 and 10).

Conversely, large-depth, continuously modulated stimuli would be more likely to recruit nonselective adaptive mechanisms. When the depth excursion is sufficiently large, the stimulus will sweep through the peak of the relevant tuning function during the modulation. Thus nonselective adaptive mechanisms, recruited by excitation elicited by the stimulus sweeping through the peak of the tuning function, will dominate the cell's response. Because the stimulus is continually varying, the time during which stimulus-specific effects could accrue due to the presence of stimulus energy at a particular stimulus value is also minimized.

The weight of various mechanisms of conditioning can be altered by the selection of the stimulus; just as changing the stimulus alters the relative dominance of nonspecific and stimulus-specific conditioning in any one cell, any one stimulus will elicit varying degrees of nonspecific and specific effects in the responding population. If the mechanisms underlying sensitivity to stimulus and response history are at least partially nonoverlapping, different cells could be differentially subject to their synaptic and action potential history. The fact that we observed cells whose misaligned frequency RF peak and conditioned RF trough indicate a bias toward stimulus-specific adaptation, as well as cells with elevated but very flat conditioned RFs outside the range of the frequency RF peak, argues for the partial independence of such mechanisms.

How does inhibition contribute to conditioning in IC neurons?

Spitzer and Semple (1993) and Sanes et al. (1998) suggested that conditioning phenomena elicited by dynamic stimuli are generated by changes in the balance of excitation and inhibition. The clearest evidence for the direct involvement of inhibition in conditioning phenomena comes from studies involving the trapezoidal modulation of ILD conducted in the IC of adult (Sanes et al. 1998) and young (Thornton et al. 1999) gerbils. The developmental study revealed that the magnitude and duration of conditioning effects are limited in very young animals and increase as the animals continue to develop (Thornton et al. 1999). The apparent magnitude of conditioning was related to the strength of inhibition evident in the ILD functions of adult animals (Sanes et al. 1998).

The most direct evidence for the sufficiency of inhibition in eliciting conditioning comes from the demonstration that enhancement obtained with the acoustic modulation of ILDs could also be mimicked by the direct application of inhibitory neurotransmitters by pressure injection (Sanes et al. 1998). The contralaterally evoked discharge rate increases dramatically immediately following the offset of a pulse of GABA and/or glycine from the recording electrode. In spontaneously active EI neurons, the response immediately subsequent to decrease in SPL of ipsilateral ("inhibitory") stimulus could be enhanced even in the absence of a contralateral ("excitatory") stimulus, which also suggests the involvement of a postinhibitory rebound. Because the cells typically do not fire while the level at the ipsilateral "inhibitory" ear is high (relative to the unmodulated control value) during the trapezoidal ILD stimulus, the elimination of action potentials, rather than inhibition per se, could result in the conditioned enhancement observed following ILD modulation. Nevertheless, qualitative observations indicated that the magnitude of conditioning remained correlated with the ipsilateral level (i.e., the magnitude of inhibition) over a range of ipsilateral levels that all completely suppressed responses (Sanes et al. 1998). Others have likewise argued that motion-direction sensitivity for dynamic interaural intensity difference stimuli in the auditory cortex of Horseshoe bats depends on inhibitory binaural interactions (Firzlaff and Schuller 2001).

The contribution of postinhibitory rebounds and synaptic inhibition to conditioning effects obtained with IPD and FM stimuli is less clear. Modulation toward the peak of the static IPD tuning function may sometimes elicit robust responses at IPDs that are associated with a complete lack of response when presented statically (Spitzer and Semple 1991, 1993, 1998). Recent computational modeling of dynamic IPD processing (Borisyuk et al. 2000) suggests that the inadequacy of a purely adaptive model to account for such responses could be redressed by the addition of a postinhibitory rebound. There is substantial evidence for the role of inhibition in delimiting the frequency response areas of IC neurons (see review by Caspary et al. 1997). In the current study, we often saw evidence of inhibitory sidebands in spontaneously active cells, and the high incidence of nonmonotonic rate level functions overall is also indicative of prominent inhibition. This is not surprising since GABA levels in the IC are among the highest in the brain, and several physiological and functional pharmacological studies implicate inhibition in the shaping of response properties of IC neurons (Faingold et al. 1989; Fuzessery and Hall 1996; Le Beau et al. 1996; Pollak and Park 1993). Instances of conditioned RFs that peak at origin frequencies on or just outside the borders of the response area were consistent with a postinhibitory rebound superimposed on the shift in effective duty cycle for origin frequencies beyond the excitatory response area.

If it's not "just adaptation," then what is conditioning?

In studies of the mechanisms subserving contrast adaptation in visual cortex, Sanchez-Vives et al. (2000) showed that postadaptation suppression is positively correlated with hyperpolarization of the membrane potential and an increase in apparent membrane conductance. They further showed that action potentials are not necessary to generate either the postadaptation hyperpolarization or contrast adaptation in general. Thus discharge history does not fully capture the operation of at least one possible mechanism of contrast adaptation. If the membrane potential can be regulated on the time scale relevant to our experiments, this finding might also explain how remote origin frequencies that do not elicit spikes can nevertheless significantly impact the response to a subsequent tone.

The mechanisms responsible for membrane hyperpolarization must be partially intrinsic to cortical neurons because the mimicry of visually induced contrast adaptation via injection of sinusoidal currents resulted in both a decrement in the response to a high-intensity stimulus and a prolonged period of reduced responsiveness after the stimulus ceased (Sanchez-Vives et al. 2000). Nevertheless, Sanchez-Vives et al. (2000) also encountered a number of cells that showed substantial postadaptation reductions of firing rate but relatively little adaptation of their responses to the injection of high-intensity current. Cells in our sample could also show robust conditioning effects despite little or no adaptation of their responses to either the origin or target tones. The fact that relatively small changes in the resting membrane potential significantly impact responses to visual stimuli (Sanchez-Vives et al. 2000) may help explain the substantial structure of conditioned RFs over ranges of origin frequency that elicit very similar responses. Finally, the authors note that additional presynaptic or network properties may be required to explain the discrepancy between the magnitude of shifts in the contrast response function obtained with an adapting visual stimulus and sinusoidal current injection (Sanchez-Vives et al. 2000). Perhaps analogously, we have suggested that adaptation confined to differently tuned afferent populations is consistent with many aspects of our data. It is possible that some mechanism (e.g., synaptic depression) that can selectively target the synapses utilized by a particular afferent population may be required to account for our results.

Studies in the subcortical auditory system provide evidence of changes in membrane polarization that could contribute to the contextual modulation of IC responsiveness (Rhode and Smith 1986; Smith and Rhode 1985). There is also evidence of subcortical sensitivity to stimulus context not explicable by response history alone, such as facilitation and delay-tuned responses (Finlayson 1999), or probe tone responses better correlated with masker level than the response to the masker (Boettcher et al. 1990; Shore 1995). Other poststimulatory changes in lower auditory nuclei that diverge from the typical recovery from adaptation functions in the auditory nerve and primary-like units of the cochlear nucleus include delayed maximal suppression (Boettcher et al. 1990; Finlayson and Adam 1997) and delayed facilitation (Finlayson and Adam 1997). It has been argued that many of these subcortical deviations from the response-historical model of adaptation of excitation involve inhibition (Palombi et al. 1994; Shore 1995). Perhaps the most direct evidence of inhibition-derived poststimulus changes below the IC is the demonstration that the responses of neurons in the lateral superior olive to binaural BF probe tones were maximal immediately subsequent to the presentation of a 200-ms BF "inhibitory adapting" stimulus to the contralateral ear (Finlayson and Adam 1997).

Implications of conditioning for auditory processing

Our concept of the effective duty cycle for repeating stimuli extends McAlpine et al.'s (2000) notion of "recovery time" for cyclically varying IPD stimuli and is applicable to a very broad range of stimulus paradigms, including those used in studies of FM processing. The effective duty cycle is commonly covaried when sensitivity to the modulation rate of linear FM is analyzed in blocked trials (Mendelson and Grasse 1992; Mendelson et al. 1993; Ricketts et al. 1998; Tian and Rauschecker 1994), confounding interpretation in terms of sensitivity to modulation rate per se. If conditioning effects "establish a new range of acoustic cues to which the neuron responds best" (Sanes et al. 1998), such shifts in sensitivity would clearly impact the processing of sound sequences. McKenna et al. (1989) demonstrated in cat auditory cortical neurons that the responses to a tone occurring within a larger sequence of tones may not be predictable from the responses to the tone presented in isolation and that responses to identical tones can also vary with the serial position of the tone in the sequence. The evidence for sensitivity to stimulus context presented here clearly indicates that response properties relevant to selectivity for sound sequences exist subcortically.

The frequency specificity of conditioning effects has important consequences for the kinds of sequence selectivity that could arise in the IC. Adaptive effects based purely on discharge history imply that the response of a neuron to a given stimulus would be the same when it follows any of the stimuli that trace an isoresponse contour in some parameter space (e.g., frequency and SPL). Sensitivity to stimulus history, however, would allow the neuron to discriminate among such sequences and thus respond in a manner specific to the history of the stimulus per se. Sanchez-Vives et al. (2000) have postulated that the continual regulation of membrane potential, and thus responsiveness, by the history of spike and synaptic potential activity "is likely to have dramatic effects on the spatial and temporal properties of receptive fields in cortical neurons" (p. 4284). Analogously, conditioning effects provide insights into the dynamic properties of the receptive fields of auditory neurons in the IC. The persistence of conditioning effects, particularly for longer stimuli, is evidence that information about stimulus trajectory along some parameter axis (e.g., frequency or ILD) is present for some time in the responses of IC neurons.

The large variability in the responses of auditory neurons to parametrically identical stimuli occurring in different contexts can be taken either as evidence of a neural processing strategy that transforms the representations of those stimuli in the ascending auditory pathway, or as a fundamental constraint on rate-based codes of acoustic stimulus parameters (Spitzer and Semple 1998). The question of what conditioning is for shares much with questions concerning the function of adaptation. The answers depend on the distinction between stimulus and discharge history and stimulus-specific and nonspecific adaptation. Many of the presumed improvements in the efficiency of the encoding of sensory signals accrue only to stimulus-specific adaptation (Barlow 1990; Muller et al. 1999). A sustained robust response to an unchanging stimulus would convey little information at a relatively high metabolic cost. Rapid stimulus specific adaptation, however, would allow for the reduction in the response to the ongoing stimulus without compromising the ability of the cell to respond to a novel stimulus, conserving energy, and improving the ability of the cell to signal differences between stimuli with larger changes in firing rate (Muller et al. 1999).

The distinction between stimulus and discharge history is rather artificial for stimuli that reliably elicit a particular response because the response context is derived from the stimulus context. For example, a substantial portion of the negative correlation of the response to the origin and target tones for origin frequencies near the BF, which appears to indicate sensitivity to response history, could reflect the correlation between the stimulus and discharge history because the engagement of the biophysical mechanisms that subserve the subsequent suppression occurs in parallel with, rather than as a consequence of, the generation of action potentials. Clear evidence that conditioning effects depend on discharge history per se would require the demonstration that any stimulus that elicits an equivalent response from the neuron could be substituted for any other without affecting the nature of the subsequent response. Recent studies that have attempted to explain apparent auditory motion sensitivity in terms of discharge history (Ingham et al. 2001) have not disentangled the specific contributions of stimulus and discharge history with such a test. From a perceptual standpoint, however, it is important to emphasize that context is relevant to the processing of all stimuli. Our findings suggest that for the auditory system the "stimulus" is not bracketed by silence, but rather the pip, the sweep, the silence, and their spacing collectively constitute the stimuli differentiated by the responses of auditory neurons.


    ACKNOWLEDGMENTS

We thank I. Miko, B. Scott, J. Rinzel, and A. Borisyuk for helpful comments on an earlier version of the manuscript.


    FOOTNOTES

Address for reprint requests: M. N. Semple, Center for Neural Science, NYU, 4 Washington Place, New York, NY 10003 (E-mail: mal{at}cns.nyu.edu).

Received 28 February 2001; accepted in final form 7 May 2001.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

0022-3077/01 $5.00 Copyright © 2001 The American Physiological Society