Temporal Representation of Iterated Rippled Noise as a Function of Delay and Sound Level in the Ventral Cochlear Nucleus

Lutz Wiegrebe and Ian M. Winter

Department of Physiology, Centre for the Neural Basis of Hearing, Cambridge CB2 3EG, United Kingdom


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Wiegrebe, Lutz and Ian M. Winter. Temporal Representation of Iterated Rippled Noise as a Function of Delay and Sound Level in the Ventral Cochlear Nucleus. J. Neurophysiol. 85: 1206-1219, 2001. The discharge patterns of single units in the ventral cochlear nucleus (VCN) of anesthetized guinea pigs were examined in response to iterated rippled noise (IRN) as a function of the IRN delay (which determines the IRN pitch) and the IRN sound level. Delays were varied over five octaves in half-octave steps, and sound levels were varied over a 30- or 50-dB range in steps of 5 dB. Neural responses were analyzed in terms of first-order and all-order inter-spike intervals (ISIs). The IRN quasi-periodicity was preserved in the all-order ISIs for most units independent of unit type or best frequency (BF). A deterioration of the temporal all-order code was found, however, when the neural response was influenced by inhibition. The IRN quasi-periodicity was also preserved in first-order ISIs for a limited range of IRN delays and levels. Sustained Chopper units (CS) in the VCN responded with very regular ISIs when the IRN delay corresponded to the unit's chopping period; i.e., the unit showed an increased proportion of intervals corresponding to the IRN delay (interval enhancement) relative to an equal-level, white-noise stimulation. This interval enhancement has a band-pass characteristic with a peak corresponding to the chopping period. Moreover, for CS units in rate saturation, the chopping period, and thus the interval enhancement to the IRN, did not vary with level. Units classified as onset-chopper also show a band-pass interval enhancement to the IRN stimuli; however, they show more level-dependent changes than CS units. Primary-like (PL) units also show level-dependent changes in their ability to code the IRN pitch in first-order intervals. The range of delays where PL units showed interval enhancement was broader and extended to shorter delays. Based on these findings, it is suggested that CS units may play an important role in pitch processing in that they transform a higher-order interval code into a first-order interval place code. Their limited dynamic range together with the preservation of the temporal stimulus features in saturation may serve as a physiological basis for the perceived level independence of pitch.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Psychophysically, the pitch perception of complex sounds is relatively independent of presentation level. When the level of a complex harmonic sound is varied from 50 to 80 dB SL, its pitch changes by, at most, 1% of the fundamental frequency (Zwicker and Fastl 1990). This perceptual level independence presents a challenge for physiological studies looking for temporal or spectral correlates of pitch in the mammalian auditory system: spectral models of pitch perception rely on the spectral magnitude information as it is preserved in a rate-place code along the tonotopic neural maps of the auditory system. Temporal models rely on the preservation of the temporal stimulus characteristics in the timing of spikes. Both these approaches have been investigated as to the extent to which they are vulnerable to changes in level (see Langner 1992 for a review).

Spectral encoding in the periphery

The majority of auditory-nerve fibers have a dynamic range of about 40 dB, above which the fiber response saturates (Evans and Palmer 1980; Sachs and Abbas 1974; Winter and Palmer 1990). Across a tonotopic map, saturation decreases the contrast in the harmonic spectrum due to "clipping" of the spectral peaks. Sachs and Young (1979) investigated the spectral representation of vowel formants across a large number of auditory-nerve fibers with BFs covering the whole vowel spectrum. They found that, while fibers with high spontaneous rate were in saturation, fibers with spontaneous rates below 20 spikes per second preserved the spectral vowel envelope. They did not investigate, however, the extent to which the harmonic spectral ripple itself is preserved in a rate-place code; i.e., they investigated a neural correlate of the vowel type but not of its fundamental frequency. In further studies, Miller and Sachs (1984) investigated the encoding of the fundamental frequency of speechlike sounds, and they demonstrated a robust code in a temporal-place representation; however, they did not investigate a rate-place code. It is therefore possible that a rate-place code may exist for the fundamental frequency, particularly if one takes into account the responses of high-threshold, low spontaneous rate auditory-nerve fibers. Recent models based on recordings from the cochlear nerve of the cat and transferred to a human cochlear place map indicate that as many as the first five harmonics are represented in a rate-place code even at reasonably high stimulus levels (Delgutte 1996). Thus it is possible for the fundamental frequency to be signaled to the brain in either a rate-place or temporal-place code, and therefore the responses of cells in the cochlear nucleus (an obligatory synapse for the auditory nerve) are of considerable importance. Some neurons in the cochlear nucleus are characterized by a dynamic range much greater than individual auditory nerve fibers; the range can be as much as 80-90 dB. These neurons were classified as onset-choppers (OC) (Rhode and Smith 1986; Winter and Palmer 1995). However, OC units have a mean bandwidth of typically 3 octaves and may even be as wide as 6 octaves (Jiang et al. 1996; Palmer et al. 1996). Thus, although these neurons may be able to represent the level range within which pitch perception is rather stable without saturation, OC neurons appear not to provide sufficient frequency selectivity to represent the harmonic spectrum.

Temporal encoding in the periphery

The temporal coding of the pitch of steady-state vowel sounds, in terms of the synchronization index, appears robust as a function of stimulus level (Miller and Sachs 1984), but it is unlikely that such a code underlies pitch perception as it cannot explain the phenomenon of pitch shift (Schouten 1940). Instead, it has been suggested that inter-spike intervals (ISIs) may provide a better code for the representation of the pitch of complex tones (e.g., Evans 1978; Greenberg 1986; Rhode 1995). However, the monotonically rising input-output functions of auditory-nerve fibers represent a further problem. An increase in discharge rate must inevitably lead to shorter ISIs, thus interfering with an analysis of these intervals in terms of quasi-periodicity associated with the pitch sensation. One possible way to overcome this problem is the processing of higher-order ISIs, an operation equivalent to an autocorrelation of the spike train (Cariani and Delgutte 1996a,b; Shofner 1991, 1999). A stimulus periodicity encoded in first-order ISIs at low stimulus levels may be preserved in higher-order ISIs for higher sound levels. This was confirmed experimentally by Cariani and Delgutte (1996a,b), who found that a neural correlate of pitch in the cat auditory nerve is well preserved in an all-order ISI analysis, whereas a first-order analysis was susceptible to changes in sound level. Whereas this is certainly true for the temporal encoding of pitch in the auditory nerve, this conclusion cannot readily be extended to the cochlear nucleus. In the cochlear nucleus, the neural information provided by the auditory nerve is subjected to different types of temporal and, through the interaction of units with different "best frequencies" (BFs), spectral processing. Concerning temporal processing, onset units were found to accentuate the degree of AM of an acoustic stimulus (Frisina et al. 1990a,b; Kim and Leonard 1988; Kim et al. 1986; Rhode and Greenberg 1994; Wang and Sachs 1994). Chopper neurons also accentuate AM, but they do so only for a limited range of modulation frequencies that are in the vicinity of their chopping frequency. Thus chopper neurons have temporal modulation transfer functions (a measure of temporal discharge synchrony as a function of modulation frequency) with a band-pass characteristic (Kim et al. 1990). This distinguishes chopper neurons from primary-like (PL) units or auditory-nerve fibers that typically show modulation transfer functions with a low-pass characteristic and a relatively small degree of temporal synchronization to a sinusoidal modulator.

In most previous studies on temporal neural correlates of pitch, research focused on the correlation between periodic envelope modulation, as seen in the stimulus waveform, and the periodicity of neural discharge. However, some stimuli do not reveal a pronounced envelope modulation in the stimulus waveform, but they nonetheless elicit a clear pitch perception. Examples for these stimuli are random-phase or Schroeder-phase harmonic complexes (the latter are designed to minimize waveform modulation) or iterated rippled noise (IRN) (Wiegrebe and Patterson 1999; Yost 1996a,b).

Rippled noise is generated by adding a delayed copy of a sample of white noise back to itself. Iterating this delay-and-add process generates IRN. In spectral terms, IRN is characterized by a ripple spectrum that, with increasing number of iterations, approximates the spectrum of a harmonic complex. In temporal terms, the iterated delay-and-add process introduces some degree of temporal periodicity that is reflected by a peak in the normalized autocorrelation function of the stimulus at a correlation lag equal to the IRN delay. This peak grows with increasing number of iterations. The height of the peak is a measure of the degree of periodicity of the stimulus. As this peak never reaches a value of one (perfect periodicity) with increasing number of iterations, we use the term quasi-periodicity when describing IRN stimuli. Perceptually, IRN produces a pitch sensation with a pitch corresponding to the reciprocal of the delay and a pitch strength that grows with increasing number of iterations. An example of an IRN waveform with a delay of 8 ms and 16 iterations is shown in the top panel of Fig. 1 and is compared with an example of a white-noise waveform (bottom).



View larger version (57K):
[in this window]
[in a new window]
 
Fig. 1. Example of an iterated rippled noise (IRN) waveform generated with a delay of 8 ms and 16 iterations (top) compared with an example of a white-noise waveform. Although the IRN stimulus produces a strong pitch sensation, its envelope does not reveal a strong periodic envelope modulation.

In a physiological study using rippled noise, Shofner (1991) showed in the chinchilla cochlear nucleus that the rippled-noise delay is well represented in the all-order ISI statistics of PL units. Shofner (1999) confirmed these findings using infinitely iterated rippled noise, i.e., a stimulus type with a much stronger pitch than rippled noise. However, Rhode (1995) using a variety of pitch-producing stimuli, showed that cat cochlear nucleus units represent pitch in their first-order ISI characteristics.

To investigate the representation of the pitch of IRN, we have examined the responses of single units in the guinea pig cochlear nucleus in terms of all-order and first-order ISIs (Winter et al. 1999). Sustained-chopper neurons as well as OC neurons revealed a band-pass temporal tuning based on first-order ISI statistics. This was not only the case for pitch stimuli with highly modulated envelopes such as cosine-phase harmonic complexes but also for pitch stimuli with relatively flat envelopes like random-phase complexes and IRN. While the strong temporal representation of the stimulus envelope by OC units has been well documented by several groups (Kim and Leonard 1988; Kim et al. 1986; Palmer and Winter 1992, 1993; Rhode 1994, 1995; Rhode and Greenberg 1994), to our knowledge the band-pass periodicity selectivity of the OC units had not been previously reported. In addition the OC units showed a range of periodicity preferences to fundamental periods as long as 16 ms (Winter et al. 1999).

In this study, the influence of presentation level is investigated systematically. As argued above, a temporal pitch code based on first-order ISIs could be highly sensitive to changes in presentation level because an increasing level may lead to an increase in discharge rate and thus to decreasing ISIs. This may be more the case for a stimulus with a relatively flat envelope like IRN (cf. Fig. 1, top) than for a stimulus with pronounced modulation like modulated pure tones or a cosine-phase harmonic complex because, due to the flat envelope, firing times can be more evenly distributed.

Given these stimulus features, the encoding of the quasi-periodicity of IRN stimuli over an appreciable range of delays and presentation levels represents a significant challenge for a mechanism of temporal-pitch extraction based on first-order ISIs in the cochlear nucleus (Rhode 1995). Nevertheless, the current results suggest that first-order ISIs of some units in the cochlear nucleus can provide a reliable representation of the pitch of IRN over a wide range of levels. Specifically, sigmoidally saturating, sustained-chopper units show a band-pass temporal tuning that is relatively stable over a 30- to 50-dB level range. The results suggest that a complex, higher-order temporal code of stimulus quasi-periodicity, as it is preserved in the auditory nerve and in PL cochlear nucleus neurons (Cariani and Delgutte 1996a,b; Shofner 1999), may be converted into a first-order temporal code by chopper neurons of the cochlear nucleus. This could be achieved through a facilitatory interaction of the complex periodicity at the neuron's input with the neuron's intrinsic oscillation. In the range of stimulus levels where the rate response saturates, the first-order temporal code provides a reliable, level-independent estimate of the stimulus pitch.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Preparation

The data reported in this paper were recorded from 14 pigmented guinea pigs weighing between 319 and 471 g. Animals were anesthetized with urethan (1.5 g/kg ip) and supplementary analgesia provided by either operidine (1 mg/kg im) or fentanyl (1 mg/kg im). All animals were given Atropine sulfate (0.06 mg/kg sc) as a premedication. Additional doses of urethan and the analgesic were given when required.

The surgical preparation and stimulus presentation took place in a sound-attenuating chamber (IAC). All animals were tracheotomized and core temperature maintained at 38°C with a heating blanket. Following placement in the stereotaxic apparatus, a midline incision of the scalp was made and the skin retracted laterally. The temporalis muscle on the left-hand side of the skull was removed and the bulla exposed. The method of stereotaxic positioning follows that previously reported (Winter and Palmer 1990). The stereotaxic coordinates were identical to those used in previous studies in the ventral and anteroventral cochlear nucleus (Winter and Palmer 1990, 1995), and electrode tracks sometimes coursed their way through the dorsal cochlear nucleus before entering the ventral division. This transition was marked by a change in best frequency of single units. Although data were recorded from units in the dorsal cochlear nucleus, as judged by their stereotaxic position and physiological response type, we have excluded them from the present data set. No histological verification of recording position was undertaken, but the above observations indicate that all the units reported in this paper were recorded from the ventral division of the cochlear nucleus.

The compound action potential (CAP) was monitored with the use of a silver-coated wire placed on the round window of the cochlea. The signal was filtered and amplified (×10,000). The CAP threshold was determined visually (10-ms tone-pip, 1-ms rise/fall time, 10 s-1) at selected frequencies at intervals during the experiment. If thresholds had deteriorated by more than 10 dB and were not recoverable (for example, by removal of fluid from the bulla), the animal was killed by an anesthetic overdose of pentobarbitol sodium (ip).

Recording technique

Recordings were made using tungsten-in-glass microelectrodes (Merrill and Ainsworth 1972). Electrodes were advanced by an electronic microdrive (Kopf 650W) through the intact cerebellum in the sagittal plane at an angle of 45°. A wideband noise stimulus was used to locate the surface of the cochlear nucleus and to search for single units.

Stimuli

For the isolation of single units, we used 50-ms bursts of broadband noise at a repetition rate of 4 Hz. For the classification of single units, stimuli were 50-ms pure tones at the unit BF. They were generated at a sampling rate of 200 kHz and low-pass filtered at 40 kHz. The complex stimuli consisted of white noise or IRN with 11 different delays ranging from 1 to 32 ms in half-octave steps, a gain of 1 and 16 iterations. IRN was generated by delaying and adding white noise. When this process was iterated, the output waveform of one delay-and-add stage served as the input to the next stage ["add same" configuration in Yost (1996a,b)]. The IRN stimuli were generated digitally at a sampling rate of 20 kHz using the Tucker Davis System II DSP Board and software delay lines. The stimulus duration was 409.6 ms including 10-ms cosine-squared ramps. Due to the nondeterministic nature of white noise and IRN, 25 samples were generated off-line for each delay and stored to hard disk. The digital stimulus energy was kept constant. To assess the effect of stimulus level, white noise and IRNs with 1 of 11 delays were presented at 7 different levels in 5-dB steps that amounted to a total of 84 different stimulus conditions. In a few examples, the effect of level was assessed over a 50-dB range corresponding to a total of 132 conditions. Each of these conditions was presented 25 times using the 25 different waveforms. For the data collection, the 84 or 132 conditions were presented in a randomized order that, however, was the same for each of the 25 repetitions. The stimuli were presented at a repetition rate of 1 per s, which resulted in an overall measurement duration of 35 or 55 min for the 30-dB range or the 50-dB range, respectively. After D/A conversion, the complex stimuli were low-pass filtered at the Nyquist frequency (10 kHz, TDT FT6) and attenuated (TDT PA4). The stimuli were equalized (Phonic PEQ 3600 Equalizer) to compensate for the speaker- and coupler frequency response before being fed into a Rotel RB971 power amplifier and a custom-made programmable end attenuator (0-75 dB in 5-dB steps). The different stimulus attenuations were set on the PA4; the minimal attenuation required for a specific measurement was set on the end attenuator to optimize the signal-to-noise ratio. The signal was presented over a Radio Shack speaker mounted in a coupler designed for the guinea pig ear (Mike Ravicz, MIT). The stimuli were acoustically monitored with a B&K 4134 microphone attached to a calibrated, 1-mm, probe tube. For an attenuation of zero, a maximum sound-pressure level of 122 dB SPL for the broadband stimuli was possible; however, it should be noted that this level was never used, and the vast majority of recordings were limited to maximum levels of 102 dB SPL. The frequency response of the set-up was flat between 100 Hz and 10 kHz within ±3dB. The 30- or 50-dB range of attenuations presented to a specific unit was usually set to be just above threshold for the highest stimulus attenuation.

Analyses

UNIT ISOLATION AND CLASSIFICATION. On isolation of a unit, the BF was determined audio-visually using pure-tone stimulation of variable frequency and level. Units were classified based on their BF peri-stimulus time histograms (PSTH), the first-order ISI histogram (ISIH) and a regularity analysis (Young et al. 1988). PSTHs were obtained for 250 presentations of the 50-ms BF tones (including 1-ms raised-cosine ramps) at a repetition rate of 4 Hz presented at levels of 20 and 50 dB above unit threshold. Spontaneous rate was determined in a 10-s time window before the PSTHs were measured. Chopper units were subdivided into sustained choppers [CS, coefficient of variance (CV) <=  0.35] or transient choppers (CT, CV > 0.35) (see Young et al. 1988).

RESPONSE TO IRN AND WHITE NOISE. To quantify the extent to which the quasi-periodicity of the IRN stimuli is reflected in the temporal response characteristics of a unit, first-order and all-order ISIHs were calculated. To determine the influence of stimulus quasi-periodicity on the neural response, ISIHs in response to IRN stimuli were compared with the ISIHs in response to white-noise, and a so-called "interval enhancement" value was calculated: the proportion of intervals with a duration equal to the IRN delay was reduced by the proportion of the same-duration intervals in response to white noise. For example, when a unit was stimulated with IRN with a delay of 4 ms, and 25% of all ISIs were 4 ms, and, with white-noise stimulation, only 20% of the intervals were 4 ms, the interval enhancement was 5%.

To compare temporal response properties between white-noise and IRN stimulation, the white-noise, first-order ISIHs, and the IRN interval-enhancement as a function of IRN delay were fitted with a gamma function of the form
<IT>gt</IT>(<IT>t</IT>)<IT>=</IT><IT>at</IT><SUP>(<IT>n</IT><IT>−1</IT>)</SUP><IT>e</IT><SUP><IT>−2</IT><IT>&pgr;bt</IT></SUP> (1)


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

This study is based on recordings from a total of 145 neurons; from this group at least 1 complete delay-level map was obtained for 54 units.

CS units

First-order ISIHs of the responses of a CS unit with a BF of 1.25 kHz to IRN stimuli with various delays and attenuations are shown in Fig. 2. The figure only contains the data for half of the IRN delays actually presented; responses for the intermediate delays of 1.4, 2.8, 5.6, 11.2, and 22.4 ms were omitted for reasons of clarity. In response to white noise (WN), the first-order ISIH shows the typical chopper characteristic with a peak at an interval corresponding to the chopping period. In response to IRN with a 4-ms delay, this peak is strongly enhanced compared with the white-noise response. Figure 3, A and B, shows the interval enhancement for this unit as a function of IRN delay and level. Figure 3A shows the interval enhancement based on the first-order ISIHs; Fig. 3B shows the interval enhancement based on the all-order ISIHs. Positive values indicate that the ISIH for IRN contains more intervals corresponding to the IRN delay than the white-noise ISIH. Based on first-order statistics only (Fig. 3A), the unit reveals interval enhancement with a band-pass characteristic. Moreover, this band-pass interval enhancement is relatively level-independent with a peak at 4 ms over a range of 40 dB. Based on an all-order ISIH analysis (Fig. 3B), the band-pass characteristic is lost, and the neuron shows interval enhancement for all delays higher than 2.8 ms (long-pass). Figure 3, C and D, shows interval enhancement for another CS unit with a BF of 0.85 kHz. As in the previous example, the unit shows band-pass interval enhancement. The best delay is at 2.8 ms, and it shifts very little over the 30-dB range tested.



View larger version (42K):
[in this window]
[in a new window]
 
Fig. 2. First-order inter-spike interval histograms (ISIHs) for a chopper (CS) unit with a best frequency (BF) of 1.25 kHz as a function of IRN delay increasing on the abscissa and sound attenuation on the ordinate. Histograms are only plotted for every other IRN delay for reasons of clarity. The white-noise (WN) response is plotted in the leftmost column. The IRN delay is indicated by an arrowhead in all IRN ISIHs. The shallow peak at an interval around 4 ms in the white-noise response is strongly enhanced when the unit is presented with an equal-energy IRN with a delay of 4 ms. The position of the peak in the white-noise response and the enhancement in response to IRN with delays near this peak do not change significantly when the stimulus presentation level is increased.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 3. Interval enhancement, expressed as the increase in the proportion of intervals corresponding to the IRN delay, as a function of IRN delay, and stimulus attenuation for 2 CS units, the top one the same unit as in Fig. 2. The interval enhancement based on 1st-order ISIHs shows a band-pass characteristic with a peak at a delay of 4 ms (A) or 2.8 ms (C). In the all-order interval analysis (B and D), the interval enhancement shows a low-pass characteristic, with positive values for all delays higher than the position of the peak in the 1st-order analysis.

CT units

CT units are distinguished from CS units based on their higher CV, which indicates that the regularity of the temporal discharge is not as high as that of CS units and the mean ISI in response to pure tones typically rises over the stimulus duration. CT units also differ from CS units in terms of the nonmonotonicity of their input-output functions (Blackburn and Sachs 1990; Winter and Palmer 1990). Whereas CS units show a steep rise over an about 20- to 25-dB range and then frequently saturate, the input-output function of CT units often show nonmonotonic shapes. Interval enhancement as a function of IRN delay and level for a CT unit with a BF of 1.1 kHz (Fig. 4, A and B) and a unit with a BF of 1.35 kHz (Fig. 4, C and D) are plotted in Fig. 4. The left and right panels present interval enhancement based on first-order and all-order statistics, respectively. Compared to the CS units shown in the previous figures, a band-pass characteristic of the interval enhancement is less pronounced in these CT units despite their similar BF. Moreover, due to the typically wider dynamic range, interval enhancement based on first-order ISIs is more susceptible to changes in presentation level.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 4. Interval enhancement as a function of IRN delay and stimulus attenuation based on 1st-order statistics (A and C) and all-order statistics (B and D) for two CT units with BFs of 1.1 kHz (top row) and 1.35 kHz (bottom row). The interval enhancement has also a band-pass characteristic but the tuning is less pronounced and more level dependent than for CS units.

PL units

PL units reveal frequency tuning and temporal discharge patterns similar to those of auditory-nerve fibers (e.g., Bourk 1976; Rhode and Smith 1986; Winter and Palmer 1990). ISIHs for IRN responses of a PL unit with a BF of 0.75 kHz are shown in Fig. 5 in the same format as in Fig. 2. In response to white noise of different levels (1st column), the ISIH for this PL unit shows the typical monotonic decay. In response to IRN stimuli, peaks in the ISIHs corresponding to the IRN delay are found for a broad range of delays between 2 and 8 ms.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 5. First-order ISIHs of a primary-like (PL) unit (BF = 750 Hz) in response to white noise (WN, leftmost column) and IRNs with various delays and levels plotted in the same format as in Fig. 2. Again, arrowheads indicate the IRN delay. ISIH peaks corresponding to the IRN delay occur for a wider range of delays between 2 and 8 ms. With decreasing attenuation (increasing presentation level) the response to WN and IRN becomes weaker due to inhibitory inputs.

Interval enhancement for first-order ISIs and mean-rate responses as a function of IRN delay and level for two PL units are plotted in Fig. 6. Figure 6, A and B, shows responses of the unit from Fig. 5 with a BF of 0.75 kHz; Fig. 6, C and D, shows responses of a unit with a BF of 2.47 kHz. For both PL units, interval enhancement based on first-order interval statistics (left) does not show such a pronounced band-pass characteristic as seen for the CS units (Fig. 3). Moreover, strong level effects are observed. For the low-BF unit, the interval enhancement decreases with increasing stimulus level for the whole range of delays where the unit reveals some degree of interval enhancement at the lowest stimulus level (attenuation = 60 dB). The loss of interval enhancement with increasing level could be explained by inhibitory input to this PL unit. The presence of inhibition can be visualized by plotting the mean-rate response as a function of IRN delay and sound level in the right panels of Fig. 6. While the 2.47-kHz unit (Fig. 6D) shows a monotonically increasing rate response with increasing level, the 0.75-kHz unit (Fig. 6B) shows a decrease in the mean-rate response with increasing level. Thus it is likely that this unit receives inhibitory as well as excitatory input. The high-BF unit does not show these inhibitory effects.



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 6. First-order interval enhancement (A and C) and mean rate (B and D) as a function of IRN delay and presentation level for 2 PL units with BFs of 0.75 kHz (top row) and 2.47 kHz (bottom row). Both units show interval enhancement over a broad range of delays. Interval enhancement as a function of level changes differently for the 2 units. In the low-BF unit (top row) the interval enhancement decreases with increasing level as opposed to the high-BF unit (bottom row), where interval enhancement increases. The decrease in interval enhancement with increasing stimulus level for the low-BF unit is correlated with a decreasing mean rate (B).

Onset units

The PSTH of onset units is characterized by a high probability of spike discharge at stimulus onset. The tonic response can differ in strength and temporal characteristics. For the current analyses of the temporal discharge characteristics in response to IRN stimuli, an additional inspection of the first-order ISIH of the BF-tone response provides a more reliable means to separate different types of onset units than the PSTH alone. In this analysis, we distinguish between onset-chopper units (OC) and onset units with low-level sustained response (OL) (Godfrey et al. 1975). OC and OL units are typically distinguished based on the BF-tone PSTH. The OC PSTH shows a second (and rarely a 3rd) sharp peak following the onset peak, whereas additional peaks are absent in OL PSTHs (Rhode and Smith 1986; Winter and Palmer 1995). Examples of onset unit responses to pure-tone and IRN stimulation are given in Fig. 7. An OC unit is shown in the left column, and an OL unit is in the right column. The top row shows PSTHs, the middle row shows the corresponding first-order ISIHs, and the bottom row shows interval enhancement as a function of IRN delay and sound attenuation. The PSTH of the OC unit (Fig. 7A) shows a pronounced second peak about 2 ms after the first peak. The PSTH of the OL unit (Fig. 7B) does not show a second peak but a gap following the onset peak in its PSTH. The ISIHs of the units' pure tone responses (Fig. 7, C and D) reveal a chopping of both these neurons in the sustained response and not only the chopping of the OC unit at stimulus onset. The interval enhancement plots (Fig. 7, E and F) reveal a band-pass characteristic not unlike that described for CS units (Fig. 3). As in the CS units, the band-pass characteristic appears to be related to the chopping in the pure-tone, sustained response.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 7. BF tone PSTHs (top row) and 1st-order ISIHs (middle row) for an onset-chopper (OC) unit (left column) and an onset low-level sustained response (OL) unit (right column) with a BF of 5 and 2.79 kHz, respectively. The OC unit shows the characteristic 2nd peak in the PSTH. This peak is also reflected in the ISIH in a sharp peak at an interval of 2 ms (C). However, both units reveal a chopping in the sustained response (C and D). First-order interval enhancement as a function of IRN delay and level for the 2 onset units are shown in E and F. Both the OC unit (E) and the OL unit (F) show a band-pass characteristic of interval enhancement.

A unit that would be classified as Onset according to the classification scheme used by Winter and Palmer (1995) is shown in Fig. 8. They used the ratio between the magnitude of the onset response and the sustained response, which had to be more than 10 to 1 for a unit to be classified as onset. The PSTH of the unit in Fig. 8A meets this criterion, but the ISIH (Fig. 8B) shows that the unit reveals no chopping in the sustained response but an ISIH similar to a PL unit. Thus this unit could also be classified as a primary-like unit with a notch (PN). A clear distinction between OL and PN units can be difficult based on the PSTH (e.g., Rhode 1994). In our sample of units, OL units all show a chopping in the pure-tone sustained response. Thus it is suggested that the ISIH may provide a means to distinguish between OL and PN units.



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 8. BF-tone PSTH (A), 1st-order ISIH (B), and interval enhancement (C) of a unit that would be classified as Onset because of the high ratio of onset to sustained response but that shows ISIH properties suggesting it is not an Onset but a primary-like unit with a notch (PN). In line with the PL features of the ISIH, the unit reveals no level independent band-pass interval enhancement.

The amount of interval enhancement in response to IRN appears to be related to the shape of the first-order ISIH in response to equal-level white noise. This is illustrated in Fig. 9, where we plotted the position of the interval-enhancement peak as a function of the position of the peak in the white-noise, first-order ISIH. To obtain a reliable estimate of relation between white-noise and IRN stimulation, a gamma function was fitted to both the white-noise ISIH and the interval-enhancement functions as described in METHODS. Each data point in Fig. 9 is derived from an iso-level recording of the white-noise response and the responses to IRN with 11 different delays and the same level as the white noise. Thus up to seven data points (for 7 different levels) were derived from a single unit. Figure 9 shows that there is a reasonable correlation (R = 0.79) between the peak in the noise ISIH and the position of the interval enhancement peak. This is true despite the fact that, to obtain the interval enhancement, the noise ISIH was subtracted from the IRN ISIHs.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 9. Correlation between the position of the peak in the noise 1st-order ISIH and the interval-enhancement peak. The correlation coefficient is 0.79. Each data point is derived from a comparison of the unit response to white noise with responses to iso-level IRN with all 11 delays. Insets A-D show fits of the gamma function to the white-noise ISIHs (A and C) and to the interval-enhancement functions (B and D) for a CS unit and a Primary-like unit. The insets show that a gamma function is suitable to fit the ISIHs of both unit types. Moreover, the vertical dashed lines in the insets show that there is a good agreement of the noise-ISIH peak and the interval-enhancement peak.

The use of a gamma function to fit the noise ISIH enables us to correlate changes in the fit parameters, n and b, with the unit type from which noise responses were obtained. For this analysis, units with band-pass characteristics as opposed to units with PL, low-pass, characteristics are pooled, i.e., CS, CT, and OC as opposed to PL and PN units. The average values and standard errors for the order parameter, n (Fig. 10A) and the bandwidth parameter, b (Fig. 10B) are plotted for these two coarse classifications in Fig. 10. Again, the number of analyses, N, represents the number of different noise ISIHs obtained (which can be up to 7 for the 7 different levels in 1 unit). Chopping units are characterized by a higher value for n and b. This reflects the occurrence of a mode in the noise ISIH, i.e., a delayed and steep rise followed by an similarly steep drop in the noise ISIH. For units with a PL sustained response, the low values for n and b characterize a noise ISIH with an immediate, steep rise, mostly due to neural refractoriness and a decay that is well fitted as an exponential with a short time constant, b.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 10. Average fit parameters, n and b, for a gamma function fitted to the white-noise ISIH. Error bars show means ± SE. Units were coarsely classified into chopper and primary-like units. The order parameter (n, A) as well as the bandwidth parameter (b, B) show significant differences for the 2 unit classes. Both n and b are higher for chopper units. The rising slope of the ISIH is more delayed, but then it rises more steeply with increasing n. The falling slope of the ISIH drops more steeply with increasing b.

To summarize the effect of presentation level, neural responses from OC units are compared with responses of CS units. These two unit types are interesting to compare because both show interval enhancement with a band-pass characteristic suitable to code the IRN pitch in first-order ISIs, but they differ in their dynamic ranges. As the interval-enhancement peak position is correlated with the noise ISIH peak position (cf. Fig. 9), the noise ISIH peak position is plotted as a function of the stimulus attenuation in Fig. 11 for nine CS units (Fig. 11A) and six OC units (Fig. 11B). In general, both unit types show a trend toward an earlier peak position with increasing stimulus level. However, this trend appears to be confined to low presentation levels (attenuations higher than 60 dB corresponding to sound levels lower than 62 dB SPL) for the CS units (Fig. 11A) above which the peak appears level independent. OC units show significant IRN responses only at attenuations lower than 45-40 dB (sound levels higher than about 80 dB SPL). The shift in the peak position with increasing level is more pronounced for OC units (Fig. 11B).



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 11. The position of the noise-ISIH peak as a function of stimulus attenuation for 9 CS units (A) compared with 6 OC units (B). Due to the higher thresholds of OC units, they were typically measured at higher levels (lower attenuations). For OC units, the data curves are tilted to the left, indicating that the noise ISIH peak (and thus the interval enhancement peak, cf. Fig. 7E) shifts toward lower intervals with increasing level. CS units preserve the temporal response characteristics despite driven into saturation at relatively low levels. This results in a rather level independent interval-enhancement peaks for intermediate and high sound levels.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

This study has investigated the temporal representation of IRN in the guinea pig cochlear nucleus as a function of delay and presentation level. Temporal response characteristics of cochlear-nucleus units were analyzed in terms of both first-order and all-order ISIs. Whereas interval enhancement showed a low-pass characteristic for all neuron types in all-order ISIHs, a pronounced band-pass characteristic of interval enhancement was found for both CS and OC units in first-order ISIHs. Moreover, especially for CS units, this interval enhancement appeared to be rather insensitive to changes of presentation level. The peak of the band-pass interval enhancement of the CS units shown in Figs. 2 and 3 changes over the range of levels where the rate response increases. This is the case because an increasing spike rate with increasing level does necessarily lead to shorter ISIs. When driven into saturation, however, the units' temporal firing characteristics still reveal the temporal quasi-periodicity of the IRN stimulus. This results in a stable, first-order temporal pitch code that is level independent in the saturated range. This finding supports the idea that CS units may play an important role in the pitch perception associated with complex harmonic sounds that is not susceptible to level changes.

Responses of CT units to IRN with varying delays and presentation levels differ from those of CS units in that the responses are typically more susceptible to changes in presentation level and that the interval enhancement appears not to have such a pronounced band-pass characteristic. These two differences can be related to response features with pure-tone stimulation; for example the decrease in discharge regularity as a function of poststimulus time and nonmonotonicity in BF rate-level functions (Blackburn and Sachs 1989; Winter and Palmer 1990).

OC and OL units have a wide dynamic range. As long as a unit's rate response is not yet saturated, the peak of the interval-enhancement functions shifts toward shorter delays with increasing sound level (cf. Fig. 7E). This is the case because first-order ISIs inevitably become shorter when the rate response increases. Assuming a first-order ISI code for pitch in the cochlear nucleus, OC and OL units like CS units may provide a conversion of higher-order to first-order intervals, but the wide dynamic range of OC and OL units makes them susceptible to changes in presentation level. Whereas it is believed that CS units project to the inferior colliculus (IC) (Adams 1979; Smith et al. 1993), projection sites of OC units are still unclear, and it may be possible that they act as interneurons in the CN (Joris and Smith 1998).

Comparison with previous studies

Shofner (1991) investigated the temporal representation of rippled noise (RN) in the antero-ventral cochlear nucleus of the chinchilla. He concluded that, while PL units seemed to preserve the RN fine structure, chopper units only code the quasi-periodicity in the stimulus envelope. This was based on the finding that rippled noise that was delayed and added (the gain, g, in the delay-and-add loop equals 1) was coded in the same way as rippled noise that was delayed and subtracted (g = -1). Shofner (1999) investigated the temporal response characteristics of chinchilla cochlear nucleus units in response to infinitely iterated rippled noise (IIRN) with a g of 0.89 or -0.89 in the delay-and-add loop. As for the rippled-noise results, Shofner (1999) concluded that while PL units do preserve the difference between IIRN with positive and negative g, this was not the case for chopper units. IIRN with positive and negative g share the same envelope features but differ in their temporal fine structure (Yost et al. 1998). This argument would be consistent with the idea that chopper units are envelope responders and PL units are driven by fine-structure information. Shofner (1999) showed responses of two PL units with BFs of 0.85 and 4.63 kHz. Whereas the autocorrelation of the low-BF response reflected the stimulus autocorrelation, the high-BF units' autocorrelation was the same irrespective of the sign of the IIRN gain as it was the case for the 2.43-kHz chopper unit shown in Shofner (1999). Thus it is possible that the encoding of the IIRN temporal properties is more dominated by an effect of BF than by an effect of unit type. This interpretation would also be more in line with the perception of the stimuli. For a fixed delay of 4 ms, the pitch difference between positive and negative g is an octave only when the low harmonics are presented. When the stimuli are high-pass filtered, the pitch difference is much smaller and more on the order of 10% (Wiegrebe, unpublished data) as it is observed for rippled noise (Bilsen 1966; Bilsen and Ritsma 1969/1970; Yost et al. 1978). It is therefore possible that chopper units with low BFs may well be capable of preserving differences related to the sign of g in their temporal response properties as far as these are established perceptually. This issue clearly needs to be explored in more detail.

Comparison with studies on amplitude-modulated (AM) tones

Frisina et al. (1990a,b) investigated the encoding of AM pure tones in the gerbil cochlear nucleus. They calculated the modulation gain based on the PSTH of the unit response to AM tones of various modulation frequencies and sound levels. The modulation gain, a measure to quantify the synchrony of the firing to the modulator, revealed a low-pass characteristic at low stimulus levels. At higher stimulus levels, chopper units revealed a band-pass characteristic of the modulation gain not unlike the band-pass characteristic of interval enhancement found in the present study with IRN stimulation. In this section, we compare the two measures of synchrony [the PSTH-based measure of Frisina et al. (1990a,b) and the ISIH-based measure used here] and show data from a single CS unit where we obtained responses to both AM tones and IRN. Frisina et al. (1990a,b) determined the modulation gain in the PSTH by dividing the percent modulation of the stimulus with the percent modulation of the PSTH. The latter was calculated by dividing the Fourier component of the PSTH at the modulation frequency by the average of all Fourier components. As the degree of modulation in an IRN stimulus is not known, one cannot calculate the modulation gain for an IRN stimulus. Instead, we directly compare the Fourier component at the modulation frequency of the PSTHs in response to AM tones and IRN. Figure 12A shows the height of the Fourier component of the PSTHs for a CS unit with a BF of 0.99 kHz for AM-tone (- - -) and IRN stimulation (---). The function is low-pass for AM-tone stimulation with a peak at 350 Hz, which indicates a possible transition to a band-pass characteristic when the sound level would be increased, as it was described in Frisina et al. (1990a,b). With IRN stimulation, the height of the Fourier component is very low and shows no systematic dependence on modulation frequency (equals 1/delay). The same data analyzed in terms of interval enhancement are plotted in Fig. 12B: with this type of analysis, both data sets reveal a clear band-pass characteristic of the temporal tuning properties of the unit. The PSTH-based measure does not work for IRN stimulation because IRN is a nondeterministic stimulus that, in our experiments, was refreshed for each presentation. Thus to the extent that IRN will cause modulation in a frequency channel, the modulation will probably have a different phase with every presentation. Thus the PSTH, which is summed up over all presentations, does not reveal a possible modulation.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 12. A: the height of the Fourier component of the peri-stimulus time histogram (PSTH) plotted as a function of modulation frequency (for AM-tone stimulation, - - -), which equals the delay reciprocal (for IRN stimulation, ---). CS unit, BF = 0.99 kHz. B: the height of the 1st-order ISIH at an interval corresponding to the modulation period for AM tones (- - -) or the delay for IRN (---). The analysis is based on the same recordings as that for A.

First-order ISI characteristics are obviously not confounded with these effects. The comparison of the two analysis types shows that it is not straightforward to compare the data by Frisina et al. (1990a,b) to our data, but the general finding of a band-pass characteristic of the temporal tuning can be found with both AM-tone and IRN stimulation. Frisina et al. (1990a,b) described that the modulation-gain functions for CS units change from low-pass to band-pass with increasing stimulus level. In principal, this can be confirmed with IRN stimulation. First-order interval enhancement for the CS unit in Fig. 3A reveals a broader tuning with a peak shifted toward longer delays at low presentation levels. With increasing level, the band-pass characteristic becomes more pronounced, and the peak becomes fixed at 4 ms.

Possible implications for the modeling of pitch processing in the auditory brain stem

Like PL units, chopper units in the cochlear nucleus preserve the quasi-periodicity information of IRN stimuli with positive gain in the all-order ISIHs. Chopper neurons, however, can be interpreted as not only preserving the quasi-periodicity information but as the first stage of a temporal processing that renders the need for an autocorrelation, i.e., an all-order ISI analysis unwarranted. Chopping produces a decrease in the number of intervals that do not correspond to the chopping period and an increase in the number of intervals equal to the chopping period. If this chopping period equals the stimulusquasi-periodicity, the chopper-unit output is strongly locked to the stimulus period even when the stimulus period is not reflected in pronounced periodic envelope oscillations of the stimulus. In an array of chopper units with the same BF but with a range of chopping periods as hypothesized by Frisina et al. (1990a,b) and Kim et al. (1990), the stimulus quasi-periodicity would be represented in an interval place code. Hewitt and Meddis (1994) demonstrated in a computer model of AM sensitivity of single units in the inferior colliculus how such a first-order interval-place code can be converted into a rate-place code in coincidence detector units presumably located in the central part of the inferior colliculus. The current experiments indicate that the suggestions of Hewitt and Meddis (1994) for the coding of AM pure tones could be extended to the case of more complex periodic stimuli where the periodicity is not apparent in the stimulus envelope. In this suggested circuitry, the VCN CS units could not only provide the all-order to first-order conversion but also a level insensitivity of temporal periodicity coding. Due to the relatively high sensitivity and steeply rising input-output functions of CS units with a range of only 20-25 dB (Blackburn and Sachs 1989; Rhode and Smith 1986; Winter and Palmer 1990), their rate response saturates at relatively low levels. Above this saturation level, the first-order interval code is level independent.

It should be recognized that this hypothesis requires a two-dimensional representation of BF and periodicity tuning (Frisina et al. 1990a,b; Kim et al. 1990). However, to date only one study has shown a distribution of best periodicity as a function of BF (see Kim et al. 1990, Fig. 12). They found best periodicities ranging from 100 to 500 Hz in a population of units in the posteroventral and dorsal cochlear nucleus. No single unit type was found to encompass the whole range of best periodicities corresponding to the range of pitches perceived. Interestingly, the range of best periodicities in chopper units varied between 90 and 400 Hz.

While the data reported here do not encompass the range of BFs and temporal periodicities necessary to verify a two-dimensional representation of the two, the current results nevertheless provide physiological evidence in support of Rhode's (1995) suggestion that periodicity pitch in the cochlear nucleus is encoded in first-order ISIs. In contrast to the auditory nerve, this code can be insensitive to changes in presentation level.


    ACKNOWLEDGMENTS

We thank K. Krumbholz and G. Neuweiler for critical comments on earlier versions of this paper. R. Patterson has continually provided fruitful discussions on the perception and encoding of iterated rippled noise.

This study was supported by a research grant from the Deutsche Forschungsgemeinschaft to L. Wiegrebe, the Medical Research Council, and the Wellcome Trust, UK.


    FOOTNOTES

Present address and address for reprint requests: L. Wiegrebe, Zoologisches Institut der Universität München, Luisenstr. 14, 80333 Munich, Germany (E-mail: wiegrebe{at}zi.biologie.uni-muenchen.de).

Received 7 July 2000; accepted in final form 21 November 2000.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

0022-3077/01 $5.00 Copyright © 2001 The American Physiological Society