Midbrain Combinatorial Code for Temporal and Spectral Information in Concurrent Acoustic Signals

D. A. Bodnar1 and A. H. Bass1,2

 1Section of Neurobiology and Behavior, Cornell University, Ithaca, New York 14853; and  2University of California Bodega Marine Laboratory, Bodega Bay, California 94923


    ABSTRACT
Top
Abstract
Introduction
Methods
Results
Discussion
References

Bodnar, D. A. and A. H. Bass. Midbrain combinatorial code for temporal and spectral Information in concurrent acoustic signals. All vocal species, including humans, often encounter simultaneous (concurrent) vocal signals from conspecifics. To segregate concurrent signals, the auditory system must extract information regarding the individual signals from their summed waveforms. During the breeding season, nesting male midshipman fish (Porichthys notatus) congregate in localized regions of the intertidal zone and produce long-duration (>1 min), multi-harmonic signals ("hums") during courtship of females. The hums of neighboring males often overlap, resulting in acoustic beats with amplitude and phase modulations at the difference frequencies (dFs) between their fundamental frequencies (F0s) and harmonic components. Behavioral studies also show that midshipman can localize a single hum-like tone when presented with a choice between two concurrent tones that originate from separate speakers. A previous study of the neural mechanisms underlying the segregation of concurrent signals demonstrated that midbrain neurons temporally encode a beat's dF through spike synchronization; however, spectral information about at least one of the beat's components is also required for signal segregation. Here we examine the encoding of spectral differences in beat signals by midbrain neurons. The results show that, although the spike rate responses of many neurons are sensitive to the spectral composition of a beat, virtually all midbrain units can encode information about differences in the spectral composition of beat stimuli via their interspike intervals (ISIs) with an equal distribution of ISI spectral sensitivity across the behaviorally relevant dFs. Together, temporal encoding in the midbrain of dF information through spike synchronization and of spectral information through ISI could permit the segregation of concurrent vocal signals.


    INTRODUCTION
Top
Abstract
Introduction
Methods
Results
Discussion
References

All species that vocalize face a common problem in auditory scene analysis, namely, the segregation of concurrent signals. When two conspecific senders vocalize at the same time, their signals summate into a single acoustic waveform at a receiver's ear. To discriminate and localize the two signals, a receiver's auditory system must extract information regarding the individual vocalizations from the fused signals. Psychophysical studies in humans show that the segregation of concurrent vowels (multiharmonic signals) is facilitated by small differences in their fundamental frequencies (F0s) (Broxk and Nooteboom 1982; Chalikia and Bregman 1989). However, the actual coding of concurrent signals within the brain and the neural computations used in their segregation remain relatively unexplored. In this study, we examine encoding of the spectral composition of concurrent signals in the midbrain of the plainfin midshipman (Porichthys notatus), a vocal species of teleost fish that routinely encounters concurrent acoustic signals.

Nest building male midshipman fish produce vocal signals known as "hums" that apparently function to attract females to their nest (Brantley and Bass 1994; McKibben and Bass 1998). Hums are long duration (>1 min), multiharmonic (2-3 harmonics) signals with a F0 near 100 Hz. Males often congregate in a localized area and vocalize simultaneously; overlapping hums produce acoustic signals with a beat-like temporal structure characterized by amplitude and phase fluctuations at the difference frequency (dF) between the F0s and upper harmonics (Bodnar and Bass 1997a). Within a field population, differences in the F0s of individual hums vary by <= 10 Hz at a given temperature (Bodnar and Bass 1997a). Two-choice phonotaxis experiments show that, when presented with concurrent tones near the F0s of natural hums, midshipman localize and approach an individual tone from a single speaker (McKibben and Bass 1998), indicating they have the neural mechanisms required for signal segregation.

In a previous study in midshipman, we found that neurons in the midbrain's torus semicircularis (homologue of the inferior colliculus) synchronize bursts of spikes to a beat's dF (Bodnar and Bass 1997a). Many midbrain neurons selectively synchronize to specific dFs, and their selectivity overlaps the range of dFs for naturally occurring acoustic beats. These data suggest that the temporal coding of dF information plays a role in the segregation of concurrent signals. However, although dF information alone can indicate the presence of two vocal signals, spectral information about at least one of the beat components is also required for dF information to be utilized in signal segregation.

Neurophysiological studies in mammals demonstrated that the F0s and upper harmonics of concurrent vowels are temporally encoded by the peripheral auditory system via synchronization of afferent spike trains to signal periodicities (Cariani and Delgutte 1996a,b; Palmer 1990). In the ventral cochlear nucleus of cats, the F0s of concurrent vowels are temporally encoded by primary-like and chopper neurons (Keilson et al. 1997). Thus segregation based on temporal coding strategies is supported at these levels of processing. However, to date, no studies have examined the encoding of the spectral composition of concurrent vocal signals in other central auditory structures. In mammals, midbrain studies of the coding of individual acoustic signals such as pure tones or amplitude modulated (AM) signals focused primarily on spatial/spike rate coding of frequency (Ehret and Merzenich 1985, 1988; Langner and Schreiner 1988; Rees and Moller 1983; Rees and Palmer 1989). In fish, temporal coding of pure tones via synchronization was examined and for the most part found to be poor (Crawford 1993; Lu and Fay 1993). Hence it remains unknown whether a temporal code of the spectral composition of concurrent signals is maintained throughout the central auditory system and hence would contribute to signal segregation.

Midshipman auditory afferents, like those in many other vertebrate species, synchronize their spike outputs to the spectral periodicities of an acoustic signal (McKibben 1998; McKibben and Bass 1996). Hence, in the case of pure tone stimuli, a signal's frequency (f) is given directly by an afferent's interspike intervals (ISIs), i.e., f = 1/ISI. Similarly, midshipman afferents synchronize to the individual components of a beat, and thus information about its spectral composition is explicitly encoded by the ISIs of afferents (McKibben 1998; McKibben and Bass 1996). In contrast, auditory midbrain neurons show poor synchronization to pure tone stimuli and the individual components of beats (Bodnar and Bass 1997a) and hence would appear to encode spectral information by some mechanism other than phase locking to the components.

Here, we assess how the spectral composition of beat stimuli may be encoded by auditory midbrain neurons by comparing the spike train responses of units in the torus semicircularis to beat stimuli with the same dF but that differ by one spectral component (Fig. 1A). Thus any differences in spike train responses must reflect differences in the spectral content of the signals. Our results show that, although many units show differences in their spike rates for spectrally different beats, virtually all auditory midbrain neurons show significant differences in their ISI distributions over specific interval ranges. Hence auditory midbrain units could encode the dF of beats via their synchronization to dF and the spectral composition of beats via their spike rates and/or ISIs. This combinatorial code of both dF and spectral information would be sufficient for the segregation of concurrent signals.



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 1. Beat stimuli and spike train analysis. A: schematic of the power spectrum of negative and positive beat dFs (±6 Hz) used in this study. For all stimuli, 1 component was always held constant at 90 Hz, and the other component was varied from 80 Hz (-10 Hz) to 100 Hz (+10 Hz) in 2-Hz increments. B: schematic of spike rate and interspike interval (ISI) measurements used in this study. Spike rate was computed as the average number of spikes during a beat, and ISIs (Delta t) were measured within a beat cycle and not between beats (designated by X between 1st and 2nd beat cycles). Mean spike rates and ISIs were compared between negative and positive dF beats so that the beat stimuli differed in only one spectral component.

Portions of these results appeared in abstract form (Bodnar and Bass 1998).


    METHODS
Top
Abstract
Introduction
Methods
Results
Discussion
References

Animals for physiological experiments were collected from nests (Tomales Bay, CA), housed in either running seawater holding tanks or artificial seawater aquaria at 15-16°C, and maintained on a diet of minnows. Midshipman have two male reproductive "morphs" (Bass 1996). Type I males build and guard nests and have the most extensive vocal repertoires. Type I males produce hums during courtship (see INTRODUCTION) and wider bandwidth, shorter-duration "grunts" (ms scale) during agonistic encounters (Brantley and Bass 1994). Type II males do not build nests or acoustically court females; instead they sneak spawn and, like females, only produce low-amplitude grunts infrequently. In this study, we used 38 type I males for neurophysiological recordings; future studies will assess possible sex differences.

Surgical and recording methods follow those described previously (Bodnar and Bass 1997a). During surgery, animals were anesthetized by immersion in 0.2% ethyl p-amino benzoate (Sigma Chemical, St. Louis, MO) in seawater. After exposure of the midbrain, a plastic dam was attached to the skin surrounding the opening, which later allowed for submersion of the fish below the water surface during recording sessions. Pancuronium bromide (0.5 mg/kg) was used for immobilization, and fentanyl (1 mg/kg) was used for analgesia during electrophysiological recording. Glass micropipettes were used for single unit extracellular recordings, which were amplified and band-pass filtered between 250 Hz and 3 kHz. As in a previous study, recording sites in select cases were verified by iontophoresis of a 5% solution of neurobiotin with subsequent immunohistochemical visualization (Bodnar and Bass 1997a). Custom software (CASSIE, designed by J. Vrieslander at Cornell University) was used for data acquisition; a pattern-matching alogrithm within CASSIE was used to extract visually identified single units. Single units were discriminated from multiple ones on the basis of their signal-to-noise ratio and spike shape; single units had distinct large amplitude peaks and fast rise times.

Beat stimuli were synthesized with CASSIE and were composed of two tones (F1 and F2) near the F0s of natural hums. F1 was held constant at 90 Hz, which is close to the characteristic frequency of most auditory midbrain units; F2 varied from F1 up to ±10 Hz in 2-Hz increments, spanning the range of characteristic frequencies (Bodnar and Bass 1997a). For most units, the intensity level of the beat stimuli was 12 dB above threshold measured for a 90-Hz pure tone; for some units the intensity level was either 6 or 18 dB above threshold. Thresholds for beat and pure tone stimuli were generally <= 3-6 dB with threshold for beats being lower. The order of presentation of beat stimuli was either with increasing dF, decreasing dF, or random dF. Stimuli were 1 s in duration, and data were collected for 10 repetitions at each dF. A UW30 underwater speaker (Newark Electronics), positioned beneath the fish in a 32-cm diameter tank, was used for delivery of acoustic signals (after Lu and Fay 1993). The speaker's frequency response in water was measured with a Bruel and Kjaer 8103 mini-hydrophone; sound pressure was equalized with CASSIE software. Hydrophone recordings of acoustic stimuli demonstrated that reflections from the tank walls and water surface did not alter the sound pressure waveform of the signals. All experiments were conducted inside a walk-in soundproof chamber (Industrial Acoustics, Bronx, NY).

Spike train analysis

To assess possible spike train coding of spectral differences in concurrent signals, we compared each unit's spike train responses to negative and positive dF beats. A negative dF beat refers to a two-tone stimulus composed of 90 Hz-dF Hz, e.g., a -6-Hz beat = 90 + 84 Hz (Fig. 1A). A positive dF beat refers to a two-tone stimulus composed of 90 Hz + dF Hz e.g., a +6-Hz beat = 90 + 96 Hz (Fig. 1A). For each unit, comparisons were made at five ± dF values (2, 4, 6, 8, and 10 Hz). These beat stimuli had the same dF but differed by only one spectral component. Any differences observed in the spike rate or ISI distributions would reflect differences in the spectral composition of the beats. In this study, we designate a unit as spectrally sensitive when it exhibits a differential response in either its spike rate or ISI distributions for positive and negative beat stimuli, i.e., having different spectral compositions. The specific representation of individual frequency components by a spike train is currently under investigation.

We measured both differences in spike rate and ISI distributions. Spike rate was measured by calculating the average number of spikes per beat, and a Mann-Whitney U test was used to test for significant differences (P < 0.05). First-order ISIs, that is, intervals between consecutive action potentials, were computed for spike bursts within the beat period only (Delta t, Fig. 1B); ISIs between periods were not included (X, Fig. 1B). To compare first-order ISI probability distributions we utilized the novel approach of examining the inverse cumulative distribution functions, also known as survival functions (by using StatView 4.5), of the ISIs of spike train responses to positive and negative dF beats. A cumulative probability distribution (F) shows the probability of event occurrences less than or equal to a designated value (e.g., Delta t, Fig. 1B), the inverse cumulative distribution (1 - F) shows the probability of event occurrences greater than or equal to the designated value. Raster plots and poststimulus time histograms of a unit's response to ±6-Hz beats are shown in Fig. 2, A and B. Corresponding ISI probability distributions and inverse cumulative distributions are shown in Fig. 2, C and D, and E, respectively. Qualitative differences can be observed in the standard plots of the ISI probability distributions. However, it is difficult to ascertain the range and extent of these differences. The points of divergence in the ISIs are easily observed in the inverse cumulative ISI distributions (Fig. 2E). Thus the range over which the ISIs appear to exhibit sensitivity to differences in the spectral composition of a beat can be directly assessed.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 2. Spike train responses and analysis of midbrain responses to beat stimuli. A and B: raster plots and poststimulus time histograms of a unit's responses to ±6 Hz beat stimuli. C and D: first-order ISI probability distributions of responses of this unit to ±6-Hz beat stimuli. E: plot of the inverse cumulative distribution functions of a unit, which does not show significant differences in its mean ISIs for a ±6-Hz beat over the entire range of intervals. The arrows mark the beginning and end of where the 2 plots show a divergence. F: plot of the inverse cumulative distribution functions of the same unit over the intervals marked by the arrows in E. Within this limited range of intervals (fISI), there is a significant difference in the mean ISIs of the spike train responses to a ±6-Hz beat. In all cases, a unit was considered to show fISI spectral sensitivity only if >= 50% of its ISIs fell within the spectrally sensitive range.

For many units (55%), significant differences in cumulative ISI distributions were not observed over the entire distribution at any dF (Fig. 2E). However, distinct regions of divergences in the inverse cumulative distribution functions could be observed (designated by arrows in Fig. 2E). When the cumulative distributions of the ISIs only within this region were compared, there was often a significant difference (Fig. 2F). Thus in our analysis, we considered a neuron to have "ISI spectral sensitivity" when there was a significant difference in mean first order ISIs (P < 0.05, Mann-Whitney U test) between positive and negative dF beat responses within a range of intervals that contained >= 50% of its total ISIs. We refer to these units as having "filtered ISI spectral sensitivity" (fISI). The lower and upper ISI cutoffs for the fISI distributions were chosen to maximize the z-value, and consequently minimize the P-value, of the statistical test.

The coefficient of variance (CV) of the ISI distribution (SD normalized by the mean) serves as an indicator of the variability in a spike train's ISIs. Previous studies used the CV as an index of the regularity of a neuron's firing (Blackburn and Sachs 1989; Young et al. 1988). These studies used a more detailed analysis in which the CV was reiteratively computed throughout the stimulus duration. To assess the general overall variability in a unit's ISIs, we measured the CV over the entire data set. For spectrally sensitive units, the CV was computed over the spectrally sensitive range of ISIs.

Synchronization of spikes to the beat dF and beat components was quantified by the vector strength of synchronization. The vector strength of synchronization measures from 0 to 1 the accuracy of phase locking to a periodic signal (Goldberg and Brown 1969). The vector strength was computed over 1-s intervals and the mean and SD over 10 repetitions. For each unit, the vector strength of synchronization was computed for the dF as well as each of the constituent components. A Rayleigh Z-test, based on the mean vector strength and mean number of spikes per repetition, was used to test whether synchronization was significant (P < 0.05) (Batschelet 1981).


    RESULTS
Top
Abstract
Introduction
Methods
Results
Discussion
References

The results presented represent data collected from recordings of 132 neurons in the midbrain's torus semicircularis. For the majority of units (89%), full data sets were obtained for ± beat dF from 2 to 10 Hz; 9% of the units have data with only one dF value missing, and the remaining 2% have data with two dFs missing.

Spike rate spectral sensitivity

Significant differences in the spike rates of auditory midbrain responses to ± beat dFs were observed in 68% of the units at one or more dFs (Fig. 3A). The distribution of spike rate spectral sensitivity at individual dFs is shown in Fig. 3B; note that any one unit may show spectral sensitivity at more than one dF. Ten percent of the population of units exhibited significant spike rate differences between ±2-Hz beats, 20% at ±4-Hz dFs, and ~35-45% at dFs of ±6, ±8, and ±10 Hz.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 3. Cumulative data for spike rate (spike rate) spectral sensitivity. A: distribution of units that show significant (gray) and no significant (hatched) differences in the spike rates between ± dF beats at >= 1 dFs. B: distribution of the percentage of units that exhibited spike rate spectral sensitivity at each dF; a unit may exhibit spectral sensitivity to >= 1 dF.

First-order ISI spectral sensitivity

A large number of units was observed to show first-order ISI spectral sensitivity over a limited range of intervals (fISI; see METHODS). Across individual units and different dFs, there was variation in the ranges of fISIs that gave rise to spectral sensitivity. The graphs in Fig. 4 show the inverse cumulative distribution functions of representative examples of the three general ranges of fISI spectral sensitivity observed. Thus a unit may show significant differences over its entire ISI distribution (Fig. 4A, left panel) but much greater sensitivity over a more limited range (Fig. 4A, right panel). Figure 4A shows an example of a unit that exhibited its maximum fISI spectral sensitivity over short ISIs. The units in Fig. 4, B and C, show significant fISI spectral sensitivity over intermediate and long ISI ranges, respectively. Across all observations of spectral sensitivity (n = 322), 23% have their maximum spectral sensitivity over short ISI ranges, 31% over intermediate ISI ranges, 31% over long ISI ranges, and 15% over the entire range of ISIs.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 4. Examples of the different ranges of fISI spectrally sensitive units observed in auditory midbrain units. The graphs on the left show the inverse cumulative distribution functions and P-values over the entire range of ISIs, whereas the graphs on the right show plots and P-values of the spectrally sensitive range of ISIs (fISI). A: example of a unit that showed significant differences in its ISI over the entire range of ISI; the level of significance improved dramatically when only short ISIs were considered (fISI). B: example of a unit that did not show significant differences in its ISI over the entire range of ISI but that showed significant differences in its fISI at intermediate intervals. C: example of a unit that did not show significant differences in its ISI over the entire range of ISI but that showed significant differences in its fISI at long intervals.

The distributions of the lower and upper ISI cutoffs across all fISI spectrally sensitive units and all dFs are shown in Fig. 5. These distributions show the ISIs ranges over which spectral sensitivity is observed across the population. The majority of units had low ISI cutoffs at intervals <15 ms, and there was wider variation in the high ISI cutoffs with a mode in the distribution at 40 ms. This suggests that there are no distinct classes of units with specific ISI cutoffs but instead that there is a continuum of fISI spectral sensitivity ranges within a population.



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 5. Distribution of low ISI cutoffs (solid) and high ISI cutoffs (hatched) for spectrally sensitive units.

Across the population, 97% of the units exhibited fISI spectral sensitivity at one or more dFs (Fig. 6A; as with spike rate, a unit may exhibit fISI spectral sensitivity at more than one dF). For any one dF, there was a relatively even distribution of ~50% of the population of units exhibiting fISI spectral sensitivity (Fig. 6B).



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 6. Cumulative data for fISI spectral sensitivity. A: distribution of units that show significant (shaded) and no significant (hatched) differences in the mean fISIs between ± dF beats at >= 1 dFs. B: distribution of the percentage of units that exhibited fISI spectral sensitivity observed at each dF; as in Fig. 3B, a unit may exhibit spectral sensitivity to more than one dF.

Because differences in spike rate give rise to differences in ISI, the ISI spectral sensitivity observed may actually reflect differences in spike rate. Figure 7 presents a regression plot of the difference in mean ISI versus the difference in spike rate across all units and all dFs (for fISI spectrally sensitive units, the mean ISI was computed over the spectrally sensitive range of ISIs). Although there was a significant relationship between Delta ISI and Delta spike rate (P < 0.0001), differences in spike rate account for only 30% of the variance in ISI differences (Fig. 7).



View larger version (31K):
[in this window]
[in a new window]
 
Fig. 7. Comparison of differences in spike rate and fISI between ± dF beats. For each unit, at every dF, the difference in its mean fISI vs. difference in spike rate between ± dF is plotted and fitted with a regression.

Figure 8, A and B, shows the distributions of CV for all fISIs of spectrally sensitive units across all negative and positive dFs within the spectrally sensitive ranges. The CV characterizes the variability in ISIs across several stimulus presentations. A random Poisson distribution is characterized by a CV = 1; previous studies classified neurons with CVs < 0.5 as exhibiting regularity in their firing (Blackburn and Sachs 1989; Young et al. 1988). The CVs of both negative and positive ISIs for spectrally sensitive units are normally distributed with means at 0.547 ± 0.167 for -dF and 0.538 ± 0.150 for +dF beat; virtually all CVs are <= 1. The distributions of the CVs for all nonspectrally sensitive units across all negative and positive dFs are shown in Fig. 8, C and D, with higher means at 0.787 ± 0.154 for -dF and 0.799 ± 0.144 for +dF beat. For both negative and positive dFs, there was a significant difference (P < 0.0001) in the mean CVs between spectrally and nonspectrally sensitive dFs.



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 8. Coefficients of variation (CV) of ISI distributions. A and B: distribution of CVs of fISI distributions within the spectrally sensitive range for negative (A) and positive (B) dF beats for all cases of spectral sensitivity. C and D: distribution of CVs of ISI distributions for negative (C) and positive (D) dF beats for all cases of nonspectral sensitivity.

dF and spectral sensitivity

In response to beat stimuli, midbrain units have been shown to encode the dF of a beat by synchronization to dF, whereas midbrain units exhibit poor synchronization to the individual components (Bodnar and Bass 1997a). In this study, synchronization to the individual beat components was measured in a representative subset of units (n = 66); only 6 of 66 units (4%) showed significant synchronization to either beat component at any dF, in support of our previous findings (Bodnar and Bass 1997a). Significant synchronization to the beat dF at one or more dFs was observed in 90% of the units (n = 120/132) (Fig. 9A). The distribution of units with significant synchronization at each dF is shown in Fig. 9B. Of the dF-sensitive units, 80% also showed fISI spectral sensitivity, and only 56% showed spike rate spectral sensitivity (Fig. 9C). The histogram in Fig. 9D shows the distribution of units that show combined dF sensitivity and fISI spectral sensitivity (solid bars) or combined dF sensitivity and spike rate spectral sensitivity (open bars) across individual dFs. At each dF, a greater percentage of dF sensitive units exhibit fISI spectral sensitivity than spike rate spectral sensitivity.



View larger version (63K):
[in this window]
[in a new window]
 
Fig. 9. Cumulative data for combined dF sensitivity and spike rate or fISI spectral sensitivity. A: distribution of units that show significant (hatched) and no significant (shaded) synchronization to beat dFs at >= 1 dFs. B: distribution of the percentage of units that exhibited significant synchronization to dF at each dF. C: comparison of the percentage of units that show both dF sensitivity and fISI spectral sensitivity (solid bar) and that show both dF sensitivity and spike rate spectral sensitivity (open bar) at >= 1 dFs. D: distribution of the percentage of units that show both dF sensitivity and fISI spectral sensitivity (solid bars) and that show both dF sensitivity and spike rate spectral sensitivity (open bars) at each dF.


    DISCUSSION
Top
Abstract
Introduction
Methods
Results
Discussion
References

A fundamental signal processing task that must be achieved by the auditory system of vocal species is the segregation of concurrent vocal signals. Understanding how the spectral components of concurrent signals are encoded within the central auditory system is essential to determining the computational mechanisms utilized in this task. The data presented in this study in midshipman fish show that, in response to beats with the same dFs but differing in one spectral component, many auditory midbrain neurons (68%) exhibited significant differences in their spike rates, whereas virtually all (97%) showed significant differences in their first-order ISIs within a specific range of intervals (fISI). Furthermore, although spike rate spectral sensitivity was observed primarily at dFs of 6-10 Hz (Fig. 3B), fISI spectral sensitivity was observed equally across all dFs (Fig. 6B), including dFs of 2 and 4 Hz, which are within the same range of the majority of dFs observed in natural habitats (Bodnar and Bass 1997a). Hence both a spike rate and temporal ISI code of a beat's spectral composition appear to be present.

In a previous study, we found that midbrain neurons in midshipman show significant synchronization to dF (Bodnar and Bass 1997a). This study demonstrates that 56% of the units that exhibited significant synchronization to dF also showed spike rate spectral sensitivity, whereas 80% also showed fISI spectral sensitivity. Together these data indicate that midbrain neurons provide a combinatorial code of the dF and spectral composition of concurrent vocal signals.

Spike rate and temporal encoding strategies

The general view of central auditory encoding of acoustic signals is that the spectral composition of a signal is represented spatially via a spike rate code. Within the auditory midbrain (inferior colliculus) of mammals there is a highly ordered tonotopic representation of frequency based on narrow spike rate tuning curves that serve as a bank of band-pass filters (Merzenich and Reid 1974; Schriener and Langner 1994; Semple and Aitkin 1979). The frequency resolution of collicular spike rate filters is in close correspondence with behavioral measures of frequency resolution (Ehret and Merzenich 1985, 1988). Spike rate frequency tuning curves and tonotopic representations of frequency were also observed in the auditory midbrain (torus semicircularis) of a range of nonmammalian vertebrates, including fish (Bodnar and Bass 1997a; Crawford 1993; Echteler 1985; Lu and Fay 1993).

The segregation of concurrent vocal signals poses a significant challenge to a spatial/spike rate encoding strategy of spectral composition. In the case of concurrent multiharmonic signals such as human vowels or midshipman hums, the F0s and harmonics may differ by only a few Hertz. Thus the spectral components of the individual vocal signals often fall within the same bandwidth of a neuron's frequency tuning curve and hence remain unresolved by a spatial/spike rate encoding strategy. Thus it remains unclear how the spectral components of the two signals could be segregated based on a spatial/spike rate encoding alone.

Here we examined the possibilities of both spike rate and temporal encoding (ISI) of the spectral composition of beats within the midshipman auditory midbrain. Theunissen and Miller (1995) explicitly defined a temporal encoding strategy as one that is independent of spike rate encoding. Although both spike rate and fISI differences are observed in midbrain neurons in midshipman, fISI encoding of spectral composition is largely independent of spike rate, as only 30% of the variance of fISI differences can be accounted for by spike rate differences. This indicates that any specific mean spike rate still allows considerable variance (70%) in the relative timing of spikes. Hence, systematic shifts in ISI distributions could greatly expand the range of stimulus parameters that are reliably encoded. Furthermore, because more units exhibit fISI spectral sensitivity over all behaviorally relevant dFs, the data suggest that a temporal ISI code of spectral composition of concurrent vocal signals could provide more information about the spectral composition of those signals than a spike rate code alone. However, it is important to keep in mind that spike rate and temporal encoding strategies are not mutually exclusive and may in fact complement and enhance each other. The response of a postsynaptic neuron is likely to be dependent on both the number and timing of incoming spikes during the integration window of the cell. Our emphasis here is not that either a spike rate or fISI is used for neural encoding but rather that the use of a fISI code would markedly increase the ability of the auditory system to segregate concurrent signals, whether alone or in conjunction with a spike rate code.

Differences in the distributions of the fISIs of neurons indicate a difference in the relative timing of action potentials contained within a neuron's spike train output. At this point, it remains unknown whether it is differences in the individual fISIs themselves that serve as the code for a signal's spectral composition or whether the differences in mean ISIs reflect changes in more complex patterns of spike intervals. A computational study explored the possibility of pattern recognition based on temporal patterns of spikes (Hopfield 1995), whereas the potential for the discrimination of sound location based on patterns of actual spike train data was demonstrated in the auditory cortex of cats (Middlebrooks et al. 1994). A more detailed comparison of spike patterns of responses to positive and negative dF beats is required to elucidate whether such a coding mechanism is present in the midshipman's auditory system.

The ISI spectral sensitivity we observed for most units was within a limited range of ISIs. Inclusion of intervals outside this range masked the detection of differences in spike trains by statistical tests. ISIs outside the spectrally sensitive range may represent either noise or they may encode other features of the stimulus. In response to beats, the midshipman auditory system apparently transforms a periodicity code of the spectral components of the beat into a combinatorial code of dF (Bodnar and Bass 1997a) and spectral composition (this study). It is possible that, to create a temporal code of dF within the brain, some noise is introduced into the temporal encoding of the spectral composition. Hence postsynaptic cells would need to filter this spike train noise to reliably extract spectral information, i.e., produce responses only to spectrally sensitive ISIs. The low and high ISI cutoffs of midbrain units indicate the ranges of temporal filtering necessary for enhanced discrimination of ±dF beats by postsynaptic cells. The majority of units had low ISI cutoffs at intervals <15 ms, and there was wider variation in the high ISI cutoffs with a peak in the distribution at 40 ms.

Several studies have demonstrated roles of synaptic mechanisms (Casseday et al. 1994; Covey et al. 1996), dendritic structure (Carr and Konishi 1990; Carr et al. 1986b; Rose and Call 1993), and voltage-dependent conductances (Brew and Forsythe 1995; Fortune and Rose 1997; Perney and Kaczmarek 1997) in the temporal filtering of spike inputs. Hence neural mechanisms clearly exist that would permit appropriate filtering of spectrally sensitive ISIs.

Alternatively, ISIs outside the spectrally sensitive range may encode other parameters of the stimulus waveform. In this study, we held stimulus depth of modulation, intensity, and duration constant. Increases in spike rates in response to stimulus intensity level have been well documented within the auditory system. In addition, several studies found units within the auditory midbrain that are sensitive to changes in stimulus duration (Casseday et al. 1994; Covey et al. 1996; Ehrlich et al. 1997; Feng et al. 1990; Fuzessery 1994; Gooler and Feng 1995). Thus changes in these parameters may produce changes in the ISIs of midbrain neurons. If midbrain neurons encode different stimulus parameters within different ISI ranges, changes in stimulus depth of modulation, intensity, or duration should not influence ISIs within the spectrally sensitive range, but instead produce changes in the ISIs outside this range. Changes in stimulus intensity and duration can affect the synchronization of midbrain units to dF (Bodnar and Bass 1997b). Studies are currently in progress to examine how these stimulus parameters may influence temporal coding of the spectral composition of a signal.

Although an ISI code would appear to improve the amount of information about the spectral composition of a signal within a spike train, the question remains whether it is actually utilized in concurrent signal segregation. The data presented here make specific predictions about the ability of midshipman to segregate concurrent signals, depending on whether they use spike rate or temporal encoding of spectral composition. Because a greater percentage of midbrain units shows spike rate spectral sensitivity primarily at large dFs, signal segregation based on this code alone predicts that midshipman can reliably only segregate signals with larger dFs. In contrast, because midshipman show similar ISI spectral sensitivity across all dFs, signal segregation that utilizes the ISI temporal code predicts they can segregate concurrent signals equally well at small and large dFs. These predictions can be directly tested in two choice behavioral phonotaxis studies. Behavioral studies have shown that midshipman can reliably segregate signals with dFs of 10 Hz (McKibben and Bass 1998). Preliminary data from that study suggest they also have the ability to segregate at smaller dFs of 2-5 Hz that are prevalent in a field population (Bodnar and Bass 1997a). A behavioral study to test the reliability of signal segregation at smaller dFs is currently in progress.

Comparisons with other species

In a different sensory modality, namely the electrosensory system, some weakly electric fish produce a quasisinusoidal electric organ discharge (EOD) nearly continuously for use in social communication and electrolocation (Heiligenberg 1991). When two fish with EODs whose F0s differ by a few Hz are near each other, their signals produce a beat waveform analogous to acoustic beats with phase modulations and AM at the dF. To avoid "jamming" of its signal, a fish will adjust its EOD away from that of its conspecific (Bullock et al. 1972; Heiligenberg 1991; Kawasaki 1993). Hence electric fish are faced with a similar problem, in segregating two incoming signals, with the exception that one signal is typically its own.

In gymnotiforms, the beat waveform is encoded by two types of peripheral receptors that encode either the phase modulations (T-type) or the AM (P-type) of the beat (see Carr 1993; Heiligenberg 1991). Within the midbrain, the general spectral composition of a beat is contained within the intervals between phase-locked spikes of giant cells of lamina 6 of the torus semicircularis (Carr et al. 1986a). The sign of dF (positive or negative) relative to an animal's own signal is determined on the basis of the comparison of differential phase modulations across an animal's body in conjunction with differential AM and dF information derived from parallel pathways (Heiligenberg and Bastian 1984; Heiligenberg and Rose 1985; Rose and Heiligenberg 1986; see also Guo and Kawasaki 1997 for gymnarchid electric fish).

To date, only a few other studies examined the coding of concurrent vocal signals. In the eighth nerve of guinea pigs and cats, fibers temporally encode the F0s and upper harmonics of concurrent vowels via synchronization to the periodicities in the stimulus waveform (Cariani and Delgutte 1996a,b; Palmer 1990). Within the ventral cochlear nucleus of cats, again the F0s of concurrent vowels are temporally encoded by both primary-like and chopper units via phase locking (Keilson et al. 1997). Thus a temporal code of the F0s of concurrent vowels is present in mammals at both primary afferent and central medullary levels.

Thus far no studies examined the coding of concurrent vocal signals within the auditory midbrain. However, a number of studies in a wide variety of species examined the coding of AM signals (Condon et al. 1994, 1996; Gooler and Feng 1995; Langner 1983; Langner and Schreiner 1988; Rees and Moller 1983; Rees and Palmer 1989; Rose and Capranica 1985). Inferior colliculus units exhibit tuning in their spike rates and vector strength of synchronization to the rate of AM (Fmod). Among mammals, strong synchronization to envelope modulations is most prominent at lower AM rates (Langner and Schreiner 1988); changes in the carrier frequency (Fc) of the AM signal, and hence its spectral composition, can produce changes in both a unit's phase coupling to the stimulus envelope and its spike rate at a given Fmod (Langner 1983; Langner and Schriener 1988). Schriener and Langner (1988) found a topographical representation of Fmod orthogonal to the tonotopic representation of Fc and hence discussed a combinatorial spatial code for Fmod and Fc. However, the fact that midbrain neurons synchronize to Fmod and exhibit Fc-related changes in spike rate indicates that individual neurons can also provide a combinatorial code of the Fmod and spectral composition of an acoustic signal.

The effects of changes in Fc on the ISIs within AM phase-coupled spike bursts were not examined. However, in a recent study in guinea pigs, the responses of many collicular units to pure tones exhibited a high degree of regularity in their ISI distributions as measured by their coefficient of variation (Rees et al. 1997). In addition, data from a preliminary study suggest that there is a correlation between a unit's regularity of firing and its response to AM signals (Sarbaz and Rees 1996). It was therefore proposed that the temporal patterns of regularly firing neurons in the inferior colliculus may encode acoustic information. Thus, as in midshipman, temporal encoding of the spectral composition of concurrent signals may also be present within the central auditory system of other vertebrates.

Behavioral studies of concurrent vocal signal segregation

Behavioral studies of animal responses in various psychoacoustic tests can provide essential insights into the neural mechanisms that may underlie auditory processing problems. The segregation of concurrent signals is the ability to hear out two overlapping signals. The fundamental components of this process are the ability to detect the presence of two signals, to discriminate or perceive a difference between the individual signals, and to localize each signal. The ability to identify individual signals is not a necessary part of the basic segregation task, although segregation is an essential prerequisite to signal identification.

Two general types of behavioral tests are used in assessing the auditory perception of nonhuman animals: conditioning or phonotaxis paradigms. In conditioning paradigms, animals are conditioned with one signal and then tested to determine whether they can discriminate between the conditioning and test signals by assessing changes in a behavioral or physiological response when presented with a test stimulus. These experiments generally assess detection and discrimination processes.

In phonotaxis experiments, animals are presented with signals from loudspeakers at a distance and allowed to approach a speaker at will. A choice is scored when an animal approaches a speaker and indicates the ability to detect and localize a signal. One-choice experiments assess the response specificity of an animal, whereas two-choice experiments assess preferences for particular signal features, e.g., frequency. The presence of response specificity indicates the presence of a signal identification process, i.e., animals are not simply approaching any acoustic signal, but rather certain signal features must be present for it to be attractive. The presence of signal feature preferences suggest signal identification at the level of the feature being tested; i.e., to exhibit a preference for a particular signal feature, an animal must not only discriminate it as different from another signal, but also must be able to identify some particular feature of the signal that makes it more attractive.

Two recent behavioral studies examined the auditory capabilities of fishes in the processing of concurrent signals. Goldfish were recently shown to discriminate between single tones and beats by using a generalization conditioning paradigm where responsiveness is measured by changes in respiratory rate (Fay 1998). The ability to discriminate is dependent on both the frequency of the conditioning tone and the difference frequency of the two-tone test signals. For example, in the case of a low-frequency conditioning tone, 200 Hz, goldfish discriminated a 4-Hz beat (200 + 204 Hz) 60% of the time. Although these experiments indicate that goldfish can discriminate between single tones and beats, it remains unclear whether they can segregate, i.e., hear out and discriminate between the two tones comprising a beat signal.

In midshipman, recent phonotaxis experiments assessed the response specificity and preferences of females for midshipman vocal signals of varying parameters (McKibben and Bass 1998). In one-choice phonotaxis experiments, females routinely approached speakers emitting hum-like signals but did not approach speakers when either white noise or grunt-like signals were broadcast. In one series of two-choice phonotaxis experiments, females were presented with concurrent hum-like signals with difference frequencies of 10 Hz from two separate loudspeakers. In these experiments, females unambiguously went to one speaker or the other, demonstrating the ability to segregate and discriminate between concurrent signals. In addition, females exhibited significant preferences for particular frequencies. In another series of two-choice experiments, midshipman were also shown to be able to discriminate between single, hum-like tones and 5-Hz beats composed of hum-like tones.

Here we have shown that differences in spike rate or the ISIs of auditory midbrain units encode differences in the spectral composition of beats and thus provide evidence of a neural substrate that is likely to contribute to the segregation of concurrent vocal signals in midshipman. The behavioral results described previously suggest that midshipman can segregate concurrent signals, that is, discriminate and localize the individual vocal signals, and also appear to identify individual signals on the basis of frequency. The latter suggests the presence of additional coding and computational mechanisms beyond those assessed here.

Signal identification could arise from two possible mechanisms, dF sign selectivity or frequency recognition. Under a dF sign-selectivity mechanism one component would be explicitly coded and serve as a reference signal, whereas the sign of the difference frequency would enable the computation of the frequency of the other signal. This mechanism is analogous to that utilized by electric fish in computing differences among their EODs (Heiligenberg 1991). In contrast, a frequency recognition mechanism predicts that each individual frequency component should be represented simultaneously within the auditory system. Determination of the neural mechanisms utilized by midshipman in identifying individual signals requires the comparison of the spike train responses between beats composed of different constituent tones (e.g., 90 Hz ± dF vs. 100 Hz ± dF) and between beat stimuli and their individual frequency components. Spike train analysis of such data are currently underway and will be presented in a forthcoming article.

Model for the combinatorial temporal coding of concurrent vocal signals

The mechanisms that underlie the segregation of concurrent vocal signals can in part be elucidated by examining the coding of concurrent signals and their transformations at different levels within the auditory system. A summary of the neurophysiology known to date about the temporal coding of concurrent vocal signals in the midshipman auditory system is shown in Fig. 10A. In the peripheral auditory system of midshipman, auditory afferents exhibit low synchronization to the dF of a beat stimulus but high synchronization to the individual components, i.e., the F0s, of a beat (McKibben 1998; McKibben and Bass 1996). Hence the F0s of the beat's components are encoded by corresponding periodic patterns of ISIs in afferent spike trains. In contrast, auditory midbrain neurons encode the dF of a beat by synchronizing to modulations in the stimulus waveform (VSdF) (Bodnar and Bass 1997a), whereas the information about spectral composition can be encoded by patterns of ISIs within a beat period (this report). Thus the afferent periodicity code of beat components is apparently transformed into a combinatorial temporal code of dF and spectral composition.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 10. Models for the temporal coding of concurrent vocal signals. A: schematic summary of the neurophysiological data for the coding of concurrent vocal signals within the midshipman auditory system. Auditory afferents temporally encode the individual components of beat signals via synchronization to their periodicities and thus encode the F0s via ISIs (McKibben 1998). Auditory midbrain neurons synchronize spike bursts to the dF of beats (VSdF) (Bodnar and Bass 1997a) and show differences in their fISIs within a beat period for ± dF beats (this study). Hence midbrain neurons provide a combinatorial temporal code of dF and spectral composition of concurrent signals. B: schematic of a model for the combinatorial coding of F0 and dF information; this model follows that proposed by Langner (1983) for the coding of Fc and Fmod of amplitude modulation signals (see text). Under this model, 2 populations of units in the medulla, one that synchronizes to an F0 of the beat (top left) and the other that synchronizes to dF (bottom left), converge onto a coincidence detector (shaded box), which fires only when spikes from both units arrive simultaneously (black); other spikes (shaded) do not elicit a response. The resulting output is a spike train in midbrain neurons that synchronizes bursts of action potentials to dF, and information about F0 is contained within the ISIs of spikes occurring within a beat period (right).

The midbrain combinatorial code of dF and spectral composition of beat stimuli may arise via a coincidence mechanism similar to the one proposed by Langner (1983) to explain the phase coupling of midbrain responses to AM at different carrier frequencies (Fc). Under this model, responses of midbrain neurons arise from the coincidence of inputs from two parallel pathways, one in which spike activity is phase coupled to modulations in the signal's envelope and the other in which activity is coupled to the carrier frequency, giving rise to the combined coding of Fmod and Fc. In the case of concurrent signals, the parallel pathways would temporally encode the F0s and dFs of beat signals (Fig. 10B). Hence this model predicts that two populations of neurons should be present in the medulla of midshipman, one population that like afferents exhibits strong synchronization to the individual components of a beat and a second population that strongly synchronizes to modulations in the signal's envelope. Midbrain neurons then serve as coincidence detectors that receive convergent inputs from these two populations and elicit action potentials that are both phase locked to dF and contain information about the F0s within their ISIs, as is observed in midshipman fish. Consequently, periodicity coding of the individual components of a beat in the periphery would give rise to combinatorial coding of the F0s and beat modulations (dF) within the brain. These mechanisms for the coding of concurrent acoustic signals are likely to be highly conserved and comparable with those used by other vertebrates, as shown for other auditory coding mechanisms in teleosts (Fay 1993).


    ACKNOWLEDGMENTS

The authors thank M. Marchaterre for anatomic assistance and logistic support in the field, A. Mason and J. McKibben for helpful comments on early versions of the manuscript, and two anonymous reviewers for many helpful suggestions.

This research was supported by National Institute of Deafness and Other Communications Disorders Grant DC-00092 to A. H. Bass.


    FOOTNOTES

Address for reprint requests: D. A. Bodnar, Section of Neurobiology and Behavior, Cornell University, Ithaca NY 14853.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received 2 June 1998; accepted in final form 15 October 1998.


    REFERENCES
Top
Abstract
Introduction
Methods
Results
Discussion
References

0022-3077/99 $5.00 Copyright © 1999 The American Physiological Society