1Institute for Systems Research and 2Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742-3311
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Depireux, Didier A., Jonathan Z. Simon, David J. Klein, and Shihab A. Shamma. Spectro-Temporal Response Field Characterization With Dynamic Ripples in Ferret Primary Auditory Cortex. J. Neurophysiol. 85: 1220-1234, 2001. To understand the neural representation of broadband, dynamic sounds in primary auditory cortex (AI), we characterize responses using the spectro-temporal response field (STRF). The STRF describes, predicts, and fully characterizes the linear dynamics of neurons in response to sounds with rich spectro-temporal envelopes. It is computed from the responses to elementary "ripples," a family of sounds with drifting sinusoidal spectral envelopes. The collection of responses to all elementary ripples is the spectro-temporal transfer function. The complex spectro-temporal envelope of any broadband, dynamic sound can expressed as the linear sum of individual ripples. Previous experiments using ripples with downward drifting spectra suggested that the transfer function is separable, i.e., it is reducible into a product of purely temporal and purely spectral functions. Here we measure the responses to upward and downward drifting ripples, assuming reparability within each direction, to determine if the total bidirectional transfer function is fully separable. In general, the combined transfer function for two directions is not symmetric, and hence units in AI are not, in general, fully separable. Consequently, many AI units have complex response properties such as sensitivity to direction of motion, though most inseparable units are not strongly directionally selective. We show that for most neurons, the lack of full separability stems from differences between the upward and downward spectral cross-sections but not from the temporal cross-sections; this places strong constraints on the neural inputs of these AI units.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Only a few general
organizational features are known in primary auditory cortex (AI). They
include a spatially ordered tonotopic axis (Evans et al.
1965), bands of alternating binaural response properties
(Imig and Adrian 1977
; Middlebrooks et al.
1980
), and a variety of other response features that change
systematically along the isofrequency planes such as thresholds
(Heil et al. 1994
; Schreiner et al.
1992
), bandwidths (Schreiner and Sutter 1992
),
FM selectivity (Heil et al. 1992
; Mendelson et
al. 1993
; Shamma et al. 1993
), and asymmetry of
response areas (RAs; the span of frequencies that influence, both
through excitation and inhibition, the response of a cell)
(Shamma et al. 1993
). To derive a functionally coherent
picture of these maps, it is necessary to integrate these features
within a comprehensive descriptor of the unit responses; one that can
be quantitatively derived and employed to predict responses to novel stimuli.
Traditionally measured response areas are inadequate because
they rarely include response dynamics and cannot be used to predict responses quantitatively. An alternative is the response field (RF)
(Schreiner and Calhoun 1994; Shamma et al.
1995
), a static, purely spectral function analogous to the RA
except for the use of broadband sounds (but see Nelken et al.
1994
; Sutter et al. 1996
). A dynamic
generalization of the RF is the spectro-temporal response field (STRF),
a characteristic function of a neuron obtained using broadband sounds
(Aertsen and Johannesma 1981
; deCharms et al.
1998
; Eggermont 1993
and references therein;
Escabi and Schreiner 1999
; Kowalski et al.
1996a
; Kvale and Schreiner 1995
; Theunissen et al. 2000
). A schematic of an idealized
STRF is illustrated in Fig. 1.
Qualitatively, its spectral axis reflects the range of frequencies that
influence the response or firing rate of the neuron being
characterized, and its temporal axis reflects how this influence
changes as a function of time. Positive-valued regions of the STRF
describe excitatory influence, and negative regions describe inhibitory
influence. The interplay between the spectral and temporal axes can
give multiple interpretations to the STRF, e.g., as a time-evolving
spectral response field or a family of impulse responses labeled by
frequency band.
|
Over the last few years, we have developed new methods to derive the
STRFs and characterize the responses of both single and multiple units
in the ferret AI (Kowalski et al. 1996a,b
). These methods use "moving ripples": time-varying broadband sounds with sinusoidal spectral envelopes that drift a constant velocity along the
logarithmic frequency axis. Figure 2
illustrates the spectrogram of such a stimulus. Neuronal responses are
vigorous and well phase-locked to these spectral and temporal envelope
modulations over a range of ripple velocities and densities. Measuring
the amplitude and phase of the locked component of the response enables
one to construct transfer functions. A transfer function can
be inverse-Fourier transformed to obtain the STRF that characterizes a
unit's dynamics and selectivity along the tonotopic axis.
|
In developing these measurement and analysis methods, we use two
fundamental assumptions. The first is that the responses are
substantially linear with respect to the time-varying spectral envelope
of stimuli. In particular, this implies that the response to the
spectro-temporally rich stimuluswhose envelope can always be
described as the sum of multiple moving ripples
will be the sum of its
responses to the individual ripple components. This assumption was
confirmed by successfully predicting responses to the superposition of
multiple ripples (Kowalski et al. 1996b
).
The second important assumption deals with the separability of the
temporal and spectral aspects of the responses. Specifically we have
demonstrated in other reports that temporal and spectral transfer
functions can be measured independently of each other and then combined
with a simple product to compute the total transfer function
(Kowalski et al. 1996a). The importance of this finding stems from its experimental implications for measuring the STRFs and
theoretical consequences for the biophysical and functional models of
the STRFs. On the experimental side, separability makes it possible to
infer responses to all ripple velocities and peak densities based on
only a pair of temporal and spectral transfer functions. Without this
assumption, measuring the two-dimensional transfer function is
difficult because of the extended times needed to collect adequate
spike counts. On the theoretical side, separability suggests that
certain features of the STRF (as we shall discuss in detail in the
following text) are formed by independent (and likely sequential)
spectral and temporal processing stages.
In our earlier study (Kowalski et al. 1996a),
separability was validated for ripples moving only in one direction
(spectral envelope moving downward in frequency), a notion also known
as "quadrant separability." In this report, we compare the
separable functions (spectral and temporal) across upward and downward
quadrants. If the functions are the same across quadrants, the
responses are "fully separable" (i.e., they are separable);
otherwise they are quadrant separable, which is a (specialized) form of inseparability.
Like quadrant separability, full separability has experimental and
theoretical implications. On the experimental side, fully separable
STRFs can be measured with either upward or downward moving ripples.
Theoretically, fully separable responses imply an STRF that is fully
decomposable into the product of a purely temporal impulse response and
a purely spectral response field. It also implies a unit that responds
equally well to upward and downward moving ripples and hence has
necessarily a symmetric transfer function magnitude with respect to
direction (Watson and Ahumada 1985). By contrast, cells
that are only quadrant separable necessarily respond in asymmetric
fashion with respect to direction, i.e., are direction sensitive.
We restrict our presentation in this paper to measurements with singly
presented moving ripples in contrast to simultaneously presented
ripples discussed in Klein et al. (2000).
There are several goals of this paper. We present a method of measuring the complete descriptor of the linear spectro-temporal properties of an auditory cell, the STRF. We describe examples of STRFs measured in AI and summarize the distribution of the STRF and transfer function parameters encountered. We show that there is a directional sensitivity in the response to the upward versus downward moving components of a sound's spectral envelope. This breaks the symmetry of full spectro-temporal separability and produces quadrant separability. We propose measures to quantify quadrant and full separability. Finally, we discuss the significance of the results and their relationship to results from similar auditory and analogous visual experimental paradigms.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Surgery and animal preparation
Data were collected from a total of 11 domestic ferrets
(Mustela putorius) supplied by Marshall Farms (Rochester,
NY). The ferrets were anesthetized with pentobarbital sodium (40 mg/kg) and maintained under deep anesthesia during the surgery. Once the
recording session started, a combination of ketamine (8 mg · kg1 · h
1),
xylazine (1.6 mg · kg
1 · h
1), atropine (10 µg · kg
1 · h
1), and
dexamethasone (40 µg · kg
1 · h
1) was given throughout the experiment by
continuous intravenous infusion, together with dextrose, 5% in Ringer
solution, at a rate of 1 ml · kg
1
· h
1 to maintain metabolic stability. The
ectosylvian gyrus, which includes the primary auditory cortex, was
exposed by craniotomy and the dura was reflected. The contralateral ear
canal was exposed and partly resected, and a cone-shaped speculum
containing a miniature speaker (Sony MDR-E464) was sutured to the
meatal stump. For more details on the surgery, see Shamma et al.
(1993)
.
Recordings
Action potentials from single units were recorded using
glass-insulated tungsten microelectrodes with 5-7 M tip impedance at 1 kHz. Neural signals were fed through a window discriminator, and
the time of spike occurrence relative to stimulus delivery was stored
using a computer. In each animal, electrode penetrations were made
orthogonal to the cortical surface. In each penetration, cells were
typically isolated at depths of 350-600 µm corresponding to cortical
layers III and IV (Shamma et al. 1993
). In many
instances, it was difficult to isolate reliably a single unit for
extended recordings, and hence several units were recorded instead.
Such data were labeled "multiunit recordings" and are explicitly
designated as such and separated from the single-unit records in all
data presentations in the paper.
Acoustic stimuli
All stimuli are computer synthesized. For each unit isolated, initial tests are carried out using tonal stimuli to measure the basic frequency response at several intensities to determine the best frequency (BF) and response threshold. All other stimuli used in these experiments have broadband spectra with a sinusoidally modulated (or rippled) envelope. We used the knowledge of the cell's BF to adjust the frequency range of the broadband sound so that the cell's excitatory and inhibitory regions lay well within the frequency range of the sounds.
In practice, it is hard to generate noise and then shape it with
filters to a desired dynamic spectral envelope, so we generate ripples
over a range of five octaves by taking logarithmically spaced pure
tones with random (temporal) phases. The amplitude S(t, x) of each tone is then
![]() |
(1) |
A single-ripple stimulus at overall level L dB SPL would
typically be composed of N logarithmically spaced
components, each at L
10 log10 (N)
L
20 dB for N = 101. The overall stimulus level was
chosen on the basis of threshold at BF; typically L was set
10-20 dB above threshold. High levels (L > 70 dB) were avoided to ensure the linearity of our stimulus delivery system. The
amplitude of a single ripple was defined as the maximum percentage or
logarithm change in the component amplitudes. Ripple amplitudes were
either 90% (linear) or 10 dB (logarithmic) modulations.
The ripple velocities w and ripple densities used were
determined by the response properties of the neuron, but the typical range was |w| < 25 Hz (with some units
requiring up to 100 Hz) and |
| < 1.6 cycles/octaves (with some
units requiring up to 4 cycles/octaves). Single ripples were always
presented with
= 0.
By the convention established in Eq. 1, a ripple whose
spectral envelope is moving downward in frequency, as in Fig. 2, has positive w and positive ; equivalently, it can be
described by a ripple with negative w and negative
, and
an added phase shift of
, by Eq. 1 and the identity sin
(
) = sin (
+
). A ripple whose spectral peaks are
moving upward in frequency has negative w and positive
,
or by Eq. 1 and the same identity, positive w,
negative
, and an added phase shift of
.
The stimulus bursts had an 8-ms rise/fall time and duration of 1.0 or
1.7 s, repeated every 3-4 s. All stimuli were gated and fed
through an equalizer into the earphone. Calibration of the sound
delivery system (to obtain a flat frequency response up to 20 kHz) was
performed in situ with the use of a .
Theoretical considerations
DEFINING THE STRF.
The fundamental tool to measure linearity and separability of primary
cortical cell is to measure their STRF. The STRF is a spectro-temporal
function STRF(t, x). The linear response rate y(t) of a cell is related to its STRF(t,
x) and the spectro-temporal envelope of the stimulus
S(t, x) by y(t) =
dt'dxS(t'
t, x) · STRF(t, x), i.e., convolution along the time dimension
t and integration along the spectral dimension x.
![]() |
(2) |
![]() |
(3) |
![]() |
(4) |
|
DEFINING AND ASSESSING SEPARABILITY.
Separability is an important property of the transfer functions. A
fully separable transfer function is one that factorizes into a
function of w and a function of over all quadrants:
T(w,
) = F(w)
· G(
). This implies that STRF(t, x) is
time-spectrum separable: STRF(t, x) = IR(t) · RF(x). In this case, one needs only
measure the transfer function for all
at a convenient w and for all w at a convenient
.
F(w) and G(
) are each
complex-conjugate symmetric [F(
w) = F*(w), G(
) = G*(
)] because IR(t) and RF(x) are
real, so one needs only consider the positive values of each. This
dramatically decreases the number of measurements needed to
characterize the STRF.
![]() |
(5) |
![]() |
![]() |
(6) |
![]() |
(7) |
![]() |
(8) |
![]() |
(9) |
![]() |
(10) |
EFFECT OF FINITE SAMPLING.
We measure the transfer function of cells by varying two parameters,
ripple velocity and ripple density. For consistency's sake, we used
the same range of parameters for a majority of cells. However, for some
cells, the transfer function has not decreased significantly at the
"edges" (for instance, in Fig. 9C, the temporal transfer
function is clearly still strong at ±64 Hz and above). This is
equivalent to multiplying the true transfer function by a rectangular
function which is zero everywhere except between 64 and 64 Hz, over
which range it is 1. In the dual Fourier space of the transfer function
space, that is, in the STRF space with coordinates t and
x, this corresponds to convolving along each dimension the
STRF with the Fourier transform of a rectangular pulse, that is,
with sin (x)/x. This leads to spurious
oscillations in the display of the STRF as can be seen in
Fig. 9C and others. These oscillations would disappear if we
had measured the transfer functions all the way to their vanishing values.
|
DEVIATIONS FROM LINEARITY.
Because the STRF is a measure of the linear part of the dynamics of a
cell, we only consider effects that might modify the measurement of the
first component of the Fourier transform of the period histograms. The
most prominent nonlinearities are (approximate) half-wave rectification
and compression. The half-wave rectification is primarily due to the
positivity of spike rates (ordinarily the steady-state response to a
flat spectrum is significantly less than half the peak firing rate of
the unit); the distortion of a sinusoid due to half-wave rectification
does not affect the phase of the response, and its effect on the
amplitude of the first Fourier component is a constant factor,
independent of w and . The distortion due to compression
or saturation, similarly, does not affect the phase of the Fourier
transform components of the response and similarly affects the
amplitude only by an overall constant factor for stimuli of moderate level.
Data reduction
Many of the data analysis methods described here are similar or
straightforward extensions of those developed earlier in
Kowalski et al. (1996a), and those will be only briefly
reviewed here. Figures 4 and
5 illustrate the nature of the responses
to the ripple stimuli and the analysis to extract the spectral (Fig. 4)
and temporal (Fig. 5) transfer functions. In Fig. 4A, the
ripples are presented at 8 Hz for ripple densities from
1.6 to 1.6 cycle/octave in steps of 0.2 cycle/octave. Each stimulus is presented
15 times.
|
|
For each ripple density, we compute at 16-bin period histogram based on
the responses starting at 120 ms (to exclude the onset response; Fig.
4B). A 16-point Fourier transform (FFT) is then performed on the period histogram, and the amplitude and phase of the
first component is taken to be the amplitude and phase of the transfer
function. If the modulation of the response was that of a purely linear
system, the higher FFT coefficients would be negligible, but because of
half-wave rectification and compression, they sometimes are
significant. In general Tw() can be
written as
![]() |
(11) |
Analogous steps are followed in measuring the temporal transfer
function as shown in Fig. 5 where ripples are presented at 0.2 cycle/octave for ripple velocities from 24 to 24 Hz in steps of 4 Hz.
Note that in the previous paper (Kowalski et al. 1996a),
we weighted the measurement of the first component of the Fourier transforms of the period histograms by a weighted sum of the higher frequency components of the transform. This, however, is not compatible with the idea of a linear system so that the resultant STRF or equivalently the ripple transfer function T would not be
expected to be the best possible predictor of the response to new
sounds. Therefore in this paper, the values of T correspond
directly to the first component of the Fourier transform.
Once the ripple transfer function has been measured, it can be inverse Fourier transformed to display the STRF. Since the transfer function is typically measured over fewer than 8 points along each dimension in each quadrant, the resulting STRF as computed would look very jagged even if the underlying STRF was smooth. We therefore interpolate to a smooth STRF for display purposes, padding the transfer function with zeros to a size of 64 × 64. All statistics and predictions use the measured unsmoothed STRF.
To construct the two-dimensional transfer function, we assume quadrant
separability, measure the transfer function along the cross-sections
shown in Fig. 3, to combine these spectral and temporal cross sections
as illustrated in Fig. 6. For each
quadrant, the transfer function is the outer product of the
cross-section, divided by the (complex) value of the transfer function
at the crossover (×) point. In Fig. 6, the point is
(w×1,
×1) = (8 Hz, 0.2 cycle/octave) in
quadrant 1 and (w×2,
×2) = (
8 Hz, 0.2 cycles/octave) in
quadrant 2.
![]() |
(12) |
|
The ratio
T1st(w×q,
×q)/T2nd(w×q,
×q), which should be unity, reflects
noise in the system and is used to estimate reliability in the
following text.
The value of the transfer function along the w = 0 axis
is set to zero because the modulation transfer function is not well defined there, i.e., there is no modulation of firing rate around the
DC (average) rate with a frequency of 0 Hz. The value of the transfer
function along the = 0 axis is not measured directly, so the
value used is the mean of the value inferred from being the boundary of
quadrant 1 and that inferred from being the boundary of quadrant 2.
Once the values of transfer functions for quadrants 1 and 2 and their
boundaries are measured, the values for quadrants 3 and 4 are given by
Eq. 4 (see also Fig. 3). The STRF is then computed by an
inverse Fourier transform (as in Eq. 3) and is illustrated in Fig. 6B (left). This interpolated version of
the STRF (used for display) is obtained by using Eq. 3 on
the transfer function padded with zeros at high |w| and
|| (see Fig. 6A).
Deriving STRF parameters from the phase functions
Numerous parameters can be derived from the STRF (or equivalently the transfer function) that are analogous to traditional response measures such as BF, tuning curve bandwidth, and latency. Most of these parameters are best derived from analysis of the phase of the transfer functions (Fig. 7).
We model the phase of the transfer function within each quadrant
q(w,
), q = 1, 2 (see Eq. 2) as a linear function of w and
![]() |
(13) |
The justification for assuming linear fits of the phase functions has
been discussed in detail earlier in (Depireux et al. 1998) and is strongly motivated by the data (Kowalski et
al. 1996a
). Note, however, that the assumption of
phase linearity is used only for parameter estimation and is
not assumed in computing the STRF. The first linear term in Eq. 13 stems from the fact that auditory units differing in their mean
neural delays will exhibit linear phase dependence on w with
different slope depending on delay. Analogous arguments apply for units
that are located at different places along the tonotopic axis: the
response phase of different units (with otherwise identical STRFs)
changes linearly with
at different rates, depending on the relative
center frequency locations. In both cases, the slopes of the linear
phase function indicate the absolute shift of the STRF relative to the
origin, i.e., the mean time delay
An interpretation of d, for each quadrant, is
that it is the sum of the pure response latency and (roughly) half the
temporal width of the STRF. This is in contrast to the STRF's peak
delay,
STRF, defined to be the delay for which
the STRF achieves its maximum value, which may lead or lag
d, depending on the constant temporal phase
shift,
, defined in the following text. Similarly, fm for each quadrant may or may not
fall on the STRF's best frequency, BFSTRF, defined to be the frequency at
which the STRF achieves its maximum value, depending on the constant
spectral phase shift,
, defined in the following text.
A convenient convention for interpreting the constant component of the
phase is to break up the constant phase angle
q into two parts
![]() |
(14) |
|
In past reports (Kowalski et al. 1996a),
and
could be measured without measuring the transfer function in the upward
moving quadrant 2 by measuring the constant component of the phase in quadrant 1 (
1 =
+
) and along the
w axis, where the constant component of the phase is
expected to be the mean across the quadrants
[(
1
2)/2 =
; note the change in convention of
between the present work and Kowalski et al. (1996a)
].
Because of response variability, we only fit to those points of the
transfer function that have more than half of the response power in the
first component of the Fourier transform. Then the fit is done across
the entire two-dimensional phase plane for each quadrant. Ultimately
our unwrapping method is less than ideal, and estimates of and
especially reflect that (Ghiglia and Pritt 1998
).
Estimating response variability: the bootstrap method
Variability in our experiments originates from multiple sources,
including internal neural mechanisms (e.g., Poisson-like distributions
of spike times), extracellular recording/identifying methods, and
equipment noise. Quantitative estimates of the reliability of our
measurements is crucial to its analysis and subsequent interpretation.
A method of variability estimation that is especially appropriate to
these measurements is the bootstrap method (Efron and Tibshirani
1993; Politis 1998
).
The essence of this method is to use "resamples," in which N samples of bootstrap data are drawn with replacement from the N original samples of data. Repeating this procedure a large number of times creates a population of bootstrap resamples whose probability distribution is a good estimator of the probability distribution from which the original data were drawn.
To illustrate this procedure, consider measuring the transfer function
at a point (w, ). This is done by presenting the same (w,
) stimulus N times and constructing a
period histogram based on all N sweeps. The amplitude and
phase of the first Fourier component of the period histogram are
assigned to the amplitude and phase of the transfer function. A single
bootstrap resampling of the responses will have N sweeps,
where, because they are drawn from the original responses with
replacement, some will be duplicated and some will be unused.
Nevertheless a period of histogram is constructed, and the bootstrap
estimate of the transfer function is assigned to its first Fourier
component. Performing a large number of bootstrap resamples results in
a population of estimates for the transfer function. This population
has a mean, variance, and higher-order moments. These moments are
estimators of the moments of the original population (of all transfer
functions of all allowable neuronal responses to the stimulus). For
example, the standard deviation of all bootstrap estimates of the
transfer function is an estimator of the standard deviation of
measurements of the transfer function. This allows us to put error bars
on our transfer functions and STRFs.
Effects of crossover point errors
Another significant source of error is the difference between
the responses of repeated measurements at the transfer function crossover points. The ratio of these independent measurements, T1st(w×
![]() |
(15) |
![]() |
(16) |
![]() |
![]() |
(17) |
![]() |
(18) |
is a measure of the average standard deviation in units of the
maximum of the STRF.
is a measure of the variance in units of
power. If noise is additive, then
= P
/(P + P
) = 1/(SNR + 1), with
P = power, P
= noise power, and SNR = signal-to-noise ratio.
should go down
with the number of recordings, assuming the system can be described as
the time-invariant random process.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Data presented here were collected from 22 single-unit and 54 multiunit recordings in 11 ferrets. In the summary histograms, both single units and multiunit are included but are distinguished from each other.
Most units encountered in AI respond well to moving ripples. Responses
are typically phase-locked to the moving envelope of the ripple over a
range of ripple velocities and densities. However, of a total of 172 recordings made, only 76 cases provided adequate quality and quantity
of responses. The reasons for this low yield vary. For example, we have
encountered responses from a few units that were either poorly
phase-locked or were inconsistent from trial to trial; such units were
abandoned since our analysis methods are unsuitable for their
characterization. Also because of extended recording times, typically
over an hour, units were sometimes lost before sufficient data could be
collected to carry out a full analysis. In other cases, the unit or
animal changed state during the recording session, rendering the data
unreliable. The reason for the extended recording time is to present
ripple sounds and other sounds consisting of combinations of ripples,
so we can verify linearity by using the STRFs to predict the response of the cell to new sounds. We found empirically that about 10,000 spikes are typically needed to obtain an STRF with well-defined features in response to single ripples, which with our sound paradigm usually corresponds to a 20-min presentation per cross-section. To
eliminate data corresponding to unreliable cells, as described in the
preceding text, we use units only with values of
0.12 and
0.7 (see METHODS) as the threshold for
rejecting the data. These reliability statistics takes into account
most of the preceding sources of error. The values of 0.12 and 0.7 are
somewhat arbitrary, though we found that cells tended to separate
themselves into two populations above and below these thresholds,
respectively, and that the mathematical criteria of reliable versus
noisy cell corresponded well with our intuitive perception based on
visual inspection.
Responses to moving ripples
On average, AI units synchronize their responses to upward and downward moving ripples equally effectively with ripple velocities ranging from 2 to over 100 Hz, and ripple densities up to 4 cycle/octave. Examples of several temporal and spectral transfer function magnitudes are shown in Figs. 8-10, each with its corresponding STRF. In all cases, units respond well only over a specific range of ripple velocities and ripple densities, but the detailed shape and extent of the transfer functions vary from one unit to another. For instance, the unit in Fig. 9A responds well only to ripple velocities of ±4 Hz, whereas the unit in Fig. 9C responds well at least up to ±64 Hz. The unit in Fig. 6 responds well to ripple densities within ±0.4 cycle/octave, whereas the unit in Fig. 10A responds over a wider range of densities but poorly at 0 cycle/octave.
|
|
|
As described in the preceding text, the transfer function at w = 0 is set to 0 since it is not well defined (and so has 0 contribution to the STRF). Additionally, for 12 cells (not shown), the transfer function was measured from ±8 to ±1 Hz in 1-Hz steps, and in all cases, the transfer function was negligible at the slowest ripple velocities (in contrast to the average firing rates, which remained significant).
Units also vary significantly in the asymmetry of their transfer functions with respect to the direction of the moving ripple. For example, responses to the two directions are relatively equal (transfer functions are roughly symmetric) in Figs. 6 and 9A. By comparison, the temporal transfer functions in Fig. 8, A-C, are asymmetric. The unit in Fig. 8B responds better to upward moving ripples; the unit in Fig. 8C responds over a wider range to downward moving ripples. These asymmetries are discussed in depth later in the context of transfer function separability.
The STRFs derived from these transfer functions commonly exhibit alternating significant regions of positive peaks and negative basins, interpreted here as excitatory and inhibitory regions, respectively. The four STRFs illustrated in Figs. 6 and 8 are of units that are tuned between 1 and 2 kHz. However, the shapes of the surrounding inhibitory regions vary considerably reflecting the different temporal and spectral transfer functions (see Fig. 11). For instance, STRFs may be relatively symmetric (Fig. 8A) or asymmetric (Fig. 9C). They can be clearly directional, i.e., tilted one way (Fig. 8B) or the other (Fig. 8C) on the spectro-temporal surface.
|
STRFs display a wide variety of shapes that are briefly described in the following text. The majority of AI cells exhibit STRFs with a simple excitatory field and varying amounts of inhibitory surround. The first peak of the excitatory portion indicates the BFSTRF of the unit, while its extent reflects its tuning curve at a given level.
In many cases, the inhibitory surround is spectrally asymmetric
around the BFSTRF (Fig.
9C); such asymmetry is effectively captured by the parameter
(Eq. 14), where
values near zero indicate roughly
symmetric STRFs, while
90° indicate strong inhibition
below the BFSTRF, and
90° indicates strong inhibition above the
BFSTRF. The
distribution in our
sample is summarized in Fig.
12C. It closely resembles
that seen earlier with downward moving and stationary ripples
(Kowalski et al. 1996a
; Schreiner and Calhoun
1994
; Versnel et al. 1995
).
|
STRFs also vary considerably in their temporal dynamics, best seen in
the t x domain. Some are fast with
envelopes that decay relatively rapidly (Figs. 9C and
10A). Others are slow, taking over 150 ms to decay (as in
Figs. 9A and 10B). These response dynamics reflect details of the temporal transfer function such as the ripple
velocity at which it peaks (characteristic ripple velocity) and its
width (ripple velocity bandwidth). STRFs also exhibit an onset delay
(or latency) that is captured by the
d values, derived from the phase function (Eq. 13). The distribution
of this delay tends to be well clustered around 25 ms as seen in Fig. 12B. Finally, unit STRFs can be generally classified as
either onset (Figs. 9, A-C, and 10, A
and B, most cells) or offset (Fig. 10C), a property that corresponds, respectively, to the
negative or positive sign of the parameter
. Onset STRFs are far
more common in our sample as seen in the
distribution in Fig.
12C.
Finally, STRFs may display very complex dynamics and spectro-temporal
selectivity that are not easily captured by simple parameters. Two
examples of such STRFs are shown in Fig. 11. One might be tempted to
dismiss such STRFs as mere aberration or noise except that they are
derived from repeatable responses ( = 0.10 and
= 0.49 for Fig. 11A and
= 0.03 and
= 0.04 for
Fig. 11B).
Separability and its relation to STRF shape
Separability is an important property of the transfer functions that has significant experimental and theoretical implications. In this paper, we assume quadrant separability and ask whether responses are fully separable, the degree of inseparability, and the origin of the inseparability. Each of these indicators has a potentially useful interpretation for the shape of the STRF and the underlying structure of processes that give rise to it.
The simplest and most general way to examine full separability is to
compute the SVD matrix SVD (Eq. 6).
Figure 13 illustrates the distribution
of
SVD, Eq. 7, computed from all
the cells used. Values near 0 indicate that only the first singular
value has a large nonzero value and hence that the STRF is fully
separable. Increasing values indicate increasing degree of
inseparability. A significant fraction of cells deviate from full
separability.
|
It can be shown that fully separable transfer functions must have
magnitudes that are symmetric about the (w, ) origin,
Alone,
SVD offers no insight into the specific
nature of these departures from the symmetric, separable case. However,
it will be shown that there are three parameters (Eqs.
8-10) that in combination form
SVD and
that each corresponds to a specific distortion of a separable transfer
function:
1) d, the response directionality,
or the imbalance in the overall strength of the responses to the upward
and downward moving ripples;
2) t, the asymmetry in the temporal
transfer function F(w);
3) s, the asymmetry in the special
transfer function G(
).
The distribution of these three parameters is shown in Fig.
14. The directionality parameter
d is distributed approximately normally
between negative and positive values. This parameter is closely related
to the directional selectivity of the STRF. STRFs with large
|
d| values exhibit obvious directional
shapes such as seen in Fig. 14 (top, middle). A significant
proportion of units (37%) also have spectral dissimilarity values
(
s) exceeding 0.3. An STRF with especially
large
s is shown in Fig. 14
(middle). Note that these STRFs may not necessarily exhibit
obvious directionally selective shapes.
|
A strikingly different finding is the dearth of units (12%) with
significant temporal dissimilarity ( > 0.3) as seen in the distribution in Fig. 14 (bottom, left). An STRF with
= 0.30 is displayed in Fig. 14 (bottom, middle): it
is difficult to detect simple correlates of the large
t values in the shape of the STRF. Note that
this is not due to measuring the temporal transfer function at six
points and the spectral transfer function at eight points in each
quadrant: when the last two points of the spectral cross section are
removed, the same results are obtained.
The three inseparability indicators do not appear to be significantly correlated, based on the pairwise scatter plots in Fig. 15, suggesting that independent mechanisms underlie the expression of each factor. By contrast, each factor (as expected) is well correlated with the total SVD index as seen in Fig. 14 (right).
|
We can define a composite measure of inseparability, the mean of
t,
s, and
|
d|. Figure
16 illustrates that this measure is
highly correlated to
SVD and hence is an
equally valid measure of inseparability.
|
There is no sharp threshold for inseparability. In Fig. 13, for
instance, SVD
0.35 clearly corresponds to
an inseparable cell. However, because of the continuum of values for
SVD, there is no obvious cutoff.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Summary of results
The emphasis of this work has been on presenting a technique to describe neural response patterns of units in the cortex. More precisely, we use moving ripples to characterize the spectral and temporal properties of responses of auditory cortical neurons, although this is a general method that can be used for any population of neurons for which responses are shown to be substantially linear for broadband stimuli.
We have examined the nature of AI responses to rippled spectra moving in both upward and downward directions and incorporated these responses into the STRF. A summary of the main results follows.
1) We confirm earlier findings (Kowalski et al.
1996a) that AI units respond in a phase-locked fashion to the
moving ripples over a range of velocities and directions that depend on
the ripple density of the spectrum. In particular, responses are
usually tuned around a specific ripple velocity and density. In the
ferret, responses are commonly best in the 4- to 16-Hz range and
densities lower than 2 cycle/octave. These findings are roughly
consistent with those found in different species using different
experimental paradigms: experiments with dynamic spectra (e.g.,
narrowband such as AM and FM tones or broadband such as modulated noise
and click trains) have found similar maximum rates of synchronized responses in AI (Eggermont 1994
; Schreiner and
Urbas 1988
).
2) We demonstrate a similarity between responses to upward and downward moving ripples. Specifically, the response parameter values and distributions to either direction are comparable (even if unequal), and hence reflect general dynamic response properties, not direction specific properties per se.
3) Complete spectro-temporal transfer functions are measured that exhibit a rich variety of shapes and cover a wide range of stimulus parameters. The STRF describes the way AI units integrate stimulus power along the spectro-temporal dimensions.
4) We illustrate a variety of STRFs with a broad range of BFs, bandwidths, asymmetrical inhibition, temporal dynamics, and direction selectivity. We have assessed the prevalence of these features over all sampled units by examining the distribution of specific parameters that reflect each of these features.
5) The degree and origin of inseparability of the unit transfer functions is assessed using two methods. In the first, SVD analysis is applied to the entire transfer function to determine the number and ratio of the resulting singular values. The results indicate that AI units span a relatively uniform distribution between full separability to moderate inseparability. In the second method, we examine the origin of inseparability and find that it is primarily due to two factors: imbalance in the response power and an asymmetry in the spectral transfer function relative to the direction of ripple motion. Interestingly, we find that temporal (but not spectral) transfer functions are relatively symmetric and hence contribute little to overall transfer function inseparability.
In Kowalski et al. (1996a,b
), pentobarbital was used for
anesthesia; in the present study, a ketamine/xylazine combination was
used. In Kohn et al. (1996)
, the effect of different
anesthetics on the tuning properties of auditory cortical cells as a
whole was presented. Under ketamine, a wider variety of responses was found, tuning to ripple density was slightly lower (from 1.05 cycle/octave under pentobarbital to 0.8 cycle/octave under ketamine), and no significant change in temporal tuning was observed. Other properties, though, such as linearity of the STRF for downward moving
ripples, were unchanged. These results can be accounted for by assuming
that overall, response fields measured with ripples have less
inhibition under ketamine than under pentobarbital.
Separability and its implications
An important property of the responses is that for ripples moving
in only one direction, the spectral and temporal functions are
separable: within each quadrant they can be measured independently of
each other. The property of quadrant separability makes it possible to
measure the overall spectro-temporal transfer function in reasonable
times using only single ripples since only a few velocity and spectral
density combinations need to be measured. We have established
(Kowalski et al. 1996a) that all recorded transfer
functions in AI exhibit quadrant separability. In the experiments
reported here, we assumed quadrant separability (Kowalski et al.
1996a
,b
) and proceeded to examine whether the resulting two-dimensional transfer functions are fully separable. Our findings indicate that AI responses fall uniformly on a continuum between moderately to fully separable.
A fully separable cell cannot be directionally selective in its
responses. Inseparability is a necessary condition for the formation of
more complex STRFs; direction selectivity is one possible consequence
of inseparability. A directionally selective STRF usually has a
distinctive elongated form along a spectro-temporal direction that
matches that of its most sensitive ripple stimulus. For example, the
STRF illustrated in Fig. 8B is most responsive to a ripple
=
0.4 cycle/octave, w =
8 Hz, whose
spectrogram matches well the outline of the STRF spacing and
orientation. Direction selectively implies that a unit is
differentially responsive to one direction of ripple movement and hence
must have a significant nonzero directionality index. Therefore
direction selectivity necessarily implies an inseparable STRF. The
opposite is not true: an inseparable STRF might reflect other factors
such as asymmetric temporal and/or spectral transfer functions
(
t or
s
0), which do not manifest themselves in an obvious elongated form or preferential responses to one direction or another (as shown in Fig. 14,
center column, middle and bottom).
Separability also places strong constraints on the underlying
biological processes that give rise to the STRF shapes. For example,
full separability suggests that the STRF is constituted of independent
temporal and spectral processing stages. By contrast, inseparability
(or just quadrant separability) implies spectrally and temporally
intertwined stages of processing with the specific form of the model
being entirely dependent on the details of the transfer functions.
Quadrant separability in particular is a very strong constraint on both
the neural inputs and the processing of the unit: almost all neural
networks (whether linear or nonlinear) with multiple fully separable
STRFs as inputs will in general produce a totally inseparable STRF. In
particular, the naive procedure of constructing a directionally
sensitive STRF by talking the simple sum of two fully separable STRFs
with differing fm and d will produce a totally inseparable STRF
which is not quadrant separable. To produce a quadrant separable STRF
requires special inputs and/or special processing.
It can be shown that a quadrant separable, temporally symmetric
(i.e., t
1), cortical neuron can be easily
constructed by taking inputs from (potentially) many units with
(potentially) different spectral response fields and even with
(potentially) different temporal impulse response properties as long as
the temporal dynamics of the inputs to the cortical cell are fast compared with the temporal dynamics of the cortical cell itself (Simon et al. 2000
). Quadrant separability then occurs
when the inputs are temporally phase-lagged relative to each other
[though not necessarily 90° as in Saul and Humphrey
(1990)
and Dong and Atick (1995)
].
This is consistent with the input neural connectivity one expects from
layer IV cortical neurons, which receive input from thalamic medial
geniculate body (MGB). MBG neurons may have fully separable STRF [as
is the case for typical inferior colliculus central (ICC) neurons
(Escabi and Schreiner 1999)] with different spectral
response fields (differing in width, extent/location of inhibitory
bands, and to a lesser extent, best frequency). MGB temporal
cross-sections of transfer functions are essential constant when
low-passed at a cutoff frequency appropriate to cortical behavior
(e.g., typically well below 100 Hz) (Yeshurun et al.
1985
). Furthermore some MGB neurons may have a temporal phase
lag, as in the visual system's lateral geniculate's "lagged cells" (Saul and Humphrey 1990
).
Significantly, the property of quadrant separability with temporal symmetry does not allow for any cortical inputs unless those inputs have the same temporal behavior as the neuron studied. If, for instance, all neurons in the same cortical column have similar temporal properties, including similar neural delays, this would be consistent with quadrant separability. Otherwise, cortical inputs would break quadrant separability and create a totally inseparable neuron. Total inseparability would be expected for cortical neurons in layers that receive significant input from other cortical columns or from any other neural source with significantly different temporal processing, including (but not limited to) any significant delays.
It is possible that this extremely constraining result is an anesthesia-induced effect. If not, the result is a fascinating constraint on the neural network providing input to a given cortical cell.
![]() |
ACKNOWLEDGMENTS |
---|
We particularly thank A. Saul. We also thank J. Eggermont, I. Ohzawa, and M. Slaney for very helpful and illuminating discussions.
This work was supported by Office of Naval Research Multidisciplinary University Research Initiative Grant N00014-97-1-0501, National Institute on Deafness and Other Communication Disorders Training Grant T32 DC-00046-01, and National Science Foundation Grant NSFD CD8803012.
![]() |
FOOTNOTES |
---|
Address for reprint requests: J. Z. Simon (E-mail: jzsimon{at}isr.umd.edu).
Received 15 November 1999; accepted in final form 12 October 2000.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|