Epstein Laboratory, University of California, San Francisco, California 94143-0526
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Snyder, Russell L., Maike Vollmer, Charlotte M. Moore, Stephen J. Rebscher, Patricia A. Leake, and Ralph E. Beitel. Responses of Inferior Colliculus Neurons to Amplitude-Modulated Intracochlear Electrical Pulses in Deaf Cats. J. Neurophysiol. 84: 166-183, 2000. Current cochlear prostheses use amplitude-modulated pulse trains to encode acoustic signals. In this study we examined the responses of inferior colliculus (IC) neurons to sinusoidal amplitude-modulated pulses and compared the maximum unmodulated pulse rate (Fmax) to which they responded with the maximum modulation frequency (maxFm) that they followed. Consistent with previous results, responses to unmodulated pulses were all low-pass functions of pulse rate. Mean Fmax to unmodulated pulses was 104 pulses per second (pps) and modal Fmax was 60 pps. Above Fmax IC neurons ceased responding except for an onset burst at the beginning of the stimulus. However, IC neurons responded to much higher pulse rates when these pulses were amplitude modulated; 74% were relatively insensitive to carrier rate and responded to all modulated carriers including those exceeding 600 pps. In contrast, the responses of these neurons (70%) were low-pass functions of modulation frequency, and the remaining (30%) had band-pass functions with a maxFm of 42 and 34 Hz, respectively. Thus temporal resolution of IC neurons for modulated frequencies is significantly lower than that for unmodulated pulses. These two measures of temporal resolution (Fmax and maxFm) were uncorrelated (r2 = 0.101). Several parameters influenced the amplitude and temporal structure of modulation responses including modulation depth, overall intensity and modulation-to-carrier rate ratio. We observed distortions in unit responses to amplitude-modulated signals when this ratio was 1/4 to 1/6. Since most current cochlear implant speech processors permit ratios that are significantly greater than this, severe distortion and signal degradation may occur frequently in these devices.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Cochlear prostheses have been in use for
postlingually deaf adults for more than 25 yr and are now widely
accepted as a safe and effective treatment for profound sensorineural
hearing loss (Gates et al. 1995). Many profoundly deaf
cochlear implant (CI) users with contemporary devices are able to
correctly identify 80% of high context sentences without visual cues
(Wilson et al. 1991
, 1995
), allowing these otherwise
deaf individuals to communicate verbally over the telephone. Such
speech reception has been attributed both to the development of
multichannel intracochlear electrodes and to advancements in sound
processing strategies, in particular, the use of amplitude-modulated
interleaved pulse trains to encode acoustic signals (McDermott
et al. 1992
; Wilson et al. 1991
).
Since these processing strategies appear to provide additional benefit
to some CI users and since temporal resolution has been shown to be
important for the perception of temporal pitch, prosody, and speech
(Freyman et al. 1991; van Tassel et al.
1987
), attention has focused on the ability of CI users to
detect AM signals. For normal hearing listeners, psychoacoustic studies of AM sounds have measured temporal modulation transfer functions (TMTFs), i.e., minimum change in modulation depth necessary to allow
differentiation of a modulated sound from an unmodulated one in humans
(Bacon and Viemeister 1985
; Hanna
1992
; Houtgast 1989
; Viemeister
1979
; Yost et al. 1989
), macaque monkeys
(Moody 1994
), and chinchillas (Salvi et al.
1982
). These TMTFs have been used to estimate modulation
sensitivity or temporal resolution across subjects and across species.
These studies demonstrate that TMTFs in normal hearing subjects have
similar properties across several diverse mammalian species. Acoustic
TMTFs are low-pass or broadly band-pass functions with high-frequency
roll-off slopes of 3-4 dB/octave; they have maximum sensitivities at
20-80 Hz with cutoff frequencies of 55-160 Hz.
Psychophysical studies in CI subjects have
demonstrated that modulation detection using intracochlear electrical
stimulation has many of the same general characteristics as that of
normal hearing subjects (Busby et al. 1993;
Eddington et al. 1978
; Pfingst 1988
;
Shannon 1983
, 1985
, 1992
; Townsend et al.
1987
). Specifically, the frequencies of highest sensitivity and
the cutoff frequency in electrical low-pass TMTFs of CI subjects are
approximately equal to those seen in normal hearing subjects.
The results seen in both normal hearing listeners and CI users suggest
that the processes limiting temporal resolution in the auditory system
are located centrally. This suggestion is supported by physiological
studies of the responses of single neurons using both acoustic and
electrical activation of the auditory nerve (AN) and cochlear nucleus
(CN) neurons. Typically, acoustic TMTF cutoff frequencies for these
neurons are much higher than those reported in the psychophysical
studies. AN cutoff frequencies range between 500 and 1,500 Hz for
fibers with characteristic frequencies or above 10 kHz, where
modulations are not limited by tuning curve bandwidth (see
Greenwood and Joris 1996; Joris and Yin
1992
; Palmer 1982
). Likewise, CN cutoff
frequencies are higher than those seen psychoacoustically ranging from
50 to 1,300 Hz (Frisina et al. 1990
; Kim et al.
1990
, Møller 1974
; Rhode and Greenberg
1994
).
In contrast TMTF cutoff frequencies for IC neurons more closely
approximate those observed in psychoacoustic studies. Typically, IC
single and multi-neuron TMTFs of several mammalian species have average
cutoff frequency of ~100 Hz, although individual cutoff frequencies
range from 10 to 1,000 Hz (Batra et al. 1989, rabbits;
Burger and Pollak 1998
, bats; Langner and
Schreiner 1988
; Schreiner and Langner
1988
, cats; Rees and Møller 1983
, 1987
, rat; Rees and Palmer 1989
, guinea pigs). The similarity
in temporal resolution of IC neurons responding to acoustic and
electric signals and that estimated in psychophysical studies of both
normal hearing and CI users suggests that the IC neurons may play an
important role in defining the limits of psychophysical temporal
resolution and modulation sensitivity.
Many studies of AN fiber responses to intracochlear electrical pulses
have commented on their narrow dynamic range (Hartmann and
Klinke 1989a,b
; Hartmann et al. 1984
;
Javel et al. 1987
; Kiang and Moxon
1972
; Parkins 1989
; Shepherd and Javel
1997
; van den Honert and Stypulkowski 1987
).
Previous studies of IC neurons have also demonstrated their narrow
dynamic range (3-6 dB) when activated by low rate pulses (<200 pps)
(Shepherd et al. 1999
; Snyder et al. 1991
,
1995
). In addition, Snyder et al. (1995)
examined the temporal resolution of IC neurons and showed that all
responses were low-pass functions of pulse rate with an average cutoff
rate of ~100 pps. They also reported that maximum following frequency (Fmax) of most IC neurons was not significantly affected by stimulus intensity.
The present study extends the examination of temporal resolution of IC neurons to ICES using sinusoidal AM (SAM) of biphasic pulse trains. We have focused on these stimuli not only because they simulate the amplitude modulated pulsatile carriers used in the processed stimuli presented to many CI subjects but also because they are more similar to the stimuli used in previous psychoacoustic and auditory physiology studies of temporal resolution.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Deafening, implantation, and chronic stimulation
Experiments were conducted in five previously normal cats
deafened as adults and in nine adult cats deafened as neonates. The
previously normal cats were deafened 2-4 wk before the physiological experiment. Three of these were implanted and chronically stimulated for
22 wk. The remaining two normal cats were deafened and implanted 2 wk before the physiological experiment and received no chronic stimulation. Nine neonatally deafened cats were implanted at 6-8 wk of
age and chronically stimulated beginning 2 wk after implantation for
22-36 wk. Implantation and stimulation histories of these animals are
summarized in Table 1. All animals were
maintained in the animal care facility at the University of California
at San Francisco. All procedures were approved by the UCSF Committee on
Animal Research and were conducted in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals.
|
Details of the procedures for neonatal deafening, surgery,
implantation, chronic stimulation, and acute physiological preparations and recordings have been reported previously (Snyder et al.
1990, 1991
, 1995
; Vollmer et al. 1999
). In
brief, kittens were deafened by daily intramuscular injections of
neomycin sulfate (60 mg · kg
1 · d
1) beginning 24 h after birth and
continuing for 16 days. During this developmental period, hearing
thresholds are known to be elevated by 90 dB SPL in normal kittens
and do not reach normal adult thresholds until after postnatal day 21 (Brugge and O'Connor. 1984
; Walsh and McGee
1986
). The ototoxic effect on hearing thresholds was monitored
by recording click-evoked auditory brain stem responses (ABRs) and
500-Hz tone-evoked frequency-following responses (FFRs). These ensemble
responses were recorded differentially using silver wire electrodes. An
electrode through the scalp under the pinna of the stimulated ear
served as the reference, another through the scalp over the vertex
served as the active, and a third through the scalp below the
contralateral pinna served as the ground. Responses were amplified
100,000 times, filtered with a band-pass of 10-3,000 Hz, and averaged
to 500 stimulus presentations. ABRs were evoked by 0.1-ms condensation
clicks presented at a rate of 10/s, and 500-Hz tone bursts with a
rise/fall of 5 ms and duration of 60 ms presented 5/s evoked the FFRs.
ABR and FFR thresholds were measured during the third postnatal week
and again at ~6 wk of age. Normal adult cats were deafened 2-3 wk
prior to implantation by a single subcutaneous injection of kanamycin
(400 mg/kg) followed by infusion of ethacrynic acid (Xu et al.
1993
). ABR and FFR thresholds were tracked in these animals
until they were >110 dB SPL (~2-3 h). All animals included in this
study had ABR and FFR thresholds >110 dB SPL.
Prior to all surgical procedures, animals were sedated with an intramuscular injection of ketamine (22 mg/kg). An intravenous catheter was inserted into the cephalic vein and sterile mammalian Ringer solution was infused continuously through the catheter. A surgical level of anesthesia was induced and maintained by infusion of pentobarbital sodium via the intravenous catheter. The infusion rate was adjusted to provide a level of anesthesia sufficient to suppress both corneal blink and forelimb withdrawal reflexes. The animal's scalp was shaved, and the head was mounted in a Kopf mouth-bar head holder.
Intracochlear electrodes were inserted through the round window into
the scala tympani. These electrodes have been described in detail
elsewhere (see Rebscher 1985). In brief, they
consist of four Teflon-coated and Paralene-C-insulated,
platinum-iridium (90%: 10%) wires, embedded in a Silastic carrier.
Each wire ended as a 250-300 µm ball contact. The contacts were
arranged as offset radial pairs, an apical pair (1,2) typically located
~11 mm from the cochlear base and a basal pair (3,4) typically
located ~7 mm from the base. Each contact of a pair was separated
from the other by 1 mm.
All stimuli for chronic electrical stimulation were capacitively
coupled and charge-balanced. For most animals, chronic stimuli consisted of biphasic square-wave pulses (0.2 ms/phase), alternating in
polarity and delivered continuously at 300 or 800 pps. These pulses
were amplitude modulated at frequencies from 20 to 60 Hz. These chronic
stimuli were delivered to one intracochlear electrode pair at a maximum
intensity 2-6 dB above electrically evoked auditory brain stem
response (EABR) threshold and presented 4 h/d, 5 day/wk, for periods of
22-36 wk (see Table 1). EABRs were recorded twice monthly using
procedures identical to those used to record ABRs except that biphasic
current pulses (0.2 ms/phase) were substituted for acoustic clicks. In
two animals, the chronic electrical stimuli were electrical analogs of
the ambient acoustic environment. These animals were maintained in an
open wire cage in the laboratory. The spectral content of the ambient
acoustic environment was difficult to characterize and highly variable.
The environment consisted primarily of noises generated by these very
active animals, speech from the workers in the laboratory, and music
and speech from a nearby radio. The analog waveforms were band-pass
filtered between 0.1 and 3 kHz and subjected to logarithmic amplitude
compression. The maximum output of the device was set to 6 dB above the
EABR threshold. The stimulating signals were delivered using constant current stimulators (Vureck et al. 1981). All currents
are specified peak to peak.
Acute electrophysiological experiments
In acute physiological experiments, animals were anesthetized as described above, a canula was inserted into the trachea via a tracheostomy, and a urinary catheter was also inserted into the urethra. After the scalp over the right side of the scull was reflected and the temporalis muscle removed, a craniotomy was performed through the right parietal bone just anterior to the tentorium to give access to the middle cranial fossa. The dura was excised and the right occipital cortex aspirated to expose the tentorium. Using a diamond burr, an opening was made in the tentorium to expose the entire dorsal and dorsolateral surface of the IC.
Neural responses were recorded differentially using two tungsten
microelectrodes with carefully matched impedances between 0.8 and 1.5 M at 1 kHz. Using micromanipulators, the electrodes were positioned
at the surface of the IC. The "reference" electrode remained at the
surface in the cerebrospinal fluid. The "active" electrode was
inserted into the IC along a standardized trajectory in the coronal
plane at an angle of 45° off the parasagittal plane. The active
electrode was advanced using a hydraulic microdrive penetrating the IC
at its dorsolateral margin and advancing toward its ventromedial edge.
We have constructed spatial tuning curves (STC) for all these
penetrations, which allow us to derive criteria that operationally
define the border between the external, ICx, and central nucleus, ICc
(Snyder et al. 1990
; Vollmer et al.
1999
). The reported data are restricted to recording sites
located
1,700 µm from the surface of the IC along a standardized
penetration trajectory. Since this is
1,000 µm below the most
superficial region of maximum sensitivity to electrical stimulation
(the center of the ICx), we assume that these recordings were made
within the ICc. Neural activity was amplified (10,000-20,000 times)
with a band-pass of 300 Hz to 3 kHz using a battery-powered
preamplifier (WPI DAM50) followed by a second stage amplifier
(Tektronix 5A22N) and monitored on a Tektronix 5110 oscilloscope.
Action potentials from single neurons were isolated from both
background activity and electrical artifact using a spike discriminator
(BAK DIS-1). The time of occurrence of each discriminated spike was
recorded and stored using a PC microcomputer with an accuracy of 10 µs.
Search stimuli consisting of biphasic pulses (0.2 µs/phase) were
delivered at 3/s. Once a neuron had been isolated, its threshold was
determined to the search stimulus. The stimulus levels were controlled
using either Hewlett-Packard (model 350) or Tucker-Davis (PA-1)
attenuators. The intensity was set to 6 dB above threshold, and the
neuron's responses to 20 presentations of a 10-pps pulse train were
recorded and displayed as poststimulus time histograms (PSTHs). The
pulse train was 320 ms in duration with a 1-s interval between pulse
trains. After the responses at this frequency were recorded, the pulse
rate was increased by 10 pps and the process repeated until the neuron
no longer responded in a time-locked manner, i.e., responded with only
an onset burst. The amplitude of the time-locked response was estimated
using the following formula
![]() |
Responses to 100% sinusoidal AM (SAM) were examined in "modulation
series." Each modulation series consisted of a fixed modulation frequency and fixed peak current while the carrier rate was increased. Initially, carrier rates were increased from 100 to 1,000 pps in steps
of ~100 pps. In subsequent experiments, carrier rates were
incremented in 100-pps steps to 500 pps and then in 200-pps steps to
~1,000 pps. These modulated pulse trains were 500 ms in duration and
were separated by interstimulus intervals of 2 s. The effect of
changing modulation frequency was determined by recording responses to
all carrier rates at modulated frequencies ranging from 8 to as high as
400 Hz. The stimulus intensity was chosen by adjusting the maximum
pulse current to be equal to 4 dB re threshold, i.e., equal to the
intermediate intensity used in the frequency series as described above.
Modulation frequencies were selected in pseudo random order beginning
with 30 Hz followed by 8, 16, 24, 40, and 60 Hz. If the neuron
continued to respond, modulation frequency was increased in 20-Hz steps
until that neuron ceased to respond in a time-locked manner. The
amplitude of the time-locked modulation response was estimated by using
the same formula used to estimate the response to unmodulated pulses.
However, Nc, the number of modulation
cycles in the train 1 (since the first cycle always produces an onset
burst), was substituted for Npls.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
As described previously (Snyder et al. 1995), IC
neurons respond vigorously to intracochlear electrical pulses when
pulse rates are low (<20 pps), but as pulse rates are increased, the number of spikes per pulse diminishes monotonically (see Figs. 1A, 3A, and
4A). The decrease in the response occurs as a result of
adaptation. At low pulse rates (10-80 pps), each suprathreshold pulse
evokes a spike, and the number of spikes evoked by each pulse is
roughly equivalent. At moderate pulse rates (90-120 pps), the number
of spikes evoked by pulses near the end of the train is fewer than
those evoked by pulses at the beginning. At higher pulse rates (>130
pps), only the first few pulses evoke a response. Finally, above the
Fmax (160 pps in Fig. 1A), IC neurons respond to sustained
pulse trains with only a brief onset burst evoked by the first pulse of
the train followed by no spikes or randomly timed spikes at a very low
rate.
|
Effects of stimulus intensity
Virtually every study of the responses of AN fibers to
intracochlear electrical stimulation have commented on their relatively narrow dynamic range when activated with intracochlear electrical pulses (e.g., see Hartmann and Klinke 1989a,b
; Hartmann et al. 1984
; Javel et al. 1987
, 1989
; Kiang and
Moxon 1972
; Parkins 1989
; Shepherd and
Javel 1997
; van den Honert and Stypulkowski
1987a
,b
). In previous studies, Snyder et al. (1991
,
1995
) determined that the dynamic range of a large sample of IC
neurons was between 8 and 12 dB when activated by 100-Hz electrical
sinusoids and much less than that (3-6 dB) when activated by low rate
(<200 pps) pulses. Moreover they determined that once stimulus
intensity reached 2-3 dB above threshold, the frequency following
capacity (temporal resolution) of most IC neurons was not significantly affected by stimulus intensity.
Nevertheless, to control for intensity effects, the frequency responses
of most neurons in this study were recorded at three stimulus
intensities, 2-3, 4-5, and 6-7 dB above threshold. Examples of a
typical neuron's responses at these three intensities are illustrated
in Fig. 1. In Fig. 1A, PSTHs to pulse trains at increasing pulse rates are illustrated at 2 dB (158 µA), 4 dB (200 µA), and 6 dB (251 µA) above threshold. At rates of 80 pps, there is clear adaptation across the duration of the stimulus and the rate of adaptation increases with pulse rate, until at ~150 pps, the neuron responds with an onset burst followed by no discharges. Increasing pulse rate has a similar monotonic effect on this neuron's responses regardless of stimulus intensity, and Fmax for this neuron is similar
at all three intensities. When Fmax is determined at all three
intensities for 62 neurons, mean Fmax at an intensity 1-2dB above
threshold was slightly but significantly (P < 0.01, Student's paired t-test) lower than that evoked by the same
stimulus at 3-4 and 5-6 dB above threshold [mean Fmax at 1-2
dB = 72 ± 42 (SD) pps; mean Fmax at 3-4 dB = 85 ± 49 pps and mean Fmax at 5-6 dB = 88 ± 57 pps]. Thus
although stimulus intensity has a small effect on Fmax at levels near
threshold, once stimulus intensity is >2-3 dB above threshold, mean
Fmax is unaffected by stimulus intensity.
If the spike rate is plotted in spikes per second as a function of pulse rate (Fig. 1B, top), the shape of the frequency transfer function varies with intensity. At low intensity (2 dB above threshold or 158 µA), the neuron's response is a relatively sharply tuned, band-pass function with the maximum response at 80 pps. When the intensity is raised to 6 dB above threshold or 251 µA, the neuron's response becomes relatively broad with the maximum response at a slightly lower pulse rate (~70 pps). The band-pass shape of these functions suggests that this neuron responds "best" to rates between 70 and 80 pps and more poorly to rates above and below this rate. However, if the spike rate is plotted in spikes per pulse as a function of pulse rate (Fig. 1B, bottom), the shape of the frequency transfer functions are consistently low-pass, and it is clear when plotted in this manner that the responses to pulses delivered at low rates are much stronger than those to high rate pulses. Thus despite the common practice of reporting spike rates in spikes per second, we have chosen to analyze and report our results in spikes per pulse (spk/p) for frequency transfer functions (FTFs) and numbers of spikes per modulation cycle (spk/c) for the temporal modulation transfer functions (TMTFs). Moreover we have chosen to examine the TMTFs at intensities (4-5 dB above threshold) where pulse amplitude has little effect on the temporal resolution.
Effects of modulation frequency and carrier rate
The effects of modulation frequency and carrier rate on the responses of IC neurons to SAM pulse trains are complex. One might predict that these effects would interact strongly with the temporal resolution of that neuron (as indicated by its maximum following frequency to unmodulated pulse trains). For example, it might be expected that a given neuron would respond poorly to all modulations of carrier rates when they were above its maximum following frequency. Or one might predict that a neuron might respond strongly to modulation frequencies that were at or below its maximum following frequency regardless of the carrier rates. In this section, we will describe the influence of these three factors (modulation frequency, carrier rates, and Fmax) on the responses of IC neurons to SAM pulse trains.
We estimated Fmax for 207 IC neurons (Fig. 2). The average Fmax for these neurons was 104 pps, and the modal Fmax was 60 pps. We have arbitrarily divided these neurons into three groups: low, medium, and high temporal resolution groups. Low-resolution neurons were those with an Fmax <60 pps. Medium-resolution neurons were those with an Fmax between 60 and 120 pps. High-resolution neurons were those with an Fmax >120 pps. In the next several figures, the responses of low-, medium-, and high-resolution neurons will be illustrated to both unmodulated and SAM pulses. Representative responses are illustrated across a range of modulation frequencies and carrier rates to demonstrate the influence of these parameters.
|
Figure 3 illustrates the responses of one
low-resolution neuron to both unmodulated pulse trains (Fig.
3A) and SAM pulse trains (Fig. 3B). This neuron
has a Fmax of ~30 pps. Given this Fmax, one might predict that this
neuron would respond to modulation frequencies of 30 Hz. However,
this low-resolution neuron responded with a strong onset response and a
weak modulation response at the lowest carrier rate of 96 pps (Fig.
3B, top) At higher carrier rates (e.g., 417 pps; Fig.
3B, bottom), there was only an onset response.
The average time locked response (total response minus the onset
response for all carrier rates) is low (~3 spk/c) at the lowest
modulation frequency (8 Hz) and decreases at higher modulation
frequencies to <1 spk/c (Fig. 3C). Thus Fmax, a measure of
temporal resolution using unmodulated pulse trains, is a poor predictor
of this neuron's sensitivity to modulation frequency. However, Fmax is
a reasonable predictor of this neuron's sensitivity to changes in
carrier rate in the sense that it correctly predicts that it does not
respond to high rate pulses (modulated or not). However, this extreme
sensitivity to carrier rate was observed in only seven (3%) IC
neurons: five were low-resolution neurons, one medium-resolution
neuron, and one high-resolution neuron.
|
The responses of most IC neurons were relatively insensitive to carrier
rate. In these neurons, Fmax was a relatively good predictor of the
neuron's sensitivity to modulation frequency and a poor predictor of
its sensitivity to carrier rate. Figure 4
illustrates the responses of a second low-resolution neuron in which
the modulated response was insensitive to carrier rate and the Fmax
underestimates the response to modulated signals. Its Fmax is estimated
to be 40 pps (only slightly higher than that of the previous neuron).
However, at identical peak intensity, it responded strongly to all SAM
carriers from 96 to 345 pps (the highest carrier rate tested) and to
modulation frequencies 40 Hz (the highest modulation frequency
tested). Carrier rate had little influence on this neuron's modulation
response. Ignoring for the moment the aliasing in this neuron's 40/96
response (i.e., the response to 40-Hz modulation of a 96-pps carrier),
the amplitude of the modulation response in spk/c is similar for all
carrier rates
345 pps (vertical columns, Fig. 4B).
Moreover, in Fig. 4C the average spk/c over all SAM carriers
(
) is greater than the amplitude of responses time-locked to
unmodulated pulses (
) at all repetition rates. Thus Fmax is a poor
predictor of this neuron's response to SAM of high rate carriers but a
good predictor of its response to modulation frequencies.
|
In addition to responding to relatively high rate carriers, this neuron responded to modulation frequencies above its Fmax. Its response to 40-Hz modulations (e.g., its 40/345 response, Fig. 4B) is much stronger than its response to 40 pps (last response, Fig. 4A). Excluding the onset burst, this neuron responded at a rate of <1 spk/p to 40-pps unmodulated pulse trains (Fig. 4C), whereas its average response rate to 40-Hz modulations is 7 spk/c. It might be argued that at this intensity there are many more pulses above threshold in a 40/345 pps stimulus than there are in a 40-pps pulse train of comparable duration, and therefore one might expect that the modulated response would be larger than the unmodulated response. However, the carrier rate has little influence on either the amplitude or temporal dispersion of the modulated response. There are four times as many pulses in each modulation cycle of a 345-pps carrier than in a 96-pps carrier. Yet at all modulation frequencies, the responses at both carrier rates are similar. Moreover it is clear that this neuron does not resolve the individual pulses in these carriers. The temporal dispersion (vector strength) of responses evoked by modulated pulses at all carrier rates is comparable to that observed in the responses to unmodulated pulses. Thus this neuron is not responding to each supra-threshold pulse in the 40-Hz modulation of the carrier. Rather, it has a stronger response to 40-Hz modulation of a carrier than to pulses presented at 40 pps.
The characteristics displayed by this neuron are typical of most IC neurons. In 138 (74%) of the 185 neurons in which it could be measured, modulation responses were relatively insensitive to carrier rate, i.e., carrier filter function was low-pass with an upper cutoff frequency >600 pps (Fig. 5). However, like most IC neurons, this neuron is very sensitive to modulation frequency. It did not respond to any carriers modulated at frequencies >40 Hz. Typically the modulation response of an IC neuron was a low-pass function; 70% were low-pass, whereas 30% were band-pass and the mean maxFm was 42.2 Hz with a range of 8-220 Hz (Fig. 6). Thus the responses of the neuron illustrated in Fig. 4 are typical of IC neurons, although atypically its maxFm is higher than its Fmax.
|
|
Figure 7 illustrates the responses of a medium-resolution neuron with modulation responses that are not sensitive to carrier rate and similar to the low-resolution neuron illustrated in Fig. 4. The frequency series (Fig. 7A) demonstrates that this neuron has an Fmax of ~70 pps. Yet when it is stimulated with all SAM carriers varying from 96 to 417 pps (Fig. 7B), it responds strongly at modulation frequencies from 8 to 24 Hz. At a modulation frequency of 30 Hz, the modulation response is clearly diminished at lower carrier rates (30/96 and 30/185), whereas at higher carrier rates, the response is relatively strong. At a modulation frequency of 40 Hz, the modulation response is diminished even further and the nature of the responses is dependent on the carrier rate. Its 40/96 response (response to 40-Hz modulation of a 96-pps carrier) displays aliasing (see following text) and consists of spikes time locked to alternating modulation cycles, so that the primary component of the response corresponds to 20 Hz. At higher carrier rates (40/185 and 40/268), the overall response and its 20-Hz component decreases progressively. At the highest carrier rates (40/345 and 40/417), the overall response recovers somewhat and the primary component corresponds to 40 Hz. In Fig. 7C, the neuron's responses in terms of spk/s and spk/c are illustrated. Both the time locked (spk/c) modulated and (spk/p) unmodulated responses are low-pass functions with comparable cutoff frequencies. Thus Fmax in this neuron is a good predictor of the maxFm, although it is a poor predictor of the effect of carrier rate. Modulation responses are relatively insensitive to carrier rate but are highly sensitive to modulation frequency.
|
The responses of the previous two neurons are typical of 74% of
IC neurons. Their modulation responses were insensitive to carrier rate and were a low-pass function of modulation frequency. Figure 8 illustrates the response of a
second medium-resolution neuron that is sensitive to both carrier and
modulation frequency. The frequency series (Fig. 8A)
demonstrates that this neuron responds strongly when stimulated with
unmodulated pulses at repetition rates 40 pps. Above that rate, the
time-locked responses diminish progressively but are still measurable
100 pps. Thus the Fmax of this neuron is ~100 pps. When stimulated
with a SAM carrier of 96 pps (Fig. 8B, top), this neuron
responds to modulations from 8 to 40 Hz. However, at carrier rates
>100 pps, its response is progressively attenuated, and this
attenuation is modulation rate dependent. For example, modulation of a
185-pps carrier (Fig. 8B, 2nd row), produces a
very weak response at 8-Hz modulation (8/185) but a strong response at
all other modulation frequencies. In contrast, modulation of a 268-pps
carrier produces weak modulation responses at all modulation
frequencies. Thus responses of this neuron are sensitive to both
carrier rate and modulation frequency. The carrier filter is a low-pass
function with a cutoff frequency of ~185 pps, and the modulation
filter is also low-pass with a cutoff of just above 40 Hz. Moreover,
the relationship between Fmax and maxFm is the reverse of that seen in
the previous neuron. Again ignoring 40/96 responses, the 40 Hz
modulation responses (Fig. 8B) are much weaker than those to
40 pps (Fig. 8A), indicating that this neuron can follow
higher repetition rates when the pulses are unmodulated than when they
are modulated. Sensitivity to both modulation and carrier rates was
observed in 26% (44 of 185) of IC neurons.
|
Figure 9 illustrates the responses of a
high-resolution neuron. This neuron responds strongly to unmodulated
pulses delivered at rates 90 pps, responds moderately to pulses at
rates of 100-150 pps and responds weakly to pulses 150 pps (Fig.
9A). At a pulse rate of ~350 pps, it no longer responds in
a time-locked fashion, therefore its Fmax is ~350 pps. When
stimulated with SAM pulse trains (Fig. 9B), this neuron
displays strong phase-locked responses to modulation frequencies from 8 to 30 Hz. It displays modest responses to 40-Hz modulation and weak but
significant responses to modulations from 60 to 100 Hz, especially at
carrier rates >345 pps. Thus this high-resolution neuron can follow
higher modulation frequencies than any of the previous lower resolution
neurons, but its maxFm is still markedly lower than its Fmax. Figure
9C demonstrates that the modulation transfer function (
)
parallels the frequency transfer function (
) although it rolls off
at a much lower repetition rate. This suggests that there is a tendency for neurons that display high resolution to unmodulated pulses to
display higher resolution to modulated carriers. However, this is just
a tendency. In Fig. 10 maxFm is plotted
as a function of Fmax for 129 neurons for which both values could be
determined. Most points fall below the equal frequency contour
(- - -), indicating, as suggested previously, that the maxFm of most
IC neurons less their Fmax. Linear regression analysis (
) indicates
that on average maxFm is approximately half Fmax. The positive slope of
the regression line confirms that there is a tendency for maxFm to
increase as Fmax increases, but the low
r2 value (0.101), indicates that this
relationship is relatively weak.
|
|
An examination of the modulated responses of the high-resolution neuron in Fig. 9 illustrates two additional important aspects of modulated responses that are characteristic of many IC neurons responding to SAM electrical pulse trains. First, at low modulation frequencies, these high-resolution neurons resolve individual pulses in the carrier. For example, in response to an 8/96 stimulus (Fig. 9B, top left), this neuron produces a series of four spike clusters, each cluster corresponding to a maximum in the 8-Hz envelope of the stimulus. Within each cluster there are a series of peaks, each time locked to the 96-pps carrier. Thus each supra-threshold pulse in the carrier evokes a spike, and the envelope modulates the probability of a spike. In this way, both the modulation frequency and carrier rates are represented in these low-modulation-frequency/low-carrier-rate signals. Second, in response to low carrier rates modulated at high modulation frequencies (low Fc to Fm ratios), there are distortions in the response, i.e., peaks in the response at unexpected intervals. These unexpected distortion intervals arise as a result of under-sampling of the modulation frequency by the carrier rate. For example, when stimulated by a 96-pps carrier modulated at 40 Hz, the response has peaks that are separated by intervals that are multiples of the modulation interval, i.e., this neuron consistently fails to discharge during certain modulation cycles. Similar response interval doubling can be seen in responses described previously (see 30/96 in Fig. 4B; 40/96 and 40/185 in Figs. 4B, 7B, 8B, and 9B). In other cases, distortions take the form of low-frequency modulation of the modulated response. For example, when stimulated by 30/96 or 60/185, the neuron illustrated in Fig. 9B responds with spikes clustered at peaks corresponding to 60-Hz modulations, but these peaks are themselves modulated at a frequency of just over 6 Hz. The 6-Hz distortions produce three modulations of the 60-Hz responses over the 500-ms recording interval. Similar, but higher, "aliased" modulations can be seen in this neuron's responses, when it is stimulated with 80/185 to 80/493 modulation/carrier combinations (top 4 responses at 80-Hz modulation). When stimulated at higher carrier rates (80/606 and 80/714), the overall amplitude of the response is diminished, but the response is time-locked to 80 Hz. Thus when IC neurons respond to AM pulse trains and carrier rates less than three to six times the modulation frequencies, the major response intervals are unpredictable and severe distortions in the temporal representations of the modulation frequency occur.
An examination of the stimuli that give rise to these distortions
indicates that they arise as a direct result of under-sampling of the
modulation frequency by a pulsatile carrier rate. When an IC neuron is
stimulated with pulsatile carrier (e.g., 96 pps) that is modulated at a
modulation frequency (30 Hz) that is greater than 1/4 the
carrier rate, predictable "under-sampling" of the modulation frequency by the carrier rate can occur. This under-sampling occurs when the peak stimulus current is near the neuron's threshold, as it
often will be given the narrow dynamic range of intracochlear electrical stimulation and when the depth of modulation is sufficient to reduce the minimum pulse current below the neuron's threshold. The
basic nature of this interaction is illustrated in Fig.
11, a diagram of 15 cycles (500 ms) of
a 96-pps carrier modulated at 30 Hz; indicates the modulation
envelope, and - - - indicates a hypothetical neural threshold at 4 dB
below peak current. By focusing on the supra-threshold pulses, three (6 Hz) modulations of the carrier can be seen clearly above threshold. The
amplitude and timing of these model "supra-threshold" pulses
closely match the discharges produced in the 30/96 response illustrated
in Fig. 9B. Thus under certain conditions, under-sampling
occurs that can lead not only to modulation of the modulation response
but also to variations in the timing of neuronal discharges such that 1) they occur over a range of intervals that are integral
multiples of the carrier interval and 2) whole modulation
cycles are skipped altogether (see for example the 40/96 response in
Fig. 8 and the 40/185 response in Fig. 9). Our data indicate that these
distortions can occur despite the fact that the carrier rate is 2.5 times the modulation frequency, i.e., they occur at modulation
frequencies that are well below the Nyquist frequency for the carrier.
The responses of IC neurons indicate that the pulsatile carrier rates must be between four and six times the modulation frequency to avoid
the production of these undesirable distortions. Thus when carrier
rates are less than three to six times the modulation frequencies, the
major response intervals are unpredictable and severe distortions in
the temporal representations of modulation rates occur.
|
Band-pass modulation responses
In all the examples presented (and most IC neurons examined), the
modulation responses are low-pass, i.e., they are equally strong to
modulation frequencies below some 6-dB cutoff frequency (modF6dB).
Above this cutoff frequency, the responses diminish until at some
maximum frequency they consist of only an onset burst. However, 30% of
IC neurons have band-pass modulation responses. They respond poorly to
low and high modulation frequencies and produce only an onset burst,
but at intermediate modulation frequencies they produce a strong
modulation response. One such neuron is illustrated in Fig.
12. Figure 12A illustrates
that this neuron is a low-resolution neuron with an Fmax of ~60 pps.
When presented with SAM pulses modulated at 8 Hz (Fig. 12B),
this neuron responds with only an onset burst at all carrier rates
tested (96-606 pps). However, as the modulation frequency increases to
16 and 24 Hz, the neuron's modulation response increases such that
there are strong responses at all carrier rates. At still higher
modulation frequencies (>30 Hz), the modulation response decreases,
until at 40-Hz modulation, the response again consists primarily of an
onset response at carriers >268 pps. As illustrated in Fig. 12C, the frequency transfer function (in spk/p, ) is
low-pass, whereas the TMTF (in spk/c,
) is band-pass. The modulation
responses of this neuron are typical of IC neurons (30%) with
band-pass modulation responses. The mean maxFm of this neuron was 17.5 Hz. The mean maxFm of all band-pass neurons was 34.2 Hz (Fig.
13).
|
|
Effects of modulation depth
Given the differences between responses to modulated and unmodulated pulses trains, it is clear that modulation depth must also be an important factor determining the response of an IC neuron. Most of these neurons do not respond to an unmodulated (0% modulation) pulse train at 200 pps, whereas almost all respond vigorously to that same pulse train when it is 100% amplitude modulated at some appropriately low modulation frequency (e.g., 30 Hz). Figure 14 illustrates the response of a typical IC neuron to unmodulated and modulated signals at various modulation frequencies and depths. In response to unmodulated pulses, this neuron has an Fmax of 50 pps (Fig. 14A).
|
The modulation response of this neuron also decreases dramatically at modulation frequencies between 30 and 60 Hz (Fig. 14B). At a modulation frequency of 60 Hz, this neuron's modulation response decreases dramatically at all carrier rates. There is only a weak 60-Hz modulation response using carriers of 185 pps and no significant modulation response (i.e., only an onset response) at carriers >345 pps. Thus the frequency transfer function and modulation transfer function of this neuron are similar, and Fmax and maxFm are ~50-60 Hz.
Notice also that there are clear modulation distortions of ~6 Hz in the 30/96, 30/185, and 60/185 modulation responses. Using carrier rates higher than the neuron's Fmax, the frequency and magnitude of these distortions vary dramatically with modulation depth. When this neuron is stimulated with 30-Hz AM of a 96-pps carrier at depths from 0 to 100% (Fig. 14C), its response varies from no time-locked response to a strong time-locked response. As expected from the responses illustrated in the frequency series, 0% modulation of the 96-pps carrier produces only an onset burst. At modulation depths from 5 to 20%, there is a small modulation response consisting primarily of an onset burst with a small number of time-locked spikes. As the modulation depth increases from 20 to 100%, a sustained time-locked modulation response increases progressively. At modulation depths of 30-50%, this neuron's response has peaks at intervals corresponding to three times the modulation interval (or a frequency of 10 Hz). At depths of 60 and 70%, the dominant peaks are at intervals corresponding to twice the modulation interval (15 Hz) with some intervals corresponding to the 30-Hz modulation; at modulation depths of 80% and higher the dominant response interval corresponds to 30 Hz, but there are intervals between response peaks corresponding to 15 Hz. Clearly, distortions in the temporal response patterns of this neuron (and most IC neurons) to 30-Hz modulation of a 96-pps carrier are a function of modulation depth.
These temporal distortions in the form of variations in the response intervals as a function of modulation depth are due to under-sampling (aliasing) of the modulation frequency by the carrier rate since they disappear when modulation-to-carrier rate ratios decreased. For example, when the carrier rate is increased from 96 to 268 pps (Fig. 14D), this neuron's 30-Hz responses make a smooth transition from the expected onset only response at 0% modulation to a 30-Hz time-locked modulation response at 100% modulation. There are no distortions in the temporal response patterns aside from adaptation.
One final parameter that influences the magnitude of distortions in the temporal response patterns of modulated responses is overall stimulus intensity. As expected, when the intensity of an AM stimulus increases, the response magnitude also increases. The amplitudes of the response peaks increase usually, but not always, monotonically. However, there is also a systematic decrease in temporal distortion. The dominant intervals in response peaks reduce to the modulation interval. Figure 14E illustrates the response of another IC neuron to 30-Hz modulation of a 97-pps carrier presented at 10 intensities from below threshold to 8 dB above threshold. At near threshold intensities (0 and +1 dB), the dominant intervals are three to four times the modulation interval. At 2 dB above threshold, the dominant response intervals are equal to the modulation interval and two times the modulation interval. Thus at these intensities this neuron systematically fails to respond to certain modulation cycles. At all higher intensities, the dominant response peaks are all at intervals that correspond to the 30-Hz modulation intervals. As discussed in the following text, these changes in the distortions of the temporal response patterns are directly related to the signal itself.
Thus the frequencies represented in the temporal response patterns of IC neurons are the result of complex interactions between each neuron's temporal resolution and each signal's modulation frequency, carrier rate, modulation depth, and overall intensity. However, when the carrier rate is more than four to six times the modulation frequency, neuronal response patterns are largely dependent on the temporal resolution of the neuron.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In previous studies, we estimated temporal resolution of IC
neurons by stimulating the auditory system with unmodulated electrical pulse trains (Snyder et al. 1995; Vollmer et al.
1999
). The current experiments extend the characterization of
IC neural responses to ICES to include SAM pulse trains, to determine
the effects modulation frequency and carrier rate, and to estimate the
range of modulation frequencies to which these neurons can synchronize. Carrier rates were varied between 100 and 1,000 pps, and modulation frequencies were varied between 10 and 400 Hz to encompass the range of
signals used in current clinical CI speech processors.
Intersubject variation in age at deafening and experience with electrical stimulation
These studies were conducted in deaf animals to avoid
electrophonic or indirect activation of AN fibers either by
electromechanical movements of the basilar membrane (Mountain
and Hubbard 1994; Xue et al. 1995
), producing
depolarization of cochlear inner hair cells or by electrical
depolarization of the hair cells themselves. Restricting the activation
of AN fibers solely to direct electrical stimulation makes
interpretation of these results simpler and more closely models
electrical stimulation in profoundly deaf cochlear implant users.
Subjects were adult cats with varied deafening and stimulation
histories. Five subjects were deafened as adults, and nine were
deafened as neonates. Moreover, most animals (12 of 14) were stimulated
chronically. It should be noted that these differences in experience
have been shown to result in quantitative differences in temporal
resolution of IC neurons (Snyder et al. 1990, 1991
, 1995
; Vollmer et al. 1999
). However, these
differences are relatively small and require statistical analysis of
large numbers of animals and neurons to evaluate. Moreover in the
current study, all of the described response types to electrical SAM
pulses were observed in all animal groups. Thus age at deafening and
experience with ICES does not result in major or qualitative
differences in responses of IC neurons. Since this study is the first
to describe the IC responses to ICES using SAM pulses, the data are
largely categorical and not analytical. The major goals are to
characterize the most common AM response classes of IC neurons and to
correlate AM responses with responses of the same neurons to
unmodulated pulses. It is beyond the scope of this initial
investigation to correlate variations in AM responses to age at
deafening or prior stimulation experience. However, subsequent studies
using these data as a foundation will describe these correlations.
Temporal resolution of IC neurons to modulated and unmodulated intracochlear electrical pulses
In agreement with previous studies (Snyder et al.
1995; Vollmer et al. 1999
), we found that the
responses of virtually all IC neurons are low-pass functions of the
unmodulated pulse rate. Although the range of Fmax across the
population (10-350 pps) was somewhat less than that previously
observed (10-710 pps, Snyder et al. 1995
), the
population mean Fmax (104 pps) was comparable to that seen previously
(109 pps, Snyder et al. 1995
; 117 pps, Vollmer et
al. 1999
). In contrast, the modulation responses of most (74%)
IC neurons are independent of the carrier rate. Many (30%) are
band-pass functions of modulation frequency, although the remaining
modulation responses (70%) are low-pass functions of modulation
frequency. Moreover the ability of IC neurons to encode modulation
frequencies is lower than that to encode unmodulated pulse rates. The
population mean maximum modulation frequency (mean maxFm) was 1/2 to
1/3 the mean Fmax. Mean maxFm for neurons with low-pass TMTFs are
comparable to those with band-pass TMTFs, 42.2 versus 34.2 Hz, respectively.
The maxima for temporally synchronized unmodulated (Fmax) and modulated
(maxFm) responses in IC neurons are far below comparable maxima in AN
fibers. AN fibers entrain (respond on a 1-to-1 basis) to pulses 800
pps and synchronize (show significant phase locking) to electrical
stimuli up to several kilohertz (Dynes 1996
;
Hartmann and Klinke 1989a
,b
; Parkins 1989
; van den
Honert and Stypulkowski 1984
). Thus the temporal resolution of
IC neurons to ICES is one or even two orders of magnitude below that
observed in the auditory nerve and does not reflect limitations on
temporal resolution imposed by these peripheral nerve fibers. Moreover
based on the response of auditory neurons at levels below the inferior
colliculus to modulated and unmodulated acoustic signals (see following
text), we hypothesize that this dramatic difference in temporal
resolution is a product of processing within the IC and does not
reflect limitations in temporal processing imposed by its afferents
from the cochlear nucleus and pons.
The temporal resolution observed in the responses of cochlear and lower
brain stem neurons to acoustic signals is much higher than that seen in
IC neurons and much higher than that seen in psychophysical studies.
The discharges of AN fibers are low-pass functions for tone and
modulation frequencies and can synchronize to relatively high
repetition rates. They show significant phase-locking to acoustic pure
tones at frequencies >3 kHz (Johnson 1980) and synchronize to the modulations of amplitude modulated tones
1.6-2 kHz (see Greenwood and Joris 1996
; Joris and Yin
1992
; Kim et al. 1990
; Rhode and
Greenberg 1994
).
Neurons in the CN and superior olivary complex also synchronize to
relatively high pure tone and modulation frequencies. Indeed some of
them appear to be specialized to encode temporal representations of
these frequencies (Joris et al. 1994). Most types of CN
neurons have low-pass frequency- and modulation-transfer functions with population means for maximum following frequency and maximum modulation frequency that are comparable to those seen in the AN (Joris and Yin 1998
; Kim et al. 1990
; Rhode and
Greenberg 1994
). In addition, the synchronized responses of
these neurons have higher maximum synchronization coefficients and
lower sensitivity to stimulus intensity than AN fibers (Joris et
al. 1994
; Rhode and Greenberg 1994
).
Neurons in the superior olivary complex (SOC) display similar or
slightly lower maximum following to tones and modulation frequencies to
AM signals, but these values are still significantly higher than those
seen in the IC. Neurons in the medial nucleus of the trapezoid body
(MNTB), for example, have 3-dB cutoff frequencies that are distributed
across a range of modulation frequencies that is identical to that of
AN fibers, whereas most neurons in the lateral superior olive (LSO) are
unable to follow repetition (pure tone and modulation) rates >800 Hz
and have a population mean maximum modulation frequency of 500-600 Hz
(Joris and Yin 1998). An exception to the uniformly high
resolution of SOC neurons is found in the response of neurons in the
medial superior olive (MSO). Although the data are rather limited,
these neurons are able to follow monaural tones to relatively high
(>800 Hz) frequencies (Yin and Chan 1990
), though they
appear to follow only relatively low modulation frequencies. The
highest modulation frequency which MSO neurons have been reported to
follow is ~250 Hz, and the mean maximum modulation frequency for the
six reported neurons is ~200 Hz (Joris 1996
).
In contrast to the high temporal resolution of AN, CN, and SOC neurons,
the temporal resolution of IC neurons to acoustic stimulation is
relatively low. Discharges of the average IC neuron are able to
synchronize to amplitude modulated CF tones at modulation frequencies
only up to ~100 Hz (Batra et al. 1989; Langner
and Schreiner 1988
; Rees and Moller 1983
;
Rees and Palmer 1989
; Reimer 1987
).
Rees and Møller (1987)
estimated the modal maximum
modulation frequency of IC neurons in rats to be 100-120 Hz.
Reimer (1987)
reported that in the rufous horseshoe bat
71% of IC neurons had best modulation frequency or BMFs of IC neurons
(defined as the modulation frequency that produces the maximum
correlation coefficient)
100 Hz, and 77% of these neurons had
maximum modulation frequencies <200 Hz. Langner and Schreiner
(1988)
estimated BMFs (defined as the modulation frequencies
that produces the highest spike rate) and reported that 98% had BMFs
<300 Hz, 23% had BMFs <30 Hz. Although these authors did not
calculate a mean BMF, Langner (1992)
has suggested that
the average would have been between 30 and 100 Hz. Batra et al.
(1989)
reported the mean BMF for IC neurons in the rabbit to be
87 Hz. Burger and Pollak (1998)
found that 73% of IC
neurons in the mustache bat produced synchronized responses only to
modulation frequencies <300 Hz and only 9% of their neurons
phase-locked to modulation frequencies >300 Hz. These results using
acoustic stimuli indicate that the temporal resolution of IC neurons in
normal hearing mammals is comparable to, although somewhat higher than,
that reported here using SAM of electrical pulse trains in deaf cats.
Psychophysical and physiological measures of temporal resolution to electrical pulses
The measures of temporal resolution in IC neurons reported here
for ICES are comparable to psychophysical measures of temporal resolution or rate pitch in human subjects. Several studies have examined rate pitch discrimination in CI users with multi-channel devices (Eddington et al. 1978; Shannon 1983
,
1992
; Tong et al. 1983
; Townshend et al.
1987
). Eddington et al. (1978)
reported that
rate or periodicity pitch increased with pulse rate and that pitch
saturation occurred at 800 pps. They also reported discriminations as
small as ±25 Hz but only up to a maximum rate of 400 pps. Both Shannon (1983)
and Tong et al. (1983)
reported pitch saturation at ~300 Hz. Townshend et al.
(1987)
reported difference limens of a few pps at pulse rates
<100 pps but a rapid increase to saturation at 150 pps with pulse
rates of 200-1,000 pps. Thus the temporal resolution of the human
auditory system to electrical activation with unmodulated pulses is
limited to rates <300 Hz in most subjects.
These frequency discrimination tasks are difficult to interpret since
changing the rate of electrical stimuli also changes many perceptual
cues, including pitch and loudness (Pfingst 1988; Shannon 1983
; Tong et al. 1983
).
Shannon (1992)
tried to address this question by
measuring the ability of CI users to detect changes in SAM and beats of
high rate carriers, whose percept changes little over the modulation
frequency range used. He studied implant users with three different
multi-channel devices using both pulses and sinusoids at several
stimulus levels. He found that most subjects could detect modulation
depths of <3% and some could detect depths of ~1%. The
TMTFs of most of these implant patients were low-pass with an average 3 dB cutoff rate of 148.8 Hz (100-200 Hz) and a maximum discrimination
rate of 300-500 Hz. Moreover he found that the shape of these
functions and their cutoff frequencies were independent of stimulus
level once the level was >4-5 dB SL. Shannon and Otto
(1990)
reported MTFs from eight deaf subjects with electrodes
placed near their cochlear nucleus (auditory brain stem implants). They
also reported low-pass MTFs with modulation detection decreasing at
frequencies >100 Hz. Busby et al. (1993)
reported
modulated pulse duration functions in seven subjects. Although
modulation of pulse duration has some aspects in common with AM, these
two are not equivalent. Nevertheless, the MTFs in two of their subjects
were flat out to 250 Hz (the highest frequency tested), whereas MTFs of
the remaining subjects were low-pass functions with 50-100 pps cutoff
frequencies. Thus the results of psychophysical studies of AM in CI
subjects reinforce the notion that the central auditory system has
limited temporal resolution with cutoff frequencies between 100 and 300 Hz.
In normal-hearing subjects comparable temporal resolution studies have
used AM noise to examine nonspectral (periodicity) pitch
discrimination. Burns and Viemeister (1976) found that
the existence region for periodicity pitch was a broadband-pass
function with a maximum that varied among subjects from 400 to nearly
1,000 Hz. Viemeister (1979)
and Bacon and
Viemeister (1985)
determined sensitivity to modulation depth
rather than the detection of pitch changes for normal-hearing subjects.
They reported that modulation sensitivity was also a low-pass function
with a 3-dB cutoff frequency of 50 Hz and saturation at 500-1,000 Hz.
These values for acoustic stimulation in normal hearing subjects are
slightly higher than those reported for cochlear implant subjects.
However, Shannon (1992)
suggests that these differences
in cutoff frequency may be more apparent than real, given the
differences in the dynamic ranges between acoustic and electric
stimuli. In either case, the maximum cutoff frequency (~300 pps)
reported here for IC neurons following electrical stimuli is within the
range of estimates of maximum resolution seen for normal-hearing
subjects and CI users.
Significance of speech processing in cochlear implants
In the last decade, there have been several significant
improvements in speech processing for cochlear implants. New strategies provide increased benefits in speech reception and speech production for some CI users (Brill et al. 1997; McDermott
et al. 1992
; Wilson et al. 1995
;
Zierhofer et al. 1997
). Typically, these speech
processors divide the speech signal into four to eight frequency bands
and extract the envelope of each band. These envelopes are compressed (usually using a logarithmic compression function) to map the relatively wide dynamic range of normal speech (30-40 dB) into the
relatively narrow dynamic range of electrical hearing (4-12 dB) from
threshold to maximum comfortable loudness. The compressed envelopes are
then used to amplitude modulate a continuously generated, fixed
pulse-rate carrier, which can be as low as 250 pps (McDermott et
al. 1992
) or higher than 800 pps (Brill et al.
1997
; Wilson 1997
). The carrier rate is
identical for all processor channels, but carriers may be offset in
time so that the pulses on individual channels are interleaved. Each
modulated carrier is delivered to one intracochlear contact. The
highest frequency speech band is delivered to an electrode located in
the highest frequency cochlear region and successively lower frequency
bands are delivered to electrodes in successively lower frequency regions.
The modeled output of one six-channel CI processor (Wilson et
al. 1991) is illustrated in Fig.
15. The input to the processor (Fig.
15A) is the vowel-consonant-vowel utterance /aba/
articulated by an adult male speaker whose voice-pitch frequency, F0,
is ~100 Hz. The glottal pulses and the voicing time-waveform for the
first /
/ vowel, which lasts for ~130 ms, are shown in more
detail in Fig. 15B. The envelope amplitude of the full-wave
rectified, low-pass filtered output of each band-pass filter is shown
in Fig. 15C with the envelope of the lowest frequency
channel 1 (ch.1) at the top and the highest frequency
channel 6 (ch.6) at the bottom. Due to the spectral content
of the utterance and the filter characteristics of the band-pass
filters, the peak envelope amplitudes vary from channel to channel. The
envelope peaks occur at F0, the voice pitch in all six channels. These
peaks are relatively small in channels 1 and 6, whereas those in
channels 2 and 3 are relatively large. In Fig. 15D the
envelopes are logarithmically compressed, mapped into a dynamic range
of 10 dB and used to modulate a constant pulse-rate (833 pps) carrier.
These modulated carriers reproduce the overall output of a six-channel
processor.
|
The processed information of each speech band is represented in the AM of each pulse train. Several points should be noted regarding these outputs. First, the overall amplitudes of these pulse trains across all channels are very similar. The similarities in amplitude across frequency bands are due, of course, to the logarithmic compression and mapping of the wide dynamic range of the speech-band envelopes (30-40 dB) into the relatively limited dynamic range (in this case 10 dB) of electrical hearing. Second, the modulation depth in all six bands is relatively modest (<30%). This is also due in part to compression and in part to the DC offset produced by rectification. Third, although compressed, modulation depth varies significantly across channels. Channels 1, 2, and 5 are relatively strongly modulated, whereas channels 3 and 6 are nearly unmodulated. Finally, the major modulation frequency corresponds to ~100 Hz (the voice pitch frequency).
Given signal outputs of a processor like that illustrated in Fig. 15D and the observation that most IC neurons are insensitive to carrier rate but they are extremely sensitive to modulation depth, at least two predictions can be made regarding the responses of IC neurons to these processed signals. The first prediction is that the neural activity evoked by each output channel will be only weakly related to spectral energy content in each pass-band. This weak relationship arises both from the logarithmic compression used in the processor and from the low cutoff frequency of most IC neurons to unmodulated pulse trains. For example, despite the difference in the overall envelope amplitudes of the signals in channels 1 and 2 (Fig. 15C, top 2 traces), IC neurons should respond strongly and virtually identically to both these signals. However, they should respond only weakly, if at all, to the large signal in channel 3 since it is a virtually unmodulated 833-pps pulse train. Thus the energy in some channels may be dramatically under-represented in the activity of IC neurons, whereas the energy of others will be dramatically over-represented given their lower acoustic energy content.
A second prediction is that the temporal representation of the speech envelope may be strongly distorted in cases in which envelope modulation-frequencies (voice pitch) are 250 Hz, as occurs when some women and most children produce vowels like the /e/ in "heed." These envelope frequencies will be under-sampled (exceed the 1:4 Fm:Fc ratio) and will be distorted even with relatively high rate carriers, those >400 pps but <1,000 pps (Figs. 9B and 14). Therefore during the steady-state portions of these vowels, neurons will respond either to the onset of the unmodulated pulse train or to an under-sampled modulation frequency. In either case, the amplitude and timing of the neuronal responses will have a complex and idiosyncratic relationship to the speech signal.
![]() |
ACKNOWLEDGMENTS |
---|
We thank Dr. D. G. Sinex and J. H. LaVail for helpful comments on the manuscript and Dr. P. C. Loizou for advice on parameters for CIS processors. We also thank B. Dwan for invaluable assistance with these experiments.
This work was supported by a contract from the National Institute on Deafness and Other Communication Disorders, Fundamental Neuroscience Program Contract N01-DC-7-2105 and Deutsche Forschungsgemeinschaft Grant Vo 640/1-1.
![]() |
FOOTNOTES |
---|
Address for reprint requests: R. L. Snyder, Epstein Laboratory, University of California, Box 0526, San Francisco, CA 94143-0526 (E-mail: rsnyder{at}itsa.ucsf.edu).
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 23 July 1999; accepted in final form 24 March 2000.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|