Influence of Head Position on the Spatial Representation of Acoustic Targets

H.H.L.M. Goossens and A. J. van Opstal

Department of Medical Physics and Biophysics, University of Nijmegen, NL-6525 EZ Nijmegen, The Netherlands


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Goossens, H.H.L.M. and A. J. van Opstal. Influence of head position on the spatial representation of acoustic targets. Sound localization in humans relies on binaural differences (azimuth cues) and monaural spectral shape information (elevation cues) and is therefore the result of a neural computational process. Despite the fact that these acoustic cues are referenced with respect to the head, accurate eye movements can be generated to sounds in complete darkness. This ability necessitates the use of eye position information. So far, however, sound localization has been investigated mainly with a fixed head position, usually straight ahead. Yet the auditory system may rely on head motor information to maintain a stable and spatially accurate representation of acoustic targets in the presence of head movements. We therefore studied the influence of changes in eye-head position on auditory-guided orienting behavior of human subjects. In the first experiment, we used a visual-auditory double-step paradigm. Subjects made saccadic gaze shifts in total darkness toward brief broadband sounds presented before an intervening eye-head movement that was evoked by an earlier visual target. The data show that the preceding displacements of both eye and head are fully accounted for, resulting in spatially accurate responses. This suggests that auditory target information may be transformed into a spatial (or body-centered) frame of reference. To further investigate this possibility, we exploited the unique property of the auditory system that sound elevation is extracted independently from pinna-related spectral cues. In the absence of such cues, accurate elevation detection is not possible, even when head movements are made. This is shown in a second experiment where pure tones were localized at a fixed elevation that depended on the tone frequency rather than on the actual target elevation, both under head-fixed and -free conditions. To test, in a third experiment, whether the perceived elevation of tones relies on a head- or space-fixed target representation, eye movements were elicited toward pure tones while subjects kept their head in different vertical positions. It appeared that each tone was localized at a fixed, frequency-dependent elevation in space that shifted to a limited extent with changes in head elevation. Hence information about head position is used under static conditions too. Interestingly, the influence of head position also depended on the tone frequency. Thus tone-evoked ocular saccades typically showed a partial compensation for changes in static head position, whereas noise-evoked eye-head saccades fully compensated for intervening changes in eye-head position. We propose that the auditory localization system combines the acoustic input with head-position information to encode targets in a spatial (or body-centered) frame of reference. In this way, accurate orienting responses may be programmed despite intervening eye-head movements. A conceptual model, based on the tonotopic organization of the auditory system, is presented that may account for our findings.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The generation of a rapid eye movement (saccade) toward a visual target involves not only the use of retinotopic visual input but also extraretinal signals such as changes in eye position. This was demonstrated by Hallett and Lightstone (1976), using the now classical visual double-step paradigm. Their study showed that accurate saccades can be made to the spatial location of a briefly flashed visual target even when the retinal and spatial location of this target are dissociated due to an intervening eye movement. In a subsequent study, Mays and Sparks (1980) showed that the saccadic system also compensates accurately for disturbances in eye position, induced by microstimulation of the monkey superior colliculus, just before a targeting saccade. This compensation in darkness does not rely on proprioception from extraocular muscles but, rather, on an internal representation of the eye movement derived from the oculomotor command (efference copy) (Guthrie et al. 1983).

These experiments seemed nicely in line with the hypothesis (Robinson 1975) that the saccadic system programs eye movements based on a target representation in head-centered coordinates. This model accounts for the remarkable accuracy of saccades to visual targets under open-loop conditions (i.e., without visual feedback) but also to sounds and somatosensory stimuli. Eye-position-dependent tuning properties of visual receptive fields in the primate parietal cortex have been interpreted as supporting evidence for this putative head-centered visuomotor programming stage (Zipser and Andersen 1988). More recently, similar eye-position-dependent response properties have been obtained also in other areas, such as primary visual cortex (Weyand and Malpeli 1993) and superior colliculus (Van Opstal et al. 1995).

An alternative hypothesis holds that the visuomotor system maintains a retinotopic representation of the saccadic goal such that the target coordinates are always relative to the most recent eye position. According to this idea, information about intervening eye displacements (i.e., the change in eye position, rather than eye position per se) are taken into account (Jürgens et al. 1981). This model was supported by recordings from the monkey frontal eye field, demonstrating the presence of all relevant signals (i.e., both retinal error and saccadic displacement) (Goldberg and Bruce 1990).

In the present study, we wondered in what frame of reference auditory targets are encoded when used as a goal for rapid orienting eye-head movements (referred to as gaze saccades; gaze triple-bond  eye-in-space = eye-in-head + head-in-space). This is not a trivial problem because the acoustic sensory input is represented tonotopically. Thus the positions of auditory stimuli, in contrast to visual stimuli, are not represented by a place code to begin with. Instead, sound localization relies entirely on implicit acoustic cues. Binaural difference cues, such as interaural level and timing differences (ILDs and ITDs), are used to extract sound-source azimuth, whereas elevation detection is based on the direction-dependent acoustic pinna filters (the so-called head-related transfer functions or HRTFs) (Batteau 1967; Hofman et al. 1998; Middlebrooks 1992; Oldfield and Parker 1984; Wightman and Kistler 1989; see also Blauert 1996, for extensive review).

Note, however, that these acoustic localization cues are all referenced with respect to the (in humans) head-fixed ears and thus define a Cartesian head-centered coordinate system. Therefore to program an accurate auditory-evoked eye movement in darkness, the auditory signal must be transformed into an oculocentric motor command. For this, the audiomotor system needs to account for the absolute eye position in the orbit. It now has been shown, for different species, that auditory-evoked saccades are indeed accurate, irrespective of initial eye position [Jay and Sparks 1984 (monkey); Frens and Van Opstal 1994 (human); Hartline et al. 1995 (cat)].

As likewise proposed for the visuomotor system (see preceding text), the audiomotor system could represent targets in a supramodal, e.g., space-fixed, reference frame. However, when all acoustic localization cues are available, it is not possible to dissociate head-centered from spatial coordinates. This problem is reminiscent of saccade control when the visual target is continuously present: it is then impossible to decide from behavioral data whether saccades are programmed on the basis of oculocentric or craniocentric target coordinates.

In the present paper, we have studied this problem in two different ways. First, following the approach of Hallett and Lightstone (1976), we investigated whether the audiomotor system compensates, in complete darkness, for changes in eye and head position in the absence of new acoustic cues (i.e., without acoustic feedback). Toward that means, we tested the orienting behavior of human subjects in a visual-auditory double-step paradigm, where the saccadic double-step response consisted of two subsequent eye-head movements. The rationale of this paradigm is explained in Fig. 1A.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1. Rationale of experiments 1 and 3 in this study. A: eye-head double-step experiment. Subject initially fixates the straight-ahead fixation target, F, with both eye and head. When this spot vanishes, 2 targets are presented briefly in rapid succession at randomly selected locations: a visual target, V, followed by a white-noise auditory target, N. Subject has to generate 2 successive eye-head movements in complete darkness: first to the site where the visual target was flashed, then to the site where the auditory target was presented. If the subject uses only the head-centered acoustic information, Th, to program the 2nd eye-head saccade starting at V, this movement will be incorrect, as indicated by &Mtilde;2. To perform accurately, the subject must account for the intervening eye-head movement, M1, to program the correct movement, M2. B: with the head in the straight-ahead position, auditory-evoked eye movements toward a pure tone have accurate azimuth components, but the elevation component of the saccade is unrelated to the real target elevation, due to the absence of adequate spectral cues (Frens and Van Opstal 1995; Middlebrooks 1992). Latter is illustrated for 1 of our subjects (JG; 5,000-Hz tone). Regression line: slope, a = 0.03; bias, b = 9.2°; correlation coefficient, r = 0.19 (n.s.). C: by presenting the tone at different spatial elevations, with the head kept in different vertical positions (eye and head aligned), 2 competing models can be tested: If sounds are represented in head-fixed coordinates (i.e., based only on head-centered acoustic cues), the saccade end points re. head will remain on the same horizontal line (here at elevation 9.2°), independent of static head elevation. If sounds are encoded in a space-fixed reference frame (i.e., when the acoustic input is combined with head position information), the saccade endpoints will compensate for the shift in static head position. In that case, the data will be aligned along a line with a slope of -1.

After looking at a fixation spot (F), the subject makes a gaze saccade, M1, toward a flashed visual target (V). Just before this eye-head movement, a brief auditory noise burst (N) is presented at a different site. The subject is required to look also at the spatial location of the acoustic target by making a second gaze shift. Because of the initial eye-head saccade, there is a dissonance between the spatial and head-centered coordinates of the acoustic target. Therefore if the subject relies only on the initial, head-centered acoustic input (Th), the second response will be incorrect, as indicated by the dashed arrow, &Mtilde;2. However, when the second response is spatially accurate, as indicated by M2, the subject must have combined the acoustic input with both eye- and head-movement signals.

The successful completion of this double-step task, however, does not necessarily require the use of a head-position signal. Therefore to further dissociate whether auditory-evoked orienting relies on a head-position signal, rather than on a head-displacement signal, we also studied the influence of static head orientation. Here we exploited the unique property of the auditory system that sound elevation is extracted independently from monaural pinna-related spectral cues (see also preceding text).

For spectrally rich acoustic targets (e.g., broadband noise stimuli), localization responses (measured with eye movements) are accurate in azimuth and elevation (Frens and Van Opstal 1995; Hofman and Van Opstal 1998). However, in the absence of spectral elevation cues (e.g., pure tone stimuli), localization is accurate in azimuth but not in elevation (Frens and Van Opstal 1995; Hofman et al. 1998; Middlebrooks 1992). Nevertheless subjects appear to have a consistent spatial percept of pure-tone targets, although the perceived elevation is unrelated to the actual stimulus elevation. This is illustrated in Fig. 1B, where it is observed clearly that the tone stimuli (5 kHz) were localized at a fixed elevation. This behavior appears to be frequency dependent in the sense that tones of different frequencies are perceived at different elevations (Frens and Van Opstal 1995). It has been proposed that this phenomenon may be understood from particular resonances in the pinna transfer functions (e.g., Middlebrooks 1992).

We reasoned that, by presenting pure tones in combination with various head positions, it should be possible to determine whether this clear elevation percept is fixed relative to space or to the head. In the latter case, it is expected that tone-evoked eye movements end at a fixed elevation relative to the head independent of the static head elevation. As illustrated in Fig. 1C, this would result in a horizontal line ("head-fixed") when the eye-in-head elevation of the saccade end points is plotted as a function of the head-in-space elevation. On the other hand, if the acoustic signal is combined with head-position information to obtain a space-fixed target representation, subjects may compensate for variations in static head elevation. As depicted in Fig. 1C, the data then would scatter along a line with a slope of -1 ("space-fixed"). In other words, the tone-evoked ocular saccades would end at a fixed elevation in space.

Apart from static cues, the auditory system also may use dynamic cues that arise from specific changes in the acoustic input when the head moves relative to the sound source or vice versa (Lambert 1974; Zakarouskas and Cynader 1991). Such dynamic cues could be particularly useful when the auditory system was to combine them with head movement information. We wondered therefore whether head-free gaze shifts toward sounds would be more accurate than head-fixed movements. We reasoned that such an improvement might be most apparent for pure tone stimuli that, under head-fixed conditions, cannot be localized in elevation at all (see Fig. 1B). To test for this possibility, we compared the accuracy of tone-evoked gaze shifts under head-fixed and -free conditions.

Our data indicate that a head-position signal is used for auditory-evoked orienting of eyes and head. This suggests that acoustic targets are represented in a spatial (or body-centered) frame of reference. Although head-movement signals, in principle, also may be used to improve sound localization performance under head-free conditions, we observed no difference in the accuracy of rapid head-free and -fixed orienting movements.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Subjects

Ten human subjects (9 male, 1 female; ages 22-52) participated in the experiments. None of the subjects suffered from an oculomotor, visual, or hearing problem, except JO (one of the authors), whose right eye is amblyopic. All subjects had participated in previous oculomotor and acoustic localization studies, but six of the subjects (JR, VG, AB, BB, PH, and VC) were kept naive as to the purpose of the experiments.

Experimental setup

Experiments were performed in a completely dark, sound-attenuated room (3 × 3 × 3 m), in which the walls, ceiling, floor, and large objects were all covered with black acoustic absorption foam (echoes were absent from >= 500 Hz). The subject was seated comfortably in the center of the room. Vision was binocular, the head was free to move, and free-field listening was binaural. The ambient background noise level was 30 dB SPL (sound-pressure level; measured with a Brüel and Kjær BK2601 sound amplifier).

Visual stimuli [85 red/green light-emitting diodes (LEDs) intensity 0.3 cd/m2] were mounted on an acoustically transparent thin-wire hemisphere with a radius of 0.85 m, the center of which coincided with the recorded eye. The subject's right eye was aligned with the central LED. The other LEDs were positioned at polar coordinates R is in  [2, 5, 9, 14, 20, 27, 35] deg, Phi  is in  [0, 30, ···, 330] deg, where Phi  = 0° is rightward from center and Phi  = 90° is upward. The entire hemisphere was braced with a thin black silk cloth that completely blocked the view of the speaker even under dim lighting conditions.

Auditory stimuli were presented through a broad-range speaker (Philips AD-44725) at an intensity of 60 dB SPL (measured at the position of the subject's head). The speaker was mounted on a two-link robot, which consisted of a base with two nested L-shaped arms, each driven by a stepping motor (Berger-Lahr, VRDM5). The speaker could be moved quickly (within 3 s) to practically any point on a virtual frontal hemisphere, just distal from the LED hemisphere, at a radius of 0.90 m from the subject's eye. In earlier studies from our group, it was verified that the sounds produced by these stepping motors did not provide any consistent localization cues to the subject (Frens and Van Opstal 1995; Goossens and Van Opstal 1997).

Measurements and stimulus presentation

The two-dimensional orientations (referred to as "positions") of both the eye and the head were measured with the magnetic search-coil induction technique (Collewijn et al. 1975). Subjects wore a scleral search-coil on their right eye as well as a lightweight helmet (150 g) with a small head-coil attached to it. The horizontal (30 kHz) and vertical (40 kHz) magnetic fields that are required for this method were generated by two orthogonal pairs of 3 × 3-m square coils that were attached to the room's edges.

Two PCs (80486) controlled the experiment. One PC-486 (the "master") was equipped with the hardware for data acquisition (Metrabyte DAS16), stimulus timing (Data Translation DT2817), and digital control of the LEDs (Philips I2C). Horizontal and vertical components of eye and head position were detected by phase-lock amplifiers (PAR 128A and PAR 120), low-pass filtered (150 Hz), and sampled at 500 Hz per channel.

The other PC-486 ("slave") controlled the movements of the robot and generated the acoustic stimuli. This computer received its commands from the master PC through its parallel port. All sound stimuli [Gaussian white noise (GWN) and pure tones] were digitally synthesized (Matlab, Mathworks), multiplexed with a 5-ms sine-squared onset and offset ramp, and stored on disk at a 50-kHz sampling rate. During an experiment, the stimuli were loaded into the computer's RAM, passed through a 12-bit DA-converter (Data Translation DT2821, output sampling rate 50 kHz), and via a band-pass filter (Krohn-Hite 3343, 0.2-20 kHz) and an amplifier (Luxman A-331) presented to the speaker (see also Goossens and Van Opstal 1997, for further details).

Tone stimuli (see following text) always were measured just before or immediately after each session by a microphone (Brüel and Kjær BK4144) suspended at the position of the subject's head. The microphone signals were amplified (Brüel and Kjær BK2610), band-pass filtered (Krohn-Hite 3343, 0.2-20 kHz), and sampled at 50 kHz (Metrabyte DAS16). Power spectra were computed off-line to verify that these stimuli indeed consisted of only a very narrow spectral peak (width < 1/12 octave) without harmonic distortions. On two separate occasions, we also measured the sound pressure evoked by the tone stimuli inside the ear canal (subjects JG and RV). In these tests, a small probe microphone (Knowles EA 1842) was connected to a flexible silicone tube (diameter 1 mm; length 5 cm) that ended within 1-2 mm from the eardrum.

Experimental paradigms

STANDARD PROTOCOL. An experimental session always started with two calibration experiments in which the subject had to align the eye and head, respectively, with all 85 LED positions (see following text).

In the subsequent control experiment, the subject's default sound localization performance was tested in the head-fixed (straight-ahead) condition (Frens and Van Opstal 1995; Goossens and Van Opstal 1997). In this experiment, the subjects first fixated the central LED, and when this stimulus vanished, an auditory stimulus (broadband white noise; bandwidth, 0.2-20 kHz; duration, 500 ms) was presented at a randomly chosen location (n = 48) within the two-dimensional oculomotor range (<= 35° eccentricity in all directions). The subject was instructed to generate a rapid and accurate ocular saccade toward the perceived auditory stimulus location without moving the head.

After the calibration and control experiments were completed, one of three test experiments was performed (see following text). In these experiments, the subjects were always instructed to refixate the target(s) as accurately and as promptly as possible. The subjects received no feedback regarding the accuracy of their responses (i.e., neither visual cues, nor verbal feedback) in any of these tasks, neither during the experiments nor in prior sessions.

EXPERIMENT 1. This was a gaze double-step paradigm to investigate whether changes in eye- and head position are accounted for when generating a saccadic gaze shift toward an auditory target (subjects JG, AB, BB, VC, and PH). The rationale of this experiment is explained in more detail in the INTRODUCTION (Fig. 1A).

At the start of each double-step trial, the subject had to fixate the central LED (random presentation time 800-1,600 ms) with the head directed straight ahead. Then after 50 ms of darkness, two targets were presented briefly at two different locations: a visual stimulus (LED flashed for 50 ms), followed (100 ms later) by an acoustic white-noise burst (GWN; 0.2-20 kHz; 50 ms). The subject was instructed to first generate a horizontal eye-head saccade toward the extinguished visual target, followed by an accurate gaze shift toward the location of the auditory target. Subjects were encouraged to move their head rapidly for both gaze shifts and were required to fixate the position of each target with both the eye and the head.

Figure 2A shows the different target configurations applied (top) together with the timing of the stimulus events (bottom). Note that both targets disappeared before or near the onset of the first eye-head movement so that the subject could not rely on sensory feedback. After the presentation of a visual target on either the right or the left side, the auditory target was presented at one of six locations on the same side. In total, there were 2 × (2 × 6) = 24 different target configurations, which were interleaved randomly. A typical experiment consisted of 144 double-step trials.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 2. Target configurations applied in the experiments. A: experiment 1. On extinguishing the fixation spot (F), 1 of 4 peripheral visual stimuli (V) on the horizontal meridian (eccentricity, [±14, ±27] deg) was selected randomly and presented for 50 ms. Then after 100 ms of darkness, 1 of 6 possible auditory target positions (N) was chosen randomly within the same hemisphere as the visual target (Azimuth, [0, ±14, ±27] deg; Elevation, ±14°) and an acoustic target (broadband white noise) was presented for 50 ms. Subject made an eye-head movement toward the visual target followed by an eye-head saccade toward the perceived location of the auditory target. Note that both targets were extinguished before or near the onset of the eye-head movements (open-loop condition). B: experiment 3. Eye and head were aligned with 1 of 5 light-emitting diodes (LEDs, F) at elevations [0, ±14, ±27] deg. Then a tone was presented for 500 ms at a randomly selected location (T). Subject made an ocular saccade toward the perceived target position without moving the head. In the 1st version of the paradigm (top left), there were 18 possible target positions in space (Azimuth, [±10, ±20] deg; Elevation, [0, ±10, ±20, ±30, ±40] deg) and the tone frequency was fixed during a recording session. In the 2nd version of the paradigm (top right), pure tones were interleaved randomly with broadband noise. In addition, the array of possible target locations shifted with head position, such that the targets (n = 10) were presented at fixed locations relative to the head (Azimuth, [±20, ±15, ±10] deg; Elevation-re-head, [0, ±15] deg) for each of the 5 eye-head fixation points. For clarity, only the target configurations for the highest and the lowest eye-head fixation point are shown (dashed boxes).

In subsequent single-step trials, the same brief white-noise burst was presented immediately after the central fixation LED extinguished, and the subjects were required to generate an accurate eye-head saccade to this peripheral stimulus. The stimuli were presented randomly at the same 10 locations, and each target was presented two or three times.

EXPERIMENT 2. This was a gaze single-step paradigm to test whether the accuracy of tone-evoked orienting responses benefits from the use of head movements (subjects MF, JG, and NC). None of the subjects was naive as to the purpose of this experiment.

At the beginning of a trial, the subject aligned eye and head with the straight-ahead fixation light. When this LED extinguished after 800-1,600 ms, a pure-tone acoustic stimulus was presented for 800 ms at a randomly selected peripheral location (n = 30 or n = 45) within the two-dimensional oculomotor field. In the head-fixed condition, the subject was required to make an eye movement toward the perceived position of the tone without moving the head. In the head-free condition, the subject was asked to make a rapid eye-head saccade toward the perceived target position. No specific instructions were given about the speed and accuracy of the head movements. In each recording session, three different tone bursts (750, 1,500, and 7,500 Hz) were interleaved randomly. Head-fixed and -free conditions were tested in separate sessions.

EXPERIMENT 3. This was a static head-position paradigm to study the effect of changes in static head position on the elevation percept of pure tones (subjects JG, RV, JO, JR, VG, and AB) and broadband noise (subjects JG, RV, JO, and AB) as a control. The rationale of these experiments is explained in more detail in the Introduction (Fig. 1, B and C).

In each trial, a red visual fixation spot first was presented at one of five different locations on the vertical meridian, and the subject was instructed to align both eye and head with this LED. Then after a random fixation interval of 2-3 s, the fixation LED disappeared, and an auditory stimulus (either a pure tone or broadband white noise) was presented for 500 ms at a pseudorandomly chosen position. The subject was asked to make an accurate ocular saccade to the perceived target location without moving the head.

In the first version of this paradigm, the tone frequency (500, 1,000, 2,000, 5,000, or 7,500 Hz) was fixed during a recording session (subjects JG, RV, JO, JR, and VG), and broadband noise stimuli (GWN; 0.2-20 kHz) were tested in separate control experiments (subjects JG, RV, and JO). The acoustic targets were presented at 18 different locations in space, yielding 5 × 18 = 90 different fixation/target combinations that were interleaved randomly. Figure 2B depicts these different fixation/target configurations (top left) together with the timing of the stimulus events (bottom). Note that the randomization of stimulus azimuth ensured that the apparent location of pure tone targets was always unpredictable, irrespective of the head position influence. This is easily understood if one recalls that the localization of target azimuth relies on interaural level and timing differences rather than on the spectral properties of the stimulus (see INTRODUCTION).

The results of these experiments indicated that the influence of head position depends on the applied frequency spectrum (see RESULTS). To test this feature also in a different way, a second version of the paradigm was used. In this version, four different tone stimuli (1,000, 2,000, 5,000, and 7,500 Hz) were interleaved randomly with broadband white noise (1/3 of the trials) in each session. In addition, the fixation/target configurations were such that all stimuli were presented at fixed locations (n = 10) relative to the head for each of the five eye-head fixation conditions (yielding 5 × 10 = 50 different locations in space). This is illustrated Fig. 2B (top right), which depicts the respective target locations for the two most extreme eye-head fixation points. Altogether, there were 6 × (5 × 10) = 300 randomized stimulus/fixation/target conditions. The data were collected either in two subsequent sessions on the same day (using left and right eye; subjects JG and AB) or in a single, relatively long session (~75 min; subjects JG and JO).

Data analysis

CALIBRATION. Eye position in space (gaze) was used to quantify the acoustic localization percept of the subjects. In two dimensions, gaze, Gs, is the vectorial sum of eye position in the head, Eh, and head in space, Hs (see also Fig. 3). Specific details of the calibration procedures are provided in Goossens and Van Opstal (1997). Here, only a brief summary is presented.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 3. Relevant reference frames in this study. Schematic outline of the relations between the spatial ("space"), craniocentric ("head"), and oculocentric ("eye") frames of reference. Spatial (or body-centered) frame of reference is fixed to the laboratory room. From the scheme, the following vectorial transformations are obtained: Gs = Eh + Hs, Th = Eh + Te, and Ts = Hs + Th = Hs + Eh + Te. Note that eye and head are unaligned in this particular example. If the eye and head were aligned with the straight-ahead fixation point, the origins of the three coordinate systems would coincide. Gs, gaze-in-space or eye-in-space; Eh, eye-in-head; Hs, head-in-space; Ts, target-in-space; Te, target-re-eye or gaze motor error; Th, target-re-head or head motor error.

First, the eye coil was calibrated with the head in a fixed, comfortable straight-ahead position by letting the subject fixate the 85 different LEDs. Then the relationship between azimuth (A), elevation (E), and the horizontal/vertical components of the eye-position signals were determined off-line. This procedure yielded the eye position in space, or gaze Gs. The accuracy of this calibration method was better than 4% over the entire recording range (±45° in all directions).

Subsequently the head-coil signals were calibrated by measuring various head positions in space using the results of the eye-coil calibration. Toward that means, the subject fixated a small spot at the end of a head-fixed lightweight aluminum rod (40 cm; mounted on the subject's helmet) while directing the head at the different LED positions. In this way, the raw head-in-space signals could be mapped on the calibrated eye-in-space positions after subtraction of a constant offset (Hs = Gs - Gs0). This offset, Gs0, equals the fixed eye position in the head and is measured when the head is straight-ahead (i.e., if Hs = 0 then Gs triple-bond  Eh = Gs0). This procedure yielded head in space, Hs.

In this paper, "eye position" designates the eye-in-head position, whereas head and gaze position both refer to the spatial coordinate frame that is fixed to the laboratory room (see Fig. 3).

SACCADE DETECTION AND SELECTION. Saccades were detected on the basis of the calibrated signals by a computer algorithm that applied separate velocity and mean-acceleration criteria to saccade onset and offset, respectively. Markings were set independently for gaze- and head-in-space signals. All detection markings were visually checked and could be updated interactively by the experimenters to correct saccade recognition failures of the algorithm. To ensure unbiased detection criteria, no stimulus information was provided to the experimenter.

Saccades associated with blinks, or with anomalous, multipeaked velocity profiles were discarded from the analysis. Also responses with first-saccade onset latencies <80 ms, or >400 ms, were excluded from the experiments 2 and 3 data sets (mean latencies typically between 180 and 240 ms). For responses obtained in experiment 1 sessions, markings were set at the beginning and end of the first and second gaze shift, which could each consist of more than one saccade (see RESULTS). Responses with latencies (re, to the onset of the visual target) <150 ms were discarded.

STATISTICS. The least-squares criterion was applied to determine the optimal fit parameters in all fit procedures (see RESULTS). Confidence limits of fit parameters were estimated by the bootstrap method (Press et al. 1992; Van Opstal et al. 1995).

Extracted parameters

EXPERIMENT 1. The initial (o) and final (e) positions of gaze- and head-in-space were determined for each of the two gaze shifts. From these, the gaze and head displacement vectors (Delta G = Gse - Gso and Delta H Hse - Hso) were calculated as well as the eye-in-head positions (Eh = Gs - Hs) at onset and offset of each gaze shift. We also computed the gaze and head end errors (GE and HE) with respect to the auditory target. These errors were defined as the difference between the target-in-space position, Ts, and the position of the eye- and head-in-space, respectively, at the end of the second gaze shift (GE = Gse,2 - Ts and HE = Hse,2 - Ts). Gaze and head motor errors (GM and HM) were calculated for the second response toward the auditory target. These motor errors were defined as the difference between the target-in-space position and the position of the eye- and head-in-space, respectively, at the onset of the second gaze shift (GM = Ts - Gso,2 = Teo,2 and HM = Ts - Hso,2 Tho,2). Note that these vectors indicate the gaze/head movement that is needed to realign the eye/head with the target position. Finally the initial target-re-head position of the auditory stimulus was calculated from the head-in-space position measured before the first movement and the known target-in-space location (Thini Ts - Hso,1). See, for illustration, Fig. 7.

EXPERIMENT 2. Spatial end points of primary gaze and head saccade vectors were determined as well as the final gaze positions. The latter measure included possible secondary saccades occurring after the head movement.

EXPERIMENT 3. Spatial end points of primary ocular saccades were determined as well as the actual static head position. From these, the end positions of the eye-re-head (Eh = Gs - Hs) were computed. The target-re-head positions were computed from the measured head-in-space positions and the known target-in-space locations (Th = Ts - Hs). Note that for broadband noise stimuli, it is physically impossible to generate an accurate ocular saccade when the target is presented outside the oculomotor range. Therefore trials in which the actual target eccentricity, re, to the head exceeded 40°, were excluded from the analysis.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Experiment 1: visual-auditory double-steps

In this experiment, it was tested whether the audiomotor system uses eye-head movement information to maintain spatial accuracy.

Figure 4 shows two typical eye-head movement responses in the double-step paradigm together with the applied target configurations and associated stimulus timings. As explained before (see INTRODUCTION and Fig. 1), the subject (JG) had to orient eyes and head in turn to the two points in space where a visual target (V) and an auditory noise target (N) had been presented successively in total darkness. The first gaze shift toward the extinguished visual target results in a displacement of both the eye and the head, from the starting point (F) where the acoustic target was last heard at position N to a new point in space (V). If the subject was using only the head-centered acoustic input, the second response would end at a wrong spatial location, as indicated by the dashed square. Note, however, that the second, auditory-evoked gaze shift appears to be quite accurate in both examples, even though all movements were clearly executed under open-loop conditions. One also may observe that each gaze shift could consist of more than one saccade (see Fig. 4D) and that gaze is stabilized in space at end of each saccade, even when the head continues to move toward the target location. The latter is due to the action of the vestibulo-ocular reflex, causing a counterrotation of the eye in the orbit.



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 4. Two typical eye-head double-step responses from experiment 1. A and B: 2-dimensional trajectories of gaze (thin traces) and head (thick traces) movements in space. F, V, and N indicate positions of the fixation spot and the visual and auditory targets, respectively. The auditory target consisted of broadband white noise. Note that the 2nd, auditory-evoked gaze and head movements are directed toward the real, spatial location of the acoustic stimulus. If the 2nd eye-head responses were programmed purely based on the head-centered acoustic input, the movements were expected to end near the dashed squares. C and D: gaze (thin traces) and head (thick traces) position in space as a function of time. Timing of the different stimulus events is also shown. Note that the auditory stimulus is extinguished before or near the onset of the 1st movement. Subject JG.

To document the accuracy of the eye-head movements in this paradigm, Fig. 5 shows the spatial trajectories of gaze double-step responses to six different visual-auditory target combinations for another subject (PH). One may observe readily that all second responses were aimed at the spatial location of the auditory target rather than to the shifted head-centered position (dashed squares). In the two double-step responses (Fig. 5, A and D, top), for example, the head-centered target position shifts (approximately) to a spatial location where auditory stimuli were presented during other trials (as in Fig. 5, B and E). Nevertheless the subject's second eye-head movement is clearly directed toward the site where the noise burst actually had been presented before the first movement.



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 5. Maintenance of spatial accuracy in the double-step paradigm. A-C: spatial trajectories of gaze (thin traces) and head (thick traces) double-step responses in the left hemisphere. Following a leftward response to the visual target (V), the subsequent auditory-evoked movements ended near the auditory noise target (N), presented at 3 different locations in space. D-F: eye-head double-step responses in the right hemisphere. In all cases the 2nd, auditory-evoked movements ended closest to the sites where the acoustic stimuli were presented. Without compensation for the intervening eye-head saccades, the second movements would have ended at the wrong, head-centered target locations (dashed squares). Subject PH.

The impression gained from the data in Figs. 4 and 5 is that the intervening eye-head movement elicited by the visual target is taken into account in programming the second gaze shift toward the auditory stimulus. To further quantify this behavior, we analyzed the horizontal components of head and gaze movements in more detail (see Fig. 7 for a schematic outline). Figure 6 shows the results of this analysis for one of our subjects (subject BB; data pooled for all target configurations). One may clearly observe (Fig. 6A) that the second, auditory-evoked gaze displacements in double-step trials (open circle ) correlate well with the actual gaze motor errors (i.e., the required gaze displacement to end on target), indicating that these gaze shifts were all goal-directed. A similar observation can be made for the concomitant head-displacement components, although in this, case the relation to head motor error is less tight (Fig. 6B).



View larger version (40K):
[in this window]
[in a new window]
 
Fig. 6. Scatter plots of auditory-evoked head and gaze responses in experiment 1. A and B: horizontal gaze and head displacement components plotted as a function of their respective motor-error components. Both single-step control responses () and 2nd responses of the double-step (open circle ) are indicated. Note the high correlation between gaze displacement and gaze motor error, indicating that the second gaze shifts were goal-directed (see also Table 1). The following correlation coefficients (r), slopes (a), and biases (b) were obtained: Gaze: single-steps: r = 0.98, a = 0.97, b = 2.6°, n = 30; double-steps: r = 0.98, a = 1.11, b = 3.1°, n = 84; Head: single-steps: r = 0.97, a = 0.47, b = 6.8°; double-steps: r = 0.73, a = 0.62, b = 3.0°. C and D: horizontal gaze and head end errors with respect to the noise target are plotted as a function of the initial horizontal head displacement. Single-step controls are plotted at 0 initial head displacement. If the head displacement is not accounted for, the data are expected to align with the diagonal Line. Following errors (mean ± SD; positive when the movement ends to the right) were obtained for targets in left and right hemifields (i.e., left and right data clusters in C and D): Gaze: single-steps: GEL = 3 ± 4 deg, GER = 3 ± 3; double-steps: GEL = 6 ± 3, GER = 0 ± 4; Head: single-steps: HEL = 16 ± 7, HER = -2 ± 6; double-steps: HEL = 9 ± 5, HER = 0 ± 5. Subject BB.

Figure 6 also shows the horizontal head and gaze end error (Fig. 6C and D) as a function of the first, visually evoked head displacement. If the initial head displacement had not been accounted for, the data would be aligned along the diagonal. This is clearly not what happens. Full compensation requires the data to scatter around the horizontal line (0 error). Although this does not occur precisely, the gaze end errors obtained in double-step trails (Fig. 6B) were not very different from those measured in the single-step control condition (; plotted at 0 initial head displacement).

When the gaze end errors were analyzed separately for movements toward acoustic targets in the left and right hemifield, small differences (mean < 5°; P < 0.01) between the two conditions typically were observed (see e.g., left and right data clusters in C). Roughly the gaze end points deviated slightly to the right (positive end errors) when the initial head displacement was to the left (negative values) and vice versa.

As may be inferred from the data in Figs. 4 and 5, the position of the eye and head in space are different at the end of the first gaze movement. Hence the eye and head are typically unaligned (i.e., the eye is not centered in the orbit) at the onset of the second eye-head movement. Therefore to determine to what extent the audiomotor system accounts for the movements of both eye and head, the second gaze shifts (Delta G2) were described as a function of the initial target-re-head position (Thini), the eye-in-head position at the onset of the second gaze response (Eho,2), and the initial head displacement (Delta H1) (see Fig. 7):
&Dgr;<IT>G</IT><SUP><IT>A</IT></SUP><SUB><IT>2</IT></SUB><IT>=</IT><IT>a</IT><IT>·</IT><IT>Th</IT><SUP><IT>A</IT></SUP><SUB><IT>ini</IT></SUB><IT>+</IT><IT>b</IT><IT>·</IT><IT>Eh</IT><SUP><IT>A</IT></SUP><SUB><IT>o,2</IT></SUB><IT>+</IT><IT>c</IT><IT>·&Dgr;</IT><IT>H</IT><SUP><IT>A</IT></SUP><SUB><IT>1</IT></SUB><IT>+</IT><IT>d</IT> (1)
where the superscript A refers to the azimuth component of each vector. Fit parameters (a, b, and c) are dimensionless gains, whereas d is a bias (in deg).



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 7. Analysis of the double-step responses. First and 2nd gaze displacements were calculated (Delta G1 and Delta G2) as well as the actual gaze motor errors (GM) and the gaze end errors (GE). Same parameters were computed for the concomitant head movements (see METHODS for further details). Second gaze shift (Delta G2) was described as a function of the initial target-re-head position (Thini), the eye-in-head position at the onset of the 2nd gaze response (Eho,2), and the initial head displacement (Delta H1) (see Eq. 1). In this hypothetical example, no compensation for Delta G1 would yield a secondary gaze shift to site A. Compensation for eye position only, would yield a response to site B. Full compensation, for both eye position and head displacement, would yield a spatially accurate movement to site C.

Note that if the changes in neither eye nor head position are accounted for (i.e., response to site A in Fig. 7), the gaze displacement would only correlate with the initial target-re-head position (a approx  1, b = c = 0). If subjects were to account only for eye position (Frens and Van Opstal 1994; Hartline et al. 1995; Jay and Sparks 1984), the movements would be directed to the head-centered location of the acoustic target (i.e., response to site B in Fig. 7). In this case, a negative correlation with eye position is expected, but no correlation with head displacement (a approx  1, b approx  - 1, and c = 0). However, when the movements of both eye and head are accounted for, that is, if the subject makes an orienting movement toward the spatial target location (i.e., response to site C in Fig. 7), negative gains are expected for both eye position and head displacement, and the value of these coefficients (b and c) should be close to -1. As may be derived from Table 1, this is indeed what is observed. Thus the data are described succinctly by: Delta G2 = GMini - Delta G1, where GMini is the gaze motor error with respect to the noise target during initial straight-ahead fixation and Delta G1 the first gaze displacement.


                              
View this table:
[in this window]
[in a new window]
 
Table 1. Multiple linear regression results of experiment 1 

The value of the gains for both the eye and head components (b and c) appeared to be slightly lower than -1, indicating that there was a small overcompensation. The latter explains the small deviations observed in the gaze end errors (see preceding text). We suspect that this feature is due to the fact that our subjects were encouraged to make fast head movements (see METHODS). In this respect, it may be noted in Figs. 4 and 5 that the first response typically also ended beyond the position of the visual target (overshoots).

Experiment 2: tone-evoked eye-head saccades

As was shown in a recent study by Frens and Van Opstal (1995), ocular saccades evoked by a pure tone have accurate azimuth components, but the elevation component is independent of the actual stimulus elevation. The latter also is illustrated for one of our subjects in Fig. 1B.

It is conceivable that the localization of tone stimuli might improve appreciably under head-free conditions. As explained in the INTRODUCTION, the direction-dependent pinna filtering may give rise to specific changes in monaural sound intensity as a function of head position, resulting in otherwise absent elevation cues. Therefore it was tested whether head movements that are made during head-free gaze shifts would lead to an improvement of the subject's localization accuracy in the elevation domain.

The results indicate that the use of such head movements did not lead to a better localization performance when compared with the head-fixed condition. This is shown in Fig. 8, A-D, which depicts the azimuth and elevation components of the primary gaze end points as function of the respective target components for a 1.5-kHz tone. Instead, one may observe that the head-fixed (Fig. 8, A and D) and head-free (Fig. 8, B and D) responses were quite comparable both in azimuth and in elevation. Similar results were obtained when final gaze end points (i.e., after possible secondary saccades) were considered. Note, however, that our subjects were instructed to make rapid orienting movements; they were not allowed to make a variety of (slow) scanning head movements ("search strategy"). This is illustrated in Fig. 8E, showing examples of head and gaze trajectories obtained in the head-free condition. Interestingly, both the head and gaze trajectories tended to end at fixed, but different, elevations.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 8. Localization of pure tone stimuli (1,500 Hz) under head-fixed and -free conditions. A-D: azimuth and elevation components of primary gaze end points a function of the respective target components. Localization under head-fixed and -free conditions is indistinguishable (see also Table 2). Obtained correlation coefficients (r), slopes (a), and biases (b) were Azimuth: Head-fixed: r = 0.93, a = 0.75, b = -4.7°; Head-free: r = 0.95, a = 0.83, b = -2.2°; Elevation: Head-fixed: r = 0.05, a = 0.04, b = 15.6°; Head-free: r = 0.09, b = 0.02, b = 14.4°. E: spatial trajectories of gaze (thin traces) and head (thick traces) movements in the 2-dimensional plane. Interestingly, both gaze and head movements tend to end at fixed, but different elevations, despite the fact that tone stimuli (T) were presented at 16 different locations. Subject MF.

To quantify the tone-evoked responses in the elevation domain, linear regression lines were fitted to the gaze (GsE = a · TsE + b) and head data (HsE = a · TsE + b), respectively, yielding slopes a (dimensionless) and biases b (in deg). Table 2 lists these coefficients together with the correlation coefficients for all three subjects and both experimental conditions (1st and 2nd line of each entry). Note that the biases, b, strongly depended on the tone frequency (particularly in subject NC) (see also Frens and Van Opstal 1995). This frequency dependence, quite comparable in the two conditions, together with the subject's azimuth accuracy (not listed, but see Fig. 8, A and B, for typical examples), clearly indicates that the elevation component of tone-evoked saccades is based on the available acoustic input rather than on an a-specific orienting strategy. Yet head-movement-related changes in sound intensity were apparently not used to improve the response accuracy because the slopes and correlation coefficients remained close to zero in the head-free condition.


                              
View this table:
[in this window]
[in a new window]
 
Table 2. Linear regression results of experiment 2 

When the test stimulus consisted of broadband noise, instead of a pure tone, gaze saccades with and without head movements were equally accurate in all directions (4 subjects; data not shown). This may not be too surprising because the localization performance under head-fixed conditions is already good for broadband stimuli (see following text) (Frens and Van Opstal 1995).

Experiment 3: changes in static head elevation

The results of experiment 1 show that accurate auditory-evoked orienting also includes the use of eye- and head-motor information. In experiment 3, static head elevation was changed to investigate whether head position, rather than head displacement signals, may be involved in the programming of orienting movements toward acoustic stimuli.

Previous experiments have shown that ocular saccades toward broadband noise targets, with the head in the straight-ahead position, are accurate in all directions (e.g., Frens and Van Opstal 1995). Figure 9 shows that eye movements toward broadband noise bursts (500-ms duration) were also accurate when evoked from different static vertical eye-head positions (see also Table 3A). Similar results were recently obtained under dynamic, head-free conditions (Goossens and Van Opstal 1997).



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 9. Localization of broadband noise targets from different vertical head positions. Plotted are the end point components of primary ocular saccades relative to the head (eye and head initially aligned). In contrast to the tone data in Fig. 1B, localization performance is unaffected when the stimulus consists of broadband noise (duration 500 ms), even when vertical head position is changed (see also Table 3A). Correlation coefficient r > 0.95 in all panels. Subject JG.


                              
View this table:
[in this window]
[in a new window]
 
Table 3. Multiple linear regression results of experiment 3

These results may be expected because for broadband noise stimuli, a change in static head elevation also results in a change of the head-centered spectral cues, revealing the actual target elevation relative to the head. Indeed this experiment is conceptually similar to a single-step visuomotor paradigm in which the retinal location of the target may change with eye position but the actual retinal error will always equal the motor error needed to foveate the stimulus. However, when the acoustic target is a pure tone stimulus, changes in head position yield no reliable changes in the elevation cues. This is clearly shown in experiment 2, where subjects were unable to extract sound elevation even when head movements were made (Fig. 8).

When based only on the head-centered acoustic input, it therefore is expected that tone-evoked ocular saccades from different eye-head positions in space (eye and head initially aligned) end at a fixed elevation relative to the head (Fig. 1C). However, as illustrated in Fig. 10A for a single-tone experiment with one of our subjects (JG), the saccade trajectories toward 5.0-kHz tone stimuli were neither parallel (which would indicate an independence of head position) nor directed toward a fixed elevation in space. The latter could be expected in case of a spatial code for the auditory target. Instead the eye-movement trajectories appeared to be in a direction that is in between these two extremes.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 10. Localization of 5.0-kHz tone stimuli from different vertical head positions in a representative single-tone experiment. A: spatial trajectories of ocular saccades elicited from 3 different vertical positions, with the head approximately aligned (within 10°) during initial fixation. Note clear dependence of saccade direction on head position. B: saccade endpoint elevation relative to the head is related to static head elevation in space. Correlation coefficient: r = -0.70; Regression: slope = -0.36, bias = 10.7°. Qualitatively similar results were obtained for the other frequencies tested (see Table 3B). Subject JG.

This feature is quantified further in the scatter plot (Fig. 10B), showing the eye-in-head elevation of the first-saccade end points as a function of static head-in-space elevation. As was explained in the INTRODUCTION (Fig. 1C), a head-centered representation of the target would yield data points that scatter around a horizontal line (i.e., fixed eye-in-head elevation; slope = 0). For a space-fixed representation, one would have expected an alignment of the data along the diagonal line with slope -1 (i.e., full compensation). Instead it was found in this experiment that the data points scatter around a line with a slope of -0.36.

To quantify the influence of head position, we described the measured eye-in-head elevation (EhE) of the saccade endpoints as a function of both the static head-in-space elevation (HsE) and the actual target-re-head elevation (ThE)
<IT>Eh</IT><SUP><IT>E</IT></SUP><IT>=</IT><IT>a</IT><IT>·</IT><IT>Hs</IT><SUP><IT>E</IT></SUP><IT>+</IT><IT>b</IT><IT>·</IT><IT>Th</IT><SUP><IT>E</IT></SUP><IT>+</IT><IT>c</IT> (2)
with superscript E the elevation component of each vector, and a-c the fit parameters of the multiple linear regression. A quantitative summary of the results for all experiments with each subject is given in Table 3.

Table 3A lists the results obtained with broadband noise stimuli. In contrast to the tone-evoked responses, the noise-evoked responses were determined mainly by the actual target-re-head elevation, whereas the head-in-space elevation had little or no influence (b >>  a). Also the biases, c, were close to zero for all three subjects. In fact, the data sets obtained from subjects JG and RV were equally well described when head position was excluded from the regression analysis (no difference between the 2 models; P > 0.1).

Table 3B lists the results when the test stimulus was a pure tone. Note that in the far majority of these experiments, the head-position related gain, a, was significantly different from zero (P < 0.01) and typically in between -1 and 0, whereas target elevation had no influence (a >>  b, and b approx  0). Thus all subjects typically showed a partial rather than full compensation for changes in static head position. One also may observe a clear frequency dependence of the bias, c. Typically, the high-frequency tones were perceived at higher elevations than low-frequency tones when the head was oriented straight ahead (see also EXPERIMENT 2, Table 2) (Frens and Van Opstal 1995). More surprisingly, the head-position-related gain, a, appeared to depend on the applied frequency too, albeit not in a (simple) systematic way.

To study this frequency-dependent influence of head position also under different conditions, we performed experiments in which different tone frequencies were interleaved randomly with broadband noise (see METHODS). Figure 11 shows the results of such an experiment with subject JO. Note that also under these mixed-stimulus conditions, both the slopes and the biases of the linear regression lines clearly depend on the applied frequency. As was likewise found in the single-tone experiments for this particular subject (see Table 3B), the higher frequency tones were localized at higher elevations relative to the head, and the responses toward these tones were more strongly influenced by head position.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 11. Frequency-dependent influence of static head elevation on tone-evoked orienting. Similar results as in Fig. 10 were obtained when 4 different frequencies (1.0, 2.0, 5.0, and 7.5 kHz) were interleaved randomly with broadband noise. Note, however, that both the biases and the slopes of the linear regression lines depend on the applied tone frequency (P < 0.001; see also Table 4). Subject JO.

Table 4 summarizes the results of the experiments with mixed stimuli. Note that both the head-position related gain, a, and the bias, c, clearly depend on the tone frequency (P < 0.001; left columns) in each subject. One also may observe that the bias, c, varied with frequency in a manner similar to that observed in the other tone experiments (Table 2 and Table 3B) and that the head-position influence was always smallest for noise-evoked saccades. However, the absolute values of head-position related gain, a, appeared to be somewhat different from those obtained in the single-tone experiments (see Table 3B, subjects JG and JO). On average, the gain varied more systematically with tone frequency, although 1.0 kHz tones still yielded a higher gain than 5.0 kHz tones (P < 0.01) in subject JG. To test the reproducibility of this finding, we repeated the mixed-stimulus experiment in this subject. Note that, except for the 2.0-kHz tone, the regression results were virtually identical for the two experiments.


                              
View this table:
[in this window]
[in a new window]
 
Table 4. Multiple linear regression results of experiment 3: mixed stimuli

Unlike the noise-evoked responses, the tone-evoked responses were always equally well described by a linear regression model without target elevation (no significant difference; P > 0.1). This indicates, of course, that there were no valid spectral cues present in the tone stimuli as was verified also by measurements of the sound spectra (see METHODS). Note, however, that despite the lack of elevation cues, the subjects still had to rely on the acoustic input for adequate tone localization because target azimuth always was randomized (see METHODS). The azimuth accuracy thus provides a criterion to test, in each individual experiment, whether the tone-evoked responses indeed were guided by the acoustic stimulus.

To evaluate the azimuth accuracy in each experiment, we quantified the measured eye-in-head end point azimuth as function of the actual target-re-head azimuth (EhA = a · ThA + b). Table 4 (right columns) lists the results of this analysis for the mixed-stimulus experiments. Note that a high correlation between saccade azimuth and target azimuth [r(a)] was obtained for both noise- and tone-evoked responses and that the regression coefficients were comparable for all stimulus types. Hence these data clearly indicate that the tone-evoked saccades indeed were based on the available auditory input. Similar results were obtained in all other tone localization experiments (see e.g., Fig. 8) (see also Frens and Van Opstal 1995).


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The present study set out to investigate to what extent the auditory localization system accounts for intervening changes in eye and head orientation when programming a goal-directed gaze shift. The results of the double-step experiments indicate that when the relevant acoustic cues have been provided, the change in eye and head position is taken fully into account (Figs. 4-6; Table 1), so that the subsequent gaze shifts of both the eye and the head are toward the remembered acoustic target position. We therefore conclude that the auditory system does not maintain a static head-centered representation of acoustic targets but instead uses accurate information about previous movements of both the eye and the head. Both signals are needed to ensure spatially accurate localization especially under open-loop conditions.

We are confident that also the results of our tone localization experiments reflect properties of the processes that underlie spatially accurate orienting behavior for the following reasons. First, azimuth accuracy was maintained in all localization experiments (Figs. 8 and 9 and Table 4). Second, the bias of tone-evoked saccades with respect to the head was frequency dependent, both in experiment 2 (Table 2) and in experiment 3 (Fig. 11, Tables 3B and 4). Finally, the influence of head position depended on the tone frequency, both under single-tone conditions (Table 3B) and under mixed-stimulus conditions (Fig. 11 and Table 4). These features cannot be understood if the subjects would have followed an a-specific orienting strategy. In that case, the responses should have been identical for all tone stimuli irrespective of their frequency and spatial location.

Yet the finding that the two versions of experiment 3 yielded slightly different quantitative results (Tables 3 and 4) suggests that also other nonacoustic factors than eye and head motor signals may play a role in auditory-evoked orienting. For example, it is conceivable that the weights assigned to the head-position input may be modulated by the system's confidence in the available elevation cues. In the single-tone condition, no valid spectral cues were present, but when tones were interleaved with broadband noise, the average reliability of the elevation cues was evidently different. As a consequence, the overall head-position influence may have been set at different values for the two conditions.

COMPETING MODELS. Although feedback about eye-head movements clearly is used by the audiomotor system, the double-step data of experiment 1 do not unequivocally show that acoustic targets are represented within a spatial frame of reference. Two possible interpretations may follow from these results.

According to the spatial model (Fig. 12A), the auditory system computes a target position in space-fixed coordinates, Ts, by combining the head-centered acoustic cues (reflecting target-re-head, Th) with head-position signals (Ts = Th + Hs; see e.g., Fig. 3). The latter may be derived from the vestibular system, from proprioception, or from an efference copy of the motor command. Regardless of intervening eye-head movements, the end point of a movement is always specified by the space-fixed target representation. In a subsequent stage, the gaze control system uses information about the actual eye and head positions, to translate the desired gaze displacement, Delta Gd, into appropriate motor commands for both the eye and head motor systems.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 12. Two possible interpretations of the double-step results. A: spatial model. Acoustic localization cues that reflect the head-centered target coordinates (Th) are combined with a head-position signal (Hs) to yield a spatial target representation (Ts) (see also Fig. 13 for more details on this stage). Downstream gaze control system uses the actual eye (Eh) and head (Hs) positions to translate the spatial (or body-fixed) target coordinates into a motor command that specifies the desired gaze displacement (Delta Gd). B: displacement model. Head-centered acoustic signals (Th) are transformed into an oculocentric target representation (Te) by subtracting eye position (Eh). Each time the eye moves, this oculocentric target representation must be updated by subtracting the gaze displacement (Delta G). Latter signal is derived from a so-called resettable neural integrator that integrates a gaze-in-space velocity signal (Gs), equal to the vectorial sum of eye-in-head velocity (Eh) and head-in-space velocity (&Hdot;s) and must be reset to 0 ("reset" signal) after each gaze shift. Updated retinotopic target representation specifies the desired gaze displacement (Delta Gd). Note that this model does not use a representation of head position. In both schemes, the eye and head motor systems are driven by an oculocentric (Delta G) and craniocentric (Delta H) error signal, respectively (see Goossens and Van Opstal 1997). Both signals are derived from the desired gaze-displacement command (Delta Gd), that is assumed to arise from the deep layers of the superior colliculus (e.g., Freedman and Sparks 1997; Freedman et al. 1996).

In the displacement model (Fig. 12B), however, the head-centered acoustic signals first are mapped into an oculocentric reference frame by subtracting eye position (Te = Th - Eh; see Fig. 3) (see e.g., Jay and Sparks 1984). To compensate for intervening eye-head movements, each gaze displacement, Delta G, is subtracted from the oculocentric target coordinates, Te, yielding an updated target representation (Tenew = Te - Delta G) (Goldberg and Bruce 1990). This latter oculocentric signal specifies the desired gaze displacement, Delta Gd (see legend Fig. 12 for further details).

As outlined in the INTRODUCTION, comparable interpretations have been put forward in the oculomotor literature to explain the spatial accuracy of head-fixed double-step saccades toward extinguished visual targets (position model: Hallett and Lightstone 1976; Mays and Sparks 1980; Robinson 1975; displacement model: Goldberg and Bruce 1990; Jürgens et al. 1981). It should be realized, however, that a model relying only on a gaze displacement signal (without accounting for the position of the eye in the orbit and either the position or displacement of the head) cannot readily explain why head movements are goal-directed when the eye and head are not initially aligned (Goossens and Van Opstal 1997). Under these conditions, the eye and head move simultaneously in different directions during single-step gaze saccades toward both visual and auditory targets.

Strictly speaking, neither the position model nor the displacement model in their present form can account for the tone localization data of experiment 3, where the observed head-position gains were typically in between the two extremes predicted by these conceptual models. An important feature of the displacement model, however, is that it does not use a head position signal (see preceding text and Fig. 12B). Therefore according to this model, gaze trajectories elicited by pure tones always are expected to be identical when evoked from different static eye-head positions. Yet, the data obtained in experiment 3 clearly show that the eye-movement trajectories are not invariant for the different head elevations (Fig. 10A). Although the gains measured for the saccade elevation components were typically different from -1, they were always negative and, in the far majority of cases, significant (Figs. 10 and 11, Tables 3B and 4). Furthermore changing the initial vertical eye position with respect to the head while keeping the head fixed at the straight-ahead position yielded no significant changes in the perceived sound elevation (subjects JG and NC; 0.5, 2.0, and 7.5 kHz; data not shown) (see also Frens and Van Opstal 1994). Our present data therefore provide strong support for the idea that a signal about absolute head position, rather than head displacement, is used in programming goal-directed eye-head movements toward sound sources. However, to account for the low-gain results of the tone experiments, the position model of Fig. 12A has to be extended (see following text, Fig. 13, where the dashed box of Fig. 12A is detailed further).



View larger version (18K):
[in this window]
[in a new window]
 
Fig. 13. Conceptual Model. Head-centered spectral elevation cues arise from the direction-dependent pinna filter functions (HRTFs) and are analyzed at a level where acoustic signals are represented tonotopically (Tonotopic Map). Head-position information is proposed to act at this acoustic level and to modulate all frequency channels in a frequency-specific manner (represented by the size of the arrows), which may be determined by the reliability of a given elevation cue in the HRTFs. Peak-firing rate of auditory neurons in the tonotopic array could be influenced by changes in initial head position in a similar way as has been found in the visuomotor pathway: Fpk(H) = F0 · (1 + a(fc) · H) with a(fc) a head-position gain that depends on the characteristic frequency of the particular cell, and F0 is the peak firing rate of the cell at its characteristic frequency when looking straight ahead (H = 0). Weighted output of the entire population of cells, rather than any single frequency band, contains spatially accurate information that is subsequently used to drive the eye and the head toward the acoustic stimulus position. When only 1 tone is presented, the effective population gain of head position will be lower than 1. This model is thought to embody the dashed box in Fig. 12A. Indicated frequencies refer to the tone stimuli applied in this study.

TONE ELEVATION CUES. Tone stimuli do not contain sufficient spectral information to reliably determine target elevation relative to the head. This leaves the auditory system with two potential alternative cues: reflections from shoulders and trunk and changes in sound intensity (up to ~20 dB) due to the direction-dependent pinna filters (HRTFs). However, these cues are apparently of no use because subjects clearly perceive a fixed target elevation when the head is kept in the straight-ahead position (Frens and Van Opstal 1995; see also Figs. 1B and 8), despite large elevation changes of the target (±35°). Indeed reflections from shoulders and trunk are expected to be relevant only in the low-frequency range (<250 Hz), and without prior knowledge about the real source intensity and its distance to the head, there is no way the system can rely on intensity cues to detect target elevation under static conditions.

When head movements are made, however, the resulting changes in sound intensity relate directly to the movement. Such dynamic cues could provide, at least in principle, new consistent information about the target elevation, especially when they are combined with signals from the head motor system. Because our subjects reported that the apparent loudness (perceived intensity) of a tone stimulus varied across trials, it is conceivable that the direction-dependent variations within the HRTFs would be sufficient to yield detectable cues when the head moves. Nevertheless it appeared that subjects could not use this information either because their tone localization responses did not improve under head-free conditions (Fig. 8). Instead it was observed that the saccade endpoint elevations under head-free and -fixed conditions were quite comparable (Table 2).

For broadband noise, we also obtained no improvement (data not shown). These results are consistent with earlier findings reported by Whittington et al. (1981) for horizontal gaze saccades of the monkey. Nevertheless, previous studies (Noble 1981; Perrott et al. 1987; Thurlow and Runge 1967), have demonstrated that in humans, head movements do contribute to improve sound localization performance. However, the head movements in the latter studies were slow compared with the saccadic head movements in the present experiments and in the Whittington et al. study. This suggests that acoustic feedback provided by slow head movements, in tasks other than rapid gaze orienting, may be useful for sufficiently long stimulus durations.

INFLUENCE OF STATIC HEAD POSITION. The results of the tone experiments demonstrate that a head position signal is used in the programming of auditory-evoked saccades. The question arises, however, why only a partial compensation for changes in static head position is found in these experiments (gains between -1 and 0; Figs. 10 and 11, Tables 3B and 4).

A remarkable feature is that the gains depend on the tone frequency, albeit not necessarily in a simple systematic way (at least this was clearly not obtained in the single-tone experiments). One may realize, that this frequency dependence is not readily understood when the acoustic input first were mapped into a topographical representation of auditory space and then, at a later stage, combined with a head-position signal. Rather our findings may be explained more elegantly when the head-position signal interacts already within the auditory system instead of in the spatially accurate gaze motor-system (see dashed box in Fig. 12A). Thus we propose that (reafferent) motor input acts at a sensory level where sounds are still tonotopically rather than topographically represented.

Tonotopic representations are found throughout the auditory system (Irvine 1986; Popper and Fay 1992). A topographic map of auditory space so far has been shown to exist in the barn owl inferior colliculus (Knudsen and Konishi 1978) and in the superior colliculus of a number of mammalian species (cats: Middlebrooks and Knudsen 1984; ferrets: King and Hutchings 1987; guinea pig: King and Palmer 1983). Superior colliculus neurons in monkey (Jay and Sparks 1984, 1987) and cat (Peck et al. 1995) change their auditory receptive fields with eye position, suggesting that at that level the auditory target has been transformed into oculocentric coordinates. Note, however, that the latter does not account for the spatial accuracy of auditory-evoked saccades, as demonstrated by the gaze double-step experiments. Up to now, it is not known whether an explicit auditory representation in spatial coordinates exists.

For adequate two-dimensional sound localization, the human (and monkey) auditory system requires a broadband activation of many frequency channels to enable a reliable spectral analysis of the pinna-based cues. This is clearly the case for broadband noise stimuli, which are indeed accurately localized under all conditions tested (Figs. 4-6 and 9; Tables 1, 3A, and 4). However, for tones, the spectral cues are incomplete and unrelated to the orientation of the head. When head position is accounted for within the auditory system, it may therefore be less surprising that the compensation for head orientation is not optimal under these circumstances.

Figure 13 proposes how the tone localization data could be reconciled with the tonotopic organization of the auditory system. In this conceptual scheme, the head-centered spectral cues are processed within the tonotopic auditory map, and it is assumed that head-position information is distributed over the entire map. The activity of the narrowband cells may be modulated by head position, where the gain of the modulation is related to stimulus frequency (represented by different arrow sizes). Such modulation could be embodied by a similar mechanism as has been reported for various stages within the visuomotor system, where a cell's peak firing rate is modulated by changes in initial eye position (so-called "gain fields") (e.g., in monkey posterior parietal cortex: Zipser and Andersen 1988; in monkey superior colliculus: Van Opstal et al. 1995). These cells have been hypothesized to embody the transformation of the visual target from an oculocentric (and retinotopic) reference frame into a head-centered representation.

According to the proposal of Fig. 13, the spatial representation of the sound results from adequate weighting of the activity of the entire population of such auditory cells rather than from any single frequency band. In case of a broadband noise stimulus, a large portion of the frequency map contributes to this weighting process, and responses are spatially accurate (i.e., full compensation in experiment 1 and an almost unity head-position gain in experiment 3). For pure tones, however, only a limited fraction of the cell population is recruited, resulting in a smaller overall head-position influence (i.e., a lower head-position gain, which is frequency dependent). In subject JG, for example, 2.0-kHz tones yielded a low and more variable head-position gain, which is symbolized here by a small synaptic efficacy of the head-position input (thin arrowhead). The 7.5-kHz stimuli yielded a higher gain in this subject, and therefore this frequency band receives a stronger head-position input (thick arrowhead).

Note that this model does not need an explicit topographic neural map of auditory space. The spatial signal, implicitly present in the population activity, combined with the frequency-specific weighting patterns, could be used directly for controlling the downstream motor systems (see also Van Opstal and Hepp 1995).

INFLUENCE OF EYE POSITION. Recently, Lewald and Ehrenstein (1996) reported a frequency-dependent effect of static eye position on auditory lateralization (i.e., left-right discrimination of dichotic stimuli). It was suggested that this effect could relate to a remapping of the acoustic input into oculocentric coordinates. In view of our present results, one should consider an alternative explanation. Several studies report that there is a relation between eye position in the orbit and neck-muscle activity (e.g., Andre-Deshays et al. 1988; Lestienne et al. 1984; Vidal et al. 1982). It is conceivable therefore, that eye position may systematically influence proprioceptive head-position information. Because a head-position signal clearly is used by the auditory system, an effect of (initial) eye position on sound localization may come about in an indirect way. The idea that neck proprioception is indeed important in spatial perception is supported by studies showing that vibration of neck muscles affects the perceived "straight-ahead" direction, as well as visual orienting behavior (e.g., Karnath et al. 1994; Roll et al. 1991).

SPATIAL ORIENTING AND CALIBRATION. In conclusion, our data suggest a possible role for head-position information in the spatial representation of auditory targets. Such a sensorimotor transformation could be of benefit when the acoustic signal is used for controlling orienting movements of not only the eye and head but also of other motor systems, such as the body and limbs. If similar mechanisms also would apply to other sensorimotor systems, a unified spatial representation could greatly simplify navigation and orienting within multimodal environments.

Another possible role for a head-position signal within the auditory system could be related to the need for an adequate and continuous calibration of the acoustic localization cues (both the spectral cues for sound elevation, as well as the binaural difference cues for azimuth detection). It has been shown that the visual system plays an important role in training the auditory localization system of young barn owls (Knudsen and Knudsen 1985) and in the formation of the collicular auditory space maps of neonate ferrets (King et al. 1988) and guinea pigs (Withington-Wray et al. 1990). At present, it is unknown which sensorimotor systems may be involved in calibrating the human auditory localization system. Indeed for spectrally rich sounds and sufficiently long stimulus durations, head movements also could provide accurate spatial information about the target, which the auditory system could use to update its current internal representations. Such a mechanism may be particularly useful for the periphery (where visual spatial resolution is relatively poor) and for rear acoustic stimuli or in darkness (when vision is not possible).


    ACKNOWLEDGMENTS

We acknowledge the participation of N. Cappaert, R. Veldman, and A. Van Beuzekom in setting up and evaluating experiments 2 and 3. We also thank H. Kleijnen and T. van Dreumel for valuable technical assistance. Dr. H. Misslisch is thanked for helpful criticism on a draft version of this manuscript.

H.H.L.M. Goossens was supported by Netherlands Foundation for the Life Sciences (SLW) Project 805-01.072. A. J. Van Opstal was supported by the Human Frontiers Science Program (RG0174/1998-B) and the University of Nijmegen.

Present address of H.H.L.M. Goossens: Dept. of Physiology I, Erasmus University Rotterdam, P.O. Box 1738, NL-3000 DR Rotterdam, The Netherlands.


    FOOTNOTES

Address for reprint requests: A. J. Van Opstal, Department of Medical Physics and Biophysics, University of Nijmegen, Geert Grooteplein 21, NL-6525 EZ Nijmegen, The Netherlands.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received 31 December 1997; accepted in final form 17 February 1999.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

0022-3077/99 $5.00 Copyright © 1999 The American Physiological Society