Howard Hughes Medical Institute and Department of Neurobiology, Stanford University School of Medicine, Stanford, California 94305
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Seidemann, Eyal and William T. Newsome. Effect of spatial attention on the responses of area MT neurons. This study examines the influence of spatial attention on the responses of neurons in the middle temporal visual area (MT or V5) of extrastriate cortex. Two monkeys were trained to perform a direction-discrimination task. On each trial, two apertures of random-dot stimuli appeared simultaneously at two spatially separated locations; the monkeys were required to discriminate the direction of stimulus motion at one location while ignoring the stimulus at the other location. After extensive training, we recorded the responses of MT neurons in two configurations: 1) Both apertures placed "within" the neuron's receptive field (RF) and 2) one aperture covering the RF while the other was presented at a "remote" location. For each unit we compared the responses to identical stimulus displays when the monkey was instructed to attend to one or the other aperture. The responses of MT neurons were 8.7% stronger, on average, when the monkey attended to the spatial location that contained motion in the "preferred" direction. Attentional effects were equal, on average, in the within RF and remote configurations. The attentional modulations began ~300 ms after stimulus onset, gradually increased throughout the trial, and peaked near stimulus offset. An analysis of the neuronal responses on error trials suggests that the monkeys failed to attend to the appropriate spatial location on these trials. The relatively weak attentional effects that we observed contrast strikingly with recent results of Treue and Maunsell, who demonstrated very strong attentional modulations (median effect >80%) in MT in a task that shares many features with ours. Our results suggest that spatial attention alone is not sufficient to induce strong attentional effects in MT even when two competing motion stimuli appear within the RF of the recorded neuron. The difference between our results and those of Treue and Maunsell suggests that the magnitude of the attentional effects in MT may depend critically on how attention is directed to a particular stimulus and on the precise demands of the task.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Visual attention enables a subset of the available
visual signals to be selected and processed preferentially at the
expense of the rest. What are the neural mechanisms underlying
attention? Traditionally, the selection mechanisms subserving attention
have been divided into two classes: "bottom-up" and
"top-down" processes. Bottom-up processes are thought to be
automatic and to reflect receptive field properties such as
center-surround antagonism in early visual areas (e.g., Allman
et al. 1985). Such mechanisms can explain why a particular
stimulus attracts attention, or "pops out," when surrounded by
contrasting stimuli (a green apple in a box of red apples) but fails to
do so when surrounded by similar stimuli (a green apple in a box of
yellow apples). A "saliency map" created by bottom-up mechanisms
can be overridden, however, by top-down mechanisms that select stimuli
of relevance in specific behavioral contexts. In contrast to bottom-up
processes, which are involuntary, top-down processes involve a
voluntary selection mechanism that can be flexibly directed
to different stimuli in the same visual scene depending on current
behavioral demands.
A key question raised by psychophysical studies of attention concerns
the site(s) of attentional modulations within the brain. Although it is
commonly assumed that bottom-up mechanisms act at very early processing
stages, the site of action of top-down attentional mechanisms is
strongly debated. According to "early selection" models, top-down
attention influences relatively early stages of the visual system,
allowing for more efficient use of limited capacities at all subsequent
stages (Broadbent 1958, 1982
). The "late
selection" models, on the other hand, assert that top-down mechanisms filter out irrelevant information only at late processing stages, potentially after perception is achieved but before behavioral responses are selected (e.g., Allport 1980
, 1987
).
Importantly, these models are not mutually exclusive. Attentional
filtering may be implemented at multiple levels, and the relative
contribution of early versus late stages may vary considerably
according to the precise demands of the behavioral task.
Over the past two decades, neurophysiological studies have revealed
striking examples of the effects of top-down attentional mechanisms on
responses of neurons in multiple brain areas, including early visual
areas in some cases (Colby 1991; Connor et al.
1997
; Haenny and Schiller 1988
; Haenny et
al. 1988
; Luck et al. 1997
; Maunsell et
al. 1991
; Moran and Desimone 1985
; Motter
1993
, 1994a
,b
; Spitzer et al. 1988
; Wurtz
et al. 1980
). In one recent study, Treue and Maunsell
(1996)
found dramatic attentional effects in the middle
temporal visual area (MT) and the medial superior temporal area
(MST)
two extrastriate areas that are specialized for processing motion information (Baker et al. 1981
; Dubner and
Zeki 1971
; Maunsell and Van Essen 1983
;
Van Essen et al. 1981
; Zeki 1974
). The
vast majority of neurons in MT and MST are directionally selective, meaning that they respond well to motion in a "preferred"
direction but little or not at all to motion in the opposite, or
"null," direction. In Treue and Maunsell's task, a monkey was
instructed to attend to one moving target in the presence of one or two
moving distracter targets and report when the attended target changed its speed. When two targets moved in opposite directions within the
receptive field of an MT or MST neuron, the response of the neuron was
dominated by the attended target. The cell gave a strong response when
the attended target moved in the cell's preferred direction and a weak
response when the attended target moved in the null direction even
though the visual stimulus display was identical in both cases. These
findings were quite novel because earlier studies in several
laboratories, including ours, using a variety of behavioral paradigms,
failed to find substantial attentional effects in MT (Ferrera
and Lisberger 1997
; Ferrera et al. 1994
;
Newsome et al. 1988
; Recanzone et al.
1993
; Seidemann et al. 1998
). These
discrepancies raise important questions: exactly what behavioral
conditions are necessary and sufficient to induce strong attentional
effects at the level of MT? How do these conditions relate to early and
late selection models?
One interesting possibility is that strong attentional effects occur in MT whenever attention is directed to one of two or more competing motion stimuli that appear simultaneously within the cell's receptive field, as in the study of Treue and Maunsell. To test this possibility, we recorded the responses of MT neurons while monkeys performed a spatial attention task. On each trial, two apertures of random-dot stimuli appeared simultaneously in two spatially separated locations on a computer monitor; the monkey was required to discriminate the direction of motion in one aperture while ignoring the direction of motion in the other (distracter) aperture. The attentional effects revealed in our paradigm were an order of magnitude smaller than those observed by Treue and Maunsell; responses were only 8.7% stronger, on average, when the monkey attended to the spatial location containing the preferred direction stimulus. The attentional effect was equivalent, on average, for a configuration with both apertures positioned within the RF and a configuration with one aperture within the RF and the other at a remote location. In addition, the attentional effect had a long latency and a slow time course, and the effect varied markedly between correct and error trials.
Our results indicate that spatial attention alone is not sufficient to induce strong attentional modulations in MT, suggesting more generally that the site of action of attention depends strongly on the manner in which attention is directed and on the precise demands of the task. Our data are consistent with the notion that an early selection mechanism is recruited in the experiments of Treue and Maunsell, while a late selection mechanism predominates in our experiments.
Some of these results have been presented briefly elsewhere
(Seidemann and Newsome 1997).
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Two male rhesus monkeys were used in this study. Before the
experiments, each monkey was implanted surgically with a head-holding device, a search coil for measuring eye movements, and a recording cylinder that was placed over a craniotomy above the occipital lobe.
All surgical procedures were performed under aseptic conditions using
isoflourane anesthesia. Additional details regarding the physiological
techniques employed in this study can be found in Britten et al.
(1992).
Visual stimulus
Visual stimuli were generated on a PC (486) using a Pepper SGT
graphics board (Number 9) and presented on a Nanao 17-in Flexscan monitor (model T2-17ts, 60-Hz screen refresh), placed 57 cm away from
the monkey. The random-dot stimulus employed in this study has been
described in detail previously (Britten et al. 1992). Briefly, each dot was displayed for one frame and then replotted 50 ms
later either at an appropriate spatial displacement for the desired
speed and direction or at a random location. In the intervening video
frames, other dots are plotted so that on each frame only a third of
the dots appear. The probability that a dot would be displaced rather
than plotted at a random location corresponds to the "coherence"
of the visual stimulus
the percentage of dots that move coherently.
The size of each random dot was 0.75 × 0.75°; dot intensity was
24.3 cd/m2 presented on a black background (intensity 0.004 cd/M2); dot density was 16.7 dots · deg
2 · s
1.
Behavioral task
The monkeys were trained to discriminate the direction of
coherent motion in a random-dot display. On each trial, two circular stimulus apertures appeared simultaneously. The monkey was required to
discriminate the direction of motion in one aperture while ignoring
stimuli in the other aperture. The two apertures could be small and
abutting (Fig. 1A) or large
and spatially remote from each other (Fig. 1B). Each trial
started with the appearance of a small fixation point on a computer
monitor. After fixation, a stationary pattern of dots (the cue)
appeared at one of the two aperture locations for 500 ms (Fig. 1,
left). After a short gap of 200 ms, two apertures of dynamic
random-dot stimuli appeared simultaneously (Fig. 1, middle).
The monkey was required to discriminate the direction of motion at the
location cued by the stationary dots and ignore the direction of motion
at the other location. After stimulus presentation, the random-dot
stimuli and the fixation point disappeared and two targets appeared,
centered around the location of the attended aperture (Fig. 1,
right). The monkey indicated the perceived direction of
motion in the attended aperture by making a saccade to the target
corresponding to that direction. The reward for a correct response was
a drop of water or juice. The directions of motion in the two
apertures, as well as the attended location, were varied randomly from
trial to trial. The monkey's eye position was measured using a scleral
search coil system (CNC Engineering). Throughout the trial the monkey
was required to maintain fixation within a small window (3 × 3°) around the fixation point. Trials in which the monkey broke
fixation prematurely were aborted without reward. Before participation in this study, both monkeys had been trained extensively on a single-aperture version of the direction-discrimination task that did
not require careful allocation of spatial attention. All procedures used in this study conformed to guidelines established by National Institutes of Health for the care and use of laboratory animals.
|
Single-unit recordings
After extensive training we recorded the responses of MT neurons
while the monkey performed the task. Single-unit activity was recorded
using parylene coated tungsten microelectrodes (Micro Probe, impedance
1-2 m at 1 kHz). The electrical signals were amplified and
filtered, and action potentials from single neurons were detected with
a time resolution of 1 ms using a time-amplitude window discriminator
(Bak Electronics). For each unit, the receptive field (RF) location and
the "preferred" direction were mapped using a random-dot stimulus
that was controlled interactively through a mapping program. The
parameters of the visual display condition then were tailored to the
properties of the recorded cell. The speed of the coherent dots
approximated the optimal speed for the recorded unit and the direction
of motion was either in the unit's preferred or null directions. Two
possible configurations for the aperture locations were used. In the
within RF configuration (Fig. 1A) both apertures were
presented within the RF of the recorded neuron. We tried to position
the two apertures in symmetrical locations relative to the center of
the receptive field so that both apertures would yield similar
responses. However, we made no quantitative measurements of receptive
field location, and in many cases post hoc analysis revealed
asymmetries in the responses to the two apertures (see
RESULTS). In addition, the two apertures were positioned
roughly at the same distance from the fixation point to minimize
imbalance in the animal's attention to the two locations. We
arbitrarily designated one of the apertures as aperture 1 and the other
as aperture 2. In the remote configuration (Fig. 1B), one
aperture covered the RF (aperture 1) while the other aperture (aperture
2) was presented at a remote location, typically in the opposite
hemifield. The two configurations were run in separate blocks of
trials. Overall there were eight trial types for each configuration:
two possible directions of motion in aperture 1 times two possible
directions in aperture 2 times two attended locations. The monkey
performed the task for a block of 160 trials in each configuration (20 trials per condition). For some units, the block was repeated if the
unit was well isolated and the monkey continued to perform the task.
Psychophysical performance
The spatial attention task was quite demanding for these
animals, and extensive training was required to bring them to adequate performance. Even after several months of training, performance could
vary substantially from one stimulus configuration to another and
depended strongly on the coherence of the visual stimulus. We typically
used 50% coherence motion stimuli because this coherence was
sufficiently demanding in most configurations (see following text). In
a few cases, when the configuration was very difficult, we used 100%
coherent motion stimuli. In trials in which the direction of coherent
motion was the same in the two apertures ("same direction" trials), the monkeys performed near 100% correct in both
configurations (within RF configuration: 97 and 98% correct for
monkeys R and S, respectively; remote
configuration: 95 and 98% correct). In trials with opposite directions
of coherent motion in the two apertures (opposite-direction trials),
the monkeys' performance fell to 84% in the remote configuration (81 and 87% for monkeys R and S, respectively) and
to 74% in the within RF configuration (73 and 75% for monkeys
R and S, respectively). In the within RF configuration,
especially when the RFs were small and the preferred speed high, this
task became exceedingly difficult and the monkeys' performance could
fall to chance. We included in our data set only experiments in which
the monkeys' performance was 66% correct in the opposite-direction
trials. In other words, the monkey chose the direction of the attended
dots at least twice as often as the direction of the irrelevant dots in
opposite-direction trials.
Data set
We recorded from a total of 84 MT units (37 from monkey R and 47 from monkey S). A unit was included in our analysis if the monkey's performance was above the minimal performance level (see preceding text), if we obtained at least five correct trials per condition, if the unit was direction selective (2-tailed t-test on the distribution of responses to preferred and null, P < 0.005), and if the unit was well isolated. After these exclusions, our data set included 56 units for the within RF configuration (26 in monkey R and 30 in monkey S), and 66 units for the remote configuration (31 in monkey R and 35 in monkey S); 47 of these units were tested with both configurations. MT was identified based on the high-frequency of direction selective units, its characteristic topography, and the stereotyped sequence of gray matter, white matter and sulci along the electrode tracks. We have no histological confirmation of the recording sites because both monkeys are currently being used in related experiments.
Statistical methods
To quantify the attentional effect, we compared the responses of
individual MT neurons to identical visual display conditions when the
monkey was instructed to attend to one or the other aperture. Neuronal
responses were measured as the number of spikes that the cell fired
during the 1-s presentation of the motion stimuli. For each of the four
visual display conditions, we compared the mean response in the two
attentional states using the following selectivity ratio (SR) index:
SR = (A1 A2)/(A1 + A2) where A1 and A2 are the mean
responses when the monkey was instructed to attend to one aperture or
the other. This equation gives the general form of the SR index; the
specific conditions (A1 and A2) that were
compared depended on specific aspects of the visual configuration and
will be described in RESULTS. The SR can assume values
between
1 and 1. A value of 0.33, for example, indicates that the
responses were twice as strong in attentional state 1 (A1)
as in attentional state 2 (A2). A value close to zero, on
the other hand, would imply that the responses of the neuron are not
modulated by spatial attention. We assessed the statistical
significance of an attentional effect by a two-tailed t-test
on the distribution of the spike counts under the two attentional
states. Only trials in which the monkey responded correctly were used
for this analysis. Error trials were analyzed separately (see
RESULTS).
For each MT unit we also computed the directionality index based on the
following formula: I = 1 N/P where
P and N are the mean responses when either
preferred or null direction motion, respectively, was present in
both apertures. P and N were
calculated as the mean firing rate during stimulus presentation minus
the spontaneous firing rate. The spontaneous rate was considered to be
the mean firing rate in the 400 ms after fixation but preceding onset
of the stationary cue dots.
Analysis of eye position
Eye position signals were collected at 250 Hz using a magnetic
search coil system (Robinson 1963). To test whether the
monkey's mean eye position (MEP) differed significantly between
attentional states, eye position traces were subjected to the following
analysis. For each visual display condition in a given experiment
(e.g., aperture 1 contains preferred stimulus, aperture 2 null
stimulus), we first computed the MEP for each trial during the
presentation of the motion stimuli. We then divided this distribution
of MEPs into two subdistributions according to the attentional state on each trial (attend to aperture 1 or attend to aperture 2), and computed
the difference in the means of these two subdistributions. These
differences were typically very small (<0.1 deg), suggesting that mean
eye position varied little between the two attentional states. To
assess the statistical significance of these differences, we used a
permutation test (Efron and Tibshirani 1993
) to
determine the probability that the observed difference (or a larger
difference) in the means of the two distributions could be generated by
chance. Let m be the number of trials in attentional state 1 and n be the number of trials in attentional state 2. Each
trial is characterized by a mean horizontal and vertical eye position
(hi, vi),
i = 1, ... (m + n). On each
iteration of the permutation test, we reassigned at random the
individual MEPs into two new groups with m and n trials, respectively. We then computed the difference in the mean values of MEP in the two randomly assigned groups. By repeating this
procedure several hundred times (>500), we obtained a distribution of
differences in mean MEP that are expected by chance. If the difference
in mean MEP actually observed for the two attentional states was >95%
of the differences created by the permutation procedure, we concluded
that the mean eye position was significantly different between the two
attentional states.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Effect of spatial attention on responses of MT neurons
Previous studies of attentional effects in several extrastriate
areas indicate that the strongest attentional effects occur when two
competing stimuli appear within the RF of the recorded neuron (e.g.,
Luck et al. 1997; Moran and Desimone
1985
; Treue and Maunsell 1996
). In these
studies, responses to identical visual display conditions were
relatively strong when the animal attended to the cell's preferred
stimulus and relatively weak when it attended to the nonpreferred, or
null, stimulus. If spatial attention modulates the responses of MT
neurons, therefore, we would expect attentional effects to be maximized
when both apertures are presented within the RF. Furthermore we would
anticipate the responses to be strongest when the monkey attends to the
aperture containing the cell's preferred direction of motion and
weakest when the monkey attends to the aperture containing null
direction motion.
Figure 2 illustrates the responses of an
MT neuron that yielded one of the largest attentional effects we
observed in the within RF configuration. The four panels depict the
responses of the unit in the four possible visual display conditions
(correct trials only). The schematic drawing at the top of
each panel indicates the visual display condition. The
bottom of each panel shows the responses as two peristimulus
time histograms (PSTHs)one for trials in which the monkey was
instructed to attend to the lower aperture (
), and one for trials in
which the monkey was instructed to attend to the upper aperture
(- - -). Not surprisingly, the overall response of the unit was
strongest when dots in both apertures moved in the preferred direction
(Fig. 2A), and weakest when dots in both apertures moved in
the null direction (Fig. 2D). Responses were intermediate
when dots in the two apertures moved in opposite directions (Fig. 2,
B and C), consistent with previous reports (Britten and Newsome 1990
; Qian and Andersen
1994
; Snowden et al. 1991
). In Fig. 2,
B and C, however, the response differed between
the two attentional states. The mean firing rate during presentation of
the moving dots was 44% stronger in B when the monkey was
instructed to attend to the lower aperture, which contained dots moving
in the neuron's preferred direction (t-test,
P < 0.005). Similarly, in C, the mean
response was 38% stronger when the monkey was instructed to attend to
the upper aperture (t-test, P < 0.01). Thus
the responses of the cell to identical visual display conditions were
modulated by the spatial location to which the monkey was instructed to
attend. Consistent with our expectation, the responses were stronger
when the monkey was instructed to attend to the spatial location that
contained the cell's preferred stimulus. No significant attentional
modulations occurred when the dots in the two apertures moved in the
same direction (Fig. 2, A and D).
|
Figure 3 displays the responses of one MT cell in the remote configuration (Fig. 1B). In this configuration, one aperture covered the RF (top right) while the other aperture was presented at a remote location (bottom left). As expected, responses were generally poor when null direction motion was presented in the RF aperture (Fig. 3, C and D). If spatial attention influences MT neurons in the remote configuration, we would expect the responses to be stronger when the monkey was instructed to attend to the stimulus within the cell's RF. This effect is evident both in Fig. 3, A and B: the response was 23% stronger in B and 11% stronger in A when the monkey was instructed to attend to the RF aperture (t-test, P < 0.01 in B, and P = 0.08 in A). We found no significant attentional modulations when null direction motion appeared in the RF aperture (C and D).
|
To quantify the magnitude of the attentional effect, we computed a
selectivity ratio (SR, see METHODS) for each unit. Figure 4A displays the distribution
of the selectivity ratios in the within RF configuration combined over
the two monkeys. Significant effects are shown in black (2-tailed
t-test on the response distributions in the two attentional
states, P < 0.05). In the within RF condition, each
unit contributed two SRs to the distribution (Fig. 4A), one for each of the two possible combinations of preferred and null direction motion in the two apertures (e.g., Fig. 2, B and
C). These two SRs were computed as SR = (AP AN)/(AP + AN)
where AP and AN correspond to the mean responses
when the monkey was instructed to attend to the aperture containing the
preferred stimulus or null stimulus, respectively. The distribution of
the SRs is shifted to the right of zero, indicating that MT neurons, on
average, responded to identical visual stimuli more strongly when the
monkey attended to the spatial location that contained the preferred
direction of motion. The magnitude of this effect is modest but highly
significant (t-test, P < 0.00005). The
average SR is 0.042, which corresponds to an 8.7% increase in firing
rate when the monkeys attended to the preferred stimulus. The mean attentional effect was somewhat stronger in monkey R
(11.9%) than in monkey S (5.9%). The mean SR is
significantly different from zero for both animals (t-test,
P < 0.0005 for monkey R, P < 0.05 for monkey S), but the two distributions of SRs do not
differ significantly from each other (2-tailed t-test,
P = 0.15). No attentional effects were found when the
direction of motion was the same in both apertures (e.g., Fig. 2,
A and D). Even though the attentional effects
illustrated in Fig. 4A are significant, they are an order of
magnitude smaller, on average, than the attentional effects in MT
recently reported by Treue and Maunsell (1996)
. We
consider possible explanations for this discrepancy in the DISCUSSION.
|
Figure 4B illustrates the distribution of the selectivity
ratios in the remote configuration. These SRs were computed for trials
in which the dots in the RF moved in the preferred direction and the
dots at the remote location moved in the null direction (e.g., Fig.
3B). For the remote configuration, the SRs were computed as
SR = (Aap1 Aap2)/(Aap1 + Aap2) where Aap1 and Aap2 are the mean
responses when the monkey was instructed to attend to the RF and remote
apertures, respectively. This distribution is also shifted to the right
of zero (t-test, P < 0.001), indicating
that MT neurons, on average, responded more strongly to identical
visual display conditions when the monkey was instructed to attend to a
preferred stimulus within the RF. The average SR was 0.047, which
corresponds to a 9.9% increase in firing rate (9.8% in monkey R and 10.0% in monkey S). In contrast, no significant
attentional effects were found when dots in the RF moved in the null
direction (e.g., Fig. 3, C and D).
A somewhat different picture emerged when both the RF and remote apertures contained preferred direction motion (e.g., Fig. 3A). Figure 5 depicts the SR distribution measured in this condition for each of the two monkeys. In monkey S (Fig. 5A), the distribution of attentional effects is similar to that illustrated for both monkeys in Fig. 4A. For this animal, the SR distributions did not differ significantly between these two stimulus conditions (the remote aperture containing preferred or null direction motion: paired t-test, P > 0.7). In monkey R (Fig. 5B), however, no attentional modulation occurred when both apertures contained preferred direction motion (mean SR = 0.004; t-test for difference of the mean from 0, P > 0.6). This result suggests that the two monkeys used different strategies on trials in which both apertures contained preferred direction motion. Whereas monkey S continued to restrict his spatial attention to the cued aperture, monkey R appears to have released attentional control early in these trials. This strategy would make some sense because the correct answer is unambiguous when both apertures contain the same stimulus.
|
Inspection of the SR distributions in Fig. 4 suggests that attentional
modulations were equivalent in the within RF and remote configurations
(Fig. 4, A vs. B). A more direct comparison,
however, can be made using only the 47 units for which we obtained data in both configurations. To make this comparison, we
calculated a single SR for the within RF configuration by taking the
mean of the two SRs that each neuron contributed to the distribution in
Fig. 4A. A paired t-test revealed no significant
difference between the SRs in the within RF and remote configurations
(P > 0.9). This result was unexpected since previous
studies of attentional effects in MT (Treue and Maunsell
1996) and other extrastriate areas (Luck et al.
1997
; Moran and Desimone 1985
) revealed much larger attentional effects when two competing stimuli appeared within,
or very close to, the RF of the recorded neuron.
Conceivably, our failure to find a difference between attentional modulations in the within RF and remote configurations could be attributed in part to poor placement of the apertures in the within RF configuration. If one of the two apertures overlapped the RF only partially, for example, the responses to preferred motion in this aperture might have been weak. If the attentional effects tend to be small in such cases, we may have underestimated the magnitude of the attentional effect in the within RF configuration. To examine this possibility, we first tested each unit for a significant asymmetry in the response to preferred motion in each aperture. We compared the responses for the two conditions in which the apertures contained opposite directions of motion (e.g., Fig. 2, B and C) when the animal was instructed to attend to preferred direction motion in each aperture. For the 31 units that showed significant response differences (t-test, P < 0.05), we compared the attentional effect in the condition that gave the stronger response to the effect in the condition that gave the weaker response. These SRs did not differ significantly (mean SR = 0.048 for the former condition and 0.031 for the latter; paired t-test, P > 0.7). In addition, neither SR distribution was significantly different from the SR distribution for units which showed no asymmetry in their responses (mean SR = 0.042). We therefore conclude that asymmetries in the location of the two apertures relative to the RF did not cause us to significantly underestimate attentional effects in the within RF configuration.
To summarize, we observed weak effects of spatial attention in MT. Responses were ~8% stronger, on average, when the monkey attended to the aperture containing preferred direction motion. Attentional response modulations were similar in the within RF and remote configurations.
Correlation between psychophysical performance and the attentional effect
The attentional modulations that we observed varied considerably among MT units (Fig. 4). An obvious factor that might lead to excessive variability in these effects is fluctuation in the monkey's motivational state. Poor motivation, and a concomitant failure to allocate spatial attention appropriately, could easily result in weak or nonexistent attentional effects in some experiments. Similarly we might expect relatively strong attentional effects in experiments in which the monkey was highly motivated. To test this possibility, we used the monkey's psychophysical performance (the percentage of correct responses) as an indicator of motivational state. In general, performance is likely to be higher when the monkey is motivated properly, although some fluctuations in performance are expected simply from the changing psychophysical conditions from experiment to experiment (due to differences in RF eccentricity, size, and speed tuning of the recorded units). Our analysis concentrated on the within RF configuration because performance was consistently excellent in the less difficult remote configuration.
Figure 6 displays for each experiment the relationship between the attentional modulation measured for each unit (the SR) and the monkey's performance on opposite-direction trials. Again we find a difference between the two monkeys. In monkey S (Fig. 6A), the correlation between performance and attentional modulation was significant (Spearman's ranked correlation, R = 0.34, P < 0.005). In monkey R, however (Fig. 6B), the magnitude of the attentional effects is not correlated with psychophysical performance (Spearman's R = 0.17, P > 0.1). We conclude that some of the variability in the attentional effects may be attributable to variability in psychophysical performance in monkey S. We could not detect such an effect in monkey R.
|
At a more global level, performance was generally better in the remote configuration than in the within RF configuration (see METHODS), yet attentional modulations were equivalent, on average, in the two conditions. It is perhaps not surprising, therefore, that covariation between performance and attentional modulation was relatively weak from experiment to experiment for the within RF configuration.
Time course of the attentional effect within single trials
Time course information can yield useful insights concerning mechanisms that might underlie the attentional effects in our experiments. Because the attentional effects are modest in amplitude, an accurate assessment of their time course within single trials could only be obtained by averaging data from a subset of MT neurons that showed the largest attentional effects. We therefore analyzed only units that exhibited a relatively large and significant attentional effect (SR > 0.1; t-test, P < 0.05) in at least one of the two opposite-directions conditions (e.g., Fig. 2, B and C) of the within RF configuration. A total of 16 units met these criteria.
For each combination of unit and visual display condition, we computed separate PSTHs for the two attentional states (as in Fig. 2, B and C) and then averaged PSTHs from all the selected neurons into a grand mean pair of PSTHs for the two attentional states (attend to preferred stimulus versus attend to null stimulus). Figure 7A displays the outcome of this analysis. The average PSTHs rose rapidly on the onset of the random-dot stimulus in both attentional states. On trials in which the attended spatial location contained preferred direction motion (solid line), the average response remained high throughout the stimulus presentation interval (black line below the abscissa). For the identical visual display condition, the response declined throughout the stimulus presentation interval when the monkey attended to the spatial location containing null direction motion (dashed line).
|
Figure 7B shows the difference between the average responses
in the two attentional states. This time course plot reveals two
interesting features of the attentional effect: the latency of the
effect is longabout 250-300 ms
and the difference between the
responses grows gradually during the trial, peaking near the offset of
the visual stimulus. The results were essentially the same when the
PSTHs were first normalized to the maximal firing rate before averaging
across all conditions (data not shown).
The slow time course of the attentional modulations in Fig. 7B suggests that we may have underestimated the attentional effect in MT by measuring over the entire 1-s stimulus presentation period (Figs. 4 and 5). To assess this possibility, we recomputed the SRs in the within RF configuration (for all units) in four nonoverlapping 250-ms time intervals. The mean SR for each interval is depicted in Fig. 7C. No attentional effects were present in the first 250 ms of the trial; the attentional effect increased gradually during the trial and reached its maximal value in the last 250-ms interval. The average SR in the last interval, however, was still only 0.068 (~15% difference in response between the two attentional states).
Attentional effects on error trials
To perform the spatial attention task correctly, the monkey must attend to motion signals in the cued aperture while ignoring motion signals in the distracter aperture. On a minority of the trials, however, the monkey erred in his choice of motion direction. Errors could occur for at least three reasons: the monkey erroneously attended to the distracter aperture, the monkey attended to the correct aperture for most of the trial but made an error due to a lapse in attention at the last moment, or the monkey failed to attend to either aperture and simply guessed.
The responses of MT neurons on error trials may provide insight concerning which of these three possibilities is at work in our experiments. Under the first scenario, the monkey attends to the wrong aperture on error trials. If this is the case, we would expect attentional modulations to be reversed in order from those in the correct trials: responses should be larger when the cued aperture contains null direction motion because the monkey is actually attending on these trials to the distracter aperture that contains preferred direction motion. Similarly, responses should be smaller when the cued aperture contains preferred direction motion because the monkey is actually attending to null direction motion in the distracter aperture. In the second scenario, the attentional modulations should be similar in correct and error trials: the monkey attends to the proper aperture but was distracted and chose the wrong target at the last moment. In the third scenario, attentional effects should simply disappear because the monkey fails to attend selectively to either aperture.
To distinguish between these possibilities, we averaged PSTHs for the
two attentional states across the population of MT neurons that
exhibited large and significant attentional effects (as described in
the preceding section). This procedure was necessary because the
attentional effects are weak and error trials are relatively infrequent. For this analysis only, we also included experiments in
which the monkey's performance was poor (<66% correct on
opposite-direction trialssee METHODS). We included only
units for which we obtained at least three error trials in each
attentional state. Because of the small number of errors made by the
monkey, our data set for this analysis includes only 12 units, two of
which qualified independently in both of the opposite-direction displays.
Figure 8 shows the results of this
analysis. Interestingly, the order of the responses in the error trials
is reversed in sign from that in the correct
trialsresponses were somewhat higher on trials in which
the monkey was instructed to attend to the aperture containing
null direction motion (dashed gray) (1-tailed paired
t-test, P < 0.05). This result suggests
that the monkey attended to the wrong aperture, at least on some error
trials. In addition, the average firing rates for both types of errors (dashed lines) fell between the firing rates for correct trials, suggesting that the monkey may simply have failed to allocate spatial
attention on other error trials. The latter comparison, however, is
somewhat problematic because the selection criteria for inclusion in
this analysis was based on strong attentional effects, assessed in
correct trials only. In other words, we deliberately selected neurons
from the end of the distribution in Fig. 4A for the analysis
in Fig. 8. Even if the attentional effects are actually identical on
correct and error trials, therefore, simple statistical considerations
dictate that the measured attentional effects will be somewhat smaller
in the error trials for the subset of neurons that yielded the highest
attentional effects in the correct trials (i.e., "regression to the
mean") (Galton, 1889
). This problem, however, cannot explain the
reversal in the attentional effect between correct and error
trials.
|
Analysis of eye position
Small, systematic differences in eye position between the two attentional states could contribute artifactually to the attentional effects that we measured. If, for example, the monkey tends to fixate toward opposite corners of the fixation window in the two attentional states, systematic differences in the position of the stimulus on the RF could generate response differences that might be misinterpreted as an attentional effect. To test this possibility, we analyzed eye position traces for each experiment to detect any significant difference in the mean eye position (MEP) between the two attentional states (see METHODS).
We focused our analysis on the trials in which the directions of coherent motion in the two apertures were opposite because these stimulus displays yielded the largest attentional effects. This analysis was restricted to monkey R because eye position signals were not saved for monkey S. In the within RF configuration, the average distance between the MEPs in the two attentional states was 0.052° (median = 0.046°). Of the 26 neurons tested (2 display conditions per unit), the permutation test (see METHODS) revealed a significant difference in mean eye position (P < 0.05) in only 7/52 cases (13.5%). In the remote configuration, the average distance between the two MEPs in the two attentional states was 0.063° (median = 0.048°). Of the 31 units tested (1 display condition per unit), only 4/31 (12.9%) exhibited a significant difference in mean eye position (permutation test, P < 0.05).
Even though the distances between the MEPs in the two attentional
states were very small, these differences possibly might be correlated
with the magnitude of the attentional effects. To test this
possibility, we computed the correlation between the attentional
modulation (the SR) and the difference in MEPs in the two attentional
states. These correlations were not significant in either configuration
(within RF configuration, Spearman's R = 0.19, P = 0.09; remote configuration, Spearman's
R = 0.21, P = 0.12).
To summarize, the differences in mean eye position between the two attentional states were very small. Furthermore, the distances between the mean eye position in the two attentional states were not correlated with the observed attentional modulations. Together, these results indicate that differences in mean eye position are unlikely to account for the attentional modulations that we observed.
Does the monkey use spatial attention to perform the task?
One aspect of our procedures raises the possibility that our monkeys might perform the direction discrimination task without using spatial attention at all. At the end of each trial, two saccade targets appeared at symmetrical positions about the cued aperture (Fig. 1A), providing the monkey with post hoc information concerning the relevant aperture. It is conceivable, therefore, that the monkeys remembered the direction of motion in both apertures and decided which direction to choose only at the end of the trial based on the location of the saccade targets. This strategy, while unlikely, would preclude the need for selective spatial attention during the stimulus presentation interval. We therefore performed two additional behavioral experiments in one monkey (S) to determine whether he actually used this strategy.
In the first behavioral manipulation, we eliminated the cue (the stationary dots) on some of the trials while keeping the saccade targets centered around the attended aperture as usual. If the cue is critical to the monkey's performance, we would expect his performance on no-cue trials to fall to chance. If, on the other hand, the monkey uses the alternative strategy of remembering both directions of motion until the saccade targets appear, this manipulation should have little or no effect on performance. To test this possibility, we randomly interleaved trials with no cue and trials in which the cue appeared at the beginning of the trial as usual. The monkey's performance on the opposite-direction trials was 78% correct in the cued trials (544/698). In contrast, performance on no-cue trials was only 53% correct (367/700) and was not significantly different from chance (Binomial test, P > 0.8).
In the second behavioral manipulation, we modified the task so that the target axis, while still parallel to the axis of coherent motion, was positioned midway between the two stimulus apertures. Under this condition, the location of the saccade targets provided no cue concerning the identity of the relevant aperture. The monkey could perform this task correctly only by using the cue at the beginning of the trial (the stationary dots). If the monkey had indeed used target location as a cue in previous experiments, we would expect his performance on the modified trials to be significantly worse than on trials of the original type. We tested this possibility by measuring performance with the two trial types randomly interleaved in single blocks of trials. Even though the monkey had never experienced the "between" condition before, we found no difference in the monkey's performance on the two types of trials. For opposite-direction conditions, the monkey made 76% correct decisions in the modified trials (872/1146) and 74% correct decisions in the original trials (847/1149).
Thus both experiments indicate unequivocally that the monkey used the cue at the beginning of the trial, and thus spatial attention, to perform the task.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The primary goal of the present study was to measure the effect of spatial attention on the responses of MT neurons. We found systematic differences between the responses of MT neurons to identical visual display conditions in the two attentional states, suggesting that spatial attention indeed modulates the responses of MT neurons. On average, responses were 8.7% stronger when the monkey was instructed to attend to the aperture containing preferred direction motion.
It seems unlikely that our data have been compromised by artifacts. First, an analysis of eye position traces stored during the experiments ruled out the possibility that small, systematic differences in eye position could account for our results. Second, additional psychophysical experiments provided strong evidence that the monkeys actually used spatial attention to perform the task rather than adopting a more convoluted strategy. A third factor that might influence our results artifactually is a simple sensory interaction between the stationary dots presented at the beginning of the trial (the attentional cue) and the stimulus dots presented subsequently at the attended location. This interpretation seems unlikely for two reasons. First, a 200-ms gap separated the offset of the stationary dots from the onset of the motion stimulus. At the time of onset of the motion stimulus, responses to the stationary dots (the cue) had generally decayed to levels close to baseline (Fig. 7A). In addition, the time course of the attentional effect was opposite to that expected if it was induced artifactually by the cue stimulus. Fig. 7B shows that the attentional effect was weakest at the beginning of the trial, increasing gradually during the trial and peaking near stimulus offset. We conclude that the response modulations observed in MT actually reflect the influence of spatial attention mechanisms.
Primary findings and their implications
One of the interesting results of this study is that attentional
modulations were similar in magnitude in the within RF and remote
configurations. Previous studies in extrastriate cortex generally have
found that attentional effects are much larger when both the attended
stimulus and the distracter/s are presented simultaneously within the
cell's RF (Luck et al. 1997; Moran and Desimone
1985
; Treue and Maunsell 1996
). These results,
among others, led Desimone and colleagues to suggest the "biased
competition" model for visual attention (Desimone and Duncan
1995
). According to this model, local interactions within the
cortex, such as the center-surround interactions observed in many
visual areas, create competition for saliency between nearby visual
objects (e.g., Koch and Ullman 1985
). Top-down attention
works by biasing this local competitive interaction in favor of the
more relevant stimulus. According to this model, therefore, the
attentional effects should decrease dramatically as the distance
between the two objects becomes greater than the distance over which
local cortical interactions occur. We found no evidence, however, for a
decrease in the attentional effect, even when the two motion stimuli
appeared in opposite hemifields (Fig. 4). Thus the mechanism that
mediates spatial attention in our experiments is not likely to be based
on local competitive interactions.
A second finding of the present study is that attentional modulations
in our paradigm develop slowly: they begin ~250-300 ms after
stimulus onset (Fig. 7) and increase gradually throughout the trial,
peaking near the time of stimulus offset. The observed dynamics are
therefore most compatible with slow, top-down attentional mechanisms
that are likely to be mediated by the extensive feedback connections to
MT from higher areas. Alternatively, this relatively slow time course
could arise if the initial response transient in MT is less vulnerable
to attentional modulation. The time course of attentional effects
reported in previous studies of extrastriate cortex varies considerably
from one study to another, even within the same brain area, suggesting
that the time course can be influenced by the specific demands of the
attentional task (e.g., Luck et al. 1997; Motter
1994a
,b
).
Third, we find an interesting difference between attentional modulations in correct trials and error trials (Fig. 8). For error trials, the attentional modulations were weaker and were reversed in sign: responses were actually stronger on trials in which the monkey was instructed to attend to the null stimulus. The reversed order of the attentional modulation in the error trials suggests that the monkey actually attended to the wrong aperture on some of these trials.
The most unexpected result of the present study is that the
attentional effects we observed (8.7% in the within RF configuration) are an order of magnitude smaller than the attentional effects measured
by Treue and Maunsell (1996) (median effect >80%),
even though both studies required the monkeys to attend to one motion stimulus in the RF while ignoring a second stimulus present
simultaneously within the RF. We suspect that this difference between
the two studies provides important clues about the neural mechanisms
underlying visual attention. Before discussing these, however, we
consider two potential explanations that would be considerably less interesting.
It is conceivable that attentional effects in our study were weak because of the particular visual stimuli we employed. We used 50% coherent motion stimuli, which may provide less of a directional signal than the moving spots used by Treue and Maunsell (recall that half of the dots in our display comprise random-motion noise). If attentional modulations were stronger for stimuli that generate better direction selectivity, the weaker effects in our study might logically follow.
To examine this possibility, we first compared the directionality of
neurons in the current study to the directionality of MT neurons from
previous studies that employed 100% coherent random-dot patterns. The
distributions of direction indices (DI; see METHODS) for
our two monkeys are shown in Fig. 9,
A and B. The mean index in monkey R
(DI = 0.97) is very similar to the mean directionality index
published previously with 100% coherent motion [mean DI = 1.05 (Albright 1984); mean DI ~ 1.0, (Britten
et al. 1992
)], as well as with a single moving spot [mean
DI = 0.98 (Albright 1984
)]. In monkey
R, then, it seems unlikely that difference in the selectivity of
MT neurons to the motion stimuli could account for the differences
between our results and those of Treue and Maunsell. In monkey
S, however, the mean direction index is 0.69, which is
significantly lower than in monkey R
(t-test, P < 0.05). The mean
attentional effect indeed appears somewhat weaker in monkey
S (5.9%) than in monkey R (11.9%), but this
difference was not statistically significant (2-tailed
t-test, P = 0.15). To explore further the
possibility that the size of the attentional effect depends on the
stimulus selectivity of the unit, we computed the correlation
coefficient between the two measures. Figure 9, C and
D, displays the relationship between the directional index and the attentional effect (SR) for each unit. The correlation between
the two measures is weak but significant in monkey S
(Spearman's R = 0.36, P < 0.05) and
is not significant in monkey R (Spearman's R = 0.16, P = 0.21). Even in
monkey S, however, the mean SR for the most directionally
selective units is still <0.1, fourfold smaller than the average
attentional effect measured by Treue and Maunsell. We conclude that
differences in stimulus selectivity fail to account for the large
differences in attentional effects in the two studies, although they
may account for a small portion of the difference in monkey
S at least.
|
The differences between our results and those of Treue and Maunsell
also would be less interesting if they resulted simply because our
behavioral task was not as demanding in terms of attentional effort. A
direct comparison of the two tasks with respect to attentional demands
is not possible because we have no independent measurement of the
attentional load in the two tasks. Nevertheless, several considerations
suggest that difference in overall attentional effort is not the main
source for the contrasting results. First, there is no reason a priori
to believe that our task is not sufficiently demanding because similar
spatial attention tasks have yielded large attentional effects in other
cortical areas (Luck et al. 1997; Moran and
Desimone 1985
). Second, our spatial attention task was quite
difficult for these two monkeys. Even after months of training,
performance fell dramatically as the distance between the two apertures
became smaller and as the coherence of the motion stimulus was lowered.
In some cases, when the two apertures were small and abutting and the
speed of coherent motion was high, the task became so difficult that
the monkeys could not perform above chance. Third, even though the
performance of both monkeys was significantly better in the remote
configuration, indicating that this configuration was easier, the
attentional modulations in the two configurations were equivalent.
Thus we conclude that the differences between our results and those of Treue and Maunsell are not likely to be due to differences in either the amount of attentional demand or in the selectivity of MT neurons for the stimuli used in the two studies. Rather we believe that attention is acting at substantially different sites in the two paradigms. In our paradigm, attention appears to exert its primary effects downstream from MT, consistent with "late selection" models of visual attention. In the paradigm of Treue and Maunsell, however, attention exerts pronounced effects at, or before, the level of MT. A key question remains: what difference(s) between the two paradigms could be responsible for such a dramatic difference in the effects of attention in MT?
Potential sources for the contrasting results
Although the two paradigms are similar in requiring the monkeys to attend to one of two motion signals in the RF, the tasks differ in at least four important ways. First, in Treue and Maunsell's study, the animal was required to detect and quickly report a change in speed of the attended target that occurred at an unpredictable time during the trial. This task structure places a premium on continuous attention throughout the trial. In our task, on the other hand, the monkey was required only to determine the direction of motion in the attended aperture, raising the possibility that attention could lapse during some portion of the trial with little or no negative consequence for the animal. The short stimulus presentation that we used (1 s) should minimize this possibility, but this procedural difference may still account for some of the difference in results.
Second, attentional effects were measured in different time epochs in the two studies. Our measurements were computed from spike counts obtained during the entire 1-s stimulus presentation; Treue and Maunsell measured attentional effects during a later epoch in the trial, from 1 to 2 s after onset of the moving targets. Because the attentional effects in our experiments became larger with time (Fig. 7B), it is possible that the difference in measurement epoch could account for a portion of the contrasting results. Even during the last 250 ms of the trial, however, the attentional effects that we measured were fivefold smaller, on average, than the attentional effects obtained by Treue and Maunsell (Fig. 7C). In general, time-course data will be of considerable importance in evaluating the functional meaning of attentional effects. If strong attentional effects only arise late in the trial for a particular task, it becomes necessary to ask whether or not the attentional effect reflects a filtering operation that is necessary for performance of the task. An alternative interpretation is that the effect represents feedback influences from higher levels after a decision has been made.
A third possibility is that Treue and Maunsell's task recruited an additional attentional mechanism because it required the monkey to use attention to track a moving object in the presence of other distracter object(s). Our task, in contrast, is a relatively pure spatial attention task; the monkey simply had to attend to a single, constant location in space throughout the trial. Mental or attentional tracking may impose more stringent requirements that recruit strong attentional mechanisms at the level of MT. The ability to track objects with attention may be useful for planning saccades or reaching movements to a behaviorally relevant object moving among distracter objects.
Finally, the two tasks differ in the way in which attention is directed to, and maintained on, the relevant visual object. In our task the only feature important for directing and maintaining attention was the spatial location of the cued aperture. The direction of random-dot motion was the stimulus attribute to be discriminated and therefore of little or no use in guiding attention. In Treue and Maunsell's task, on the other hand, the direction of motion of each target was unambiguous, and indeed helped define the stimulus to be attended. In this case, therefore, motion direction (in addition to spatial location) could play a major role in directing attention to the relevant object. If motion direction helps to define the relevant stimulus, attentional filtering could be accomplished in part by a top-down mechanism that enhances the responses of MT neurons the preferred direction of which corresponds to the direction of the attended target while suppressing the responses of those having other preferred directions. In this manner, the number of motion signals present in MT at a given moment in time could be reduced, perhaps facilitating detection of the critical speed change.
If this indeed proves to be the critical difference between the
two tasks, it follows that attentional mechanisms can modulate the
responses of MT neurons more effectively with reference to a
combination of direction and space (Treue and Maunsell's paradigm) than to space alone (our paradigm). In other words, feature-based attentional mechanisms, using a particular direction of motion as the
relevant feature, may contribute importantly to the attentional modulations observed by Treue and Maunsell, as described in other visual areas by Motter (Motter 1994a,b
). This intriguing
possibility raises questions concerning the interaction between
bottom-up mechanisms and top-down mechanisms. As the target in Treue
and Maunsell's task reverses its direction, bottom-up mechanisms must inform the top-down mechanisms that the relevant feature, namely the
direction of motion of the attended target, has changed, and different
populations of MT cells must then be modulated. If the top-down
mechanisms suppress responses to unattended directions, what source
provides signals to the top down mechanisms that the direction of the
attended target has changed?
Clearly, additional experiments are needed to address these issues and identify with certainty the aspects of a behavioral task that recruit the strongest attentional modulations in MT. Whatever the results of such experiments may be, the current contrasting results suggest that attentional mechanisms can act at multiple levels within the hierarchy of visual areas. "Early" selection may be optimal under some circumstances, but an unbiased representation in the early visual areas might be preferable under other circumstances. In the latter case, attentional mechanisms must operate at later processing stages downstream from MT. In exploiting the advantages of early and late selection mechanisms, therefore, the brain may get the best of both worlds, switching from one strategy to the other depending on subtle aspects of the task.
![]() |
ACKNOWLEDGMENTS |
---|
We thank C. L. Barberini, G. D. Horwitz, and M. J. Nichols for helpful comments on the manuscript and J. Stein and C. J. Doane for excellent technical assistance.
This work was supported by National Eye Institute Grant EY-05603. W. T. Newsome is an Investigator at the Howard Hughes Medical Institute.
Present address of E. Seideman: Dept. of Neurobiology, The Weizmann Institute of Science, Rehovot 76100, Israel.
![]() |
FOOTNOTES |
---|
Address for reprint requests: W. T. Newsome, Dept. of Neurobiology, Stanford University School of Medicine, Sherman Fairchild Building, Room D209, Stanford, CA 94305.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 31 March 1998; accepted in final form 16 November 1998.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|