Task-Related Modulation of Visual Cortex

Alexander C. Huk and David J. Heeger

Department of Psychology, Stanford University, Stanford, California 94305-2130


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Huk, Alexander C. and David J. Heeger. Task-Related Modulation of Visual Cortex. J. Neurophysiol. 83: 3525-3536, 2000. We performed a series of experiments to quantify the effects of task performance on cortical activity in early visual areas. Functional magnetic resonance imaging (fMRI) was used to measure cortical activity in several cortical visual areas including primary visual cortex (V1) and the MT complex (MT+) as subjects performed a variety of threshold-level visual psychophysical tasks. Performing speed, direction, and contrast discrimination tasks produced strong modulations of cortical activity. For example, one experiment tested for selective modulations of MT+ activity as subjects alternated between performing contrast and speed discrimination tasks. MT+ responses modulated in phase with the periods of time during which subjects performed the speed discrimination task; that is, MT+ activity was higher during speed discrimination than during contrast discrimination. Task-related modulations were consistent across repeated measurements in each subject; however, significant individual differences were observed between subjects. Together, the results suggest 1) that specific changes in the cognitive/behavioral state of a subject can exert selective and reliable modulations of cortical activity in early visual cortex, even in V1; 2) that there are significant individual differences in these modulations; and 3) that visual areas and pathways that are highly sensitive to small changes in a given stimulus feature (such as contrast or speed) are selectively modulated during discrimination judgments on that feature. Increasing the gain of the relevant neuronal signals in this way may improve their signal-to-noise to help optimize task performance.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Recent neuroimaging experiments have demonstrated that cortical activity in early visual areas depends not only on which stimulus is being presented, but also on which task the subject is performing (sometimes referred to as effects of featural attention). Instructing subjects to perform a discrimination on a specific feature of a stimulus can selectively increase the activity in the secondary (extrastriate) visual areas that are believed to process information relevant to the task. For example, the human MT/MST complex (MT+, also known as V5) responds more strongly to moving than stationary stimuli and is also more active when subjects are instructed to make judgments about stimulus speed than when they are instructed to make judgments about other features such as color or shape (Corbetta et al. 1990, 1991) (see DISCUSSION for further references). The specificity of these effects implies that they do not simply reflect changes in the general arousal level of the observer. Rather, task specificity suggests a close relationship between specific tasks and specific brain areas or pathways. We emphasize, however, that the cortical mechanisms underlying task-specific, featural attention may be different from those underlying spatial attention.

Even primary visual cortex (V1) can be affected by task demands. V1 activity can be modulated by instructing subjects to alternately "attend to" and "passively view" a stimulus (Watanabe et al. 1998a). While these and other results confirm that early visual areas are affected by the behavioral state of the observer, the varieties of task instructions, stimuli, and experimental designs fall short of providing a clear and systematic understanding of how and when task performance can modulate cortical activity in early visual areas.

To clarify and quantify the relationships between task performance and activity in visual cortex, we performed a series of functional magnetic resonance imaging (fMRI) experiments with the following goals: 1) to replicate task-related modulations of V1 with task instructions that exert a clear control over the subject's behavioral state (i.e., using a 2-interval forced-choice threshold-level discrimination task, as opposed to instructing subjects simply to attend); 2) to test whether task performance reduces the variability of responses by controlling the subject's attentional state; 3) to evaluate the consistency of task-related modulations within and between subjects; 4) to replicate the motion specificity of task-related modulations in MT+; and 5) to test for specificity of task-related modulations in V1 and other early visual areas.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Stimuli

Stimuli were either random dot fields or radial gratings that moved with an average speed of either 8 or 0°/s. To minimize eye movements, subjects fixated a small, high contrast fixation mark (1° square, centered in the display) that was displayed continuously throughout each scan. In addition, the gratings and dots moved radially inward/outward (toward or away) from the fixation mark, avoiding a single, powerful optokinetic stimulus.

DOTS. The dot stimulus was a field of 75 white dots on a black background placed randomly inside a circular aperture that subtended 14° of visual angle. Each dot was shaped as a two-dimensional (2D) gaussian (with standard deviation = 0.15° of visual angle). The dots moved radially inward or outward from the fixation mark.

GRATINGS. Radial sinusoidal gratings (concentric rings) subtending the central 14° of visual angle moved inward or outward from the fixation mark. Spatial frequency was randomly varied (0.4-0.6 cycles/deg). Temporal frequency was selected, according to the randomized spatial frequency, to produce the desired speed.

Tasks

Subjects performed one of four tasks (speed discrimination, direction discrimination, contrast discrimination, or passive fixation) during each block of each fMRI scan. All four tasks were performed in a two-interval forced-choice protocol. Each trial consisted of two stimulus presentations, each with a random duration between 400 and 600 ms. The second stimulus interval began 700 ms after the beginning of the first. Subjects registered their decision with a button press and then received feedback (correct/incorrect) immediately following each button press. Each trial began 2,000 ms after the start of the previous trial.

The exact values of the stimulus velocity and/or contrast for each task were determined individually for each subject to yield approximately 80% correct performance based on the results of pilot psychophysical experiments. These psychophysical measurements used standard adaptive-staircase methods of threshold estimation and were performed using identical stimuli and display equipment.

Subjects were cued to switch between tasks by a change of the color of the fixation point. Subjects practiced extensively before being scanned so that task switching did not pose a significant attentional demand.

SPEED DISCRIMINATION. Subjects performed a speed discrimination task on the ~8°/s stimuli. Two intervals of a moving stimulus were presented. A slight speed increment was added to one interval and subtracted from the other to create a difference in speed while maintaining the average speed within a trial equal to 8°/s. Subjects indicated the interval that appeared to move faster. To minimize the effects of adaptation (motion aftereffect), the direction of motion alternated each trial, with the stimulus moving inward or outward in both intervals of each trial.

DIRECTION DISCRIMINATION. Subjects performed a direction discrimination task on the ~0°/s stimuli. Two stimulus intervals were presented. In one interval the stimulus moved very slowly outward, and in the other it moved very slowly inward; average velocity within a trial was therefore 0°/s. The stimulus motion was barely detectable, with speeds only slightly above motion detection thresholds (approximately 0.01°/s, determined separately for each subject). Subjects indicated the interval that appeared to move outward.

CONTRAST DISCRIMINATION. Subjects viewed two intervals of the gratings stimulus. In one interval the stimulus contrast was slightly higher than the baseline contrast of 20%, and in the other interval the contrast was slightly lower than the baseline. Subjects indicated the interval that appeared to have higher contrast.

PASSIVE FIXATION. Subjects passively viewed the stimuli; that is, they maintained fixation and pressed alternating buttons without actually making a judgment about the trial. The slight changes in contrast, speed, or direction of motion that were present when subjects were performing a task were removed from the passive viewing trials. This was done to prevent subjects from covertly performing the task. It is unlikely that removing the threshold-level velocity changes affected our fMRI measurements because 1) the average velocity and contrast per trial was the same during task-performance and passive-viewing trials; 2) the velocity and contrast changes were very small compared with the other stimulus attributes that were randomly varied from trial to trial; and 3) pilot experiments in which the increments were present showed the same pattern of results as scans with the increments removed.

MRI data acquisition

Each subject participated in several MR scanning sessions: one to obtain a standard, high-resolution, anatomical scan, one to functionally define the retinotopic visual areas, and several sessions (14 for subject ACH, 8 for subject BTB, 7 for subject AGP, and 2 for subject RMK) to measure fMRI responses in the various experimental conditions. Each subject repeated each experimental condition between 4 and 26 times in separate fMRI scans.

MR imaging was performed on a standard clinical GE 1.5 T Signa scanner with a custom designed dual surface coil. The experiments were undertaken with the written consent of each subject, and in compliance with the safety guidelines for MR research.

The stimuli were displayed on a flat-panel (NEC, multisynch LCD 2000) display positioned just beyond the end of the patient bed. The display was viewed through binoculars (Optolyth-Optik Alpin 8 × 30) specially modified with all the steel parts removed and replaced with beryllium-copper or brass. A pair of mirrors, angled at approximately 45°, was attached to the binoculars just beyond the two objective lenses to enable the subjects to see the LCD display.

A bite bar stabilized the subjects' heads. The time series of fMR images from each scan were visually inspected for head movements. A post hoc motion correction (Bergen et al. 1992; Black and Anandan 1996) was applied only to the scans in which head movements were apparent. Motion correction was applied to the scans from two scanning sessions, both from the same subject (AGP). The head motions were small (1-2 mm) and brief in duration, perhaps caused by the subject swallowing while on the bite bar.

Each fMRI scanning session began by acquiring a set of low resolution, sagittal, anatomical images used for slice selection. Eight adjacent planes were selected with the most ventral slice positioned along the boundary between the occipital lobe and the cerebellum. Approximately the same slices were chosen in each scanning session. A set of structural images were then acquired using a T1-weighted spin echo pulse sequence (500 ms repetition time, minimum echo time, 90° flip angle) in the same slices and at the same resolution as the functional images. These inplane anatomical images were registered to the high-resolution anatomical scan of each subject's brain so that all MR images (across multiple scanning sessions) from a given subject were coregistered (Nestares and Heeger 2000). Then a series of fMRI scans were performed using a spiral T2*-sensitive gradient recalled echo pulse sequence (1,500 ms repetition time, 40 ms echo time, 90° flip angle, 2 interleaves, effective inplane resolution = 1.94 × 1.94 mm, slice thickness = 4 mm) (Glover and Lai 1998; Noll et al. 1995). Spiral fMRI pulse sequences compare favorably with echo-planar imaging in terms of spatial resolution and sensitivity (Sawyer-Glover and Glover 1998).

fMRI data analysis

Each fMRI scan lasted 252 s. Data from the first 36-s cycle were discarded 1) to minimize effects of magnetic saturation, 2) to minimize effects of visual adaptation, and 3) to allow time for subjects to practice the tasks. During the remaining 6 cycles of each scan, 72 functional images (1 every 3 s) were recorded for each slice. For a given fMRI voxel (corresponding to a 2 × 2 × 4 mm brain volume), the image intensity changed over time and comprised a time series of data.

The data were analyzed separately in each of several identifiable visual areas. We computed the fMRI response amplitudes and phases by 1) removing the linear trend in the time series, 2) dividing each voxel's time series by its mean intensity, 3) averaging the resulting time series over the set of voxels corresponding to the stimulus representation within a visual area (e.g., V1 or MT+), and then 4) calculating the amplitude and phase of the best fitting 36-s period sinusoid. The first step (removing the linear trend) compensates for the fact that the fMRI signal tends to drift very slowly over time (Smith et al. 1999). The second step converts the data from arbitrary image intensity units to units of percent signal modulation; this is especially important because the image intensity varies substantially with distance from the surface coil. Finally, we computed the vector average and standard deviation of these amplitudes and phases across measurements that were repeated in separate scans.

Some data we report in this paper (shown in Fig. 5) are vector averages of the response amplitudes and phases. However, in other experiments we were not interested in the phase component of the response, which represents the temporal delay of the signal relative to the beginning of the stimulus/task cycle. Because the temporal phase in most of the experiments only provided information about the hemodynamic delay, we converted the bivariate amplitude and phase measures to a univariate measure of signal amplitude. To do this, we estimated the hemodynamic delay for each subject and visual area by computing the mean phase for a reference scan that was repeated at each scanning session (see Reference scans in METHODS). We then projected the bivariate amplitude/phase responses to this common reference vector to create a univariate "projected amplitude."

Localizing the retinotopically organized visual areas

Following well-established methods (Engel et al. 1994, 1997; Schneider et al. 1993; Sereno et al. 1995; Wandell 1999), the polar angle component of the retinotopic map was measured by recording fMRI responses as a stimulus rotated slowly (like the second hand of a clock) in the visual field. To visualize these retinotopy measurements, a high-resolution MRI of each subject's brain was computationally flattened (Wandell et al. 2000). Area V1 within each hemisphere was identified as a large region of cortex in/near the calcarine sulcus with a retinotopic map spanning half the visual field. Areas V2, V3, V3A, and V4v were likewise identified by their distinct retinotopic maps.

Localizing MT+

Following previous studies (Tootell et al. 1995a; Watson et al. 1993; Zeki et al. 1991), area MT+ was identified based on fMRI responses to stimuli that alternated in time between moving and stationary dot patterns. The dots (small white dots on a black background) moved (10°/s) radially inward and outward for 18 s, alternating direction once every second. Then the dot pattern was stationary for the next 18 s. This moving/stationary cycle was repeated seven times. We computed the cross-correlation between each fMRI voxel's time series and a sinusoid with the same (36 s) temporal period. We then drew MT+ regions by hand around contiguous areas of strong activation, lateral and anterior to the retinotopically organized visual areas.

Reference scans

The procedures to define the visual areas were performed only once per subject. Because the fMRI data recorded during successive scanning sessions in a given subject were all aligned to a common three-dimensional coordinate grid (see above), we could localize the areas across scanning sessions.

The visual areas were further restricted based on responses to a reference stimulus. The reference scan responses were used to exclude unresponsive voxels, e.g., brain regions that would have responded to visual field locations outside the 14 × 14° stimulus aperture, and voxels that had too little overlap with gray matter. The reference stimulus was the same moving versus stationary dot pattern used to localize area MT+. A reference scan was run during each scanning session, usually as the first fMRI scan of the session. Voxels that were unresponsive in the reference scans were discarded in the analysis of all subsequent scans in that scanning session. Responsive voxels were defined as those that were strongly correlated (r > 0.3 and 0-9 s time lag) with a 36-s period sinusoid.

Statistics

Two-tailed t-tests were used to determine the statistical significance of the measured modulations in cortical activity. First, we computed the fMRI response amplitude and phase for each repeat of each experiment (see FMRI data analysis, above). Second, for each subject and for each visual area, the responses to the aforementioned reference scans were averaged across scanning sessions. Third, we computed the component of the fMRI responses with zero phase lag relative to the responses from the reference scans. Fourth, we computed the mean and standard error of the resulting response amplitude components. Finally, we tested the null hypothesis that the mean response amplitude component was zero, i.e., that there was no modulation of cortical activity. Details of this procedure have been described previously (Heeger et al. 1999).

Some statistics reported in this paper (error bars shown in Fig. 2) were computed using a parametric bootstrapping procedure (Efron and Tibishirani 1993). This procedure works by randomly resampling (with replacement) from the Gaussian distributions defined by the sample mean and variance of each condition, and then computing the mean of the resampled data. These two steps were repeated 5,000 times for each condition. Finally, confidence intervals (corresponding to the standard error) were computed from these 5,000 replications.

Eye tracking

To determine whether patterns of eye movements might account for some of the fMRI signal modulations we observed, we performed a series of eye-movement experiments outside the scanner using an IR eye-tracking system (Ober 2, Timra, Sweden) that sampled horizontal and vertical eye positions at 100 Hz. In these experiments performed in a psychophysical laboratory, we recorded eye movements as subjects performed some of the tasks that they had in the fMRI experiments. Although the eye-movement measurements were performed in separate sessions outside of the scanner, the setups were as similar as possible; the two LCD displays were calibrated for the same luminance, contrast, and display style. The stimulus parameters and difficulty level were also the same as in the scanner. Although it would have been ideal to record eye movements and acquire functional data simultaneously, that was not possible with the equipment we had available. Behavioral performance (% correct) was recorded during both the eye-tracking and fMRI experiments so that performance in both experiments could be compared.

To analyze the eye-tracking data, we 1) removed eye blinks, which have obvious and stereotyped effects on the eye-movement trace that cannot be confused with saccades; 2) high-pass filtered the separate horizontal and vertical eye-position traces to remove the slow drift evident in the recordings; and 3) inspected the vertical and horizontal eye position traces separately for any evidence of unsteady fixation or systematic eye movements.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Performing a task modulates cortical activity

To test for task-related modulations in visual cortex, we conducted a series of fMRI scans in which subjects alternated every 18 s between performing a task and passive viewing. The baseline stimulus speed did not change for the length of each scan. During speed discrimination scans, the stimulus moved at an average speed of 8°/s, and subjects alternated between performing the speed discrimination task on this stimulus and passive viewing. Similarly, during direction discrimination scans, the stimulus moved with an average speed of 0°/s, and subjects alternated between performing the direction discrimination task and passive viewing.

Speed and direction discrimination scans were conducted using both the moving dots and gratings stimuli (see METHODS). Following previous psychophysical studies of speed discrimination (McKee et al. 1986), irrelevant stimulus parameters were pseudo-randomized to force subjects to base their responses only on stimulus speed. In particular, stimulus duration was randomized (400-600 ms) so subjects could not base their judgments of speed on the displacement of dots over time or by counting the temporal cycles of the grating stimulus. For the grating stimulus, contrast was randomized (16-24% contrast) so subjects could not use the alternative cue of apparent contrast, and spatial frequency was randomized (0.4-0.6 cycle/deg) so subjects could not base their responses only on perceived temporal frequency.

All three subjects show significant task-related modulations (Fig. 1). Because the stimulus speeds were essentially the same throughout each scan, the modulations in cortical activity are linked to the subject alternating between task performance and passive viewing. In the figure, positive responses indicate that the fMRI signal was greater during task performance than passive viewing, while negative responses indicate that the fMRI signal was greater during passive viewing. It is important to note that the patterns of results in scans using dots and gratings are qualitatively quite similar. This similarity suggests that effects of performing a task are separable from the stimulus on which the task is performed. At the same time, however, there are notable individual differences between subjects. For example, only two of the three subjects demonstrate a trend toward reduced V1 activity during direction discrimination, as shown by the negative responses in Fig. 1 (this tendency is significant only for ACH during dot-stimulus scans, P < 0.05).



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 1. Task-related modulations in V1 and MT+. Top row: results from scans using dot stimulus. Bottom row: results using gratings stimulus. Each panel corresponds to 1 of 3 subjects. Dark and light bars correspond to speed discrimination and direction discrimination scans, respectively. Height of each bar indicates the functional magnetic resonance imaging (fMRI) response elicited by alternating between task performance and passive viewing. Error bars represent ±1 SE of the mean (n = 3-9 repeats per condition). * P < 0.05; ** P < 0.005 (2-tailed t-test, testing the null hypothesis that fMRI signal modulation is equal to 0).

Other retinotopic visual areas also showed significant task-related modulations. Responses in dorsal visual areas (V2d, V3d, V3A) were typically quite similar to those in MT+, and responses in ventral visual areas (V2v, V3v, V4v) were similar to those in V1.

The amplitudes of these task-related modulations need to be evaluated with respect to a baseline response level. The fMRI responses measured in this experiment correspond to the difference in fMRI signal during task performance versus passive viewing. Small task-related modulations might correspond to relatively large effects in a brain area that has an inherently low level of response to the stimulus itself. The baseline response to the stimulus could vary across stimulus speeds, visual areas, and/or individuals. To better evaluate the relative sizes of task-related modulations across individuals and visual areas, we quantified the size of these effects by computing a "task-dependence index." We created this index by comparing the responses elicited by task-versus-passive scans with the responses elicited during a separate series of baseline scans.

During the baseline scans, the stimulus alternated between 18-s periods of the dot stimulus (baseline speeds were 8 or 0°/s, measured in separate scans) and 18 s of a uniform black field. The stimulus was presented in the same 2-interval forced choice manner as in the main experiment, and subjects performed a speed or direction discrimination task when the stimulus was present.

The task-dependence index was computed as the ratio between the mean response from the main experiment and the mean response from the baseline experiment. This division effectively normalizes responses by the baseline response levels for each subject, brain area, and stimulus speed.

In V1 and MT+, the task index ranged from 13 to 27% in ACH and from 5 to 29% in BTB (Fig. 2). A task index of 100% would mean that passively viewing the stimulus produced the same reduction in signal as turning the stimulus off, so the index values of up to 29% demonstrate that the size of task-related modulations can be close to one-third that of stimulus-driven activity. The relatively large size of these modulations affirm that alternating between performing a task and passive viewing has a substantial influence on activity in V1 and MT+.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 2. Task-dependence index (same format as Fig. 1). Each panel corresponds to 1 of 2 subjects. Index values were calculated for V1 and MT+ by dividing the mean fMRI response from each task/passive condition using the dot stimulus (top row of Fig. 1) by the mean response measured during baseline scans (dot stimulus on/off). Error bars represent a 68% confidence interval (estimated using bootstrap procedure, see Statistics in METHODS).

Performing a task does not affect response variance

Instructing subjects to perform a threshold-level psychophysical judgment exerts control over their attention and arousal levels. Passive viewing is comparatively uncontrolled, potentially allowing subjects to attend to different aspects of the stimulus or exhibit varying levels of vigilance. Therefore instructing subjects to perform a threshold-level discrimination task might lead to less variable fMRI responses over a series of repeated measurements, independent of changes to the mean levels of fMRI responses.

To test the effect of task performance on response variance, we conducted a series of scans using new combinations of the stimuli and tasks described previously. During all-task scans, the stimulus speed alternated every 9 trials (18 s), and subjects performed the appropriate task throughout the scan (alternating speed and direction discrimination as the stimulus periodically alternated speed from 8 to 0°/s). During all-passive scans, subjects passively viewed the stimuli throughout the scan. The all-task and all-passive scans presented the same stimuli and differed only in whether the subject performed a task or passively viewed for the length of the scan. We suspected that the variance of the responses might be smaller for repeats of the all-task scans than for repeats of the all-passive scans.

We found, however, that task performance does not decrease the variance of repeated measurements (error bars in Fig. 3). The standard deviation of the all-task scans is less than the standard deviation of the all-passive scans (i.e., falls below the lower bound of a 95% confidence interval) only for subject AGP in MT+ (P < 0.05). Conversely, for subject ACH, the all-task standard deviation observed in V1 is actually significantly higher than the all-passive scans (P < 0.05), i.e., the opposite of our expectation. All other comparisons are not significant at P > 0.2. While task performance can increase the amplitude of fMRI responses, it does not reliably decrease the variance of repeated measurements.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 3. Responses to alternating stimulus during task performance and passive viewing (same format as Fig. 1). Each panel corresponds to 1 of 3 subjects. Error bars represent ±1 SE of the mean (n = 3-13 repeats per condition).

Because these scans involved a stimulus that alternated between moving at 8 and 0°/s, V1 and MT+ should be expected to modulate with the changing stimulus. However, the amplitude of stimulus-driven modulation is sometimes higher during all-task scans than during all-passive scans (Fig. 3). The pattern of results varies across the three subjects, with ACH and AGP showing all-task responses greater than all-passive responses in V1 (significant for ACH, P < 0.005), while BTB's pattern of results suggest a more substantial difference in MT+. In the next section, we describe how each individual's pattern of results shown in Fig. 3 can be predicted from the results shown in Fig. 1.

Consistency/additivity of task-related modulations

The task/passive experiments described above (Fig. 1) consisted of combinations of two speeds (8 and 0°/s) and two viewing conditions (performing a task or passive fixation). The all-task and all-passive experiments (Fig. 3) consisted of combinations of these same four conditions. Because all of these scans are combinations of the same four basic conditions, it should be possible to predict the responses of each subject in one set of experiments from their results in the other. Specifically, the difference of responses from the speed and direction discrimination (task/passive) scans should be equal to the difference of responses from the all-task and all-passive scans.1 However, this prediction will be consistent across these experiments only if the level of cortical activity measured during each of the four primary conditions is independent of the condition it has been paired against in a given scan.

These comparisons (speed minus direction and all-task minus all-passive) match as predicted (Fig. 4). Each pair of bars in the figures, across the three subjects and two visual areas, are about equal (P > 0.2 for all; we cannot reject the null hypothesis that the all-task minus all-passive responses equals the speed minus direction responses). In other words, measurements of cortical activity during each of the four conditions are independent of the conditions against which they are paired. More importantly, the consistency demonstrated by this analysis confirms that there are reliable individual differences in task-related modulations of cortical activity.



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 4. Additivity of task-related modulation. Comparison of speed-discrimination/direction-discrimination difference and all-task/all-passive difference. The difference between speed and direction responses (from Fig. 1) is never significantly different from the difference between all-task and all-passive responses (from Fig. 3) (P > 0.2 for all). Error bars represent ±1 SE of the mean.

Task specificity: contrast discrimination versus speed discrimination

To test for task specificity in visual cortex, subjects were instructed to alternate between performing contrast and speed discrimination tasks on the same stimulus. Throughout each scan, the grating stimulus (moving with an average speed of 8°/s) was presented in a series of two-interval forced choice trials. Subjects began each scan by performing nine trials of contrast discrimination, switched to speed discrimination for the next nine trials, and continued to alternate between blocks of the contrast and speed tasks for the duration of the scan. Subjects were cued to alternate tasks by a change in the color of the center of the fixation point. Both a contrast and a speed change occurred within each pair of intervals in every trial, regardless of the task that the subject was currently performing. The positions (1st interval/2nd interval) of the speed and contrast increments were assigned randomly and independently so that subjects had to base their discrimination on the relevant feature (contrast or speed) to perform above chance. The sizes of the contrast and speed increments were chosen to yield 80% correct performance for both tasks, as determined by extensive psychophysical piloting (Table 2). As in the preceding experiments, spatial frequency and duration were randomized to force subjects to base their speed discriminations on speed per se (and not temporal frequency or displacement over time).


                              
View this table:
[in this window]
[in a new window]
 
Table 1. Psychophysical thresholds and performance for speed and direction discrimination


                              
View this table:
[in this window]
[in a new window]
 
Table 2. Psychophysical thresholds and performance for contrast/speed experiment

All four subjects demonstrate task-specific modulations (Fig. 5). In the polar plots, fMRI response amplitudes are represented as the radial distances from the origin, and fMRI response (temporal) phases are represented by the angles counterclockwise from the horizontal axis. A temporal phase near zero means that the fMRI signal increased when the subject performed the contrast discrimination task (during the 1st 18 s of each cycle) and decreased when the subject performed the speed discrimination task (during the 2nd 18 s of each cycle). FMRI responses that show the opposite pattern (decreasing during contrast discrimination and increasing during speed discrimination) have a temporal phase near 180°.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 5. Task-specific modulation of MT+, V1, and V3A. Polar plots of fMRI responses while subjects alternated between performing a contrast discrimination task and a speed discrimination task. Each panel corresponds to the responses in a visual area for all subjects (ACH: circles; AGP: squares; BTB: diamonds; RMK: triangles). Response amplitude is indicated by radial distance from the origin and response temporal phase is indicated by the angle counterclockwise from the horizontal axis. Note that MT+ responses are consistently near 180° for all 4 subjects (demonstrating that MT+ responses were higher during speed discrimination). Meanwhile, responses in V1 are more variable across subjects (note that in subject ACH, responses were reliably higher during contrast discrimination). Responses in V3A are also variable, and no subjects demonstrate stronger responses during speed discrimination. The slight counter-clockwise rotation of the data points away from the horizontal axis is due to the hemodynamic delay. Error circles represent a 68% confidence interval on the bivariate response amplitudes and phases (n = 26 for ACH, n = 16 for RMK, n = 15 for BTB, n = 16 for AGP).

All four subjects show higher MT+ responses during speed discrimination than during contrast discrimination (ACH, P < 0.005; AGP, P < 0.04; BTB, P < 0.02; RMK, P < 0.001; 2-tailed t-test; Fig. 5, left). However, responses in V1 varied considerably across subjects (Fig. 5, middle). For subject ACH, V1 responses were reliably higher for the contrast task (P < 0.001; 2-tailed t-test); we are confident that this is a reliable effect because it is based on 26 repeated measurements collected in 4 separate scanning sessions. Conversely, for subject RMK, V1 responses tended to be higher during speed discrimination (P = 0.07). In subjects AGP and BTB, V1 responses were approximately equal during contrast and speed discrimination (P > 0.6 for both).

Although all subjects demonstrated higher MT+ responses during the speed task, no other dorsal visual areas (V2d, V3d, V3A) in any of the subjects showed a significant increase during speed discrimination. The distinct difference between task-specific modulations in MT+ and V3A is especially notable (see Fig. 5, right), given that recent fMRI experiments have reported that V3A demonstrates strong and specific responses to stimulus motion (Smith et al. 1998; Tootell et al. 1997). Ventral visual areas (V2v, V3v, V4v) usually showed a pattern of responses similar to those in V1 (with the exception of responses in subject RMK's V2v and V3v, which tended to increase during the contrast task); the pattern of individual differences observed in V1 was thus repeated throughout the ventral visual areas.

Eye movements

Differences in eye movements between conditions (and/or between subjects) could potentially account for some of our results. A tendency to make more or fewer fixational or pursuit eye movements during one stimulus/task condition than another might be sufficient to modulate the fMRI signal, particularly in V1, where even small eye movements can be significant relative to the size of the receptive fields. Differences in eye-movement patterns between subjects could also contribute to the individual differences observed.

To test for differences among eye-movement patterns, we measured the eye movements of subjects ACH, BTB, and RMK while they performed a subset of the tasks, in the same blocked design as in the fMRI experiments, outside the scanner. Eye movements were recorded for the task/passive experiments at baseline speeds of 8 and 0°/s and for the contrast/speed experiment. Inspection of the eye-movement data reveals that subjects were able to maintain steady fixation throughout both the task/passive and contrast/speed experiments. Representative eye-movement traces are shown in Fig. 6. Across subjects and conditions, eye positions were within the central ±0.5° approximately 95% of the time; no systematic changes in the variance of eye position were apparent.



View larger version (68K):
[in this window]
[in a new window]
 
Fig. 6. Eye position does not covary with task performance. A: time course of horizontal eye position (in deg of visual angle) during 5 cycles of task/passive experiment (direction discrimination, subject ACH). B: time course of horizontal eye position during contrast/speed experiment, subject BTB. Eye position is steady, and slight displacements do not vary systematically with task alternations (gray and white fields in each plot). Intervals including blinks have been removed and replaced with blank space; an example is indicated by the arrow in A. Eye position traces for the vertical axis, and traces of other subjects, were similar.

This is consistent with the fact that all subjects were experienced psychophysical observers who had, before scanning, practiced maintaining steady fixation while performing the tasks. Because both the fMRI and eye-tracking experiments were both performed at psychophysical threshold, if subjects had moved their eyes (differently) in the scanner, we would expect performance to be different. However, subjects performed the tasks at levels similar to their in-scanner performance (±5% correct).

Spatial attention does not account for task-specific modulations

Another possible explanation of the task-specific modulations we observed is that subjects, instead of moving their eyes, covertly shifted their spatial attention. For example, subjects may have attended to the central portion of the stimulus while performing the contrast task, and then switched their attention to the peripheral aspect of the stimulus while performing the speed task. Cortical activity in several visual areas, including V1, can be modulated by spatial attention (see Heeger 1999).

To test for the effect of spatial attention, we re-analyzed the data from the contrast-versus-speed scans in separate subregions of each subject's V1. Specifically, we defined several central and peripheral subregions of each subject's V1 and compared the modulations in these central and peripheral divisions. However, we never observed opposite patterns of modulation between central and peripheral V1, confirming that simple shifts of spatial attention cannot account for the modulations observed in our contrast-versus-speed experiment.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The experiments reported in this paper have attempted to clarify some of the effects of task performance on cortical activity in early visual areas. The results allow for several conclusions. First, task-related modulations in cortical activity are evident in each of several visual areas, including primary visual cortex. Second, task performance affects the mean level of response but not the variability of repeated measurements; although instructing subjects to perform a threshold-level discrimination task might have exerted tighter control over the subjects' attention and arousal, repeated measurements during task performance do not exhibit lower variance than repeated measurements under passive viewing conditions. Third, the effects of task performance are consistent within each subject. Fourth, there are notable individual differences in task-related modulations of cortical activity. Fifth, MT+, as well as the retinotopic visual areas as early as V1, can exhibit task-specificity; activity in each of these brain areas can be preferentially increased by performing some, but not all, tasks.

Individual differences

Consistent individual differences in cortical activity were observed repeatedly across multiple stimulus conditions (dots and gratings) and across different combinations of tasks and stimuli (the additivity test discussed above). The individual differences were evident in V1 even when the subjects' attentional state was as controlled as possible, i.e., when subjects were performing threshold (80% correct) psychophysical judgments throughout each scan (as in the contrast/speed experiment). The individual differences in cortical activity suggest that subjects might have been using different strategies to perform the tasks. However, task performance (% correct) was very similar across subjects and the tasks were designed (randomizing duration, contrast, spatial frequency, etc.) to minimize the number of possible strategies. The lack of any apparent systematic eye movements suggests that individual differences in eye position cannot account for the individual differences in fMRI responses. Additionally, further analyses suggest that shifts of spatial attention, potentially associated with differences in strategy between tasks, are not responsible for the modulations of cortical activity that we observed in the task-specificity experiment. The individual differences we observed may complement results from electrophysiology. Ito and Gilbert (1999; see also Ghose and Maunsell 1999) reported individual differences in attentional modulations between monkeys at different stages of training.

Task specificity

Perhaps the most interesting result reported here is that human cortical activity can be selectively modulated when subjects alternate between two tasks of equal difficulty on the same stimulus (Fig. 5). This task specificity can occur as early as V1. We suggest that visual areas and pathways that are most sensitive to small changes in a given stimulus feature (such as contrast or speed) are selectively modulated during discrimination judgments on that feature. Specifically, we argue that MT+ neurons are more sensitive to changes in speed than contrast. Hence, increasing the gain of MT+ responses during speed discrimination would improve the signal-to-noise of the relevant neuronal signals, thereby helping to optimize task performance. V1 neurons are highly sensitive to small changes in contrast, so increasing the gain of V1 responses would improve contrast discrimination performance. However, some V1 neurons are a crucial part of the motion pathway as well, so increasing the gain of V1 responses during both speed and contrast discrimination could help performance in each task.

Evidence from both single-unit neurophysiology and fMRI experiments supports the hypothesis that contrast detection and discrimination performance is limited by neuronal signals in V1. The response of a typical V1 neuron increases gradually with stimulus contrast across a wide range of contrast values (Albrecht and Hamilton 1982; Carandini et al. 1997; Geisler and Albrecht 1997; Sclar et al. 1990). The responses of V1 neurons have been used to predict behavioral measurements of absolute contrast detection thresholds (Hawken and Parker 1990; Tolhurst et al. 1983). The responses of V1 neurons have also been used to predict entire contrast discrimination threshold curves, over the full range of contrasts (Barlow et al. 1987; Geisler and Albrecht 1997). fMRI measurements also demonstrate that V1 is sensitive to changes in contrast, and that V1 activity is correlated with contrast discrimination performance. fMRI responses in V1 increase gradually with contrast, roughly as a power law with an exponent of 0.3-0.4 (Boynton et al. 1996, 1999; Tootell et al. 1995a). Boynton et al. (1999) found that fMRI responses in V1 are consistent with psychophysical contrast discrimination judgments, i.e., that a contrast increment is detected when the cortical activity increases by a criterion amount. V1 thus appears capable of representing small changes in stimulus contrast and contributing useful information to our contrast discrimination task.

Neuronal activity in MT+, on the other hand, cannot appreciably contribute to performance in our contrast discrimination task. fMRI responses in MT+ show extremely high gain at low contrasts and near complete saturation at high contrasts (Demb et al. 1998; Tootell et al. 1995a), in concordance with results from single-cell recordings in monkey MT (Sclar et al. 1990). The contrast of our stimulus was 20%, well above contrast levels that saturate MT+ responses. When MT+ neurons are contrast-saturated, their responses cannot represent the small contrast increments that are nevertheless detectable by human observers performing the contrast discrimination task.

While V1 neurons are sensitive to small changes in contrast, they are not sensitive to changes in stimulus speed per se. Because the spatial frequency of the grating stimulus was randomized in our experiment, subjects had to base their judgements on changes in stimulus speed, not on changes in temporal frequency. The speed of a moving grating can be expressed as the ratio of temporal frequency to spatial frequency (speed = tf/sf), hence any particular speed can be achieved by a set of stimuli with appropriate combinations of spatial and temporal frequencies (e.g., a stimulus with tf = 2 cycle/s and sf = 1 cycle/deg will have the a speed of 2°/s, as will a stimulus with tf = 4 cycle/s and sf = 2 cycle/deg). For a neuron to be explicitly speed-tuned, therefore, its preferred temporal frequency must change in proportion to stimulus spatial frequency. V1 neurons, however, confound information about stimulus speed with spatiotemporal frequency. When tested with moving grating stimuli, the spatial and temporal frequency tuning curves of a V1 neuron are largely independent of one another (Foster et al. 1985; Hamilton et al. 1989; Holub and Morton-Gibson 1981; Ikeda and Wright 1975; Tolhurst and Movshon 1975). In other words, the preferred temporal frequency of a V1 neuron is constant across changes in stimulus spatial frequency (and vice versa). On the other hand, information about stimulus speed is implicitly represented across the population of direction selective V1 neurons, and these neurons perform a critical processing step in the visual motion pathway.

Neuronal activity in MT+ appears to be well suited to our speed discrimination task. Lesions to MT/MST disturb psychophysical speed discrimination performance in monkeys, using stimuli much like ours (Orban et al. 1995; Pasternak and Merigan 1994). A proportion of MT neurons is tuned for stimulus speed per se; the preferred temporal frequencies of these MT neurons shift systematically with stimulus spatial frequency (Newsome et al. 1983). The role of human MT+ in motion perception has been addressed previously using a variety of techniques. Patients with lesions that include this brain area show deficits in motion perception (Vaina et al. 1994, 1998; Zihl et al. 1983, 1991). Transcranial magnetic stimulation (TMS) near MT+ in healthy volunteers interferes with motion perception (Beckers and Homberg 1992; Beckers and Zeki 1995; Hotson et al. 1994). Functional neuroimaging studies have shown that MT+ is strongly activated when subjects view stimuli that appear to be moving (DeYoe et al. 1996; McCarthy et al. 1995; Sereno et al. 1995; Smith et al. 1998; Tootell et al. 1995a; Watson et al. 1993; Zeki et al. 1991), even for illusory motion in stationary displays (Tootell et al. 1995b; Zeki et al. 1993). Activity in MT+ can be modulated by instructing subjects to attend to moving stimuli (Beauchamp et al. 1997; Corbetta et al. 1990, 1991; Gandhi et al. 1999; O'Craven et al. 1997). MT+ responds selectively when subjects simply imagine visual motion stimuli (Goebel et al. 1998). Most relevant to our speed discrimination task are 1) that both V1 and MT+ responses are correlated with individual differences in speed discrimination performance using grating stimuli (Demb et al. 1998) and 2) that MT+, but not V1, responses are correlated with (within subject) differences in speed discrimination performance across different types of transparent dot displays (Heeger et al. 1999).

In summary, contrast discrimination performance appears to be limited by signals in V1. Speed discrimination performance, meanwhile, may be limited primarily by signals in MT+ but neurons in V1 are likely to also play an important role. In the monkey, the direction selective neurons in layer 4b of V1 send a strong projection to MT (Movshon and Newsome 1996).

Increasing the gain of the relevant neuronal signals will improve their signal-to-noise, thereby helping to optimize task performance. For this reason, we expected MT+ responses to be higher during speed discrimination than during contrast discrimination. Because performance of either task could potentially benefit from increases in the gain of V1 responses, the direction of modulations of V1 activity are difficult to predict. The diverse pattern of results that we observed in V1 may reflect the fact that some V1 neurons can contribute indirectly to speed discrimination. Indeed, given the potential contributions of V1 to both tasks, it is noteworthy that we observed task-specific modulations in two of the four subjects.

Related neuroimaging results

A number of neuroimaging studies have reported task-related modulations in MT+. Corbetta et al. (1990, 1991) demonstrated task-related modulations of extrastriate areas using positron emission tomography (PET). In selective attention scans, subjects were instructed to detect either a change in the speed, color, or shape of the stimuli. In divided attention scans, they were instructed to detect a change in any of these three attributes. In passive scans, subjects maintained fixation while passively viewing the same stimuli. Among other results, MT+ responses were higher during speed discrimination than during the other conditions. Beauchamp et al. (1997) and Chawla et al. (1999) likewise observed that MT+ responses modulated as subjects alternated between performing speed and color discriminations. O'Craven et al. (1997) observed modulations of MT+ when subjects were simply instructed to covertly alternate their attention between moving and stationary dots, without overtly performing any task.

PET measurements by Orban et al. (1996, 1998) showed a pattern opposite from what we observed. Orban et al. failed to show a significant difference in MT+ activity between a speed discrimination condition and a condition comparable to our passive fixation condition; we observed strong MT+ modulations. However, the speed discrimination task used by Orban et al. (1998) differs substantially from the task we used. While our stimuli consisted of a set of radially moving dots or gratings within a central 14° circle, Orban and colleagues used translating dots of 50% coherence within a small (3°) aperture. Additionally, their stimulus moved in the same direction on every trial, which could have produced motion aftereffect. Our stimulus alternated inward/outward to reduce the possibility of motion adaptation.

The significance of these differences in design is confirmed by the large discordance in psychophysical thresholds reported in the two studies. Subjects in our speed discrimination tasks performed at 80% correct with speed increments of 7-14% across both stimulus types, a value only slightly higher than published speed discrimination thresholds in the psychophysics literature (McKee et al. 1986). Subjects in the Orban et al. (1998) study, meanwhile, had a median speed discrimination threshold of 40%, which suggests that the difficulty or nature of their task was quite different from ours. In a related series of fMRI experiments, Orban and colleagues did observe weak task-related modulations in MT+ when subjects performed direction discrimination tasks (Cornette et al. 1998).

Task-related modulations in V1 have also been reported previously. Corbetta et al. (1991) reported that V1 activity was greater (relative to passive fixation) when subjects performed speed discrimination. Watanabe et al. (1998a) reported that the task-related modulations in V1 exhibit task specificity. In that experiment, subjects viewed a field of translating dots superimposed over a group of radially moving dots. Subjects were cued to attend to the radial motion, to attend to the translational motion, or to passively fixate. Significant increases in V1 activity (relative to passive fixation) were observed when subjects were cued to attend to the translating dots, but there was no difference in V1 activity (relative to passive fixation) when subjects were cued to attend to the radially moving dots. We, on the other hand, observed significant task-related increases in V1 activity using radially moving dots (further replicated using radial gratings). This discrepancy might arise from the fact that subjects in the Watanabe et al. study were instructed simply to "attend" to the various types of motion, without performing a well-controlled task. It is also possible that task-related modulations using radial motion are smaller than those using translational motion, and that the methodological differences in our study (e.g., many repeated measurements for each subject, with no averaging across subjects) enabled us to demonstrate significant results. Note especially that averaging across subjects may obscure otherwise significant effects because of the consistent individual differences in the task-related modulations.

Related electrophysiology results

A number of single-cell electrophysiology experiments have demonstrated that shifts of spatial attention can affect neuronal responses (see Desimone and Duncan 1995; Heeger 1999; Maunsell 1995 for reviews), but it is unknown whether the effects of spatial attention are the same as those of featural attention.

Treue and Martinez Trujillo (1999, experiment 2) demonstrated an effect of nonspatial, featural attention. They measured responses in MT to preferred-direction motion in the receptive field while monkeys attended to a moving random dot pattern positioned outside the receptive field. The attended stimulus moved in either the preferred or null direction. Responses were enhanced when the attended pattern moved in the same (preferred) direction as the stimulus inside the receptive field. In this experiment, the attended random dot pattern was always in the same location, so that spatial attention was fixed. Likewise, Haenny et al. (1988) reported that responses in V4 were contingent on the attended orientation in a match-to-sample task. Our experiments required subjects to attend alternately to different stimulus features (e.g., speed vs. contrast), while these electrophysiology experiments required monkeys to attend alternately to different values of a single stimulus feature (e.g., upward vs. downward motion, vertical vs. horizontal orientation). It is unclear whether these two behavioral protocols are tapping into common mechanisms. Unfortunately, the effects observed in the electrophysiological experiments cannot readily be replicated with fMRI because of its limited spatial resolution; it is unlikely that changing the value of the attended stimulus feature (e.g., from vertical to horizontal) would affect the average activity within a visual area, and it is not yet possible to routinely measure activity at the columnar scale with fMRI.

Several other electrophysiology experiments have examined the effects of stimulus selection based on cued features. However (as discussed by Treue and Martinez Trujillo 1999), these experiments either required a response (e.g., an eye movement) to a target at a specific location (Chelazzi et al. 1993; Motter 1994), or they simultaneously manipulated both the attended feature and the attended spatial location of the target (McAdams and Maunsell 1999). Thus they do not provide definitive measurements of featural attention in the absence of spatial attention.

The responses of single units are also sensitive to task difficulty. For example, Spitzer et al. (1988) demonstrated that V4 responses were increased and orientation tuning was sharpened by increasing the difficulty of a line orientation task from near-perfect performance to 73% correct (approximately threshold level). It is possible that similar effects underlie the modulations we observed in our task-versus-passive experiments.

Implications

The results of the experiments reported in this paper have important implications for the design and interpretation of neuroimaging experiments. Because fMRI responses are affected by what the subject is doing, experiments need to be designed to properly control the behavioral state of the subject throughout each scan. Experiments in which subjects only perform a task through part of the scan, or in which one stimulus is potentially more engaging than the other(s), may yield results that confound stimulus-elicited responses with task- or vigilance-driven changes in cortical activity. Moreover, the chosen tasks need to be as similar as possible throughout each scan. For example, several recent neuroimaging studies have claimed to demonstrate spatially selective effects of attention in V1 (Brefczynski and DeYoe 1999; Gandhi et al. 1999; Kastner et al. 1998; Somers et al. 1999; Tootell et al. 1998; Watanabe et al. 1998b), but in two of these studies the effect of spatial selection was confounded with the effect of changing the task. Watanabe et al. (1998b) instructed subjects to alternately attend and to passively view stimuli either to the left or right of fixation. Somers et al. (1999), on some trials, instructed subjects to attend to a series of five letters presented in the fovea to determine whether any letters were different from those presented during the previous trial. On other trials, subjects were instructed to attend to a surrounding peripheral stimulus to determine whether its motion direction matched that during the previous trial.

Finally, averaging neuroimaging data across subjects can miss reliable effects observed in individual subjects. For example, we observed individual differences in the contrast/speed experiment (1 subject showing significantly higher V1 activity during contrast discrimination, 1 showing higher activity during speed discrimination, and 2 showing intermediate effects). Although averaging the data across subjects could have demonstrated that task-specific modulations are more variable in V1 than in MT+, this type of analysis would still obscure the fact that some individuals show statistically reliable modulations as early as V1.


    ACKNOWLEDGMENTS

Special thanks to G. H. Glover (and the Richard M. Lucas Center for Magnetic Resonance Spectroscopy and Imaging, supported by a National Institutes of Health National Center for Research Resources grant) for technical support. Thanks to G. M. Boynton and W. A. Press for helpful comments on the manuscript.

A. C. Huk was supported by a National Science Foundation graduate research fellowship. D. J. Heeger was supported by National Eye Institute Grant R01-EY-11794.


    FOOTNOTES

Address reprint requests to D. J. Heeger (E-mail: heeger{at}stanford.edu).

1 This relationship becomes clear by considering each scan type as measuring a difference between two conditions. Speed discrimination scans compared cortical activity during 8°/s motion (with task) with cortical activity during 8°/s motion (passive). Direction discrimination scans compared 0°/s (with task) and 0°/s (passive). Denoting the responses from speed discrimination scans as (A - B) and denoting the responses from direction discrimination scans as (C - D), the difference between speed and direction responses is simply (A - B) - (C - D). Meanwhile, all-task scans compared 8°/s (with task) and 0°/s (with task); all-passive scans compared 8°/s (passive) and 0°/s (passive). Using the same notation, the difference between all-task and all-passive scans is (A - C) - (B - D), which is equal to the expression above for the difference between the two types of task/passive scans.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received 21 September 1999; accepted in final form 11 February 2000.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

0022-3077/00 $5.00 Copyright © 2000 The American Physiological Society