1Interdepartmental Program in Neuroscience and 2Department of Neurobiology and Brain Research Institute, UCLA School of Medicine, Los Angeles, California 90095-1763
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Amador, Nelly, Madeleine Schlag-Rey, and John Schlag. Reward-Predicting and Reward-Detecting Neuronal Activity in the Primate Supplementary Eye Field. J. Neurophysiol. 84: 2166-2170, 2000. In addition to cells specifically active with visual stimuli, saccades, or fixation, the supplementary eye field contains cells that fire in precise temporal relationship with the occurrence of reward. We studied reward-related activity in two monkeys performing a prosaccade/antisaccade task and in one monkey trained in memory prosaccades only. Two types of neurons were distinguished by their reciprocal firing pattern: reward-predicting (RP) and reward-detecting (RD). RP neurons linearly increased their firing as early as 150 ms before saccade onset until the occurrence of reward, at which time they abruptly ceased firing. In contrast, RD neurons fired in phase with reward delivery, even when its duration was varied and when it was repeated at different frequencies. RD discharges were little affected or unaffected by the position of a visual cue that briefly anchored the goal at the onset of reward. The complementary firing patterns of the RP and RD neurons could provide a feedback mechanism necessary for learning and performing the task.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To a monkey performing a
task, the reward given on correct performance is perhaps the most
significant event of the trial. Until the monkey "learns the rules of
the game," she proceeds by repeating only those behaviors that are
rewarded and by avoiding those that are not. In primates,
reward-dependent neuronal activity has been found in the prefrontal
cortex, parietal cortex, rostral cingulate cortex, and basal ganglia
(see DISCUSSION). The present study especially is concerned
with neuronal signals time-locked to the reward event that were
recorded in the supplementary eye field (SEF). A preliminary report
appeared in Amador et al. (1999).
Patterns of neuronal activity concomitant with reward were analyzed in three monkeys who were trained to perform delayed and nondelayed saccade tasks, including prosaccades and antisaccades. In our experiments, monkeys often had to make a saccade to an unmarked location. When the saccade terminated in a window centered on the required goal and gaze remained in the window for 300-500 ms, a flash of light paired with juice delivery indicated this goal (but no flash/juice appeared on incorrect trials). The concomitant occurrence of the feedback flash and the juice is here defined as reward. One objective of this study was to determine how these events were represented in the reward-related neuronal activity.
Reward-related neurons were recorded in the SEF region containing
saccade-related and/or visual fixation cells similar to those
previously described (Schlag and Schlag-Rey 1987;
Schlag et al. 1992
). Two types of neurons were
distinguished by their reciprocal firing patterns: reward-predicting
(RP) and reward-detecting (RD).
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Single-unit recordings were made from the SEF of three rhesus
monkeys (implanted with scleral search coils) performing memory-guided saccades and pro/antisaccades (Amador et al. 1998;
Schlag-Rey et al. 1997
). Receptive fields and/or
movement fields were first mapped by presenting light spots (0.3°) on
a tangent screen. Saccade targets were then presented at the center
of
diametrically opposite
the response field. Correct performance was
rewarded by 50% diluted apple juice (sweetened with aspartame) paired
with a flash of light at the prescribed saccade goal. An infrared
camera was used to observe licking and drinking movements. All task
events were controlled by computer. Prior electrical stimulation (
40
uA) ascertained that the recording sites were in the SEF. Final
verification by histological examination was obtained in two monkeys.
All procedures conformed with National Institutes of Health guidelines
for the care and use of laboratory animals and were approved by the
UCLA Animal Research Committee.
When a neuron showed a pattern of firing that was time-locked to reward, one or more tests were applied: 1) the flash or the juice, or both, was omitted; 2) within a trial, the frequency and/or the duration of the compound reward was varied; 3) juice and feedback light were repeated at asynchronous frequencies. A Fourier analysis was then used to determine the relative power of light-related and juice-related signals.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Three types of reward-related activity were distinguished. One of
them (n = 11/65) was invariably associated with oral
movements. Because it was observed at posterior locations, possibly
at sites in the supplementary motor area (SMA) (see
Schall 1991), it will not be considered here. The other
two types were complementary: one predicting the imminence of reward
(RP neurons; n = 23/65) the other signaling its
occurrence (RD neurons; n = 31/65). RP and RD neurons
were recorded within the physiologically and anatomically defined SEF
(Schlag and Schlag-Rey 1987
; Shook et al.
1990
), i.e., at a level slightly anterior to the bend
of the arcuate sulcus, at sites 3-4 mm distant from the midline in
superficial and deep layers of the dorsomedial cortex. RP and RD
neurons were found intermingled with saccade-related (SR) neurons
within the same or adjacent tracks. In recent experiments, the ratio of
reward-related (RP + RD) neurons to saccade-related neurons was 2:1.
However, this proportion may have been influenced by the complexity of the task and/or by its recent learning (see Hollerman et al.
1998
).
The firing of RP neurons often started before the eye movement but culminated at the onset of reward (Fig. 1B). This smooth increase in firing resembled that of SR neurons commonly encountered in the SEF. However, SR and RP neurons were easily distinguished by the timing of their peak firing. Typical SR firing (Fig. 1A) peaked during the saccade and then progressively declined whereas the RP discharge (Figs. 1B and 2A) kept increasing throughout the movement and throughout a subsequent fixation period (300-500 ms), peaking, on average, 133 ± 32 ms before reward onset. For 9/23 RP neurons, the prelude started after the saccade (Fig. 2B). RP neurons were further characterized by a striking inhibition starting with and lasting through the reward epoch. The difference between peak firing rate and activity following reward onset (both measured over 100 ms, paired t-test) was significant at P < 0.001 for all but one neuron (P < 0.05).
|
|
To probe the role of the reward event in inhibiting RP neurons, trials ending with and without reward were interleaved. The monkey could not predict when this reward/no reward paradigm would be introduced. When reward was omitted, the RP activity was prolonged, as if the monkey still expected the reward, but then decreased much more slowly (Fig. 1B, dotted curve) than when reward was given (Fig. 1B, solid curve). Thus the RP profile appears not only to predict the timing of reward but also to reflect the subject's confidence in performing correctly because, in previous trials, correct performance was always rewarded.
In contrast to RP neurons, RD neurons (Fig. 2, D-F) were silent before reward onset, but they fired at a time when the RP neurons were inhibited; compare Fig. 2, A and D. The difference between peak firing and activity preceding reward (measured as for RP neurons) was significant at P < 0.001 (n = 13), P < 0.01 (n = 7), and P < 0.05 (n = 1). Whereas the RP neurons stopped firing just before reward onset (mean = 3 ± 9.3 ms, SE; n = 22), RD neurons produced a few spikes before that event (mean = 10 ± 4.9 ms; n = 20; this group excludes neurons tested with asynchronous light and juice). As expected, RD neurons were completely silent on incorrect trials. All of the 31 RD neurons signaled the occurrence of the reward regardless of the location of the visual stimulus or the direction of eye movement. RD bursts depended neither on visual fixation per se (as evident from other periods of fixation during a trial; see Fig. 2E), on small corrective saccades (which eventually occurred after reward onset, but not always; see Fig. 3A), or on licking and swallowing movements (with which RD bursts were visibly asynchronous).
|
RP and RD neurons remained contrasted when juice was given with, and then without, feedback light (compare Fig. 2, B and E). Typically, RP firing increased in expectation of the reward (light and juice) and was inhibited afterwards. Juice given alone (Fig. 2B, second reward) also inhibited RP firing, but not immediately. As for the RD response, it was still present but significantly weaker in the juice-alone condition.
When we varied the frequency and duration of the compound reward, RP neurons still exhibited their predictive pattern (Fig. 2C). For RD neurons, increasing the duration of the reward epoch prolonged the activity (Fig. 2F, group profiles) whereas increasing the frequency of shorter rewards increased the number of discrete bursts (not illustrated).
For 10 neurons in 28 experiments with repeated reward, juice and light of various durations were combined at different frequencies. A Fourier analysis was used to discriminate signals related to juice delivery from those related to feedback light during stable fixation (Fig. 3A). The majority of neurons (n = 8/10) responded exclusively to or more strongly to the light (Fig. 3B) than to the juice (even when the light was given for 2 ms) whereas two neurons responded exclusively to the juice (Fig. 3C).
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The major finding of the present study is that neurons
displaying reciprocal firing patterns hinged on reward for correct eye
movements can be found in the SEF. Signals related to reward expectation or detection of correct hand movements have been reported in many studies and in diverse areas such as the caudate nucleus (Hikosaka et al. 1989), prefrontal cortex
(Rosenkilde et al. 1981
; Tremblay and Schultz
1999
; Watanabe 1989
), and rostral cingulate motor area (Shima and Tanji 1998
).
In contrast with studies showing modulations of visual or motor signals
resulting from differential reward (e.g., Hollerman et al.
1998; Kawagoe et al. 1998
; Platt and
Glimcher 1999
; Shima and Tanji 1998
;
Watanabe 1996
), in our experiments, the reward remained
invariant during a session.
The RP profile of excitation/inhibition could not be confounded with a
presaccadic prelude of activity because it had a clearly different time
course and was time-locked to a different event. RD bursts were
specifically triggered by the reward onset and, in most cases, they
lasted for the duration of the reward epoch. These activities did not
depend on fixation, corrective saccades, or oral movements. More data
are needed to specify the exact temporal relationship between RP
inhibition and RD excitation, but their striking coincidence suggests
that they might be complementary elements in a control system used to
self-evaluate performance of a task (see also Stupphorn et al.
1999).
Except in recent tests, our monkeys always saw a feedback light appear
simultaneously with juice delivery. The juice served to sustain the
monkey's motivation for performing the task and the feedback light
provided an error estimate of the difference between the required goal
and the achieved goal of the saccade. Because this light was constantly
paired with the primary reinforcer (juice), it played an important role
as a secondary reinforcer. What was the respective influence on RD
firing of the motivation and cognitive elements confounded in a
compound reward occurrence? Our signal analysis suggests that, in our
experiments, the feedback light had a greater influence than the juice.
Accurate feedback may be particularly important when making a saccade
to an internally defined goal, such as in antisaccade trials. The fact
that the neurons described were found in the SEF, where
sensori-oculomotor signals are processed, is consistent with a
cognitive role for this region (Chen and Wise 1995).
Further research will have to determine whether this cognitive aspect
of reward is more prominently represented in the SEF than, for example,
in the orbitofrontal cortex, where neurons appear to be more attuned to
motivational parameters of reward (for a recent review see
Schultz et al. 2000
).
The SEF may be one of the brain sites that encode the expected timing of reward and record its actual occurrence, as distinct from the quality of reward. Whether this activity is unique to the SEF or whether it is prompted by the complexity of the task remains to be seen.
![]() |
ACKNOWLEDGMENTS |
---|
We thank the anonymous referees for constructive suggestions, A. Dorfsman for computer expertise, and Y. Kwon for general assistance.
This work was supported by National Eye Institute Grants EY-02305 and EY-05879.
![]() |
FOOTNOTES |
---|
Address for reprint requests: M. Schlag-Rey, Dept. of Neurobiology, UCLA Medical Center (CHS), 10833 Le Conte Ave., Box 951763, Los Angeles, CA 90095-1763 (E-mail: msr{at}ucla.edu).
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 28 March 2000; accepted in final form 19 June 2000.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|