Institute of Physiology, University of Fribourg, CH-1700 Fribourg, Switzerland
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Hassani, Oum K., Howard C. Cromwell, and Wolfram Schultz. Influence of Expectation of Different Rewards on Behavior-Related Neuronal Activity in the Striatum. J. Neurophysiol. 85: 2477-2489, 2001. This study investigated how different expected rewards influence behavior-related neuronal activity in the anterior striatum. In a spatial delayed-response task, monkeys reached for a left or right target and obtained a small quantity of one of two juices (apple, grenadine, orange, lemon, black currant, or raspberry). In each trial, an initial instruction picture indicated the behavioral target and predicted the reward. Nonmovement trials served as controls for movement relationships. Consistent preferences in special reward choice trials and differences in anticipatory licks, performance errors, and reaction times indicated that animals differentially expected the rewards predicted by the instructions. About 600 of >2,500 neurons in anterior parts of caudate nucleus, putamen, and ventral striatum showed five forms of task-related activations, comprising responses to instructions, spatial or nonspatial activations during the preparation or execution of the movement, and activations preceding or following the rewards. About one-third of the neurons showed different levels of task-related activity depending on which liquid reward was predicted at trial end. Activations were either higher or lower for rewards that were preferred by the animals as compared with nonpreferred rewards. These data suggest that the expectation of an upcoming liquid reward may influence a fraction of task-related neurons in the anterior striatum. Apparently the information about the expected reward is incorporated into the neuronal activity related to the behavioral reaction leading to the reward. The results of this study are in general agreement with an account of goal-directed behavior according to which the outcome should be represented already at the time at which the behavior toward the outcome is performed.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
One of the central structures
involved in the motivational control of behavior appears to be the
striatum (Beninger 1983; Fibiger and Phillips
1986
; Robbins and Everitt 1996
; Wise
1982
). Studies searching for neurophysiological correlates of
motivational functions in primates describe neurons in the striatum
(caudate nucleus, putamen, ventral striatum including nucleus
accumbens) that process reward information in several distinctive
forms. Some striatal neurons respond phasically following the delivery of a drop of liquid reward during well-established behavior, thus detecting the outcome of a behavioral action (Aosaki et al.
1994
; Apicella et al. 1991a
, 1997
; Bowman
et al. 1996
; Hikosaka et al. 1989
;
Shidara et al. 1998
). Other striatal neurons are
activated during several seconds before the occurrence of a reward,
suggesting access to stored information about the expected outcome
(Apicella et al. 1992
; Hikosaka et al.
1989
; Hollerman et al. 1998
; Schultz et
al. 1992
). With changing reward contingencies during learning, striatal neurons show changes in reward expectation activity that correspond closely to adaptations in the animals' behavior
(Tremblay et al. 1998
). Particularly interesting reward
influences are seen in the striatum during the preparation and
execution of movement. Neuronal activity is particularly prominent when
rewards are expected as opposed to no reward (Hollerman et al.
1998
; Kawagoe et al. 1998
). These studies
suggest that the striatum may constitute one of the prominent reward
centers in the brain.
To elucidate the role of the primate striatum in the processing of
reward information, we investigated to what extent neurons in the
anterior striatum might discriminate between different rewards and how
such information could influence neuronal activity related to the
behavior leading to these rewards. We aimed primarily for the anterior,
or "associative," striatum, which encompasses all three major
striatal subdivisions (caudate, putamen, ventral striatum including
nucleus accumbens), and made comparisons among these parts.
Delayed-response tasks can be considered as crucial tests for assessing
the behavioral functions of the striatum and frontal cortex
(Divac et al. 1967; Jacobsen and Nissen
1937
). We used a modified delayed-response task to test,
separately, the behavioral reaction to be performed (arm movement to
the left vs. right, movement vs. nonmovement reaction) and the type of liquid reward to be obtained (different juices). These procedures allowed us to investigate how an expected reward may influence neuronal
activity during the decision, preparation, initiation, and execution of
the behavioral reactions. The results were previously presented in
abstract form (Cromwell et al. 1998
).
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The study was performed on three macaque monkeys (A:
Macaca fascicularis, female, 3.4 kg; B: M. fascicularis, male, 2.8 kg; C: M. mulatta,
male, 6.0 kg). The activity of single neurons was recorded with
moveable microelectrodes during performance of a spatial
delayed-response task with different liquid rewards while monitoring
arm muscle activity, licking movements, and eye movements. Electrode
positions were reconstructed from small electrolytic lesions on
50-µm-thick, cresyl-violet-stained histological brain sections. Most
methods were similar to those described in detail before
(Hollerman et al. 1998). All experimental protocols
conformed to National Institutes of Health guidelines and the Swiss
Animal Protection Law; they were supervised by the Fribourg Cantonal Veterinary Office.
Behavioral procedures
Animals performed in a spatial delayed-response task for liquid reward. The monkey kept its right hand relaxed on an immovable, touch-sensitive resting key. It faced a 13-in computer monitor positioned behind a transparent plastic wall in which two small levers were mounted to the left and right of the midline. At the start of each trial, a color instruction picture (13 × 13°) appeared for 1 s on a computer screen above the left or right lever (Fig. 1, top). The instruction indicated both the target of a future arm movement and the kind of liquid reward obtained at trial end. After a randomly varying delay of 3.5-4.5 s following instruction onset, two identical red squares appeared simultaneously as movement trigger at the left and right positions of the instruction. The trigger determined the time of the behavioral response without indicating the spatial target or the specific reward. The animal released the resting key, touched the lever at the position previously indicated by the instruction and received the liquid reward indicated by the instruction. Both trigger squares extinguished on correct lever touch. Muscle contractions during the instruction-trigger delay, precocious key release, incorrect lever touch, or failure to touch the correct lever within 2 s after trigger onset was considered as an error, led to cancellation of the trial, and went unrewarded.
|
Liquid rewards (0.10-0.20 ml) were dispensed 2.0 s after lever touch by computer-controlled liquid valves from spouts at the animal's mouth. Three spouts (2 mm ID) were positioned in a horizontal arrangement (distance, 7.5 mm) in front of the animal's mouth. Each spout delivered only a single liquid during any given day. Liquids were apple, grenadine, orange, lemon, black currant, and raspberry juices. A specific instruction picture indicated at trial onset the juice to be delivered for correct performance at trial end. Each picture remained constantly associated with the same specific juice throughout experimentation with the same animal. However, we used two to six different sets of three instruction pictures for the same three rewards in each animal to distinguish between the influences of visual features versus rewards on neuronal responses (Fig. 1). Only two instruction pictures with their associated two liquid rewards were used in a given block of trials. All neurons were tested with at least one picture set in one block of trials, and the majority of neurons was tested with all three pictures of a given set, necessitating at least two trial blocks. About 100 neurons were tested with two or three sets of pictures, including neurons showing responses to the pictures. All rewards were used in combinations in which animals showed reliable and persistent preferences throughout at least one block of trials and usually during one or several days or weeks. The effects of satiation on particular juices were not tested.
The two spatial targets and two liquid rewards alternated semi-randomly with the consecutive occurrence of same trial types being restricted to three trials. Trials lasted 12 s irrespective of behavioral performance; intertrial intervals were 2-3 s. Closed-circuit video systems served to continuously supervise limb and mouth movements. Animals were partially fluid-deprived during weekdays and were returned to their home cages after each daily session.
Two additional trial types served for control purposes. First, we used nonmovement trials to assess further movement relationships of selected neurons. These trials were semi-randomly interspersed with left and right movement trials in the delay task. The instruction picture was presented for 1.0 s at the center of the monitor, instead of the left or right position, and the trigger stimulus was the same as in movement trials. The animal kept its hand on the resting key for 2.0 s beyond the instruction-trigger delay to receive the liquid reward indicated by the instruction. In the second task variation, we assessed preferences for liquid reward in blocks of two-reward choice trials several times on each day on which neurons were recorded. In otherwise unchanged delayed-response trials, two different instructions for two rewards, instead of one instruction for one reward, were shown simultaneously, their left and right positions alternating semi-randomly. This allowed the animal to chose its reward by touching the appropriate lever following the trigger stimulus. Each pair of instructions was composed of one picture associated with a preferred and one with a nonpreferred reward.
Data acquisition
Following behavioral conditioning, animals were implanted under deep pentobarbital sodium anesthesia and aseptic conditions with two horizontal cylinders for head fixation and a stainless steel chamber permitting vertical access with microelectrodes to the left striatum. The dura was left intact. Teflon-coated, multistranded, stainless steel wires were implanted into the extensor digitorum communis and biceps brachii muscles of the right arm for electromyographic (EMG) recordings. The implant was fixed to the skull with stainless steel screws and several layers of dental cement. Animals received postoperative analgesics and antibiotics.
Glass-insulated, platinum-plated tungsten microelectrodes positioned inside a metal guide cannula served to record extracellularly the activity of single neurons, using conventional electrophysiological techniques. Inspections of histological sections revealed that the tips of the guide cannulas ended above the most dorsal parts of striatum. Although guide cannulas damaged more tissue than solid microelectrodes, they permitted the use of thin microelectrodes causing very little damage to the areas investigated. Discharges from neuronal perikarya were converted into standard digital pulses by means of an adjustable Schmitt trigger. EMGs were converted into standard digital pulses by a Schmitt trigger. EMGs and horizontal and vertical eye positions (infrared oculometer, Iscan) were collected during neuronal recordings. Licking movements were recorded during neuronal recordings as standard digital pulses produced by tongue interruptions of an infrared light beam at the liquid spout.
Pulses from neuronal discharges and EMGs were sampled together with digital signals from the behavioral task by a computer, together with analog signals from electrooculograms. Only data from neurons sampled and displayed by the computer for at least 10 trials in each of the four trial types (2 spatial targets and 2 liquid rewards) are reported. All data from neurons suspected to covary with some task component, and occasionally from unmodulated neurons, were stored uncondensed on computer disks.
Data analysis
Task-related increases of activity from the group of slowly discharging striatal neurons were assessed during individual task periods with the nonparametric one-tailed Wilcoxon signed-rank test incorporated into the evaluation software (P < 0.01). Data are only reported from neurons showing statistically significant activity increases in relation to at least one task event compared with a 1- or 2-s control period. This period was immediately before the instruction as first task event or at a period of apparent lack of modulation in cases of suspected activations preceding the instruction. Preinstruction activations are not reported. Depressions of the low background activity were difficult to assess during any task period and were not further studied.
Task-related increases of individual neurons were compared on the basis of impulse counts from individual trials during identical task periods and durations with the two-tailed Mann-Whitney U test (P < 0.01). We compared between two different rewards, left and right targets, movement and nonmovement trials, and corresponding instructions of different picture sets. Comparisons between different rewards used the following standard time windows: 0-1.0 s after instruction, 0-3.0 s before the trigger, 0-1.0 s after trigger, 0-2.0 s before the reward, and 0-2.0 s after reward. Only data from neurons with insignificant differences in control periods were considered (P > 0.05).
Durations of anticipatory licks were measured from instruction onset to
reward onset in each trial. Reaction times (from trigger stimulus onset
to key release) and movement times (from key release to lever touch)
were collected during neuronal recording sessions. Licks, reaction
times, and movement times were parametrically distributed and tested
for predicted differences between two rewards with the one-tailed
Student's t-test. Error rates were measured for movements
to the left or right target levers. Their skewed distributions were
compared with the one-tailed Wilcoxon test. Anatomical distributions of
task-related activations were assessed with the
2 test.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Behavioral performance
All three animals performed the task >95% correctly throughout neuronal recording periods. In two-reward choice trials, animals reliably preferred the same reward in 65-99% of trials irrespective of left or right instruction positions. Animal A preferred grenadine or lemon over orange over apple juice. Animal B preferred black currant or raspberry over orange over grenadine or lemon juice. Animal C preferred raspberry over black currant over orange or lemon juice. Some of the preferences changed over periods of days or weeks, but none changed reliably on satiation during individual days.
Licks were assessed during trial periods preceding reward as a measure of reward expectation. They began mostly after instruction onset and became more frequent toward the reward (Fig. 2). When pooled over all trials of a given two-reward comparison, anticipatory licks lasted significantly longer with preferred than nonpreferred rewards in 7 of 13 comparisons (P < 0.025 to P < 0.0005; 1-tailed t-test; Fig. 3, top). Errors in task performance were significantly lower with preferred rewards in 9 of 14 comparisons (P < 0.1 to P < 0.001; 1-tailed Wilcoxon test on errors in individual trial blocks; Fig. 3, middle). Reaction times of arm movements were significantly shorter with preferred rewards in 8 of 14 comparisons (P < 0.025 to P < 0.0005; 1-tailed t-test in pooled trials; Fig. 3, bottom). Movement times differed inconsistently between rewards.
|
|
Behavioral reactions in individual trial blocks rarely showed significant differences between rewards. Anticipatory licks differed in 87 of 444 blocks (20%), and reaction times differed in 49 of 610 neuronal recording blocks (8%; P < 0.025; 1-tailed t-test).
Behavioral differences were slightly more pronounced in trials in which neurons discriminated between rewards, as compared with trials with nondiscriminating neurons. This was seen in pooled trials (licks, 7 vs. 4 of 13 reward comparisons, P < 0.025; errors, 6 vs. 4 of 14 comparisons, P < 0.1; reaction times, 9 vs. 5 of 14 comparisons, P < 0.025) and in individual blocks (licks, 26% of 156 vs. 16% of 288 blocks; reaction times, 9% of 221 vs. 7% of 389 blocks; P < 0.025).
Muscle activity differed during the reaching movement between left and right targets in the extensor digitorum communis and biceps but failed to vary systematically between different rewards (Fig. 4). Despite occasional activities before trial onset, these muscles were relaxed during the instruction-trigger delay. Gaze and eye movements were comparable for the different rewards (Fig. 5). The instruction elicited an ocular saccade to a relatively fixed position on each instruction picture unless the gaze was already there. The trigger stimulus in both movement trials elicited a saccade to the appropriate response lever. In no cases were differences of neuronal activity between rewards clearly related to differences in eye movements.
|
|
Neuronal database
We studied >2,500 slowly discharging striatal neurons with
control rates of 0.1-2.5 imp/s in the spatial delayed response task,
and a subset of 105 of them in the delayed go-nogo task. Of the >2,500
neurons, 610 showed 716 statistically significant task-related
activations (63, 251, and 296 neurons recorded in animals A,
B, and C, respectively, requiring 302 recording days). The remaining neurons failed to show task-related modulations in raster
displays during the experiment and were not further investigated. Five
forms of task relationship were found during the different trial
periods and consisted of responses to instructions, activations
preceding the trigger, activations following the movement trigger,
activations preceding rewards, and responses to rewards. The separate
group of tonically active striatal neurons (Apicella et al.
1991b; Kimura et al. 1984
) with discharge rates
of 3-8 imp/s were not further studied.
A total of 216 of the 610 task-related neurons (35%) showed statistically significant differences in 242 task-related activations between at least two liquid rewards. Reward-discriminating activity was seen in 17-49% of task-related neurons (Table 1). Activations were higher with either preferred (62% of discriminations; Figs. 7-9, 12B, and 14, A and B) or nonpreferred rewards (38%; Figs. 6, 10-12A, and 13) in all five forms of task relationships. Reward-discriminating differences in activity were 112 ± 18.3% (mean ± SE) for the 61 instruction responses, 76 ± 6.5% for the 50 activations preceding the trigger, 65 ± 12.4% for the 28 activations following the trigger, 69 ± 12.5% for the 25 activations preceding rewards, and 109 ± 16.9% for the 78 reward responses.
|
|
Responses to instructions
A total of 124 neurons responded to the instructions with transient activations which subsided within 500 ms after stimulus offset (Table 1). Of these, 61 neurons (49%) discriminated between two liquid rewards. Responses discriminating between left and right instruction positions were seen in 29 neurons, 18 of which discriminated between rewards, all of them only on one side of instruction presentation (Fig. 6). Responses to instructions presented on the other side either failed to discriminate between rewards (10 neurons) or were entirely absent (8 neurons). Spatially unselective neurons discriminated frequently also between rewards (Fig. 7). A subset of 22 nonspatial neurons were tested in nonmovement trials, and 6 of them showed higher responses in movement than nonmovement trials. Of these, one neuron also discriminated between rewards in nonmovement trials (Fig. 7C). Fourteen reward-discriminating neurons were tested with multiple instruction picture sets (3 spatial, 11 nonspatial neurons). Responses in nine of these neurons maintained the same reward discriminations and varied insignificantly between instructions indicating the same rewards (Figs. 6, A vs. B, and 7, A vs. B).
|
Activations during the instruction-trigger delay
A total of 153 task-related neurons showed activations which began during the instruction-trigger interval and terminated >500 ms after instruction offset, either before or immediately after the trigger stimulus (Table 1). Activations in 50 of these neurons differed between rewards (33%). Of 31 neurons with spatially discriminating activations, 10 discriminated between rewards (Fig. 8A). Reward discriminations occurred only on one side of instruction presentations in all but three neurons. Activations with instructions on the other side either failed to discriminate (1 neuron) or were entirely absent (6 neurons). A subset of 48 nonspatial neurons was tested in nonmovement trials, and 24 neurons showed higher activations in movement than nonmovement trials. Of these, eight neurons discriminated between rewards in movement trials (Fig. 9A). Spatially unselective neurons with substantial nonmovement activity discriminated also between rewards (Fig. 10). Reward-discriminating activity was unrelated to reaction times, which overlapped considerably (Figs. 8B and 9B).
|
|
|
Activations following the trigger stimulus
A total of 168 task-related neurons showed activations that closely followed the movement trigger stimulus (Table 1). Activations in 28 of these neurons discriminated between rewards (17%). Ranking of trials according to the interval between the trigger stimulus and movement onset allowed us to determine the temporal relationships to these two events. Accordingly, the activations were classified as movement-related (123 neurons, 22 of them reward-discriminating; Fig. 11), undefined (39 neurons, 6 of them reward-discriminating; Fig. 12, A and B), or trigger responses (6 neurons, none of them reward-discriminating). Twelve of the 53 neurons differentiating between left and right movement targets discriminated between rewards, 9 of them on one side only (Fig. 11). Activations with movement targets on the other side either failed to discriminate (7 neurons) or were entirely absent (2 neurons). A subset of 35 nonspatial neurons were tested in nonmovement trials, and 27 of them showed higher responses in movement than nonmovement trials. Of these, four neurons discriminated between rewards in movement trials (Fig. 12A). Reward-related differences in neuronal activity were unrelated to reaction times, which overlapped considerably (Fig. 11B).
|
|
Activations preceding rewards
A total of 101 task-related neurons showed activations that usually began well before the reward (Table 1). Most of these activations remained present until the reward was delivered and terminated <500 ms afterward, even when reward occurred before or after the usual time. These activations showed similar, insignificantly varying magnitudes in trials with left versus right movement targets. They occurred in both movement and nonmovement trials. Prereward activations in 25 of the 101 neurons (25%) discriminated between rewards (Fig. 13). Studies of the licking behavior suggested that the reward-discriminating, anticipatory neuronal activities were not due to differences in anticipatory licking. In the example of Fig. 13, higher neuronal activity was associated with the less preferred reward in anticipation of which the animal licked less.
|
Activations following rewards
A total of 170 task-related neurons showed responses that followed
the delivery of a reward and subsided before the instruction of the
subsequent trial (Table 1). Activations showing close temporal
relationships to licking movements (Apicella et al.
1991a) were discarded from the data sample. The reward
responses varied insignificantly in magnitude between left and right
movement targets and occurred in both movement and nonmovement trials.
Activations in 78 of the 170 neurons (46%) discriminated between
rewards (Fig. 14) irrespective of the
side of the movement target. Reward-discriminating neuronal activations
occurred in trial periods in which there were no major differences in
licking behavior (Fig. 14A). They were also observed in
nonmovement trials (Fig. 14B).
|
Recording positions
Histological reconstructions of recording positions revealed that neurons were sampled in caudate nucleus, putamen, and ventral striatum, including nucleus accumbens, between rostrocaudal levels A18 and A25 and thus mostly rostral to the anterior commissure. Recordings were made throughout the entire dorsoventral extent of these structures and were mediolaterally concentrated around the internal capsule (Fig. 15).
|
Reward-discriminating neurons were found in the caudate (53 of 165 task-related neurons; 32%), putamen (73 of 219 neurons; 33%), and
ventral striatum (90 of 226 neurons; 40%). Their distribution failed
to vary significantly between the three structures (P = 0.19; 2 test) and among the rostrocaudal
levels explored (A18-25; P = 0.17). The distribution
of spatially discriminating neurons varied significantly among these
structures (P < 0.0005), being lower in ventral
striatum (10% of neurons with instruction, delay or trigger
activations) as compared with caudate (22%) and putamen (24%). The
distribution of reward-discriminating neurons among the spatially
discriminating neurons varied insignificantly among the three
structures (P = 0.26). Very similar results were
obtained when the same comparisons were made separately for the two
monkey species employed.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The present data show that some of the behavior-related activity
of neurons in the anterior striatum distinguished among different liquid rewards in a spatial delayed response task. All forms of task-related activity differed depending on the type of liquid reward
expected at trial end. Reward relationships were observed during the
preparation, initiation, and execution of the movement leading to the
reward. They occurred apparently on the basis of differential reward
expectations. These data extend comparable results from a previous
study on rewarded versus unrewarded movements in the striatum
(Hollerman et al. 1998) and suggest that information about expected reward may influence neuronal activity related to the
behavioral action leading to the reward. These activities may
contribute to brain mechanisms directing behavioral reactions to
rewarding goals.
Task and behavior
As judged from the effects of lesions, the spatial delayed
response task tests the mnemonic and movement preparatory functions of
primate prefrontal cortex and striatum (Divac et al.
1967; Jacobsen and Nissen 1937
). The initial,
occasion-setting instruction cue determines the operant response to a
subsequent stimulus. In our task, the instruction in addition contained
information about the expected liquid reward. This association was
acquired by a Pavlovian mechanism, as the animal could disregard the
instruction picture and still receive the predicted reward. The
consistent reward preferences in choice trials with two different
instruction pictures, the durations of anticipatory licks, the errors
in behavioral performance, and the reaction times suggest that the
animals discriminated between the different rewards and had established
expectations of the upcoming reward. Apparently, the animals expected
the type of reward produced by the movement.
Mechanisms of reward discrimination
DIFFERENTIAL REWARD EXPECTATIONS VERSUS BEHAVIORAL DIFFERENCES.
Although animals appeared to distinguish between the different liquid
rewards, the neuronal differences did not seem to be predominantly due
to behavioral differences. The ranking of trials according to reaction
time revealed that neuronal activity related to the preparation or
execution of movement differed between rewards despite comparable
reaction times in individual trials (Figs. 8B,
9B, 11, and 12B). A similar result was obtained
with reward-dependent activity in the striatum in rewarded versus
unrewarded movements (Hollerman et al. 1998).
Differences in anticipatory licking were also an unlikely factor, and
some neurons showed even higher activity with less licking toward the
nonpreferred reward (Fig. 13).
IMPORTANCE OF REWARD EXPECTATION. All forms of task-related activity found in our experiments discriminated between the different liquid rewards employed. However, the frequencies of reward discrimination varied (Fig. 16). They were highest for responses following rewards and for responses to the initial, reward-predicting instruction. They were lowest for activity during the execution of movement following the trigger stimulus. Reward discriminations were not better for activity closer to the reward as compared with earlier periods during which the behavior toward the reward was being prepared. This suggests that the expectation of reward exerted a strong influence during task periods in which the reward information was used for organizing the behavioral reactions toward the reward.
|
REWARD VERSUS OBJECT RELATIONSHIPS.
The use of different instruction cues allowed us to assess the
contribution of visual features to reward-discriminating activity. Object-related visual responses may occur in the anterior striatum due
to inputs from ventrolateral prefrontal cortex and inferotemporal cortex (Liu and Richmond 2000; Miller et al.
1996
; Selemon and Goldman-Rakic 1985
) and are
reported in more posterior striatal regions (Brown et al.
1995
). More than half of our neurons tested with multiple
instruction sets for the same rewards maintained their discriminating
activity with the rewards, suggesting that these neurons discriminated
between instructions on the basis of the associated reward rather than
visual features. The impact of reward associations over visual features
may be even higher in the orbitofrontal cortex where a considerably
fraction of instruction responses reflected the predicted type of food
or liquid reward in a delayed response task (Tremblay and
Schultz 1999
).
REWARD VERSUS MOVEMENT RELATIONSHIPS.
Spatially discriminating neurons frequently distinguished between
rewards for only one of the two spatial targets. Such neurons could
show strong activations for both rewards on one side and for only one
reward on the other side (Figs. 6 and 11), or they were activated only
on one side and only with one reward. However, more gradual differences
were also observed (Fig. 8). Likewise, neurons discriminating between
movement and nonmovement trials were activated with both rewards on one
trial type and only one reward on the other trial type (Fig. 10), in
only one trial type with only one reward (Figs. 7 and 9), or showed
more gradual differences. Selective striatal activations in rewarded
movement trials, as opposed to unrewarded movement trials or
nonmovement trials, were also observed in a delayed go-nogo task
(Hollerman et al. 1998). When more than two different
rewards are compared, neurons in the prefrontal cortex may show even
finer differences with rewards that may occur on both sides in a
two-target delayed response task (Watanabe 1996
).
LOW REWARD DISCRIMINATION DURING MOVEMENT EXECUTION.
The lowest degree of reward discrimination was found in neurons showing
activations related to the execution of movement following the
movement-triggering stimulus. This corresponds to the comparatively low
fraction of movement execution neurons discriminating between rewarded
and unrewarded trials in the anterior striatum and the putamen motor
region (Hollerman et al. 1998). In a similar way, activity during the execution of eye movements, as compared with other
task periods, showed relatively little discrimination between different
reward magnitudes in the posterior parietal cortex (Platt and
Glimcher 1999
). These data suggest that the expectation of reward has a weaker influence on activity related to the execution of
movement than other behavioral components.
REWARD DISCRIMINATION VERSUS AROUSAL.
The expectation of highly valued rewards may be accompanied by high
arousal levels, and differences in arousal might contribute to
differential reward-related activities. This argument might be made
when neuronal activity is higher in rewarded than unrewarded trials
(Hollerman et al. 1998; Tremblay et al.
2000
) or with larger than smaller rewards (Leon and
Shadlen 1999
; Platt and Glimcher 1999
). However,
about one-third of the presently reported activations were stronger
with less preferred rewards (Figs. 6, 10, 11, 12A, and 13),
suggesting that the differential reward activities may not in a simple
way reflect different arousal levels.
REWARD PREFERENCE VERSUS PHYSICAL PROPERTIES.
A previous study showed that neurons in the orbitofrontal cortex may
distinguish between different rewards on the basis of the motivational
value, as expressed by the animal's preference behavior, rather than
the physical properties of the reward objects (Tremblay and
Schultz 1999). It is interesting to note that neuronal activity
in the present study on the anterior striatum was higher for preferred
rather than nonpreferred rewards in nearly two-thirds of the
reward-discriminating neurons. This may indicate that a similar coding
mechanism may also exist in some striatal neurons. However, a more
explicit assessment of motivational value would require a different
experimental plan, and a good distinction between striatal coding of
motivational value versus physical identity is difficult to make from
the present data.
Generation of striatal reward activities
The present results reveal several ways in which rewards are
processed by different groups of striatal neurons. The differential responses to rewards and reward-predicting stimuli may be involved in
the perception of rewarding events. The integrated
reward-discriminating and behavior-related activities may provide
information about the type of reward expected for a particular
behavioral reaction. This combination of reward and behavioral
processing follows the general idea of an anatomically based
limbic-motor convergence in the basal ganglia by which information
about behavioral reactions is combined with the motivational aspects to
execute the behavior (Mogenson et al. 1980).
The observed heterogeneous activities may reflect inputs from various
reward-related neurons in cortical and subcortical structures (Schultz et al. 2000). Dopamine neurons seem to detect
an error in the prediction of reward and produce a neuronal signal
suitable for approach learning (Schultz 1998
). Probably
all striatal neurons receive dopaminergic inputs (Freund et al.
1984
; Smith et al. 1994
) and may be influenced
by the dopamine reward signal. Although dopamine neurons discriminate
between rewarded and unrewarded stimuli, they respond similarly to
different food and liquid rewards. They might contribute to striatal
reward processes and even be involved in sustained neuronal activities
(Durstewitz et al. 2000
) but would unlikely be
responsible for their reward-discriminating capacities.
Neurons in the anterior orbitofrontal cortex and the amygdala detect
rewards and reward-predicting stimuli, are active during the
expectation of immediate rewards, and differentiate well between different reward objects, possibly on the basis of relative reward value (Critchley and Rolls 1996; Hikosaka and
Watanabe 2000
; Nishijo et al. 1988
;
Schoenbaum et al. 1998
, 1999
; Thorpe et al.
1983
; Tremblay and Schultz 1999
). Inputs from
such neurons may contribute to the reward-discriminating responses and
sustained reward expectation activities reported here.
Neurons in the dorsolateral prefrontal cortex and posterior parietal
cortex show sustained activations related to mnemonic processes and the
preparation of movements. Some of these activities appear to be
influenced by expected rewarding outcomes (Leon and Shadlen
1999; Platt and Glimcher 1999
; Watanabe
1996
), suggesting coding of both the reward and the behavioral
reaction toward the reward. Activities in these neurons may contribute
to the sustained reward and movement preparation-related activities
reported presently.
Given the existence of partly closed frontal cortex-basal ganglia loops, it cannot be ruled out that striatal reward activities are generated through reverberating loop activity with additional inputs from the amygdala and possibly other structures in which reward information is processed. This suggestion would require that reward-related activities exist also in other components of the loops, such as the globus pallidus and anterior thalamus, which remain to be shown.
Neurophysiological basis for goal-directed behavior?
Rewards may serve as goals for voluntary behavior when the
behavior is intentionally directed at obtaining the reward. According to motivational theory (Dickinson and Balleine 1994),
there should be representations of the outcome already at the time at
which behavioral reactions toward the reward are being prepared and executed (knowing the outcome when doing the action). Behavioral choices are made according to the motivational value of rewards (quality, quantity, and probability of reward). Monkeys can estimate the outcome of behavior and consistently choose among different outcomes irrespective of spatial positions or visual features of cues,
as observed with video games (Washburn et al. 1991
), food versus cocaine discrimination (Nader and Woolverton
1991
), and nutrient rewards (Baylis and Gaffan
1991
; Tremblay and Schultz 1999
; present data).
The present data on differential reward-related activity may point to
neurophysiological mechanisms in the striatum related to the
representation of goals before and during the execution of actions.
Groups of striatal neurons do not only carry information that a given
movement will produce a reward, as opposed to no reward
(Hollerman et al. 1998; Kawagoe et al.
1998
), their activity reflects which of several rewards will
likely be obtained. Such detailed representations would permit striatal
neurons to contribute important information to mechanisms involved in
making choices between different movements toward different rewards. A
similar coding of specific, basic reward aspects, namely quality,
quantity, and probability of reward, appears to take place in the
prefrontal and posterior parietal cortex (Leon and Shadlen
1999
; Watanabe 1996
), where such neurons may be
involved in making decisions between different outcomes (Platt
and Glimcher 1999
). The localization of reward-dependent,
behavior-related activity in closely related striatal and cortical
circuits may contribute to an anatomical and functional basis for a
future framework of neuronal mechanisms of decision-making.
![]() |
ACKNOWLEDGMENTS |
---|
We thank B. Aebischer, J. Corpataux, A. Gaillard, B. Morandi, and F. Tinguely for expert technical assistance.
This study was supported by the Swiss National Science Foundation (Grants 31.43331.95 and NFP38.4038-43997), the Biomed 2 program of the European Community via the Swiss Office of Education and Science (BMH4-CT95-0608 via 95.0313-1), and by an International Research Fellowship Award from the National Science Foundation (U.S.) to H. C. Cromwell (INT-9802538).
![]() |
FOOTNOTES |
---|
Address for reprint requests: W. Schultz (E-mail: Wolfram.Schultz{at}unifr.ch).
Received 2 November 2000; accepted in final form 14 March 2001.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|