Modifications of Reward Expectation-Related Neuronal Activity During Learning in Primate Striatum
Léon Tremblay,
Jeffrey R. Hollerman, and
Wolfram Schultz
Institute of Physiology, University of Fribourg, CH-1700 Fribourg, Switzerland
 |
ABSTRACT |
Tremblay, Léon, Jeffrey R. Hollerman, and Wolfram Schultz. Modifications of reward expectation-related neuronal activity during learning in primate striatum. J. Neurophysiol. 80: 964-977, 1998. This study investigated neuronal activity in the anterior striatum while monkeys repeatedly learned to associate new instruction stimuli with known behavioral reactions and reinforcers. In a delayed go-nogo task with several trial types, an initial picture instructed the animal to execute or withhold a reaching movement and to expect a liquid reward or not. During learning, new instruction pictures were presented, and animals guessed and performed one of the trial types according to a trial-and-error strategy. Learning of a large number of pictures resulted in a learning set in which learning took place in a few trials and correct performance exceeded 80% in the first 60-90 trials. About 200 task-related striatal neurons studied in both familiar and learning conditions showed three forms of changes during learning. Activations related to the preparation and execution of behavioral reactions and the expectation of reward were maintained in many neurons but occurred in inappropriate trial types when behavioral errors were made. The activations became appropriate for individual trial types when the animals' behavior adapted to the new task contingencies. In particular, reward expectation-related activations occurred initially in both rewarded and unrewarded movement trials and became subsequently restricted to rewarded trials. These changes occurred in parallel with the visible adaptation of reward expectations by the animals. The second learning change consisted in decreases of task-related activations that were either restricted to the initial trials of new learning problems or persisted during the subsequent consolidation phase. They probably reflected reductions in the expectation and preparation of upcoming task events, including reward. The third learning change consisted in transient or sustained increases of activations. These might reflect the increased attention accompanying learning and serve to induce synaptic changes underlying the behavioral adaptations. Both decreases and increases often induced changes in the trial selective occurrence of activations. In conclusion, neurons in anterior striatum showed changes related to adaptations or reductions of expectations in new task situations and displayed activations that might serve to induce structural changes during learning.
 |
INTRODUCTION |
Several lines of evidence postulate a role of the basal ganglia in various forms of learning. Deficits in Parkinson and Huntington patients suggest an involvement in motor learning, habit formation, and procedural memory (Butters et al. 1985
; Canavan et al. 1989
; Harrington et al. 1990
; Knopman and Nissen 1991
; Saint-Cyr et al. 1988
; Vriezen and Moscovitch 1990
). This view is also supported by increases in striatal blood flow during motor learning (Seitz and Roland 1992
) and, by exclusion, by preserved procedural learning after temporal lobe lesions leading to declarative memory deficits (Mishkin and Appenzeller 1987
; Phillips and Carr 1987
). Although a specific involvement in procedural memory has been questioned (Gaffan 1996
; Wise 1996
), a role in motor learning would be compatible with the known motor functions of basal ganglia, including the formation of movement sequences (Graybiel 1995
; Hikosaka et al. 1995
). Psychopharmacological experiments suggest a role for the basal ganglia in incentive or reward-directed learning, in particular the ventral striatum and the dopamine systems (Beninger 1983
; Fibiger and Phillips 1986
; Robbins and Everitt 1992
; Wise 1982
). Further cognitive learning functions are suggested by the effects of lesions of the monkey anterior striatum, which induce deficits in spatial delayed response and alternation learning (Bättig et al. 1960
; Divac et al. 1967
). Striatal lesions in rats lead to learning deficits in spatial navigation (Whishaw et al. 1987
) and radial arm maze tasks (Packard et al. 1989
). Finally, the basal ganglia may mediate the influence of declarative memory functions of the temporal lobe on behavioral output controlled by the frontal lobe, as visual discrimination learning remains unimpaired after section of all temporal lobe projections to prefrontal cortex and diencephalon except those via the striatum (Gaffan 1996
). Taken together, the basal ganglia appear to be involved in a considerable number of learning functions, rather than subserving a single learning mechanism. This appears to be in accordance with the multiple behavioral functions attributed to these nuclei.
Recent neurophysiological experiments investigated neuronal mechanisms underlying some of these learning functions. Tonically active striatal neurons (Aosaki et al. 1994
) and midbrain dopamine neurons (Ljungberg et al. 1992
) acquired responses during appetitive learning and may be involved in reward detection, acquisition of stimulus-response associations, and reward prediction. Neurons in the tail of caudate failed to display major changes during discrimination learning (Brown et al. 1995
). However, learning-related changes were found in frontal cortical areas projecting to the head and body of striatum. Neurons in dorsolateral prefrontal cortex transiently lost their behavioral selectivity during the initial learning of new instruction stimuli in delayed response tasks (Niki et al. 1990
; Watanabe 1990
). Selectivity reappeared when task performance reached 85-90% correct trials. Neurons in premotor cortex and supplementary and frontal eye fields showed decreased, increased, or even new task-related activations, as well as changes in behavioral selectivity, when new instruction stimuli were introduced in conditional motor tasks (Chen and Wise 1995a
,b
; Mitz et al. 1991
).
In the present study, we used a learning set situation in which only a single task component changed. Animals acquired the capacity to learn the new component within a few trials, and learning could be tested repeatedly during experimental sessions (Gaffan et al. 1988
; Harlow 1949
). Individual neurons were studied during a whole learning episode and their activity compared with familiar performance. This appeared appropriate in view of the heterogeneity of task relationships of striatal neurons, rather than studying different neurons before, during, and after the learning of entirely new tasks. The learning set was based on a delayed go-nogo paradigm comparable with the tasks used in learning studies on cortical neurons. Depending on an initial instruction picture, animals executed an arm movement reinforced either by liquid or a conditioned sound, or they withheld the movement and were rewarded by liquid. Neurons in the anterior striatum showed several forms of activations related to the preparation of movement and the expectation of reward during the performance of this task with familiar instructions (Hollerman et al. 1998
). The present report describes changes of neuronal activity during the association of new visual instruction stimuli with known behavioral reactions and reward. These data were previously presented as abstract (Tremblay et al. 1994
).

View larger version (96K):
[in this window]
[in a new window]
| FIG. 1.
Examples of instruction pictures for the 3 trial types. Top row: familiar stimuli. Middle and bottom: stimuli for 2 learning problems. From left to right, instructions indicate rewarded movement trials, rewarded nonmovement trials, and unrewarded movement trials, respectively.
|
|
 |
METHODS |
The study was performed on the same two Macaca fascicularis monkeys (A and B) using the same experimental procedures with the same delayed go-nogo task as described in the preceding report (Hollerman et al. 1998
). One of three colored instruction pictures was presented on a computer monitor in front of the animal for 1.0 s (13 × 13°) and specifically indicated one of three trial types (rewarded movement, rewarded nonmovement, and unrewarded movement). A red trigger stimulus presented at a random 2.5-3.5 s after instruction onset required the animal to execute or withhold a reaching movement according to each trial type (13 × 13°, same position as instruction). The trigger was the same in each trial type. In rewarded movement trials, the animal released a resting key and touched a small lever below the trigger to receive a small quantity of apple juice (0.15-0.20 ml) after a delay of 1.5 s. In rewarded nonmovement trials, the animal remained motionless on the resting key for 1.5 s and received the same liquid reward after a further 1.5 s. In unrewarded movement trials, the animal reacted as in rewarded movement trials, but correct performance was followed not by liquid reward but by a 1-kHz sound. The sound constituted a conditioned auditory reinforcer, because it visibly helped the animal to perform the task, but it was not an explicit reward, hence the simplifying term "unrewarded" movements. Thus each instruction was the unique stimulus in each trial indicating the behavioral reaction to be performed after the trigger (execution or withholding of movement) and predicting the type of reinforcer (liquid or sound). Correctly performed unrewarded movements were followed by one of the rewarded trials. Any incorrectly performed trial was repeated. Apart from that, the three trial types alternated semirandomly, with the number of consecutive trials of the same type restricted to three rewarded movement trials, one or two nonmovement trials, and a single unrewarded movement trial. Trials lasted 11-13 s, and intertrial intervals were 4-7 s.

View larger version (33K):
[in this window]
[in a new window]
| FIG. 2.
A: learning curves for the 3 trial types, averaged from 117 blocks of trials presented during neuronal recordings with monkey A. B: stability of learning curves during the course of the experiment, shown here for rewarded movement trials in monkey A. The 1st 39 problems were presented during the initial period of recording, the 2nd 39 at the midpoint, and the 3rd 39 at the end of neuronal recordings. In A and B, trial numbers start with introduction of a new learning problem, and % indicates level of correct performance. C: average learning during neuronal recordings for familiar (top) and learning trials (bottom) in monkey A. Tick marks below curves indicate learning problems, consisting of 3 new instruction stimuli. Percent correct performance was calculated from the 1st 15 trials of each type for each block.
|
|
In the learning situation, a new instruction picture was presented in each of the three trial types, whereas all other task events and the global task structure remained unchanged. Learning consisted in associating each new instruction picture with the execution or withholding of movement and expecting liquid reward versus the conditioned auditory reinforcer. Whereas each familiar instruction consisted of a single fractal image (Fig. 1, top), each learning instruction was composed of two partly superimposed, simple geometric shapes that were drawn randomly from 256 images (64 shapes having one of the colors red, yellow, green, or blue), resulting in a total of 65,536 possible stimuli (Fig. 1, middle and bottom). A color subtraction mode produced superimposed composite pictures with up to five colors on a black background of 13 × 13° on the computer monitor (CopyBits procedure of the Macintosh Operating System, transfer mode 38).

View larger version (15K):
[in this window]
[in a new window]
| FIG. 3.
Development of movement parameters during learning, distinguishing rewarded from unrewarded movement trials. Differences in movement parameters in rewarded vs. unrewarded movement trial types were stable throughout a block of familiar trials (left panels). In contrast, during learning the movement parameters differentiated with experience with the new stimuli (right panels). Sequential occurrence of each trial type after introduction of a new problem is indicated below curves. Curves show medians obtained from 117 problems tested during neuronal recordings with monkey B.
|
|

View larger version (38K):
[in this window]
[in a new window]
| FIG. 4.
Development of muscle activity distinguishing rewarded from unrewarded movements during learning. Left: muscle activity accompanying the return movement from the lever to the resting key was similar in initial learning trials with movements of both types and later became distinctive in unrewarded movement trials (arrows). Right: occasionally, movement parameters in monkey A were similar in the 2 movement trial types with familiar instructions and failed to differentiate during learning. All muscle recordings were obtained during neuronal recordings from electrodes chronically implanted in the extensor digitorum communis. Dots in rasters indicate the times at which rectified muscle activity exceeded a preset level. Original trial sequence is shown from top to bottom in each part.
|
|
Animals were first conditioned during 4 mo to perform all three trial types with the familiar pictures at >90% correct. Then each of the three instruction stimuli was successively replaced by a novel (learning) stimulus, the other task components being unchanged. When >90% correct performance was reached with a problem of three learning stimuli, those stimuli were discarded, and three new stimuli were successively introduced until a total of ~15 new images had been learned. This stage lasted ~1.5 mo. Subsequently, new learning stimuli for all three trial types were introduced at once as a new problem, and monkeys A and B were trained for a further 3 and 1.5 mo, respectively. The learning rate increased progressively with each new problem until an asymptote was reached and a learning set established. Errors in behavioral performance led to cancellation of all further signals in a given trial, including reward. Learning was facilitated by repeating erroneous trials until correct performance was obtained.
The activity of single neurons was recorded with moveable microelectrodes from histologically reconstructed positions in anterior parts of the left caudate, putamen, and ventral striatum during contralateral task performance, together with activity from the right forearm extensor digitorum communis and biceps muscles. Every neuron was tested in 40-80 trials with the 3 familiar instructions and, in a separate block, with >60 learning trials using 3 new instruction pictures that were discarded afterward. The sequence of testing varied randomly between familiar and learning blocks. Task-related neuronal activations were assessed with the sliding window procedure based on the one-tailed Wilcoxon test (P < 0.01). Neurons not activated in familiar or learning trials are not reported. Differences in magnitudes and latencies of task-related changes between familiar and learning trials were assessed in individual neurons and groups of neurons with the two-tailed Mann-Whitney U test (P < 0.05). Movement parameters were evaluated in terms of reaction time (from trigger onset to key release), movement time (from key release to lever touch), and return time (from lever touch back to touch of resting key). They were compared on a trial-by-trial basis with the Kolmogorov-Smirnov test (P < 0.001).
 |
RESULTS |
Learning behavior
The repeated learning of new stimuli within the same task structure resulted in a learning set in which each problem of three new stimuli was learned rapidly. At the onset of neuronal recordings, monkeys A and B had learned 275 and 78 problems. Thereafter, learning was relatively stable, with an overall >80% correct performance in blocks of 60-90 learning trials. Learning occurred largely within the 1st trials and approached an asymptote within 5-10 trials of each trial type, although animals occasionally made errors during the subsequent consolidation period (Fig. 2A; see also Figs. 13, 16, and 17). Learning curves remained stable during the course of experimentation in all three trial types (Fig. 2B), although reduced learning was occasionally observed (Fig. 2C). Medians of correct performance in the 1st 15 trials in each of the 3 trial types were, respectively, 87, 100, and 93% (monkey A) and 73, 80, and 93% (monkey B). Performance in familiar trials exceeded 95% throughout the period of neuronal recording.

View larger version (39K):
[in this window]
[in a new window]
| FIG. 13.
Transiently decreased activations in 2 caudate neurons during learning. A: during initial learning, decrease of the selective pretrigger activation in rewarded movement trials. B: decrease of posttrigger activation during initial learning trials with gradual buildup during subsequent trials. The posttrigger activation occurred in all 3 trial types during both familiar performance and learning. Correctly and incorrectly performed trials are indicated by plus sign and minus sign, respectively. Only rewarded movement trials are shown in A and B.
|
|

View larger version (28K):
[in this window]
[in a new window]
| FIG. 16.
Appearance of activations with different durations in 3 caudate neurons during learning. A: transient appearance of activation following the trigger stimulus during learning in rewarded movement trials. A posttrigger activation occurred also in unrewarded movement trials and was maintained during learning (not shown). B: sustained appearance of selective posttrigger activation in initial nonmovement trials during learning, with gradual disappearance during subsequent trials. C: sustained appearance of instruction response in rewarded movement trials during learning. This neuron also showed a new instruction response in unrewarded movement trials during learning. Correctly and incorrectly performed trials are indicated by plus sign and minus sign, respectively.
|
|

View larger version (39K):
[in this window]
[in a new window]
| FIG. 17.
Sustained increase and new appearance of instruction response in a caudate neuron during learning. With familiar instruction stimuli, the neuron responded almost exclusively in rewarded movement trials. During learning, the response was increased in rewarded movement trials. In nonmovement trials, a new response appeared and became gradually weaker during the course of learning. In unrewarded movement trials, a response appeared also but remained present during the testing period (not shown). The appearance of substantial responses in other than rewarded movement trials constituted a reduction in trial selectivity. Correctly and incorrectly performed trials are indicated by plus sign and minus sign, respectively.
|
|
Trial-by-trial analysis revealed that both animals adopted a similar learning strategy. They withheld the movement in 70-90% of the first trials with new stimuli (Fig. 2A), which was well above the probability of 0.33 for this trial type. This strategy yielded a high correct performance on initial nonmovement trials that remained above the chance level in subsequent trials. Both animals showed the steepest learning curves for unrewarded movement trials. Rewarded movement trials were learned slowest. This learning strategy remained rather stable throughout the course of neuronal recording.

View larger version (26K):
[in this window]
[in a new window]
| FIG. 5.
Adaptation of reward expectation-related sustained instruction response during learning. During familiar performance, this caudate neuron showed a sustained response in rewarded movement trials (top) and a transient response in nonmovement trials (not shown). Typically, the hand returned later to the resting key in rewarded as compared with unrewarded movement trials (return times of 958-2,539 ms vs. 404-735 ms). During learning, the sustained response occurred initially also in unrewarded movement trials, which were performed with parameters of rewarded movements (return times of 1,606-2,971 ms in trials 1-9 and 1,141-2,814 ms in trials 13-16). The response disappeared when movement parameters became typical for unrewarded movements (return times of 687-888 ms in trials 10-12 and 457-700 ms in trials 17-end; brackets to the right). Rewarded movement, nonmovement, and unrewarded movement trials alternated semirandomly during the experiment and were separated for analysis. Familiar and learning trials were performed in separate blocks. Dots in rasters denote the time of occurrence of neuronal impulses, referenced to instruction onset. Each line of dots represents 1 trial. In this and the following figures, the sequence of trials is plotted chronologically from top to bottom, learning rasters beginning with the 1st presentations of new instructions. Data in familiar and learning trials are only shown from correctly performed trials, unless mentioned otherwise.
|
|
Rewarded and unrewarded movements in familiar trials differed in both reaction time and return time, often in monkey A and always in monkey B (Fig. 3, left). This allowed identification of typical rewarded versus unrewarded movements. With new stimuli, mean reaction times in the first movement trial were intermediate between the two movement trial types. By contrast, mean return times were initially typical for rewarded movements. Both parameters became significantly different and typical for the two types of movement trial after the first few correct trials of each type (P < 0.001; Fig. 3, right). Movement times developped similar to return times in monkey B but varied inconsistently in monkey A. Erroneous movements in nonmovement trials were usually performed with reaction times typical for rewarded movements.

View larger version (16K):
[in this window]
[in a new window]
| FIG. 6.
Adaptation of reward expectation-related activation following the trigger stimulus during learning. This caudate neuron was activated when a movement for liquid reward was performed, as opposed to sound reinforcement, and showed no activation in nonmovement trials. During learning, activations occurred in correct rewarded movement trials and in initial unrewarded movement trials performed with parameters of rewarded movements, as judged by the return to the resting key.
|
|

View larger version (20K):
[in this window]
[in a new window]
| FIG. 7.
Adaptation of reward expectation-related activation during learning. This ventral striatum neuron was activated during familiar performance and learning in both rewarded trials irrespective of movement (A and B), but not in trials in which only sound reinforcement was given (C). During learning, activations occurred in addition in initial unrewarded movement trials that were performed with parameters of rewarded movements (F). Neuronal activations in unrewarded movement trials disappeared 3 trials before the animal performed with parameters of unrewarded movements, as judged by return of the hand to the resting key (arrows at right in F). In agreement with activations in familiar nonmovement trials (B), erroneous nonmovement reactions in rewarded movement trials were also accompanied by activations (G). Here, impulses were referenced to trigger offset as the last signal before potential reward delivery, preceding reward delivery by a constant 1.5 s. In erroneous nonmovement trials in which a movement was performed (H), the trigger disappeared already on erroneous key release, and the movement was aborted by the animal before lever touch (return to key in H). This suggests an absence of reward expectation and may explain the missing activation in H. In addition, activations in correctly performed rewarded movement trials during learning (D) were increased compared with familiar trials (A).
|
|
Forearm muscle activity developed in a similar way during learning. In rewarded movement trials, muscle activity between key release and return to the resting key resembled that seen in familiar trials and remained stable over consecutive learning trials (Fig. 4, top). In unrewarded movement trials, muscle activity was frequently typical of rewarded movements in initial learning trials and later approached a pattern typical for unrewarded movements with shorter return times (Fig. 4, bottom left). Occasionally with monkey A, muscle activity and return times were similar in both familiar trial types and did not develop differentially during learning (Fig. 4, right). Thus movement parameters and muscle activity in initial learning trials were typical for rewarded movement trials and subsequently differentiated according to the two types of reinforcement.
Overview of neuronal changes
A total of 205 slowly discharging, task-related neurons was tested in both familiar and learning trials in the anterior striatum between the anterior commissure and 7 mm rostral to it. Their behavioral relationships were characterized in two respects. First, neuronal activations were event-related in temporally preceding or following the instructions, the trigger or the reinforcers. According to the preceding report (Hollerman et al. 1998
), activations preceding individual task events were probably related to the expectation of the future event or the preparation of the behavioral reaction. Activations following an event constituted a response to the event or, when following the trigger stimulus, may be related to the execution or withholding of movement. Second, neuronal activations were trial selective in occurring only in single trial types or combinations of two trial types. Activations were more frequent in trials reinforced by liquid as opposed to sound. They further discriminated between movement and nonmovement trials. Several neurons showed multiple event relationships, each one often with different trial selectivity.

View larger version (24K):
[in this window]
[in a new window]
| FIG. 8.
Adaptation of reward expectation-related activation during learning without changes in movement parameters. This ventral striatum neuron was activated before the liquid reward in rewarded movement and nonmovement trials during familiar and learning situations (only movement trials shown). In contrast to the preceding figure, movements in rewarded and unrewarded trials were performed with similar parameters, in both familiar and learning situations.
|
|
Both event relationships and trial selectivities were influenced by learning (Table 1). In many cases, neuronal activations maintained their basic event relationships and trial selectivities and adapted to new task contingencies in parallel with the behavior. In other forms, event-related activations were decreased or increased, often leading to changes in trial selectivity. Neurons with multiple task relationships occasionally showed combinations of these learning changes with different task events and trial types, resulting in considerably modified task relationships. Learning-related changes were qualitatively comparable in all 25 neurons tested in 2 blocks of learning trials, and in 3 neurons tested in 3 blocks. This was seen with all six forms of activation related to the three task events, including instruction responses in eight neurons.
Adaptation of maintained neuronal activations
Activations were maintained in 150 of the 205 tested neurons (73%), with at least one event relationship in at least one trial type failing to show significant differences in magnitude between familiar and learning trials (Table 1). However, activations in only 39 neurons (19%) showed unchanged magnitudes in every event relationship in every trial type. Activations in 141 of the 205 tested neurons (69%) remained selective for the same trial types as with familiar performance. Activations reflected the type of trial actually performed by the animal, which was not necessarily the trial type indicated by the instruction. They occurred in inappropriate trial types when animals performed a different type of trial than indicated by the instruction. After a few learning trials, animals adapted their behavior to the type of trial indicated, and neuronal activations adapted accordingly. Maintained trial selectivities involved activations with unchanged, decreased, or increased magnitudes. Activations following the three task events showed mean latencies and durations of 112-380 ms and 250-2,400 ms, respectively, which varied statistically insignificantly during learning.
RELATIONSHIPS TO REWARD.
As described in the preceding report, many of the task-related neuronal activations in the anterior striatum showed pronounced relationships to the reinforcers during familiar performance (Hollerman et al. 1998
). Most activations related to the instructions, trigger, or reinforcers occurred only in rewarded movement trials, in both rewarded trials irrespective of movement, or occasionally only in unrewarded movement trials. Many preinstruction activations occurred selectively after rewarded trials and probably reflected an upcoming unrewarded trial.
Many activations selective for liquid-rewarded trials during familiar performance were maintained during learning but occurred also in unrewarded movement trials. Animals indiscriminately performed most movements in initial learning trials with parameters of rewarded movements. Activations in unrewarded movement trials disappeared once the animals learned to differentiate between rewarded and unrewarded movements, or even a few trials earlier. This was observed with instruction responses (Fig. 5), pretrigger activations, activations following the trigger (Fig. 6), prereward activations (Fig. 7), and preinstruction activations (Table 2). Activations occurred also in initial unrewarded movement trials when rewarded and unrewarded movements were performed with similar parameters (Fig. 8). Activations in these neurons occurred also with erroneous movements in nonmovement trials, consistent with the frequent treatment of all initial movements as rewarded. The transient occurrence of inappropriate reward-related activations in unrewarded movement trials is schematically shown in Fig. 9. Responses following the reward remained mostly restricted to trials with liquid reward.

View larger version (17K):
[in this window]
[in a new window]
| FIG. 9.
Schematic overview of the development of reward expectation-related neuronal activations during learning. A: activations during the instruction-trigger interval related conjointly to movement preparation and expectation of liquid reward. Activations occurred inappropriately also in unrewarded movement trials during early learning but became restricted again to rewarded movement trials in later learning stages. B: activations during the trigger-reward interval related to expectation of liquid reward irrespective of movement. Activations occurred inappropriately also in unrewarded movement trials during early learning. In A and B, inappropriate neuronal activations during early learning occurred in parallel with inappropriate behavioral reactions, indicating that animals had not yet acquired a differential reward expectation from the new instructions. Filled bars indicate activations occurring appropriately according to trial types; open bars show inappropriate activations. Drawings are based on time courses of activations in individual striatal neurons during the instruction-trigger and trigger-reinforcement intervals, original examples of which are shown in Figs. 5 and 6-8, respectively. Below graphs, i, t, and r indicate occurrence of the instruction, trigger, and reinforcer, respectively.
|
|
Adaptations in the opposite direction were observed with activations occurring only in unrewarded and not in rewarded movement trials (Fig. 10). Activations were absent in initial unrewarded movement trials during learning and reappeared after a few trials, consistent with the animal's habitual treatment of initial movement trials as rewarded. This occurred also when movement parameters were occasionally similar in rewarded and unrewarded trials. Activations were also absent with erroneous movements in nonmovement trials.

View larger version (19K):
[in this window]
[in a new window]
| FIG. 10.
Adaptation of sustained instruction response related to the expectation of the conditioned auditory reinforcer during learning. With familiar instructions, this caudate neuron was only activated in unrewarded movement trials (top, nonmovement trials not shown). During learning, the response was initially very weak in unrewarded movement trials and became substantial only after the 3rd trial of this type (bottom right). The initial absence of response was consistent with the animal's habitual treatment of initial movements as rewarded. However, in this instance, movements in rewarded and unrewarded trials were performed with similar parameters, in both familiar and learning situations. Only data from correctly performed trials are shown.
|
|
RELATIONSHIP TO BEHAVIORAL REACTIONS.
Several task-related neuronal activations in anterior striatum differentiated between the execution and withholding of movement during familiar performance (Hollerman et al. 1998
). This was seen in relation to all three task events by activations occurring selectively or preferentially in rewarded, unrewarded or both movement trials, or in nonmovement trials. Some preinstruction activations occurred selectively after nonmovement trials and probably reflected an upcoming movement trial.
Many activations selective for movement trials during familiar performance were maintained during learning and occurred also with erroneous behavioral reactions in inappropriate trial types. Activations occurred in correctly performed movement trials and in erroneous nonmovement trials in which a movement was made but were absent in erroneous movement trials in which no movement was made. This was observed with instruction responses (Fig. 11), pretrigger activations, activations following the trigger, and preinstruction activations (Table 1). Conversely, activations related to withholding of movement occurred in correct nonmovement trials and in erroneous movement trials in which no movement was made, but were absent in erroneous nonmovement trials in which a movement was made. All inappropriate activations disappeared after a few trials when animals learned the new instructions and reacted appropriately.

View larger version (32K):
[in this window]
[in a new window]
| FIG. 11.
Adaptation of movement preparatory activity during learning. With familiar instructions, this caudate neuron showed a sustained instruction response only in rewarded movement trials and a transient response in nonmovement trials. During learning, the sustained response was slightly increased in movement trials (bottom left) and occurred also in erroneous nonmovement trials in which a movement was performed (right). The response in correct nonmovement trials resembled that in familiar trials. Same neuron as Fig. 5.
|
|
RECORDING POSITIONS.
Neurons with maintained relationships during learning were distributed throughout the entire recording area in caudate nucleus (n = 71), putamen (n = 46), and ventral striatum (n = 24), without local preferences for any form of task relationship (Fig. 12).

View larger version (21K):
[in this window]
[in a new window]
| FIG. 12.
Anatomic positions of neurons in both monkeys showing maintained relationships to behavior and reinforcers during learning. Positions of neurons are labeled according to the 6 possible task relationships, namely responses to and activations preceding instruction, trigger, and reinforcers. Interrupted lines within sections indicate approximate borders of striatal regions (Cd, caudate; Put, putamen; V St, ventral striatum; AC, anterior commissure). Data from both monkeys are plotted on coronal sections from the left hemisphere of one monkey, which are labeled according to the distances from the interaural line (A18-A25).
|
|
Decreased neuronal activations
A total of 90 of the 205 tested neurons (44%) showed significantly decreased magnitudes of at least one event-related activation in at least one trial type. In 46 of these neurons, the decreases occurred transiently during initial learning trials until more consistent task performance was obtained. In 48 neurons, decreases were sustained in outlasting this period and often remaining present throughout the entire testing period of 60-90 trials (Table 1). Combinations between transient and sustained decreases occurred occasionally with different task events or trial types. Transient or sustained decreases resulted in the complete absence of activation in some trial types in 40 and 33 of these neurons, respectively. This abolished entirely the responsiveness in neurons activated in a single familiar trial type and increased the trial selectivity of activations in neurons activated in several familiar trial types. Neurons with decreased activations during learning were unpreferentially distributed over the entire recording area in caudate nucleus (n = 48), putamen (n = 22), and ventral striatum (n = 20).
Transient decreases during the initial two or three learning trials were seen with instruction responses, pretrigger activations (Fig. 13A), activations following the trigger (Fig. 13B), prereward activations (Fig. 14), reinforcer responses, and preinstruction activations. Sustained decreases outlasting the initial learning trials are shown in Fig. 15. During familiar trials, this neuron showed a combination of activations following the instructions, trigger stimulus, and liquid reward in both rewarded trials irrespective of movement. During learning, the instruction responses were nearly completely abolished, trigger responses were considerably decreased, and reward responses remained largely unchanged.

View larger version (25K):
[in this window]
[in a new window]
| FIG. 14.
Transiently decreased reward expectation-related activations during learning. In familiar trials, this putamen neuron was activated preceding liquid reward in rewarded movement and nonmovement trials, but not in unrewarded movement trials. During learning, the activations were initially reduced in both trial types and subsequently attained the level of familiar performance. Only data from correctly performed trials are shown.
|
|

View larger version (23K):
[in this window]
[in a new window]
| FIG. 15.
Differential sustained decreases of activations following several task events during learning. In familiar trials, this ventral striatum neuron was activated following the instructions, trigger, and liquid reward selectively in rewarded movement and nonmovement trials. During learning, the activation following the instruction was almost completely abolished, the activations following the trigger decreased considerably, and the reward responses remained largely unchanged in both rewarded movement and nonmovement trials.
|
|
Increased neuronal activations
A total of 95 of the 205 tested neurons (46%) showed significantly increased magnitudes of at least 1 event-related activation in at least 1 trial type. The increases occurred transiently during initial learning trials in 38 of these neurons and remained sustained beyond learning and often during the entire 60-90 testing trials in 72 neurons (Table 1). Combinations between transient and sustained increases occurred occasionally with different task events or trial types. The transient or sustained increases resulted in the appearance of new activations in some trial types in 28 and 38 of these neurons, respectively. Thus 11 neurons not activated in any familiar trial type showed task-related activations during learning, and the trial selectivity of neurons with existing task relationships was decreased. These increases do not include the transiently appearing reward-related activations when initial unrewarded movement trials were performed with parameters of rewarded movements. Neurons with increased activations were unpreferentially distributed over the entire recording area in caudate nucleus (n = 54), putamen (n = 22), and ventral striatum (n = 19).
Examples of newly appearing activations are shown in Fig. 16. The posttrigger activation of the neuron in A was restricted to the initial learning trials during which the animal made several errors. The posttrigger activation in B decreased gradually after the initial learning trials and disappeared completely before the end of the habitual testing period, although the animal happened to perform every trial correctly during learning. The newly appearing instruction response in C lasted during the entire testing period during learning, again without error performance. Figure 17 shows an increased instruction response in rewarded movement trials together with a newly appearing response in nonmovement trials that decreased the trial selectivity. The animal made several errors in both trial types that decreased only gradually during learning. Examples of increases of reward-related activations are shown in Fig. 18. Prereward activations showed transient (A) or sustained (B) increases. Similar increases were seen with reward responses (C).

View larger version (14K):
[in this window]
[in a new window]
| FIG. 18.
Increases and new appearances of reward-related activations in 3 caudate neurons during learning. A: transient appearance of reward expectation-related activation in rewarded movement trials. This neuron also showed reward expectation-related activations in nonmovement trials during familiar performance and learning (not shown). B: sustained, massive increase of reward expectation-related activation in nonmovement trials. This neuron showed a similar activation increase in rewarded movement trials, and in initial unrewarded movement trials performed with parameters of rewarded movements (not shown). C: sustained appearance of reward response in nonmovement trials. This neuron failed to respond to liquid reward or sound reinforcer in any other trial type during familiar performance or learning. Only data from correctly performed trials are shown.
|
|
 |
DISCUSSION |
These data show that task-related neuronal activations in the anterior striatum undergo changes when animals rapidly adapt their behavioral reactions to new instruction stimuli. In the first form of neuronal change during learning, the behavioral adaptations were accompanied by adapting neuronal activations that otherwise maintained their relationships to behavioral reactions and upcoming reinforcers. In the second form, neuronal activations were decreased or abolished. In the third form, neurons were more strongly or exclusively activated during learning. Together, these data suggest an involvement of the striatum in associating new environmental signals with known behavioral reactions and reinforcers.
Learning behavior
In most learning situations, only a part of the environment changes, and learning is confined to the modified contingencies. Repeated learning in an unchanged context leads to a learning set in which new stimuli are fully learned in a few trials (Gaffan et al. 1988
; Harlow 1949
). The rapid learning is particularly appropriate for the limited periods of recordings from single neurons. The present learning paradigm was based on a conditional, delayed go-nogo task involving associations between visual stimuli and behavioral reactions without explicit sensorimotor guidance. Delay tasks typically test preparatory and expectation-related functions of the striatum (Alexander and Crutcher 1990
; Apicella et al. 1992
; Divac et al. 1967
; Hikosaka et al. 1989
). Similar conditional delay tasks were used in learning set situations for investigating neurons in prefrontal and premotor cortex (Mitz et al. 1991
; Niki et al. 1990
; Watanabe 1990
) and supplementary and frontal eye fields (Chen and Wise 1995a
,b
, 1996
).
The present experiment comprised three learning stages. Animals first learned the basic task, discriminating on the basis of different instructions between execution and withholding of movement, and between liquid reward and conditioned sound reinforcement. The second stage involved the acquisition of a learning set. New instruction stimuli were repeatedly introduced while keeping the remaining task components unchanged, and animals acquired the capacity to associate new instructions with known behavioral reactions and reinforcers within a few trials. The third stage comprised stable performance in the learning set, and neuronal recordings were done during early learning and subsequent task consolidation. Animals applied a trial-and-error learning strategy with a win-stay loose-shift rule. They usually showed a nonmovement reaction to the first instruction of a new problem, apparently economizing the effort. They subsequently learned unrewarded movements, thus avoiding to rerun error trials and further postpone the reward. In this way, learning occurred rapidly, involved only changes in the instruction as crucial task component, was based on a previously learned, unchanged background situation, and was stationary over successive learning episodes.
Initial unrewarded movements were often performed with parameters and muscle activations typical for rewarded movements, as judged by the return movement from the touch lever to the resting key. This suggests that animals initially expected a reward with all movements and only after a few trials distinguished between rewarded and unrewarded movements. The well-organized behavioral strategy suggests that animals guessed the significance of novel instructions and accordingly performed the three trial types with nearly the same parameters as familiar trials of corresponding types. Apparently, animals had an existing set of expectations corresponding to their previous experience in the familiar task in which only one of three trial types was possible. When confronted with a novel instruction, they could draw on these expectations to test the novel stimulus and later adapt their expectations and behavior to the new situation. Inappropriate or "erroneous" behavioral performance would thus reflect inappropriate but otherwise unchanged expectations evoked by the new instruction. The simplicity of evoking existing expectations and matching them to the experienced meaning of new instructions may explain the remarkable speed of learning.
Adaptation of maintained activations
Many neurons in the anterior striatum are activated in relation to the preparation of movements and the expectation of individual task events, including reward (Alexander and Crutcher 1990
; Apicella et al. 1992
; Hikosaka et al. 1989
; Hollerman et al. 1998
; Schultz et al. 1992
). Apparently, these neurons have access to predictive information stored during previous task experience. In the present experiments, neuronal activity related to movement preparation during the instruction-trigger interval maintained the behavioral relationship and reflected the actually executed behavioral reaction, which did not necessarily correspond to the trial type indicated by the instruction. Thus erroneously performed movements in nonmovement trials were accompanied by inappropriate but otherwise maintained movement-related neuronal activations. Also, the initial "default" expectation of reward with all movements was frequently paralleled by reward expectation-related neuronal activity, which in later trials became restricted to rewarded movements. This occurred in all reward expectation-related activations preceding or following the instruction, trigger, and reward. Correspondingly, activations reflecting the expectation of the auditory reinforcer were rare in initial learning trials and reappeared subsequently. It appears that initially inappropriate neuronal activations reflected inappropriate expectations evoked by instructions of guessed type. These activations adapted and became appropriate during the course of learning when expectations were matched to the new task contingencies and evoked appropriately. Interestingly, neuronal activations adapted a few trials before the behavioral reactions, which was also observed in prefrontal neurons (Watanabe 1990
). Taken together, these data suggest a mechanism for adaptive learning in which existing expectation-related neuronal activity is simply matched to the new conditions, rather than acquiring all task contingencies from scratch.
Comparable learning changes were found in several regions of frontal cortex projecting to anterior striatum. In a similar go-nogo learning set paradigm as employed presently, some dorsolateral prefrontal neurons maintained their movement preparatory activity in close correspondence with the actually executed behavioral response (Niki et al. 1990
; Watanabe 1990
). Neurons in premotor cortex, supplementary eye field, and orbitofrontal cortex also showed maintained task relationships, albeit with lower incidence than in anterior striatum (Chen and Wise 1995a
,b
, 1996
; Mitz et al. 1991
; Tremblay and Schultz 1996
). Orbitofrontal cortical neurons changed their activity in parallel with behavioral changes during reversal of visual stimuli (Rolls et al. 1996
; Thorpe et al. 1983
). Interestingly, the inappropriate coding of sample stimuli by hippocampal place cells was suggested to play a role in bringing about inappropriate behavioral reactions (Deadwyler et al. 1996
). By contrast, task-specific expectations would not exist when a new task without preconceived structure is learned by a naive animal. Prefrontal neurons began to show expectation- and preparation-related activations during the learning of spatial delayed response tasks as soon as animals attained substantial performance (Germain and Lamarre 1993
; Kubota and Komatsu 1985
). Similarly, most tonically active neurons in striatum acquired responses to reward-predicting stimuli only after conditioning (Aosaki et al. 1994
). Thus inputs from frontal cortex could mediate some of the maintained activations presently observed during learning set performance in anterior striatum, whereas the contributions of cortical changes during the acquisition of new tasks are difficult to evaluate.
Decreased neuronal activations
A sizeable fraction of the investigated striatal neurons showed decreased event-related activations during learning compared with familiar performance. Decreases occurred in relation to any of the events and often resulted in the complete disappearance of activations in particular trial types, thus making the neurons either entirely unresponsive or increasingly task selective. The shorter lasting decreases resembled the "learning-dependent" changes that occurred transiently in premotor cortex and supplementary eye field during learning and subsequently recovered to levels of familiar performance (Chen and Wise 1995a
; Mitz et al. 1991
). Similar changes were also seen in prefrontal (Niki et al. 1990
) and hippocampal neurons (Cahusac et al. 1993
). The current longer lasting decreases resemble the "learning-static" decreases outlasting the learning phase in frontal eye field and supplementary eye field neurons (Chen and Wise 1995b
).
About one-half of the presently observed decreases lasted only during the few learning trials. Apparently, some striatal neurons lost their relationship to the preparation of movement and expectation of reward during a transient period of uncertainty with the incompletely established task. The activations regained their magnitudes when events became better predicted with firmly established stimulus-response associations and reestablished behavioral performance. Other decreases outlasted the learning phase. Whereas the familiar instructions consisted of the same, highly structured, fractal images that were used for many months, the learning pictures were simple geometric forms that were usually presented for <100 trials and then replaced by new stimuli. The animals conceivably differentiated between the class of familiar stimuli and the class of novel, "disposable" stimuli. Thus the long-lasting decreases during learning may have reflected the transient nature of frequently changing associations.
Increased neuronal activations
Several striatal neurons showed transient or sustained increases of existing event-related activations in learning trials. Activations appearing in additional trial types during learning induced decreases in trial selectivity. These striatal neurons apparently processed unidentified instructions during learning with less selectivity and thus broadened their potential coding of task events. The transient nature of increases in several striatal neurons suggests that a comparable trial selectivity was regained after the significances of instructions were acquired.
The quantitative increases resembled augmented visual responses in the tail of caudate during learning (Brown et al. 1995
) and might correspond to rapidly habituating responses to new visual stimuli in the same structure (Caan et al. 1984
). The activations occurring exclusively during learning were similar to "learning-selective" activations in the supplementary eye field (Chen and Wise 1995a
) and comparable changes in hippocampus (Cahusac et al. 1993
). Such pronounced increases during learning are compatible with decreases in movement-related activations in the supplementary motor cortex with overtrained lever pressing (Aizawa et al. 1991
). The present increases outlasting the learning phase resembled the learning-static increases in frontal and supplementary eye fields (Chen and Wise 1995b
). Decreases in trial selectivity were also observed in prefrontal and orbitofrontal cortex (Tremblay and Schultz 1996
; Watanabe 1990
) and were compatible with the modification or breakdown of directional selectivity in the supplementary eye field (Chen and Wise 1996
). Taken together, frontal cortical areas projecting to the striatum show learning-related changes that may contribute to the increased activations found in the anterior striatum.
It is conceivable that the activation increases during learning might simply reflect differences in visual stimulus features, in particular with instructions. However, many learning changes occurred only during initial learning trials, whereas the instruction pictures remained the same during a whole block of learning trials. Although some activation increases and changes in trial selectivity lasted during whole learning blocks, they remained comparable when different sets of learning pictures with different visual features were tested. Increases outlasting initial learning trials were also seen with activations following the trigger, although this stimulus was the same as in familiar trials. Taken together, many of the observed learning changes appeared to be unrelated to differences in visual features. By contrast, neuronal responses related to visual features were found in the tail of caudate (Brown et al. 1995
).
In contrast to visual stimulus features, attention-related mechanisms may well have contributed to increased activations. Most task events were probably processed with increased levels of attention during learning. The new instruction stimuli were novel and had future behavioral significance, and the other task events were processed in a less automatic manner. The occurrence of environmental stimuli or behavioral reactions during heightened foveal or extrafoveal selective attention is accompanied by enhanced neuronal activations in cortical areas, in particular the parietal cortex (Bushnell et al. 1981
; Mountcastle et al. 1981
; Sakata et al. 1983
). Inputs from these areas to the striatum might account, at least in part, for the observed learning-related increases in activation.
The increased task-related activations during learning might influence the storage of new stimulus-response associations in memory. Associating new instructions with known behavioral reactions and reinforcers would require modifications of neuronal processing, most probably involving changes at the synaptic level. These changes should take place between the introduction and acquisition of new instructions, and during the subsequent consolidation phase. Neuronal activations were presently found to be increased during both periods. According to the notion of Hebbian learning, synaptic transmission may undergo long-lasting changes following repeated and conjoint pre- and postsynaptic activations. The presently reported increased activations of striatal neurons most likely reflected increased cortical input and thus increased use of corticostriatal synapses. This may induce synaptic changes at corticostriatal synapses, which are known to undergo long-term potentiation (Boeijinga et al. 1993
; Calabresi et al. 1992
; Pennartz et al. 1993
; Wickens et al. 1996
). It could be speculated that transient increases induce synaptic changes at corticostriatal synapses and that more sustained increases participate in the consolidation of these synaptic modifications.
Comparison with responses of dopamine neurons
Midbrain dopamine neurons respond to primary rewards and to conditioned, reward-predicting stimuli in a manner compatible with the concept of an error in the prediction of reward (Ljungberg et al. 1992
; Mirenowicz and Schultz 1994
; Schultz et al. 1993
, 1997
). They are activated by unpredicted rewards, are not influenced by fully predicted rewards, and are depressed when predicted rewards are omitted. Prediction errors crucially determine the rate of associative learning (Dickinson 1980
; Rescorla and Wagner 1972
). It appears that the rather homogeneous and simultaneous responses of dopamine neurons may constitute a teaching signal that is propagated as a global reinforcement message along diverging axons to neurons in the striatum and frontal cortex. By contrast, many of the presently found changes of striatal neurons were related to adaptations of previously acquired expectations to new task situations, or of reductions of such expectations. This would correspond to the striatal involvement in expectation- and preparation-related cognitive processes. These adaptive changes would not constitute a reward prediction signal suitable for modifying synaptic weights. However, the increased striatal activations during learning might contribute to synaptic modifications, although their heterogeneous nature and the different architecture of axonal projections would not allow them to serve as a global reinforcement message.
 |
ACKNOWLEDGEMENTS |
We thank D. Ballard, A. Dickinson, and M. Watanabe for discussions and comments and B. Aebischer, J. Corpataux, A. Gaillard, A. Pisani, A. Schwarz, and F. Tinguely for expert technical assistance.
This study was supported by Swiss National Science Foundation Grants 31-28591.90, 31.43331.95, and NFP38.4038-43997), the Roche Research Foundation, Switzerland, and by postdoctoral fellowships from the Fondation pour la Recherche Scientifique of Quebec to L. Tremblay and National Institute of Mental Health Grant MH-10282 to J. R. Hollerman.
Present addresses: L. Tremblay, INSERM Unit 289, Hôpital de la Salpetriére, 47 Boulevard de l'Hôpital, F-75651 Paris, France; and J. Hollerman, Dept. of Psychology, Allegheny College, Meadville, PA 16335.
 |
FOOTNOTES |
Received 11 February; accepted in final form 9 April, 1998.
 |
REFERENCES |
-
AIZAWA, H.,
INASE, M.,
MUSHIAKE, H.,
SHIMA, K.,
TANJI, J.
Reorganization of activity in the supplementary motor area associated with motor learning and functional recovery.
Exp. Brain Res.
84: 668-671, 1991.[Medline]
-
ALEXANDER, G. E.,
CRUTCHER, M. D.
Preparation for movement: neural representations of intended direction in three motor areas of the monkey.
J. Neurophysiol.
64: 133-150, 1990.[Abstract/Free Full Text]
-
AOSAKI, T.,
TSUBOKAWA, H.,
ISHIDA, A.,
WATANABE, K.,
GRAYBIEL, A. M.,
KIMURA, M.
Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning.
J. Neurosci.
14: 3969-3984, 1994.[Abstract]
-
APICELLA, P.,
SCARNATI, E.,
LJUNGBERG, T.,
SCHULTZ, W.
Neuronal activity in monkey striatum related to the expectation of predictable environmental events.
J. Neurophysiol.
68: 945-960, 1992.[Abstract/Free Full Text]
-
BÄTTIG, K.,
ROSVOLD, H. E.,
MISHKIN, M.
Comparison of the effects of frontal and caudate lesions on delayed response and alternation in monkeys.
J. Comp. Physiol. Psychol.
53: 400-404, 1960.[Medline]
-
BENINGER, R. J.
The role of dopamine in locomotor activity and learning.
Brain Res. Rev.
6: 173-196, 1983.
-
BOEIJINGA, P. H.,
MULDER, A. B.,
PENNARTZ, C.M.A.,
MANSHANDEN, I.,
LOPES DA SILVA, F. H.
Responses of the nucleus accumbens following fornix/fimbria stimulation in the rat. Identification and long-term potentiation of mono- and polysynaptic pathways.
Neuroscience
53: 1049-1058, 1993.[Medline]
-
BROWN, V. J.,
DESIMONE, R.,
MISHKIN, M.
Responses of cells in the caudate nucleus during visual discrimination learning.
J. Neurophysiol.
74: 1083-1094, 1995.[Abstract/Free Full Text]
-
BUSHNELL, M. C.,
GOLDBERG, M. E.,
ROBINSON, D. L.
Behavioral enhancement of visual responses in monkey cerebral cortex. I. Modulation in posterior parietal cortex related to selective visual attention.
J. Neurophysiol.
46: 755-772, 1981.[Free Full Text]
-
BUTTERS, N.,
WOLFE, J.,
MARTONE, M.,
GRANHOLM, E.,
CERMAK, L. S.
Memory disorders associated with Huntington's disease: verbal recall, verbal recognition and procedural memory.
Neuropsychologia
23: 729-743, 1985.[Medline]
-
CAAN, W.,
PERRETT, D. I.,
ROLLS, E. T.
Responses of striatal neurons in the behaving monkey. 2. Visual processing in the caudal neostriatum.
Brain Res.
290: 53-65, 1984.[Medline]
-
CAHUSAC, P.M.B.,
ROLLS, E. T.,
MIYASHITA, Y.,
NIKI, H.
Modification of hippocampal neurons in the monkey during the learning of a conditional spatial response task.
Hippocampus
3: 29-42, 1993.[Medline]
-
CALABRESI, P.,
PISANI, A.,
MERCURI, N. B.,
BERNARDI, G.
Long-term potentiation in the striatum is unmasked by removing the voltage-dependent magnesium block of NMDA receptor channels.
Eur. J. Neurosci.
4: 929-935, 1992.[Medline]
-
CANAVAN, A.G.M.,
PASSINGHAM, R. E.,
MARSDEN, C. D.,
QUINN, N.,
WYKE, M.,
POLKEY, C. E.
The performance on learning tasks of patients in the early stages of Parkinson's disease.
Neuropsychologia
27: 141-156, 1989.[Medline]
-
CHEN, L. L.,
WISE, S. P.
Neuronal activity in the supplementary eye field during acquisition of conditional oculomotor associations.
J. Neurophysiol.
73: 1101-1121, 1995a.[Abstract/Free Full Text]
-
CHEN, L. L.,
WISE, S. P.
Supplementary eye field contrasted with the frontal eye field during acquisition of conditional oculomotor associations.
J. Neurophysiol.
73: 1122-1134, 1995b.[Abstract/Free Full Text]
-
CHEN, L. L.,
WISE, S. P.
Evolution of directional preferences in the supplementary eye field during acquisition of conditional oculomotor associations.
J. Neurosci.
16: 3067-3081, 1996.[Abstract/Free Full Text]
-
DEADWYLER, S. A.,
BUNN, T.,
HAMPSON, R. E.
Hippocampal ensemble activity during spatial delayed-nonmatch-to-sample performance in rats.
J. Neurosci.
16: 354-372, 1996.[Abstract]
-
DICKINSON, A.
In: Contemporary Animal Learning Theory. Cambridge, UK: Cambridge Univ. Press, 1980.
-
DIVAC, I.,
ROSVOLD, H. E.,
SZWARCBART, M. K.
Behavioral effects of selective ablation of the caudate nucleus.
J. Comp. Physiol. Psychol.
63: 184-190, 1967.[Medline]
-
FIBIGER, H. C.,
PHILLIPS, A. G.
Reward, motivation, cognition: psychobiology of mesotelencephalic dopamine systems.
In: Handbook of Physiology. The Nervous System. Intrinsic Regulatory Systems of the Brain. Bethesda, MD: Am. Physiol. Soc., 1986, sect. 1, vol. IV, p. 647-675.
-
GAFFAN, D.
Memory, action and the corpus striatum: current developments in the memory-habit distinction.
Semin. Neurosci.
8: 33-38, 1996.
-
GAFFAN, E. A.,
GAFFAN, D.,
HARRISON, S.
Disconnection of the amygdala from visual association cortex impairs visual reward association learning in monkeys.
J. Neurosci.
8: 3144-3150, 1988.[Abstract]
-
GERMAIN, L.,
LAMARRE, Y.
Neuronal activity in the motor and premotor cortices before and after learning the associations between auditory stimuli and motor responses.
Brain Res.
611: 175-179, 1993.[Medline]
-
GRAYBIEL, A. M.
Building action repertoires: memory and learning functions of the basal ganglia.
Curr. Opin. Neurobiol.
5: 733-741, 1995.[Medline]
-
HARLOW, H. F.
The formation of learning sets.
Psychol. Rev.
56: 51-65, 1949.
-
HARRINGTON, D. L.,
HAALAND, K. Y.,
YEO, R. A.,
MARDER, E.
Procedural memory in Parkinson's disease: impaired motor but not visuoperceptual learning.
J. Clin. Exp. Neuropsychol.
12: 323-329, 1990.[Medline]
-
HIKOSAKA, O.,
RAND, M. K.,
MIYACHI, S.,
MIYASHITA, K.
Learning of sequential movements in the monkey: process of learning and retention of memory.
J. Neurophysiol.
74: 1652-1661, 1995.[Abstract/Free Full Text]
-
HIKOSAKA, O.,
SAKAMOTO, M.,
USUI, S.
Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward.
J. Neurophysiol.
61: 814-832, 1989.[Abstract/Free Full Text]
-
HOLLERMAN, J. R.,
TREMBLAY, L.,
SCHULTZ, W.
Influence of reward expectation on behavior-related neuronal activity in primate striatum.
J. Neurophysiol.
80: 947-963, 1998.[Abstract/Free Full Text]
-
KNOPMAN, D.,
NISSEN, M. J.
Procedural learning is impaired in Huntington's disease: evidence from the serial reaction time task.
Neuropsychologia
29: 245-254, 1991.[Medline]
-
KUBOTA, K.,
KOMATSU, H.
Neuron activities of monkey prefrontal cortex during the learning of visual discrimination tasks with go/no-go performances.
Neurosci. Res.
3: 106-129, 1985.[Medline]
-
LJUNGBERG, T.,
APICELLA, P.,
SCHULTZ, W.
Responses of monkey dopamine neurons during learning of behavioral reactions.
J. Neurophysiol.
67: 145-163, 1992.[Abstract/Free Full Text]
-
MIRENOWICZ, J.,
SCHULTZ, W.
Importance of unpredictability for reward responses in primate dopamine neurons.
J. Neurophysiol.
72: 1024-1027, 1994.[Abstract/Free Full Text]
-
MISHKIN, M.,
APPENZELLER, T.
The anatomy of memory.
Science
256: 80-89, 1987.
-
MITZ, A. R.,
GODSCHALK, M.,
WISE, S. P.
Learning-dependent neuronal activity in the premotor cortex: activity during the acquisition of conditional motor associations.
J. Neurosci.
11: 1855-1872, 1991.[Abstract]
-
MOUNTCASTLE, V. B.,
ANDERSON, R. A.,
MOTTER, B. C.
The influence of selective attentive fixation upon the excitability of the light-sensitive neurons of the posterior parietal cortex.
J. Neurosci.
1: 1218-1235, 1981.[Abstract]
-
NIKI, H.,
SUGITA, S.,
WATANABE, M.
Modification of the activity of primate frontal neurons during learning of a go/no-go discrimination and its reversal.
In: Vision, Memory and the Temporal Lobe, edited by
E. Iwai,
and M. Mishkin
New York: Elsevier, 1990, p. 295-304.
-
PACKARD, M. G.,
HIRSH, R.,
WHITE, N. M.
Differential effects of fornix and caudate lesions on two radial arm maze tasks: evidence for multiple memory systems.
J. Neurosci.
9: 1465-1472, 1989.[Abstract]
-
PENNARTZ, C.M.A.,
AMEERUN, R. F.,
GROENEWEGEN, H. J.,
LOPES DA SILVA, F. H.
Synaptic plasticity in an in vitro slice preparation of the rat nucleus accumbens.
Eur. J. Neurosci.
5: 107-117, 1993.[Medline]
-
PHILLIPS, A. G.,
CARR, G. D.
Cognition and basal ganglia: a possible substrate for procedural knowledge.
Can. J. Neurol. Sci.
14: 381-385, 1987.[Medline]
-
RESCORLA, R. A.,
WAGNER, A. R. A
theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement.
In: Classical Conditioning. II. Current Research and Theory, edited by
A. H. Black,
and W. F. Prokasy
New York: Appleton Century Crofts, 1972, p. 64-99.
-
ROBBINS, T. W.,
EVERITT, B. J.
Functions of dopamine in the dorsal and ventral striatum.
Semin. Neurosci.
4: 119-128, 1992.
-
ROLLS, E. T.,
CRITCHLEY, H. D.,
MASON, R.,
WAKEMAN, E. A.
Orbitofrontal cortex neurons: role in olfactory and visual association learning.
J. Neurophysiol.
75: 1970-1981, 1996.[Abstract/Free Full Text]
-
SAINT-CYR, J. A.,
TAYLOR, A. E.,
LANG, A. E.
Procedural learning and neostriatal function in man.
Brain
111: 941-959, 1988.[Abstract]
-
SAKATA, H.,
SHIBUTANI, H.,
KAWANO, K.
Functional properties of visual tracking neurons in posterior parietal association cortex of the monkey.
J. Neurophysiol.
49: 1364-1380, 1983.[Free Full Text]
-
SCHULTZ, W.,
APICELLA, P.,
LJUNGBERG, T.
Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task.
J. Neurosci.
13: 900-913, 1993.[Abstract]
-
SCHULTZ, W.,
APICELLA, P.,
SCARNATI, E.,
LJUNGBERG, T.
Neuronal activity in monkey ventral striatum related to the expectation of reward.
J. Neurosci.
12: 4595-4610, 1992.[Abstract]
-
SCHULTZ, W.,
DAYAN, P.,
MONTAGUE, R. R. A
neural substrate of prediction and reward.
Science
275: 1593-1599, 1997.[Abstract/Free Full Text]
-
SEITZ, R. J.,
ROLAND, P.
Learning of sequential finger movements in man: a combined kinematic and positron emission tomography (PET) study.
Eur. J. Neurosci.
4: 154-165, 1992.[Medline]
-
THORPE, S. J.,
ROLLS, E. T.,
MADDISON, S.
The orbitofrontal cortex: neuronal activity in the behaving monkey.
Exp. Brain Res.
49: 93-115, 1983.[Medline]
-
TREMBLAY, L.,
HOLLERMAN, J. R.,
SCHULTZ, W.
Neuronal activity in primate striatum neurons during learning.
Soc. Neurosci. Abstr.
20: 780, 1994.
-
TREMBLAY, L.,
SCHULTZ, W.
Neuronal activity in primate orbitofrontal cortex during learning.
Soc. Neurosci. Abstr.
22: 1388, 1996.
-
VRIEZEN, E. R.,
MOSCOVITCH, M.
Memory for temporal order and conditional associative-learning in patients with Parkinson's disease.
Neuropsychologia
28: 1283-1293, 1990.[Medline]
-
WATANABE, M.
Prefrontal unit activity during associative learning in the monkey.
Exp. Brain Res.
80: 296-309, 1990.[Medline]
-
WHISHAW, I. Q.,
MITTLEMAN, G.,
BUNCH, S. T.,
DUNNETT, S. B.
Impairments in the acquisition, retention and selection of spatial navigation strategies after medial caudate-putamen lesions in rats.
Behav. Brain Res.
24: 125-138, 1987.[Medline]
-
WICKENS, J. R.,
BEGG, A. J.,
ARBUTHNOTT, G. W.
Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro.
Neuroscience
70: 1-5, 1996.[Medline]
-
WISE, R. A.
Neuroleptics and operant behavior: the anhedonia hypothesis.
Behav. Brain Sci.
5: 39-87, 1982.
-
WISE, S. P.
The role of the basal ganglia in procedural memory.
Semin. Neurosci.
8: 39-46, 1996.