Howard Hughes Medical Institute, Department of Physiology and Biophysics, and National Primate Research Center, University of Washington, Seattle, WA 98195-7290, USA
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The persistent spike activity that is commonly observed from neurons in the association cortex is thought to provide a neural substrate by which information can span such time gaps. It has been implicated in working memory (Fuster and Alexander, 1971; Fuster, 1973
; Miyashita and Chang, 1988
; Funahashi et al., 1989
), motor planning (Evarts and Tanji, 1976
; Bruce and Goldberg, 1985
; Gnadt and Andersen, 1988
) and decision making (Kim and Shadlen, 1999
; Shadlen and Newsome, 2001
; Hernandez et al., 2002
). A reasonable hypothesis is that such persistent activity represents the outcome of a process of integration with respect to time that is, the accumulation and storage of information. By analogy to neural integrators in the brainstem that convert eye velocity to position (Robinson, 1989
; Fukushima et al., 1992
), our hypothesis is that cortical neural integrators combine evanescent sensory data to generate an evolving conception about the state of the world, which can then be used to plan appropriate behaviors.
In this paper we explore the idea that the accumulation of sensory information underlies a simple perceptual decision. We review evidence suggesting that rhesus monkeys make decisions about the direction of random dot motion by integrating, with respect to time, the information they receive through direction-selective neurons in the visual cortex. We test this idea by developing a model of a neural circuit that we believe may perform such a computation. The model encompasses three stages of neural processing: (i) the representation of visual motion by ensembles of direction-selective neurons; (ii) the representation of a decision variable by ensembles of neurons that accumulate input from the first stage; and (iii) a comparison to threshold, which terminates the process and determines the models choice. The neurons that comprise the first stage are based on the known properties of neurons in area MT. The second and third stages of the model predict the response properties of neurons in LIP and behavioral measurements obtained from monkeys. Our results show that temporal integration can explain a fairly wide range of behavioral and physiological observations, suggesting that this simple computation may play an essential role in sensorimotor decisions.
Background
The processes underlying sensorimotor decisions in humans and monkeys have been investigated using a motion discrimination task (Fig. 1) (Britten et al., 1992; Britten, 2003
). The sensory stimulus in the task consists of a dynamic display of dots which appear and disappear at random locations within a 510° circular aperture. A fraction of these dots are displaced at some fixed offset to impart an overall sense of motion in one direction or its opposite (e.g. left versus right). The subjects goal is to discriminate between these two alternatives. Rhesus monkeys are trained to indicate their choice of direction by making a saccadic eye movement to one of two targets, corresponding to the two motion directions. If the fraction of moving dots, termed the percent coherence, is sufficiently high, it is easy to identify the correct direction. By controlling the percent coherence, the task can be made arbitrarily difficult or easy.
|
The motion information in the random dots task is represented by direction selective neurons in the extrastriate visual cortex, especially area MT (also known as area V5). When random dot motion appears in the receptive field of an MT neuron, there is a large initial burst followed by a sustained response whose magnitude depends on the strength and direction of motion (Fig. 2A). Evidence from microstimulation, lesion, and neural recording experiments demonstrate that such responses underlie the monkeys judgment in the random dots task (Newsome and Paré, 1988; Salzman et al., 1990
; Britten et al., 1992
, 1993, 1996; Bisley et al., 2001
; Britten, 2003
). In particular, the animals ability to discriminate weak motion appears to be limited by variability in the response of MT neurons (Parker and Newsome, 1998
). The responses of MT neurons provide the evidence upon which the monkey bases its decision about direction, but they do not gather this evidence together to form a decision. Their activity fluctuates with each passing random dot and the evidence they provide is as evanescent. To reach a decision, other neurons must read out the evidence from area MT.
|
In this study we use simulated neural responses to test the notion that temporal integration of sensory signals from area MT forms the basis for direction discrimination in the random dots task, and that neurons with persistent activity preceding eye movements represent this integration. For the latter, we focus on neurons in the lateral intraparietal area (LIP), which have been studied using both FD and RT versions of the task. We begin by simulating the well-described responses of MT neurons to random dot motion, and we model LIP responses as a simple time-integral of the difference from opposing pools of MT neurons. We simulate two ensembles of LIP neurons, one for each of the possible eye movement responses. The decision is made when the activity of one of the LIP ensembles exceeds a critical value, that is, when the evidence reaches a threshold. The model furnishes several novel interpretations of behavioral and physiological measurements obtained using the random dots task.
The electrophysiological and behavioral data in this paper have been described in earlier publications (Britten et al., 1992; Shadlen and Newsome, 2001
; Roitman and Shadlen, 2002
).
![]() |
Model Design |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
In the second stage we simulate the responses of two pools of LIP neurons. One of the LIP pools represents a plan to choose the right choice target (Fig. 3, LIPright choice), whereas the other represents the opposite plan (Fig. 3, LIPleft choice). Unlike the MT neurons, the expected firing rates of these neurons are time-dependent, determined by the time integral of the difference in the output of the right and left MT pools. The expected LIP firing rate is calculated by integrating the difference in spike rate signals from MT starting from when the coherence-dependent MT response begins (MT), then delaying the result by
LIP; we do not simulate the LIP response before the integration begins at time
MT +
LIP. Because of the temporal fluctuations in the responses of the MT pools [
and
; see equations A1 and A2, Supplementary Material], the expected LIP responses [
and
, equations A3 and A4, Supplementary Material] exhibit random deviations around the motion-dependent drift, similar to a biased diffusion process. As with the MT stage, the averaged spike rate from the two pools of LIP neurons furnishes the output of this stage of the model.
For simulations of the reaction time (RT) task, the LIP responses are compared to a decision threshold. The two ensemble average spike rates are smoothed using a first order filter with time constant LIP = 0.1 s in order to tame the moment-to-moment fluctuations. The smoothed LIP signals race against each other to provide the weight of evidence for their preferred choice direction. The first to reach the decision threshold,
, determines the target choice and the decision time. A random time is then added before saccade initiation. Note that the reaction time in a trial is the sum of decision time plus the non-decision intervals [saccadic latency,
MT and
LIP, amounting to
300 ms non-decision time on average (Luce, 1986
)].
For simulations of the fixed duration (FD) task, the decision is generated in one of two ways. In the first method, LIP signals are allowed to integrate for the full duration of the simulation. The decision in this case is determined by simply comparing the value of the smoothed LIP signals corresponding to the two possible choices: whichever one is higher at the end of the trial determines the choice. In the second method, a decision threshold () is applied, and if one of the LIP signals crosses
, the decision is determined as in the RT simulation. If neither LIP signal crosses
by the end of the simulation, the decision is determined by whichever of the LIP signals is higher.
A complete description of the model parameters along with a justification of the values we used can be found in the Supplementary Material. Importantly, although there are many parameters, most are constrained by measurement. The threshold, , is the only parameter in the model that was fit to match behavioral accuracy (see Supplementary Material).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The results are presented in three parts. First, we compare the predictions of the model to behavioral and physiological data obtained in the reaction time (RT) direction discrimination task, where the subject responds as soon as they arrive at a decision. We then test whether the assumptions of the model hold in the fixed duration (FD) task, where subjects must view the motion stimulus for a specified duration. Finally, we examine the predictions of the model for the relationship between single unit responses and behavior.
Reaction-time Direction Discrimination
The combined measurement of RT and perceptual choice permits an examination of the neural responses that underlie decision formation in the time frame used by the animal to solve the task. The task furnishes two behavioral measures, choice and response time, which must be accounted for by a model of the decision process.
Behavior: Accuracy and Decision Time
The model in Figure 3 works by integrating the sensory evidence the difference in ensemble spike rates from area MT representing the alternative directions toward a threshold level (). The level of this threshold controls the models accuracy: a higher threshold requires more evidence to be accumulated before a response, leading to fewer errors than would be obtained with a lower threshold. Importantly,
is the only parameter in the model that was fit to behavioral data. We adjusted
to match the overall accuracy seen in the monkey. At this level, the model produces a psychometric function with similar shape to that measured experimentally (Fig. 4A).
|
|
While integration to threshold accounts for accuracy and RT on correct choices, it fails to account for the pattern of RT on error trials. As shown in Figure 4B (gray curves), the model predicts that error RTs will be faster than correct RTs at each coherence, which is inconsistent with data (black curves). These fast errors appear to be caused by early threshold crossings, which are caused by the instantaneous fluctuations in the model LIP activity. We will consider possible solutions to this shortcoming in Discussion. Such errors constitute only 14% of trials.
LIP Physiology
According to our hypothesis, neurons that represent the read-out of area MT compute the integral of the difference between opposing direction signals. Here we examine how well this idea predicts neural responses seen in area LIP. Figure 5 shows averaged responses from 54 neurons in area LIP recorded during the combined direction-discrimination RT task (Roitman and Shadlen, 2002). Recall that the task is arranged so that one of the choice targets is in the neurons RF. The responses increase gradually before saccades to the target in the RF, and they decrease gradually before saccades to the target outside the RF (Fig. 5A). The rate of increase or decrease initially depends on the motion strength: when responses are aligned to the onset of the motion (Fig. 5A, left panel), the spike rates diverge more rapidly for strong motion than for weak motion. This part of the graph emphasizes the epoch of decision formation. When we focus instead on the epoch immediately preceding the behavioral choice (Fig. 5A, right panel), the dependence of the responses on motion strength appears greatly reduced. Indeed, on trials when the monkey chose the target in the RF (solid curves), the responses appear to converge to a common point, independent of the motion coherence. In summary, the data show that (i) different strengths of motion result in different rates of accumulation; and (ii) the responses converge to a common point before decisions that motion is toward the RF but not before decisions in favor of the opposite direction. These properties are explained by the model.
The model produces a family of predicted LIP responses that displays many of the same properties as seen in neural recordings from LIP. Figure 5B shows the simulated responses of LIP neurons that would signal a rightward choice. In other words, the RF of these model neurons is aligned to the rightward choice target. When aligned to the onset of motion (Fig. 5B, left panel), the average spike rates of these model neurons trace out ramp-like linear trajectories. The slopes of the ramps are positive (increasing) for trials leading to rightward choices and negative (decreasing) for leftward choices, and the ramps are steeper for stronger motion than for weaker motion. When the responses are aligned to the time of the motor response (Fig. 5B, right panel), the increasing responses for rightward choices look nearly identical, whereas responses associated with leftward choices do not achieve a stereotyped level of activity but instead retain their dependency on motion strength.
These properties are explained by the underlying computations in the model. Responses aligned to the beginning of the trial reflect the integral of the difference between rightward- and letward-preferring MT ensemble responses. Because the magnitude and sign of the MT difference signal varies with the strength and direction of motion, the average rate of rise or decay in the LIP response also depends on motion strength and direction. In contrast, during the time period immediately preceding the motor response, the responses of rightward-choice LIP neurons during rightward-choice trials appear similar for all motion strengths. This is because the responses are constrained to cross a common decision threshold 100 ms before saccade initiation regardless of whether the threshold was approached slowly or quickly. Indeed, the precise moment of this crossing is often determined by a brief upward fluctuation in the ensemble LIP signal. On the other hand, rightward-choice neurons are not constrained to cross a common threshold during leftward choices because decision time in that case is governed by the opposing population of LIP neurons, whose RFs align with the left-choice target. Thus, the average responses from rightward-choice neurons (dashed curves) retain a dependency on motion strength even at the end of the decision process culminating in a leftward choice. The model thus explains this puzzling asymmetry that is evident in the data (Fig. 5A).
One somewhat surprising observation is that the simulated response rates achieve peak values well above the decision threshold (Fig. 5B, right panel). This occurs because the LIP ensemble activity is smoothed in the formation of the decision signal, which is then compared to decision threshold. Such smoothing delays the growth of the decision signal relative to the spike rates of the neurons within LIP. As a result, the spike rates of single LIP neurons continue their trajectory beyond the threshold by the time the decision is made. The degree of overshoot depends on the value of
LIP, but the effect is seen with even modest smoothing. For
LIP =100 ms, the model achieves spike rates of 6570 spikes/s at the end of the simulation,
100 ms before the saccade, similar to the spike rates seen in the data at this time.
Fixed-duration Direction Discrimination
Accuracy
Up to this point we have mainly discussed the reaction time version of the random dot motion task, where subjects report their decision as soon as possible. Random dot motion discrimination has been studied more extensively under conditions in which the subject views the motion stimulus for a duration that is controlled by the experimenter. In these fixed duration (FD) experiments, it is commonly assumed that the subject makes a decision using all the available motion information that is, that the decision occurs after the motion is turned off. This idea receives experimental support, especially for short stimulus durations in the range of 100800 ms (Gold and Shadlen, 2000; see also Britten et al., 1992
; Burr and Santoro, 2001
). This assumption leads to one obvious prediction: accuracy in 1 s FD experiments should be better than the accuracy in RT experiments because the animals typically view motion for less than a second in the RT task.
However, the experimental evidence contradicts this prediction. Figure 6A shows psychometric functions obtained from experiments in which monkeys alternated blocks of trials in the RT task with blocks of trials in a 1 s FD task. The fits are very similar; in fact, performance on the FD task is slightly worse (P < 0.05). This is remarkable considering that the monkey took an average of only 680 ms to initiate an eye movement response in the RT task. Subtracting out visual latencies and motor preparation time, this finding implies that the monkey performed about equally well in the RT task as it did in the FD task even though it had about half the viewing time. Clearly, this is at odds with the most straightforward version of the integration model. If the monkey has twice the viewing time, we would expect an improvement in signal to noise of 2, which would translate to a left shift of the PMF by about this same amount (Fig. 6B).
|
First, the hypothesis that both RT and FD tasks use the race-to-threshold mechanism predicts that the performance in the FD task should be similar to that obtained in the RT task so long as the decision threshold () is the same (Fig. 6B; compare RT prediction against the FD prediction for integration to threshold). In fact, accuracy should be slightly worse on the FD task because a decision must be rendered from 1 s of stimulus viewing, even if the evidence is <
. In principle, there is no reason to set the threshold to the same level for both FD and RT tasks, but we suspect that this occurs because the monkeys performed RT and FD experiments in alternating blocks of trials on the same day. The critical point is that even when given a fixed amount of time to solve the task, the monkey might operate in the same fashion as in the RT task. We next consider the implications of this hypothesis for neural activity in LIP.
LIP Physiology
Figure 7A (adapted from Roitman and Shadlen, 2002) shows the responses of LIP neurons recorded in 1 s FD experiments; these are the same neurons whose responses in the RT task are plotted in Figure 5A. The responses during FD trials begin with trajectories that are similar to those recorded in RT trials, but as time ensues, the response begins to flatten, eventually reaching a plateau rate whose magnitude depends on the strength of the motion. The pattern is best seen in the larger data set of Shadlen and Newsome (2001
), shown in Figure 7B, using viewing durations of 0.5, 1 and 2 s.
|
Relationships between Single Neurons and Behavior on Single Trials
One of main dividends of the style of model that we have developed is that it portrays the activity of neurons as sequences of spikes that resemble spike trains from neurons in areas MT and LIP, incorporating their associated variability. We can therefore use the model to formulate quantitative predictions about the relationship between neural activity and decisions on a trial-by-trial basis. A basic assumption is that neural signals are represented by the average spike rate from a large ensemble of neurons with similar response properties. Consequently, the impact of any one neuron on the animals behavior is relatively small. In this section, we examine this assumption quantitatively. First, we consider how trial-to-trial variability in the response of single MT neurons relates to the decisions on those trials and to the monkeys overall performance. Then, we examine how the variability of single neurons in LIP relates to the monkeys RT on individual experimental trials and to the monkeys overall performance.
Single Neurons in Area MT
In their original studies, Newsome and colleagues made two quantitative comparisons between MT neurons and behavior. First, they compared the sensitivity of single neurons to the monkeys behavioral sensitivity in a 2 s FD task (Newsome et al., 1989; Britten et al., 1992
). To quantify the neural sensitivity, they compared the distributions of spike counts measured in response to preferred- and null-direction motion. They calculated the probability that a spike count drawn from the distribution of preferred direction responses would exceed a spike count drawn from the distribution of null direction responses. They thus predicted the level of performance that would be achieved if the monkey could compare responses from a pair of neurons tuned to opposite directions of motion, that is, the recorded neuron and one just like it but for the opposite direction preference. This analysis showed that single neurons are surprisingly sensitive to random dot motion: the predicted accuracy versus coherence function from one neuron, which they termed the neurometric function, was strikingly similar to the monkeys psychometric function on average (Fig. 8A). This result implies that the monkey could achieve observed levels of accuracy using only a single pair of MT neurons with opposed direction preferences.
|
Together, these two results presented a paradox. The similarity between the psychometric function and the neurometric function based on a pair of neurons seemed to suggest that the brain does not combine signals from many MT neurons: more neurons would improve the monkeys accuracy to an unrealistic level. On the other hand, if this were truly the case, the few neurons underlying the decision should be highly correlated with the monkeys choice. A decision based on a comparison between two neurons would yield a CP of 0.85 (Newsome et al., 1989
; Shadlen et al., 1996
), whereas the average CP was closer to 0.5, suggesting a very weak relationship. Thus the data appear incompatible with either a population-coding model or a model where a few MT neurons determine the decision.
Two solutions to this paradox were proposed by Newsome and colleagues (Shadlen et al., 1996). First, one could assume that the brain uses many MT neurons whose preferred directions are not as well matched to the random dot motion as the neurons that were studied by Newsome et al. (1989)
, where the motion was precisely aligned to the preferred direction of the neurons. Pooling from such a mixture of optimal and sub-optimal neurons would lead to less improvement in accuracy. However, this resolution was not sufficient to explain the CP. Even with arbitrarily large pools, and regardless of the mix of optimal and sub-optimal neurons, single neurons would be expected to retain a higher degree of correlation with the monkeys decision than observed (the predicted CP was
0.65). This is because the neurons in the pools are thought to covary weakly in their responses. To reduce the predicted CP, they assumed that some additional noise is added during read-out of the MT signal; they referred to this as pooling noise. By adjusting the mixture of neurons and pooling noise, the model could account for both the accuracy of the monkey and the trial-by-trial relationship between single neurons and decisions.
The model in Figure 3 offers a simpler account of these results. When the model is allowed to integrate for a full 2 s, it yields the same problematic predictions as the Shadlen et al. (1996) pooling model: a level of sensitivity that is about two times better than single neurons and the monkey (Fig. 8B), and CP values that are higher than those observed in the monkey (Fig. 8E). Suppose, however, that instead of accumulating evidence for the full duration of the motion stimulus, the monkey reaches a decision when the evidence reaches a threshold,
, as in the RT task (see Fixed-duration Direction Discrimination). According to this idea, MT spikes would only contribute to the decision during the portion of the trial before the threshold crossing. Importantly, the experimenter does not know when this time occurs and thus includes all spikes from the full 2 s recording into calculations of neural sensitivity and CP.
This alternative decision model resolves the paradox posed by the sensitivity of single MT neurons and the weak trial-by-trial correlation between their variable responses and the monkeys decisions. First, the model no longer attains the 2-fold improvement over the measured sensitivity of single neurons because the decision process uses only a fraction of the 2 s viewing time. In fact, the predicted psychometric function is now approximately the same as the neurometric function calculated from just one typical MT neuron using 2 s of spike discharge (Fig. 8C). This correspondence reflects the fact that the decision takes 500 ms on average, which is only one-quarter of the total viewing time, so the models accuracy is factor of 2 lower than the theoretical maximum. Secondly, the predicted CP in this model is reduced to 0.57 (Fig. 8F), which is within the range of values calculated by Britten et al. (1996)
). Again, this occurs because many of the MT spikes used to compute the CP do not actually contribute to the decision, and thus they show no correlation with the choice on a given trial. Hence, this model advanced here can account for all sources of noise between the sensory representation in MT and the decision. Moreover, as shown in Figure 7, it explains features of the LIP responses recorded in FD experiments.
The model makes one prediction that is contradicted by the data. According to our idea, spikes recorded late in the trial are less likely to affect the monkeys choices because they are often recorded after the monkey has made its decision. Assuming, as we have, that these spikes are independent of those occurring earlier in the trial, we would expect that they would exhibit no correlation with the monkeys choices. The data from Britten et al. (1996) appear to contradict this prediction (see their figure 11). We will return to this issue in Discussion.
Single Neurons in Area LIP
Our model instantiates the idea that LIP represents the accumulation of evidence for choosing the direction of motion associated with the choice target in the neurons response field. As with MT, we assume that ensembles of noisy, weakly correlated, spiking neurons supply the signals used for making decisions about the direction of motion. Therefore, the same intuitions should apply: responses from single LIP neurons should covary weakly with the ensemble signals to which they contribute and therefore with the monkeys behavioral response.
The LIP neurons studied by Shadlen and Newsome (2001) and Roitman and Shadlen (2002
) exhibit responses that can be interpreted as motor planning signals. Not surprisingly, in the 100 ms epoch just before the monkey makes an eye movement, the responses are clearly associated with the eye movement response. Indeed, for many neurons, it is possible to predict with 100% reliability the monkeys choice based on the spikes in that trial. A more interesting quantity is the spike rate in the epoch ending 100 ms before eye movement initiation in the RT task (Fig. 9A), about the time when the evidence reaches the threshold for commitment. Although the ensemble response in LIP determines the monkeys choice, individual neurons predict the monkeys choices with only modest reliability (mean CP = 0.73). Interestingly, the model yields a very similar CP value for single neurons in the same epoch (Fig. 9B).
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Our model instantiates the idea that the decision in the random dots task is based on the time integral of motion evidence represented by neurons in area MT. The simulations described in this paper do not test the idea, but instead explore its implications. In that sense, the model exposes important features of the data, furnishes novel interpretations, and makes predictions about future experiments (see Supplementary Material). In what follows, we provide some intuition for the structure of the model, explain the rationale behind several assumptions, and explore its successes, limitations and alternatives.
Structure of the Model
A substantial body of experimental results indicates that neurons in area MT represent the critical sensory signals that monkeys use to base their judgment of random dot motion (for reviews, see Parker and Newsome, 1998; Britten, 2003
). This evidence fluctuates in time, reflecting both the variability in the dynamic stimulus and the noise in the neural response (Mazurek and Shadlen, 2002
). It must be accumulated in order for the brain to reach an accurate decision. The model suggests that this integrated motion evidence is carried in the responses of sensorimotor association areas such as LIP. When the accumulated evidence reaches a threshold level, the decision process terminates. The higher the threshold, the longer it will take to reach a decision but the more accurate the decision will be, on average. The model thus takes us from the representation of motion to a commitment to a choice.
The models structure incorporates several assumptions about the nature of the neural signals in these computations. These are detailed in the Supplementary Material along with a description of the mathematical operations used to perform the simulations.
What Do We Learn from the Model?
Integration of MT signals to a decision threshold explains the effect of motion strength on the monkeys behavior in the reaction time (RT) task (Fig. 4). It reconciles the accuracy of the monkeys decisions with the amount of time required to make them. Stronger motion produces a rapid accumulation of evidence toward the correct decision threshold. Weaker motion produces a more random accumulation that meanders between either decision threshold, leading to a mixture of correct and incorrect responses. For the 0% coherent motion, the accumulation is equally likely to diffuse toward either threshold.
Accumulation to threshold also accounts for the psychometric function observed in the fixed duration (FD) task, where the experimenter controls the stimulus duration (Fig. 6). This idea resolves a puzzling observation: monkeys achieve similar level of accuracy on RT and 1 s FD tasks (Fig. 6A), despite the longer viewing times provided in the FD experiment. We propose that this surprising behavior occurs because the monkey employs a threshold of evidence for committing to a decision in the FD task, just as in the RT task (Link and Heath, 1975; Meyer et al., 1988
; Ratcliff, 1988
; Link, 1992
). The similarity in performance suggests that the monkeys used the same level of the decision threshold (
) in the two tasks.
This idea does not imply that monkeys cannot improve their performance with increasing viewing times. If the viewing duration is relatively short, the integrated evidence may fail to reach threshold. In that case the optimal solution is to respond based on the available evidence, which is expected to increase in reliability as a function of t. Behavioral evidence for such integration can be found in FD experiments (Britten et al., 1992
; Gold and Shadlen, 2000
; Gold and Shadlen, 2003
).
Integration to threshold explains the time course of the responses from LIP neurons in both RT and FD tasks (Figures 5 and 7). This observation reconciles a remarkable discrepancy in the experimental data: the responses during motion viewing exhibit different time courses depending on whether the monkey is performing the RT or the FD task (Roitman and Shadlen, 2002). Our simulations demonstrate that integration to threshold can account for the time course of the responses in both of these situations, predicting both ramp-like increases and decreases in the RT task and saturating functions of time in the FD task. The model also reconciles some puzzling observations about the LIP activity: (i) the convergence of rising responses preceding eye movements to the RF in the RT task (Fig. 5A,B, solid curves, right panels); (ii) the absence of convergence of declining responses before eye movements to away from the RF in the RT task (Fig. 5A,B, dashed curves, right panels); (iii) the absence of convergence of rising responses (i.e. the coherence-dependent saturation) in FD trials (Fig. 7); and (iv) the apparent failure of averaged responses in FD experiments to reach a common threshold.
Because the model explicitly represents the activity of single MT and LIP neurons, it lends insight into the relationship between behavioral measurements and single neuron recordings obtained in the laboratory. By mimicking the spike discharge of single MT neurons, the model explains the observed relationship between MT responses and the monkeys decisions on a trial-to-trial basis (choice probability; Fig. 8DF). By mimicking the spike discharge of single LIP neurons, the model explains the observed relationship between LIP responses and behavior in the RT task (Figures 9 and 10). The explanation of these single-trial comparisons, which are only available from combined psychophysicsneurophysiology experiments, are the main dividend of our modeling exercise. Finally, the idea that evidence is integrated to a threshold, even when given a fixed viewing duration, serves to reconcile the exquisite sensitivity of single MT neurons with the overall sensitivity of the monkey to random dot motion (Fig. 8AC).
Model Limitations and Failures
A major limitation of our model is that it fails to specify how key computations are achieved by neurons. It offers no insight into the neural mechanisms that would underlie integration, or direction selectivity, or even addition and subtraction. Nor does the model attempt to explain the mechanism by which the LIP activity is compared to the decision threshold. The biophysical and circuit properties underlying operations such as integration are an active topic of investigation at the theoretical and experimental level (Camperi and Wang, 1997, 1998; Seung et al., 2000
; Aksay et al., 2001
; Koulakov et al., 2002
). Our model circumvents these issues by performing simple mathematical calculations on spike rate signals, but obviously this leaves open the question of how the brain actually accomplishes the calculations.
Similarly, our model provides little insight into the control of the integration process. For example, we do not attempt to explain the early dip and recovery of neural activity that occurs just after onset of random dot motion. These dips are a common feature of neurons with persistent activity (Sato et al., 2001), and we suspect that this phenomenon might play a role in initiating the integration, perhaps serving as a reset. Similarly, we assume that LIP stops receiving sensory input after the decision threshold is crossed, even when motion information remains present as in the FD task. This may be plausible in the context of the random dot motion task, but it is undoubtedly an oversimplification. Clearly, neural integration for high-level cognitive behavior requires complex controls and is susceptible to numerous influences, and our model makes no attempt to capture these various factors.
Finally, neither our data nor our model address the question of whether integration actually occurs in LIP or elsewhere, only to be relayed to LIP. Several brain areas show decision related activity similar to what is observed in LIP, including the prefrontal cortex and the superior colliculus (Kim and Shadlen, 1999; Horwitz and Newsome, 2001
). It is unknown whether any or all of these areas are responsible for integration of the evidence from MT, or to what degree their responses simply reflect this process.
In addition to its limitations, the model makes several predictions that are known to be incorrect. The most serious discrepancy between model and data is in the RT associated with error trials. The model predicts relatively fast error responses, whereas monkeys tend to take slightly more time to respond on error trials than on correct trials at each coherence level (Fig. 4). Diffusion or random walk models, which are related to ours, predict that errors and correct trials share the same chronometric function (RT versus motion strength) (Luce, 1986). Our model produces fast errors because of variability in the LIP signal, resulting in early threshold crossings. How might these fast errors be remedied? Ratcliff and Rouder (1998
) demonstrated that the inclusion of trial-to-trial variability in the effective stimulus strength produces slower errors in diffusion models sharing features with ours. This source of variability is not incorporated in our model: the expected MT response is identical across trials sharing the same motion strength, which is clearly an oversimplification. Remedying this might bring the predicted error RTs closer to those observed in the data. In addition, we should note that monkeys performing the discrimination task probably make errors for a wide variety of reasons, many of which are not captured by the integration model (e.g. lapses in attention, blinks). Hence, error RTs may be harder to explain than other aspects of the behavior.
There is at least one other prediction of the model that is contradicted by data. In simulations using a fixed 2 s duration, the model posits that MT inputs are ignored after LIP crosses the decision threshold. This predicts that MT spikes occurring early in the viewing period should correlate with the monkeys decision more strongly than spikes occurring later in the trial, which is contrary to the findings of Britten et al. (1996). One possible explanation may be that the monkeys in the Britten et al. study began integrating MT spikes at a variety of latencies during the 2 s motion viewing period. This is consistent with our observations that the animals require much less than the 2 s to reach a decision. Some support for this idea can be found in Shadlen and Newsomes study of LIP (Shadlen and Newsome, 2001
). They found that exposure to viewing durations shorter than 2 s caused an acceleration in the build up of LIP activity for trials of all durations. Direct measurements of MT neurons in monkeys trained to perform both FD and RT tasks may help to clarify these issues.
Model Extensions and Alternatives
The analyses presented in this paper show that time integration of sensory evidence to a threshold accounts for a wide variety of behavioral and physiological observations in monkeys trained to discriminate the direction of random dot motion. Several extensions to the basic idea deserve consideration. Variability in the parameters of integration, such as the starting time, the baseline spike rate, and the drift rate, may be helpful in remedying the failures and improving the generality of the model, especially when incorporating prior biases into the decision (Carpenter and Williams, 1995; Ratcliff and Rouder, 1998
). Similarly, the decision threshold may be variable and is likely to change as a function of time. A dynamic decision threshold embodies the idea that commitment to one or another behavioral option may need to occur by some time point or with some degree of urgency (Reddi and Carpenter, 2000
). In general, the decision threshold is likely to incorporate knowledge of the accuracy that has been achieved, the passage of time, and the rate at which reward is attained (Gold and Shadlen, 2002
). These factors will need to be incorporated into a more complete computational model.
The main alternative to temporal integration is a decision process that is not based on an accumulation of information but instead on extreme values in the sensory data, considered as a passing stream. This type of process is often termed probability summation (Watson, 1979). Like integration, probability summation predicts that accuracy and decision time will depend on motion strength. However, probability summation would not predict the existence of neurons like those in LIP that represent a decision variable that evolves gradually. Rather, it predicts that at any moment, the brain is either committed to a decision or it has yet to detect the decisive sensory data. The pattern of neural responses seen in LIP fails to support this alternative, but individual neurons could in principle undergo the kind of state change predicted by this model on individual trials. The idea is that the average of many such switches from an intermediate spike rate to a high spike rate (or a low spike rate for the opposite choice) at different times could give rise to the ramp-like trajectories seen in Figure 5. The evidence at present favors integration over probability summation (see Gold and Shadlen, 2000
; Roitman and Shadlen, 2002
), but further experiments will be necessary to clarify the matter
![]() |
Concluding Remarks |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Supplementary Material |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Notes |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Address correspondence to Michael N. Shadlen, Department of Physiology and Biophysics, University of Washington Medical School, Box 357290, Seattle, WA 981957290, USA. Email: shadlen{at}u.washington.edu.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bisley JW, Zaksas D, Pasternak T (2001) Microstimulation of cortical area MT affects performance on a visual memory task. J Neurophysiol 85:187196.
Britten K (2003) The middle temporal area: motion processing and the link to perception. In: The visual neurosciences (Chalupa WJ, ed.). Boston, MA: MIT Press.
Britten KH, Shadlen MN, Newsome WT, Movshon JA (1992) The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci 12:47454765.[Abstract]
Britten KH, Shadlen MN, Newsome WT, Movshon JA (1993) Responses of neurons in macaque MT to stochastic motion signals. Vis Neurosci 10:11571169.[ISI][Medline]
Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA (1996) A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis Neurosci 13:87100.[ISI][Medline]
Bruce CJ, Goldberg ME (1985) Primate frontal eye fields I Single neurons discharging before saccades. J Neurophysiol 53:603635.
Burr DC, Santoro L (2001) Temporal integration of optic flow, measured by contrast and coherence thresholds. Vision Res 41.
Camperi M, Wang X-J (1997) Modeling delay-period activity in the prefrontal cortex during working memory tasks. In: Computational neuroscience (Bower, ed.), pp. 273279. New York: Plenum Press.
Camperi M, Wang X-J (1998) A model of visuospatial working memory in prefrontal cortex: recurrent network and cellular bistability. J Comput Neurosci 5:383405.[CrossRef][ISI][Medline]
Carpenter R, Williams M (1995) Neural computation of log likelihood in control of saccadic eye movements. Nature 377:5962.[CrossRef][ISI][Medline]
Celebrini S, Newsome WT (1994) Neuronal and psychophysical sensitivity to motion signals in extrastriate area MST of the macaque monkey. J Neurosci 14:41094124.[Abstract]
Evarts EV, Tanji J (1976) Reflex and intended responses in motor cortex pyramidal tract neurons of monkey. J Neurophysiol 39:1069.
Fukushima K, Kaneko C, Fuchs A (1992) The neuronal substrate of integration in the oculomotor system. Prog Neurobiol 39:609639.[CrossRef][ISI][Medline]
Funahashi S, Bruce C, Goldman-Rakic P (1989) Mnemonic coding of visual space in the monkeys dorsolateral prefrontal cortex. J Neurophysiol 61:331349.
Fuster JM, Alexander GE (1971) Neuron activity related to short-term memory. Science 173:652654.[ISI][Medline]
Fuster JM (1973) Unit activity in prefrontal cortex during delayed-response performance: neuronal correlates of transient memory. J Neurophysiol 36:6178.
Glimcher PW, Sparks DL (1992) Movement selection in advance of action in the superior colliculus. Nature 355:542545.[CrossRef][ISI][Medline]
Gnadt JW, Andersen RA (1988) Memory related motor planning activity in posterior parietal cortex of macaque. Exp Brain Res 70:216220.[ISI][Medline]
Gold JI, Shadlen MN (2000) Representation of a perceptual decision in developing oculomotor commands. Nature 404:390394.[CrossRef][ISI][Medline]
Gold JI, Shadlen MN (2002) Banburismus and the brain: decoding the relationship between sensory stimuli, decisions, and reward. Neuron 36:299308.[ISI][Medline]
Gold JI, Shadlen MN (2003) The influence of behavioral context on the representation of a perceptual decision in developing oculomotor commands. J Neurosci 23:632651.
Goldman MS, Kaneko CR, Major G, Aksay E, Tank DW, Seung HS (2002) Linear regression of eye velocity on eye position and head velocity suggests a common oculomotor neural integrator. J Neurophysiol 88:659665.
Hernandez A, Zainos A, Romo R (2002) Temporal evolution of a decision making process in medial premotor cortex. Neuron 33:959972.[ISI][Medline]
Horwitz GD, Newsome WT (2001) Target selection for saccadic eye movements: prelude activity in the superior colliculus during a direction-discrimination task. J Neurophysiol 86:25432558.
Kim J-N, Shadlen MN (1999) Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nat Neurosci 2:176185.[CrossRef][ISI][Medline]
Koulakov AA, Raghavachari S, Kepecs A, Lisman JE (2002) Model for a robust neural integrator. Nat Neurosci 5:775782.[CrossRef][ISI][Medline]
Link SW (1992) The wave theory of difference and similarity. Hillsdale, NJ: Lawrence Erlbaum Associates.
Link SW, Heath RA (1975) A sequential theory of psychological discrimination. Psychometrika 40:77105.[ISI]
Luce RD (1986) Response times: their role in inferring elementary mental organization. New York: Oxford University Press.
Mazurek ME, Shadlen MN (2002) Limits to the temporal fidelity of cortical spike rate signals. Nat Neurosci 5:463471.[ISI][Medline]
Meyer DE, Irwin DE, Osman AM, Kounios J (1988) The dynamics of cognition and action: mental processes inferred from speedaccuracy decomposition. Psychol Rev 95:183237.[CrossRef][ISI][Medline]
Miyashita Y, Chang H (1988) Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature 331:6870.[CrossRef][ISI][Medline]
Newsome WT, Paré EB (1988) A selective impairment of motion perception following lesions of the middle temporal visual area (MT). J Neurosci 8:22012211.[Abstract]
Newsome WT, Britten KH, Movshon JA (1989) Neuronal correlates of a perceptual decision. Nature 341:5254.[CrossRef][ISI][Medline]
Newsome WT, Britten KH, Movshon JA, Shadlen M (1989) Single neurons and the perception of visual motion. Neural mechanisms of visual perception. In: Proceedings of the Retina Research Foundation (Lam DM-K, Gilbert CD, eds), vol. 2, pp. 171198. The Woodlands, TX: Portfolio Publishing Company.
Parker AJ, Newsome WT (1998) Sense and the single neuron: probing the physiology of perception. Annu Rev Neurosci 21:227277.[CrossRef][ISI][Medline]
Ratcliff R (1988) Continuous versus discrete information processing: modeling accumulation of partial information. Psychol Rev 95:238255.[CrossRef][ISI][Medline]
Ratcliff R, Rouder JN (1998) Modeling response times for two-choice decisions. Psychol Sci 9:347356.[CrossRef][ISI]
Reddi BA, Carpenter RH (2000) The influence of urgency on decision time. Nat Neurosci 3:827830.[CrossRef][ISI][Medline]
Robinson DA (1989) Integrating with neurons. Annu Rev Neurosci 12:3345.[CrossRef][ISI][Medline]
Roitman JD, Shadlen MN (2002) Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. J Neurosci 22:94759489.
Salzman CD, Britten KH, Newsome WT (1990) Cortical microstimulation influences perceptual judgements of motion direction. Nature 346:174177.[CrossRef][ISI][Medline]
Sato T, Murthy A, Thompson KG, Schall JD (2001) Search efficiency but not response interference affects visual selection in frontal eye field. Neuron 30:583591.[CrossRef][ISI][Medline]
Seung HS, Lee DD, Reis BY, Tank DW (2000) Stability of the memory of eye position in a recurrent network of conductance-based model neurons. Neuron 26:259271.[ISI][Medline]
Shadlen MN, Newsome WT (2001) Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J Neurophysiol 86:19161936.
Shadlen MN, Britten KH, Newsome WT, Movshon JA (1996) A computational analysis of the relationship between neuronal and behavioral responses to visual motion. J Neurosci 16:14861510.[Abstract]
Tanji J, Hoshi E (2001) Behavioral planning in the prefrontal cortex. Curr Opin Neurobiol 11:164170.[CrossRef][ISI][Medline]
Watson AB (1979) Probability summation over time. Vision Res 19:515522.[CrossRef][ISI][Medline]
Wurtz RH, Albano JE (1980) Visual-motor function of the primate superior colliculus. Annu Rev Neurosci 3:189226.[CrossRef][ISI][Medline]