W.M. Keck Laboratory, The Neurosciences Institute, 10640 John Jay Hopkins Drive, San Diego, CA 92121, USA
Jeffrey L. Krichmar, W.M. Keck Laboratory, The Neurosciences Institute, 10640 John Jay Hopkins Drive, San Diego, CA 92121, USA. Email: krichmar{at}nsi.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In dealing with this degree of complexity, careful experimental analysis and theory building are obviously essential. However, analytic approaches conducted separately at each level are unlikely alone to provide a full picture of neural patterns in a behaving organism. There are obvious limits on the number of levels simultaneously observable during any given experiment. Moreover, despite the power of mathematical and computational approaches, they have not yet provided a multilevel picture of the non-linear relationships between brain and behavioral events.
To confront these issues and complement these approaches, we have adopted a procedure called synthetic neural modeling (Reeke et al., 1990; Edelman et al., 1992
). This consists of building devices capable of behavior, providing them with a computationally simulated nervous system based on known biological principles of neuroanatomical organization and physiological activity, and then following the behavioral and neuronal responses of such a construct in real time, in a real-world environment. By following behavioral and brain responses completely at all levels of control in a particular environment, one can formulate a synthetic picture that has heuristic value in interpreting data obtained from behaving animals.
A series of such brain-based devices capable of increasingly sophisticated autonomous performance has been tested over the last decade (Edelman et al., 1992; Almassy et al., 1998
; Krichmar et al., 2000
; Sporns et al., 2000
). In these earlier devices, we demonstrated the learning of perceptual responses and emphasized the role of value systems. Value systems are neural structures that are necessary for an organism to modify its behavior based on the salience or value of an environmental cue (Friston et al., 1994
). The value system in a brain-based device is analogous to ascending neuromodulatory systems in that its units show uniform phasic responses when activated and its output acts diffusely over multiple pathways by modulating synaptic change (Schultz et al., 1997
; Sporns et al., 2000
).
In the present report, we describe the construction and performance of Darwin VII, a device capable of perceptual categorization and conditioned behavior. We have extended previous conditioning experiments to include second-order conditioning and have carried out an extensive analysis of the responses of simulated neuronal units. By probing simultaneous brain and behavioral responses at all levels during perceptual and conditioning tasks, we have obtained several new insights into the organization of autonomous behavior. These include a richer understanding of the effects of individual history on learning, of the possible origins of invariant object recognition in an analogue of the inferotemporal cortex, and of the relation of changes in synaptic efficacy to appetitive and aversive conditioned responses.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
In the present experiments, the simulated nervous system contained 18 neuronal areas, 19 556 neuronal units, and ~450 000 synaptic connections. Figure 2 shows a high-level diagram of the different neural areas and the synaptic connections between neural areas in the simulated nervous system. Further details of the parameters describing neuronal unit activity and neuronal unit connectivity can be found in the Appendix (see Tables A1 and A2
, and Appendix, parts B and C). Each simulation cycle took ~200 ms of real time. A simulation cycle is the period during which the current sensory input is processed, the activities of all neuronal units are computed, the connection strengths of all plastic connections are computed, and motor output is generated (see Appendix, parts A and B). Connections between and within neuronal areas were subject to activity-dependent modification following a value-independent (see Appendix, part C) or value-dependent (see Appendix, part D) synaptic rule. Synaptic modification was determined by both pre- and post-synaptic activity and resulted in either strengthening or weakening of the synaptic efficacy between two neuronal units. We used a modified Bienenstock, Cooper and Munro (BCM) learning rule to govern synaptic change because it has a region in which weakly correlated inputs are depressed and strongly correlated inputs are potentiated (Bienenstock et al., 1982
). Simplifying the BCM learning rule by making it piecewise linear and fixing the thresholds, resulted in an efficient biologically based learning rule (see Appendix, part C). Plastic connections that were value-dependent were made between areas involved in responses to salient environmental events [A1/IT
Mapp/Mave, A1/IT
S; see (Aston-Jones and Bloom, 1981
; Ljungberg et al., 1992
)]. Plastic connections that were not value-dependent were made between areas where perceptual categories were to be learned from experience (VAP
IT, LCoch/RCoch
A1). Non-plastic connections were between neural areas where there were reflex responses (Tapp/Tave
Mapp/Mave, R
C), local projections within an area (IT
IT, A1
A1), or between areas where it was assumed that plasticity had already occurred very early in development [R
VAP, see Crair et al. (Crair et al., 1998
)]. On the assumption that these synaptic changes do not saturate or persist indefinitely, we used a passive synaptic decay term (see
in Appendix, part C) to express a decline in synaptic strength in the absence of activity. Activation of the simulated value system (area S, Fig. 2
) signaled the occurrence of salient sensory events and contributed to the modulation of connection strengths of all active synapses in the affected pathways (see value-dependent projections in Fig. 2
). For example, tasting a block picked up by Darwin VIIs gripper is a salient event affecting subsequent behavior that is reinforced or weakened through synaptic change. Area S is thus analogous to an ascending neuromodulatory value system (Schultz et al., 1997
; Sporns et al., 2000
).
|
|
|
In the experiments in which individual variation was to be examined, each Darwin VII subject shared the same physical device, but had an instantiation in which the simulated nervous system was unique, as a result of different random initializations within the constraints given by Table A2, in both the connectivity between individual neuronal units and the initial connection strengths between those units. Because the connectivity between neuronal units was constrained by a common set of projections, however, large-scale connectivity (i.e. projections between neural areas) was similar between subjects. Details of the neuro-anatomical constraint parameters for each synaptic projection, as well as parameters for the synaptic efficacy rules and the projection distributions, can be found in the Appendix (part C and Tables A1 and A2
).
Darwin VIIs environment consisted of an enclosed area with black walls and a floor covered with opaque black plastic panels, on which we distributed stimulus blocks (6 cm metallic cubes) in various arrangements (Fig. 1). The top surfaces of the blocks were covered with removable black and white patterns; the other surfaces of the blocks were featureless and black. All experiments reported in this paper were carried out with multiple exemplars of two basic designs: blobs (several white patches 23 cm in diameter) and stripes (width 0.6 cm, evenly spaced). Stripes on blocks in the gripper can be viewed in either horizontal or vertical orientations, yielding a total of three stimulus classes of visual patterns to be discriminated (blob, horizontal and vertical). A flashlight mounted on Darwin VII and aligned with its gripper caused the blocks, which contained a photodetector, to emit a beeping tone when Darwin VII was in the vicinity. The sides of the stimulus blocks were metallic and could be rendered either strongly conductive (good taste or appetitive) or weakly conductive (bad taste or aversive). Gripping of stimulus blocks activated the appropriate taste neuronal units (either area Tapp or area Tave) to a level sufficient to drive the motor areas above a behavioral threshold. In the experiments described in this paper, strongly conductive blocks with a striped pattern and a 3.9 kHz tone were chosen arbitrarily to be positive value exemplars, whereas weakly conductive blocks with a blob pattern and a 3.3 kHz tone represented negative value exemplars.
Basic modes of behavior built into Darwin VII included IR sensor-dependent obstacle avoidance, visual exploration, visual approach and tracking, gripping and tasting, and two main classes of innate behavioral reflex responses (appetitive and aversive). With the exception of obstacle avoidance, selection among the above behaviors was under control of the simulated nervous system. Appetitive and aversive responses were triggered when the difference in activity between motor areas Mapp and Mave exceeded a threshold (Fig. 2). These responses could be activated by taste (the unconditioned stimulus, US, triggering an unconditioned response, UR) or by auditory or visual stimuli (the conditioned stimulus, CS, triggering a conditioned response, CR). Prior to conditioning, taste triggered the behavioral responses; after conditioning, either a visual pattern or an auditory pattern could evoke behavioral responses. Unconditioned appetitive and aversive behavioral responses consisted of prolonged gripping and tasting of a stimulus block, releasing the block, and then turning counterclockwise. Conditioned appetitive responses, which occurred when motor area activity exceeded the threshold before tasting, differed from unconditioned appetitive responses in that a clockwise turn was executed after tasting a block. In conditioned aversive responses, Darwin VII avoided a stimulus block by backing away without picking it up and then turning clockwise. Thus, during the conditioning experiments, in which many stimuli were encountered over an extended period of time, Darwin VII developed perceptual categories that modified its behavioral responses.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Perceptual Categorization
Perceptual categorization is the ability to discriminate and categorize sensory stimuli (Clark et al., 1988; Kilgard and Merzenich, 1998
). Development of this ability is obviously necessary for learning and conditioning and, for this reason, was extensively explored in Darwin VII. In primates, the inferotemporal cortex is an area that is believed to be associated with visual object recognition (Tanaka, 1996
). In Darwin VII, activity in the simulated inferotemporal cortex, area IT (Fig. 2
), provided the basis for visual perceptual categorization. Initially, ITs responses to visual stimuli were weak and diffuse (see IT activity in Fig. 3A
). After approximately five stimulus encounters, activity-dependent plasticity between VAP and IT caused IT responses to the different stimuli to become strong, sharp and separable (see IT activity in Fig. 3B
). It is this strong, discriminative activity of neuronal groups within IT in response to visual stimuli as well as the appropriate behavioral response that we refer to as visual categorization in Darwin VII.
|
In animals, perceptual categorization in the inferotemporal cortex is invariant with respect to differences in position, scale and rotation of an object (Tanaka et al., 1991; Tovee et al., 1994
; Ito et al., 1995
; Rolls and Tovee, 1995
; Tanaka, 1996
). Such invariant object recognition has been difficult to achieve in computer vision systems (Mundy and Zisserman, 1992
; Mundy et al., 1992
; Shashua, 1993
; Weinshall, 1993
). In the present work, however, Darwin VIIs object recognition was observed to be invariant with respect to scale, position and rotation. Visual categorization of a stimulus occurred no matter where an object appeared in Darwin VIIs visual field, with the apparent size of the stimulus ranging from a maximum when the object was directly in front of Darwin VII (Fig. 3A
) to one-quarter of the maximum size when the object was distal to Darwin VII. Correct categorization of striped blocks in Darwin VIIs field of vision, when blocks were not in its gripper, occurred when the stripes on the blocks were rotated over a range of ±30° of a horizontal or vertical reference.
Invariant object recognition required continuous, time-varying sensory input while Darwin VII moved. Invariant responses developed as a result of competition among activity-dependent plastic connections between retinotopically mapped VAP and non-topographically mapped IT. The connections that were potentiated earliest were those with VAP receptive fields corresponding to regions near Darwin VIIs gripper, regions where IT responses to the neural stimulus were first sustained (Fig. 4, top First Horizontal Striped Block). These connections had a competitive advantage; they received not only the earliest but also the longest exposure to the stimulus as a result of the time spent by the block in the gripper. The maintenance of discriminative, persistent patterns of neuronal groups in IT required sustained high activity resulting from strengthening of plastic connections with VAP neuronal units that received continuously varying images of the block. Upon each approach and withdrawal from the stimulus block, the number of potentiated connections increased, resulting in recruitment of neuronal units with receptive fields that responded to visual stimuli beyond Darwin VIIs gripper (Fig. 4
, top). An example of the resultant activity in VAP and IT during invariant object recognition is shown in the bottom two rows of Figure 4
. When the temporal sequence of the images leading to invariance was artificially shuffled (Almassy et al., 1998
), invariant object recognition did not occur. As further considered in the Discussion, the invariance arose mainly as a result of an initial strengthening of VAP to IT synapses that was reinforced and expanded by subsequent inputs from the stimulus block during Darwin VIIs movements.
|
Differences in an individuals perceptual history can have an effect on the organization and response of the nervous system. For example, more neurons in the monkey inferotemporal cortex respond to familiar than to unfamiliar stimuli (Kobatake et al., 1998). Using Darwin VII, we performed experiments concerned with experience-dependent effects on categorization during the development of perceptual categories as well as after such development.
We first investigated the effect of variations in presentation frequency of each stimulus class on the development of neuronal unit responses. Darwin VII explored an environment that was partially segregated into two equal sized areas. One area mainly contained blocks with blobs and the other area mainly contained blocks with stripes. In each of 14 separate experiments, Darwin VII started with an identical simulated nervous system that had not sampled any stimuli. The number of neuronal units in IT responding to a given stimulus (whether blob, horizontal stripe or vertical stripe) increased selectively with an increase in the frequency of presentation of that stimulus class. Statistical significance was tested using, r, Pearsons product moment correlation. Stimulus presentation frequency was found to be positively correlated with patterned neural activity in IT that was individually characteristic for each of the visual stimulus classes (blob: r = 0.71, P < 0.005; horizontal: r = 0.75, P < 0.003; vertical: r = 0.61, P < 0.03). These findings are similar to the results of neuronal recordings in the monkey inferotemporal cortex in that more IT neurons responded to familiar than to unfamiliar objects in recognition tasks (Kobatake et al., 1998).
In these experience-dependent responses, competitive and selective interactions among neuronal units from VAP to IT and within IT governed the changes in the number of those units that responded to a stimulus. An increase in neuronal group size reflected the activity-dependent changes in synaptic connections from neuronal units in VAP to neuronal units in IT, leading to increased activity in IT. Through intrinsic excitatory connections, this increased activity further recruited neighboring neuronal units in IT. The change in neuronal group size was competitive: a group specific to one stimulus could grow at the expense of another neuronal group associated with another stimulus (Clark et al., 1988).
In the second set of experiments on experience-dependent perceptual categorization, we studied the effect of stimulus presentation frequency on neural mapping in IT after visual categories had already been developed. To reach this level of experience, Darwin VII sampled an equal proportion (eight each) of blocks in the three stimulus classes. Darwin VII then sampled either eight additional stimuli containing all three stimulus classes or eight additional stimuli containing any two out of the three stimulus classes. Thus, some stimuli were more frequently sampled than others.
In contrast to the previous experiments on early development, after extensive experience, the number of neuronal units in IT responding to more frequently sampled stimuli did not change significantly, suggesting that responses in IT had become saturated with respect to the familiar stimuli. However, in the experiments in which Darwin VII responded to the less frequently sampled stimulus, the number of IT neuronal units was significantly less than that in the controls (Table 1). Two factors appear to be responsible for these results. First, the growth of the neuronal groups in IT was limited by intrinsic excitatory and inhibitory connections. Recurrent excitation caused the size of the groups to grow, but lateral inhibition kept that growth in check (see IT activity in Fig. 3B
). At a certain size, the different neuronal groups that were active in response to a visual stimulus competed with each other and their growth was halted. In essence, the memory for these perceptual categories is stable. Secondly, the decrease in neuronal units responding to an under-sampled stimulus was governed, in part, by the decay rate,
, in the activity-dependent synaptic efficacy rule (see Appendix, part C). This caused the efficacy of each synaptic connection that had not been recently updated to decay towards its original value. If, for example, the blob visual stimulus was not encountered for a protracted period of time, synaptic connections from VAPB to IT weakened and fewer IT neuronal units responded to that stimulus class. In essence, the perceptual category was forgotten.
|
|
In contrast to the limited number of cells whose activity can be monitored in live animals, the design of Darwin VII allowed us to record all such activity in all neuronal units. Neurophysiologists often test whether limited samples from brain areas are robust predictors of responses to input stimuli (Bialek and Zee, 1990; Theunissen and Miller, 1991
; Brown et al., 1998
). It was therefore of interest to investigate whether a sparse sampling of neural patterns in the simulated area IT would reliably predict the response to visual stimuli by Darwin VII. We allowed seven individually different Darwin VII subjects to sample at least 20 aversive and 20 appetitive blocks. For each Darwin VII subject, patterns of activity in IT during the development of visual categories (i.e. exposure to the first 10 aversive and appetitive exemplars) were compared with patterns of activity in IT after categorization (i.e. exposure to the last 10 aversive and appetitive exemplars (see Appendix, part G). The accuracy of classification based on IT activity improved with each stimulus exemplar to near perfect performance (Fig. 6
). Classifications remained accurate even when relatively small sub-populations (1% of the neuronal units) in IT were sampled; below this range, prediction failed. The relatively small proportion of neuronal units in IT sufficient to classify responses to a given stimulus is in accord with results in live animals, as seen for example, in the limited number of hippocampal neurons needed to reconstruct a rats position in space (Wilson and McNaughton, 1993
) or the limited number of motor cortical neurons needed to predict a monkeys hand position (Georgopoulos et al., 1986
).
|
In a series of conditioning experiments, Darwin VII was trained to associate the taste value of objects with their visual and auditory characteristics. Weakly conductive objects were assigned innate negative value (bad taste) and strongly conductive objects were assigned innate positive value (good taste). In accord with our prior and arbitrary assignments of block properties, Darwin VII, through experience-dependent learning, associated the blob visual pattern and 3.3 kHz beeping tone with negative value, and the striped visual pattern and 3.9 kHz beeping tone with positive value. Seven individually different Darwin VII subjects participated in the experiments, in which each subject encountered at least 10 appetitive and 10 aversive blocks. In experiments in which only visual stimuli were paired with taste, positive conditioned responses occurred in >70% of trials after encountering the sixth exemplar and in >90% after encountering the tenth exemplar. In auditory conditioning trials, conditioned responses occurred in over 80% of trials following exposure to the sixth exemplar. While performance improved with training, it never reached perfection and occasional mistakes were made. This unpredictability is a property of selectionist systems in general. These are systems consisting of a population of variant repertoires which can be differentially amplified, thus yielding responses to unpredicted or novel events. Such selection has been proposed as being a property of real nervous systems (Edelman, 1987). The unpredictability of behavioral responses in Darwin VII coupled with the variability of a complex environment did not, however, prevent the device from learning after mistakes, from generalizing over sensory inputs, and even from dealing with novel situations.
Early during the conditioning trials, Darwin VII picked up and tasted blocks that led to either appetitive or aversive responses (see Fig. 3A). During this period, it was the output of the taste neuronal units that activated the value system (S) and drove the motor neuronal units (Mapp and Mave) to cause a behavioral response. After conditioning, however, both the value system and the motor neuronal units were immediately activated upon the onset of ITs response to a visual pattern or A1s response to a tone. This shift following learning, from value system activity that was triggered in early trials by the unconditioned stimulus to value system activity triggered at the onset of the conditioned stimulus, is analogous to the shift in dopaminergic neuronal activity found in the primate ventral tegmental area after conditioning (Schultz et al., 1997
).
After associating visual patterns with taste, Darwin VII continued to pick up and taste stripe-patterned blocks, but avoided blob-patterned blocks (see Fig. 3B). After associating auditory sounds with taste, Darwin VII continued to pick up the high frequency beeping blocks, but avoided the low frequency beeping blocks (see Fig. 3C
).
We extended the training paradigm by carrying out second-order conditioning experiments (Rescorla, 1980). In the first stage of conditioning, a single conditioned stimulus (CS1; either the tone or the visual pattern) was paired with taste for ~10 encounters with each block type until learning was achieved. In the second stage of conditioning, the two conditioned stimuli (CS1 and CS2, tone and visual pattern) were paired together for ~10 encounters with each block type. After the second stage, Darwin VIIs performance was tested by presenting CS2 alone for 10 encounters of each block type. There were four possible behavioral responses for each stimulus encounter: (i) appetitive unconditioned response, (ii) appetitive conditioned response, (iii) aversive unconditioned response, and (iv) aversive conditioned response. When CS1 was visual and CS2 was auditory (see high tone and low tone on the left side of Fig. 7
), Darwin VII made the appropriate appetitive and aversive conditioned responses to auditory stimuli. However, when CS1 was auditory and CS2 was visual, Darwin VII responded incorrectly to visual aversive stimuli (see blobs on the left side of Fig. 7
). As we discuss later, this resulted from the fact that, with this sequence of conditioned stimuli in the aversive learning condition, the blocks were avoided before gripping, and thus taste reinforcement could not occur. Examination of the synaptic weights between area IT and the motor neuronal units in this case showed that the connection strengths from IT to Mapp were greater than from IT to Mave (Fig. 8A
). By altering the synaptic efficacy function (Fig. 8
inset), we were able to assure that aversive stimuli evoked a stronger learning response than appetitive stimuli (Fig. 8B
). This change led to more balanced synaptic weights and more appropriate behavioral responses (Fig. 7
, right side).
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
These considerations suggest that synthetic modeling of the kind described in this paper may be a useful strategy in attempts to understand higher brain functions. The behavior of Darwin VII shows that a synthetic brain-based device operating on biological principles and without pre-specified instructions can carry out perceptual categorization and conditioned responses. The successful performance of the device rests on the selectional modulation of its neuronal activity by behavior as well as on the existence of constraints provided by its value system. In both the perceptual categorization and conditioning experiments, the development of categorical responses required exploration of the environment and sensorimotor adaptation through specific and highly individual changes in connection strengths. We observed Darwin VIIs overall behavior while at the same time recording the state of every neuronal unit and synaptic connection in its simulated nervous system. By collecting these neuronal data, we were able to demonstrate the development of neuronal groups during categorization and recognition by individual subjects (Fig. 5), to show that reliable classification of responses to visual stimuli could be based on the sampling of a small sub-population of neuronal units (Fig. 6
), and to relate learning responses to functional changes in synaptic efficacy (Figs 7 and 8
).
Darwin VIIs nervous system has three features that are critical for understanding the mechanisms underlying perceptual categorization: (i) Connectivity from a topographically mapped primary area with transient activity to a non-topographically mapped higher area with more persistent activity. (ii) Sensory input that is continuous and temporally correlated with self-generated movement. (iii) Activity-dependent learning in which competitive mechanisms categorize sensory information and select for appropriate behavioral repertoires.
All of these features were necessary for Darwin VII to achieve invariant object recognition. Because a given visual stimulus spent more of the time in Darwin VIIs gripper, VAP neuronal units with receptive fields near the gripper were initially selected for and their corresponding connections to neuronal units in IT were potentiated (see Fig. 4, First Horizontal Shaped Block). Once localized and patterned activity began in IT, it tended to sustain itself via local recurrent excitation combined with lateral inhibition. Continual overlapping input from VAP as Darwin VII moved toward and away from the stimulus block (Almassy et al., 1998
) led to the reinforcement of the specific pattern of changes in synaptic strength from the retinotopically mapped VAP neuronal units to the non-topographic populations of neuronal units in IT. By these means, the activity of VAP neuronal units that drove IT neuronal units into a stimulus-dependent pattern of activity expanded from those with receptive fields near Darwin VIIs gripper to those involving an almost complete coverage of the visual field (see Fig. 4
, Fifth Horizontal Striped Block). As a result, IT neuronal units were primed to respond to stimuli over a wider range of Darwin VIIs visual field. Invariant object recognition is thus a system property that emerges dynamically from competitive neuronal group interactions within and between neural areas. These interactions differ from those of other models in which images are typically static and invariant object recognition is achieved by arranging features to line up across multiple views (Mundy and Zisserman, 1992
; Mundy et al., 1992
; Shashua, 1993
; Weinshall, 1993
), by deriving a learning rule that utilizes the temporal trace of neural activity (Wallis and Rolls, 1997
; Rolls and Stringer, 2001
), or by placing the main responsibility for invariance on neuronal properties alone (Tovee et al., 1994
; Rolls and Tovee, 1995
).
One striking characteristic of Darwin VII observed under all circumstances was the individuality of the patterns displayed by each subjects neural responses even for repetitions of the same behavior (see Fig. 5). This is consistent with the observation that adaptive behaviors tend to remain similar despite changes in context and variance in system properties resulting from multiple interactions across circuitry, plastic synaptic connections, fluctuating value systems, and variable object encounters. Thus, Darwin VII is structurally and behaviorally degenerate: different circuits and dynamics can yield similar behavior (Tononi et al., 1999
; Edelman and Gally, 2001
). The developmental experiments comparing responses to strongly biased samples of appetitive or aversive stimuli indicate, however, that even with identical starting architectures, changes in experiential sequences can have profound effects. While this has been documented phenomenologically with living organisms, the experiments reported here may suggest possible mechanisms underlying such epigenetic biases.
The ability to modify various levels of control in Darwin VII provides insights into the neural mechanisms of learning during conditioning. For example, when CS1 was an auditory cue and CS2 was a visual cue, our second-order conditioning experiments revealed an asymmetry that was initially unexpected: a predominance of appetitive conditioned responses over aversive responses that is analogous to the psychological phenomenon of overshadowing. Overshadowing occurs when an intense, salient stimulus gains control of responses at the expense of another less salient stimulus (Pavlov, 1927; Staddon, 1983
). In the second-order conditioning experiments where the CS1 was an auditory cue and CS2 was a visual cue, behavior similar to that of overshadowing occurred in Darwin VII for two reasons. First, because of the simple tonotopic mapping in A1, responses to auditory stimuli were stronger and easier to categorize than visual stimuli. No overshadowing occurred when CS1 was visual and CS2 was auditory, since visual categories in IT and the appropriate behavioral response developed during primary conditioning when visual stimuli (CS1) were paired with taste (US). Secondly, during the second stage of conditioning when both CS1 and CS2 were present, responses to the reinforcement (i.e. taste) of appetitive stimuli overshadowed aversive learning. This is attributable to the fact that after aversive learning the blocks were avoided before gripping and therefore taste reinforcement did not take place. Thus, in this sequence, Darwin VII generalized incorrectly that all visual stimuli were predictive of positive value. In the appetitive learning condition, this avoidance did not occur and reinforcement came from the US (taste) and CS1 (auditory cue). Conditioning performance more consistent with animal models was obtained by altering the synaptic efficacy so that changes in plasticity were on average larger for aversive events than for appetitive events (see Figs 7 and 8
). These results are consistent with the observation that the brain uses different learning rates for punishment and reward and that, in some cases, this difference may be crucial for an organisms survival (Garcia et al., 1955
; Siucinska et al., 1999
; ODoherty et al., 2001
).
The design of brain-based devices such as Darwin VII that possess neuroanatomical structure and large-scale neural dynamics differs fundamentally from that of robots. Unlike Darwin VII, robotic approaches using classical artificial intelligence are based on data representation, rule-driven algorithms, and the manipulation of formal symbol systems (Moravec, 1983; Nilsson, 1984
). Artificial intelligence has been somewhat successful in emulating logical aspects of human behavior, but has been less successful in dealing with perception, categorization and movement in the world, which is a strength of synthetic neural models and brain-based devices (Reeke and Edelman, 1988
; Pfeifer and Scheier, 1997
). Purely reactive or behavior-based robots carry out actions that are controlled through arbitration of several primitive behavioral repertoires without neural architectures (Brooks, 1986
; Arkin, 1993
). Evolutionary robotics, in which control systems are selected after each trial or lifetime according to a fitness function (Nolfi and Floreano, 2000
), can evolve complex behaviors with very simple systems, but also do not emphasize neuronal responses. A recent hybrid between evolutionary algorithms and artificial neural network learning rules was designed to mutate learning rules between trials, allowing learning during the lifetime of the robot (Floreano and Mondada, 1998
). Typically, however, the artificial neural networks controlling the evolutionary robots behavior were small (on the order of tens of artificial neural units) and they did not reflect neuroanatomical organization.
In its present form, Darwin VII has several limitations. In comparison to organisms that its behavior mimics, it has an extremely simple nervous system. Re-entrant connections (Edelman, 1987) within a neural area are present in the model, but re-entrant connections between neural areas, such as A1 and IT, are currently not implemented. This limits intra-modal and cross-modal interactions, making its behavior purely stimulus driven. Moreover, the activity in a simulation cycle is the average of a relatively small population of neurons over 100200 ms, and the spiking dynamics of individual neurons cannot presently be explored with this model. Despite these limitations, Darwin VIIs performance shows that, regardless of the existence of individual variance, neurally based devices acting in the real world can carry out consistent behaviors.
One might ask why the simulation must include behaviors in the real world. Why not simulate the environment as well as the brain? The answer rests in the constructive nature of the brain and behavioral responses to real-world inputs (Chiel and Beer, 1997; Clark, 1997
). For example, to specify the outlines of an environmental object in a pure computer simulation of the environment would contribute an a priori bias in the form of a detailed albeit implicit instruction. In contrast, by acting in the real world, Darwin VII decides for itself on object properties and then constructs appropriate responses. By using a real-world environment, not only is the risk of introducing biases into the model reduced, but also the experimenter is freed from the substantial burden of constructing a highly complex simulated environment (Edelman et al., 1992
).
Although the world of Darwin VII is much simpler than a real econiche, there does not seem to be a fundamental restriction on constructing a more complex phenotype to deal with a richer environments. Experiments exploring the effects of different neuroanatomical arrangements, the effects of lesions, and of altered synaptic responses are also now possible. As in the present experiments, the behaviors of the resulting brain-based devices would emerge solely as a result of internally generated activity of their nervous systems rather than of responses to any programmed instructions from computer software. Devices of this kind might prove useful in situations of novelty where computation is not possible or in cases of great local complexity where programming proves infeasible. In the near future, such devices are not likely to include behaviors as rich as those of higher vertebrates, and therefore their greatest practical use may at present be to complement computers in a hybrid arrangement, i.e. a brain-based device linked to a conventional digital computer. Since the fundamental operation of such devices includes random fluctuations and unpredictable behaviors, they are not in any strict sense Turing machines. Although the phrase machine psychology may thus appear to be a misnomer, it may be nevertheless be usefully applied to the behavior of non-living things that learn. In any case, providing such synthetic constructions with increasingly sophisticated neural circuits and body forms should give further valuable insights into the relationships between brain, body and behavior.
![]() |
Appendix: Specifics of Neuronal Responses, Input and Output in Darwin VII |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The simulated nervous system was run on a Quad Pentium III Xeon Linux workstation capable of communicating with the mobile base. The workstation received visual input via radio frequency (RF) video transmission from a CCD camera mounted on the mobile base (see Appendix, part E). The workstation received auditory and gripper information, and transmitted motor and actuator commands via an RF modem (see Appendix, part F).
B. Neuronal Unit Activity
The total contribution of input to unit i is given by
![]() |
![]() |
![]() |
C. Activity-dependent Synaptic Plasticity
Activity-dependent synaptic changes in cij were given by
![]() |
![]() |
See Figure 8 inset for a representative chart of the function. Specific parameter settings for fine-scale synaptic connections are given in Table A2
.
D. Value-dependent Synaptic Plasticity
A value term was computed as
![]() |
![]() |
E. Visual System and its Input
The CCD camera sent 320 x 240 pixel monochrome video images, via an RF transmitter, to an ImageNation PXC200 frame grabber attached to the computer running the neural simulation. The image was clipped, such that only the center square of the image remained, and it was then spatially averaged to produce a 64 x 64 pixel image. Each pixel was normalized between 0 (black) and 1 (white) and mapped directly to neuronal units of area R in the neural simulation. R neuronal units projected retinotopically to neuronal units in neural area VAP, which in turn projected to neural area IT non-topographically (see Fig. 2 and Table A2
).
F. Auditory System and its Inputs
Microphone input was amplified and filtered in hardware. An RMS (root mean square) chip measured the amplitude of the signal and a comparator chip produced a square waveform which allowed frequency to be measured. Every millisecond, the microcontroller on NOMAD calculated the overall microphone amplitude by averaging the current signal amplitude measurement with the previous three signal amplitude measurements. The microcontroller calculated the frequency of the microphone signal at each time point by inverting the average period of the last eight square waves. LCoch and RCoch each had 64 neuronal units. Their response was based on the frequency and amplitude information received from the microcontroller via the RF modem. Each cochlear neuronal unit had a cosine tuning curve with a tuning width of 1 kHz and a preferred frequency, which ranged over the ensemble of units from 2.9 to 4.2 kHz. Activity of a cochlear neuronal unit was obtained by multiplying the value from its cosine tuning curve by the amplitude of the microphone signal. Cochlear neuronal units projected tonotopically to neuronal units in neural area A1 (see Fig. 2 and Table A2
).
G. Sampling of IT Activity for Classification of Responses
In order to test the classification of responses to visual stimuli based on IT activity, the patterns of activity in IT during the development of visual categories were compared with templates consisting of individual patterns of IT activity in response to visual stimuli after categorization had developed. Since patterns of activity varied for each Darwin VII subject, a separate template and comparison needed to be made for each individual. A template was created for each Darwin VII subject by taking the average activity of neuronal units in area IT in response to the last 10 presentations of a particular stimulus class. Templates were made for each of the visual stimulus classes (blob, horizontal, vertical) as well as a template with random activity which was achieved by shuffling the blob template. The metric used to compare activity of IT with the templates was
![]() |
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Arkin RC (1993) Modeling neural function at the schema level: implications and results for robotic control. In: Biological neural networks in invertebrate neuroethology and robotics (Beer RD, Ritzmann RE, McKenna T, eds), pp. 383409. Boston, MA: Academic Press.
Aston-Jones G, Bloom FE (1981) Norepinephrine-containing locus coeruleus neurons in behaving rats exhibit pronounced responses to non-noxious environmental stimuli. J Neurosci 1:887900.[Abstract]
Bialek W, Zee A (1990) Coding and computation with neural spike trains. J Stat Phys 59:103115.[ISI]
Bienenstock EL, Cooper LN, Munro PW (1982) Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J Neurosci 2:3248.[Abstract]
Brooks RA (1986) A robust layered control system for a mobile robot. IEEE J Robotics Automation 2:1423.
Brown EN, Frank L, M, Tang D, Quirk MC, Wilson MA (1998) A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. J Neurosci 18:74117425.[Abstract]
Chiel HJ, Beer RD (1997) The brain has a body: adaptive behavior emerges from interactions of nervous system, body and environment. Trends Neurosci 20:553557.[ISI][Medline]
Clark A (1997) Being there. Putting brain, body, and world together again. Cambridge, MA: MIT Press.
Clark SA, Allard T, Jenkins WM, Merzenich MM (1988) Receptive fields in the body-surface map in adult cortex defined by temporally correlated inputs. Nature 332:444445.[ISI][Medline]
Crair MC, Gillespie DC, Stryker MP (1998) The role of visual experience in the development of columns in cat visual cortex. Science 279: 566570.
Edelman GM (1987) Neural Darwinism: the theory of neuronal group selection. New York: Basic Books.
Edelman GM, Gally JA (2001) Degeneracy and complexity in biological systems. Proc Natl Acad Sci USA 98:1376313768.
Edelman GM, Reeke GN, Gall WE, Tononi G, Williams D, Sporns O (1992) Synthetic neural modeling applied to a real-world artifact. Proc Natl Acad Sci USA 89:72677271.[Abstract]
Floreano D, Mondada F (1998) Evolutionary neurocontrollers for autonomous mobile robots. Neural Netw 11:14611478.[ISI][Medline]
Fox K (1995) The critical period for long-term potentiation in primary sensory cortex. Neuron 15:485488.[ISI][Medline]
Friston KJ, Tononi G, Reeke GN, Sporns O, Edelman GM (1994) Value-dependent selection in the brain: simulation in a synthetic neural model. Neuroscience 59:229243.[ISI][Medline]
Garcia J, Kimeldorf DJ, Koelling RA (1955) A conditioned aversion towards saccharin resulting from exposure to gamma radiation. Science 122:157159.[ISI]
Georgopoulos AP, Schwartz AB, Kettner RE (1986) Neuronal population coding of movement direction. Science 233: 14161419.[ISI][Medline]
Ito M, Tamura H, Fujita I, Tanaka K (1995) Size and position invariance of neuronal responses in monkey inferotemporal cortex. J Neurophysiol 73:218226.
Kato N, Artola A, Singer W (1991) Developmental changes in the susceptibility to long-term potentiation of neurones in rat visual cortex slices. Brain Res Dev Brain Res 60:4350.[ISI][Medline]
Kilgard MP, Merzenich MM (1998) Cortical map reorganization enabled by nucleus basalis activity. Science 279:17141718.
Kirkwood A, Lee HK, Bear MF (1995) Co-regulation of long-term potentiation and experience-dependent synaptic plasticity in visual cortex by age and experience. Nature 375:328331.[ISI][Medline]
Kobatake E, Wang G, Tanaka K (1998) Effects of shape-discrimination training on the selectivity of inferotemporal cells in adult monkeys. J Neurophysiol 80:324330.
Krichmar JL, Snook JA, Edelman GM, Sporns O (2000) Experience-dependent perceptual categorization in a behaving real-world device. In: Animals to animats 6: Proceedings of the Sixth International Conference on the Simulation of Adaptive Behavior (Meyer J-A, Berthoz A, Floreano D, Roitblat H, Wilson SW, eds), pp. 4150. Cambridge, MA: MIT Press.
Ljungberg T, Apicella P, Schultz W (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. J Neurophysiol 67:145163.
Moravec HP (1983) The Stanford cart and the CMU rover. Proc IEEE 71:872884.[ISI]
Mundy J, Zisserman A, eds (1992) Geometric invariance in computer vision. Cambridge, MA: MIT Press.
Mundy JL, Welty RP, Brill MH, Payton PM, Barrett EB (1992) 3-D model alignment without computing pose. Image understanding workshop, Morgan Kaufmann, San Mateo, CA.
Nilsson N (1984) Shakey the robot. Menlo Park, CA: SRI International.
Nolfi S, Floreano D (2000) Evolutionary robotics: the biology, intelligence, and technology of self-organizing machines. Cambridge, MA: MIT Press.
ODoherty J, Kringelbach ML, Rolls ET, Hornak J, Andrews C (2001) Abstract reward and punishment representations in the human orbitofrontal cortex. Nat Neurosci 4:95102.[ISI][Medline]
Pavlov IP (1927) Conditioned reflexes. London: Oxford University Press.
Pfeifer R, Scheier C (1997) Sensorymotor coordination: the metaphor and beyond. Robotics Autonomous Syst 20:157178.
Reeke GN, Edelman GM (1988) Real brains and artificial intelligence. Daedalus Proc Am Acad Arts Sci 117:143173.
Reeke GN, Sporns O, Edelman GM (1990) Synthetic neural modeling: the Darwin series of recognition automata. Proc IEEE 78:14981530.[ISI]
Rescorla RA (1980) Pavlovian second-order conditioning: studies in associative learning. Hillsdale, NJ: Lawrence Erlbaum Associates.
Rolls ET, Stringer SM (2001) Invariant object recognition in the visual system with error correction and temporal difference learning. Netw Comput Neural Syst 12:111129.[ISI]
Rolls ET, Tovee MJ (1995) The responses of single neurons in the temporal visual cortical areas of the macaque when more than one stimulus is present in the receptive field. Exp Brain Res 103:409420.[ISI][Medline]
Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:15931599.
Shashua A (1993) Projective depth: a geometric invariant for 3D reconstruction from two perspective/orthographic views and for visual recognition. International Conference on Computer Vision, Berlin.
Siucinska E, Kossut M, Stewart MG (1999) GABA immunoreactivity in mouse barrel field after aversive and appetitive classical conditioning training involving facial vibrissae. Brain Res 843:6270.[ISI][Medline]
Sporns O, Almassy N, Edelman GM (2000) Plasticity in value systems and its role in adaptive behavior. Adaptive Behav 8:129148.[ISI]
Staddon JER (1983) Adaptive behavior and learning. Cambridge: Cambridge University Press.
Tanaka K (1996) Inferotemporal cortex and object vision. Annu Rev Neurosci 19:109139.[ISI][Medline]
Tanaka K, Saito H, Fukada Y, Moriya M (1991) Coding visual images of objects in the inferotemporal cortex of the macaque monkey. J Neurophysiol 66:170189.
Theunissen FE, Miller JP (1991) Representation of sensory information in the cricket cercal sensory system. II. Information theoretic calculation of system accuracy and optimal tuning-curve widths of four primary interneurons. J Neurophysiol 66:1690703.
Tononi G, Sporns O, Edelman GM (1999) Measures of degeneracy and redundancy in biological networks. Proc Natl Acad Sci USA 96: 32573262.
Tovee MJ, Rolls ET, Azzopardi P (1994) Translation invariance in the responses to faces of single neurons in the temporal visual cortical areas of the alert macaque. J Neurophysiol 72:10491060.
Turrigiano GG, Nelson SB (2000) Hebb and homeostasis in neuronal plasticity. Curr Opin Neurobiol 10:358364.[ISI][Medline]
Wallis G, Rolls ET (1997) Invariant face and object recognition in the visual system. Prog Neurobiol 51:167194.[ISI][Medline]
Weinshall D (1993) Model based invariants for 3-D vision. Int J Comput Vis 10:2742.[ISI]
Wilson MA, McNaughton BL (1993) Dynamics of the hippocampal ensemble code for space. Science 261:10551058.[ISI][Medline]