1 Department of Psychology, University of Hull, Hull HU6 7RX, UK, 2 Department of Methodology & Statistics, Utrecht University, 3508 TC Utrecht, The Netherlands, 3 School of Psychology, University of St. Andrews, Fife KY16 9JU, UK
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: feature integration, high-level visual processing, temporal cortex, ventral visual stream
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Existing ideas about where in the brain the processing of the different features of a visual stimulus occurs are heavily influenced by the UngerleiderMishkin model (Ungerleider and Mishkin, 1982), and by a subsequent adaptation by Milner and Goodale (1995
). The UngerleiderMishkin model basically discriminates two visual cortical streams: a dorsal where stream, extending into the inferior parietal cortex, primarily dealing with the spatial position of objects; and a ventral what stream, extending into the inferior temporal cortex, dealing with the shape and identity of objects (Desimone and Ungerleider, 1989
; Haxby et al., 1991
; K@ouml;hler et al., 1995
). Milner and Goodale questioned the strict what-where dichotomy, and suggested that space and form are processed in both parietal and temporal areas but for different purposes (e.g. Goodale et al., 1991
; Milner and Goodale, 1995
). In their view, the ventral stream subserves visual perception, i.e. object and scene recognition, requiring allocentric spatial coding to represent the enduring characteristics of objects. This idea has gained support from studies at the cellular level. For example, Dobbins et al. (1998
) reported cells coding for the object distance in area V4 within the ventral stream.
An area of great interest with respect to visual integration is the anterior part of the Superior Temporal Sulcus (STSa; encompassing both the upper and lower banks and the fundus). A subset of the STSa, consisting of the upper bank and fundus, is often called STP (Superior Temporal Polysensory area; Bruce et al., 1981), as many of the cells in this area respond to auditory and/or somesthetic stimuli in addition to visual stimuli. The STSa is often reported as a focus for the processing of the visual appearance of the face and body, body postures and actions (Gross et al., 1972
; Perrett et al., 1982
; 1985a,b, 1989b; Jellema and Perrett, 2002
). The cells often generalize their selectivity for faces across changes in size, retinal position, orientation, the species (human or monkey), luminance and colour (e.g. Bruce et al., 1981
; Perrett et al., 1984
, 1989b; Rolls and Baylis, 1986
; Ashbridge et al., 2000
). The upper bank of the STS is thought to form an interface between the ventral and dorsal streams (Karnath, 2001
). Indeed, it has been demonstrated that activity of single visual STSa cells is determined by information about both form and motion of animate objects (Oram and Perrett, 1994
, 1996; Tanaka et al., 1999
; Jellema and Perrett, 2003
).
Although early observations of cells in STSa sensitive to spatial cues were made by Bruce et al. (1981), and sensitivity to spatial position has been documented in posterior regions of the STS (STPp) by Hikosaka et al. (1988
), the influence of spatial location on STSa cells and interaction with other visual cues has been largely unexplored. Recently, we discovered that some cell populations in STSa are sensitive to the spatial location of animate objects that moved out of sight behind a screen (Jellema and Perrett, 1999
; Baker et al., 2000
, 2001). These findings have prompted further investigation into the question of whether single cell sensitivity for location is combined with form and motion sensitivity, and if so, in what way? The current study provides the first detailed demonstration that single STSa cells integrate information about the form, motion and location of animate objects. We discuss the implications our findings have for the ideas about higher-order visual integration.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The experiments were performed on two awake rhesus macaque monkeys (Macaca mulatta, age 46 years). A detailed description of the surgical procedures can be found elsewhere (Oram and Perrett, 1996). Animal care and experimental procedures were performed in accordance with UK Home Office guidelines.
Recording
Single cell recordings (using standard methods, see Oram and Perrett, 1996) were made while the monkey was seated in a primate chair. Spikes were captured online onto a PC (CED1401plus and Spike2 software, Cambridge Electronic Design, UK). Additionally, spikes were stored on an audio track of a HiFi videotape recorder. The stimulus events (seen from the subjects perspective) were recorded with a video camera, and stored simultaneously on the video track of the same tape. Eye movements were recorded with a second (infra-red sensitive) camera mounted on the primate chair. The signals from the two cameras were integrated (Panasonic VHS video mixer, WJAVE7) prior to recording. The signal from the eye camera was also recorded separately on a second videotape recorder, synchronized with a time-code generator and frame counter (VITC Horita VG50), for offline analysis of eye position.
Stimuli and Testing Procedure
The visual stimulus consisted of a 3-D live presentation of a human agent positioned within the testing room. The agent walked toward or away from the subject in a compatible (i.e. walking forward, head and body facing the same direction as overall movement) or in an incompatible manner (walking backward, head and body facing in the opposite direction to overall movement). This allowed for testing cell sensitivity to three basic features of the visual stimulus: motion, form, and location, with two levels within each modality. The levels of the factor motion were motion away from the subject and motion toward the subject; the levels of the factor form were front view of body and back view of body; and the levels for the factor location were near to subject and far away from subject. The walking agent was chosen as stimulus because it allowed for easy manipulation of the three factors. Furthermore, STSa cells respond maximally to animate actions, of which walking actions are especially well represented (cf. Oram and Perrett, 1996).
The testing procedure consisted of systematic manipulation of the levels of the three factors motion, form and location. In a typical experiment the agent walked (compatibly or incompatibly) toward or away from the subject. The total walking distance per trial was 4.5 m (walking velocity 1m/s), between the subject and the opposite wall. The walking space in the testing room (dimensions: 5 m depth and 4 m width, relative to the subject) was pragmatically divided into three zones: near (between 1 and 2 m from subject), middle (23 m), and far (34 m) (see Fig. 1A for a plan view of the testing room). Only the near and far locations were used in the analysis, resulting in 2 x 2 x 2 = 8 conditions.
|
The velocity of walking (1 m/s), and the gait and appearance of the agent, were kept constant (one and the same agent was used for the live 3-D presentations). Other agents (with different gait and appearance) were regularly tested in addition to the standard one, but this was never observed to produce different results.
Definition of Cells Sensitive to Walking
A cell was defined as sensitive to walking when it responded significantly more to either forward or backward walking in at least one of the four directions tested (toward and away, and to the left and right, with respect to subject), compared to a range of control stimuli. No cells were found to respond to both forward and backward walking. In cases of cell sensitivity to, for example, the back view of the agent walking away from the subject, the primary control conditions consisted of the static back view of the agent (to exclude cell sensitivity to just the back view of the head and/or body), and of the agent walking away from the subject in backward manner (to exclude sensitivity to just the direction of motion, and at the same time to exclude sensitivity to spatial position, irrespective of the shape of the object). Note that the latter control condition also served as experimental condition. The necessity of the walking action was further tested by presenting (i) single arm or leg movements, which formed part of the preferred walking action [cell selectivity for individual limb articulation has been reported (Perrett et al., 1985b; Jellema et al., 2000
)]; (ii) a variety of other whole body actions; and (iii) moving non-animate control objects. The latter consisted of rigid objects of comparable size (e.g. a large screen on wheels covered by a lab coat), which was pushed at similar velocity along the walking trajectories by the experimenter (who remained out of sight behind the screen). Cells defined as responding to walking were significantly less excited by each of these control conditions.
Multiple Visual Cues Contribute to Each of the Three Factors
One should bear in mind that multiple visual cues are likely to have contributed to each of the three main factors motion, form and location, and that the computations involved in producing each individual factor may have taken place either inside or outside STSa.
At least two sources of visual form information exist: form from physical body cues with rigid motion and form-from-motion (i.e. the typical pattern of articulation of limbs characteristic for forward or backward walking). Previous studies from our lab indicated that both sources contribute to STSa cell sensitivity for walking agents (Oram and Perrett, 1994, 1996). Virtual all STSa cells sensitive to walking agents are sensitive to form cues derived or available from the body moving rigidly. This is suggested by the finding that equivalent translations of the body, in which the direction of motion and the body/head view were the same as the preferred walking action, also excited the cells, but at reduced firing rate compared to the articulated walking action (Oram and Perrett, 1996
). The body translations consisted of an agent standing on a mobile platform while the platform was made to move. The reduced firing rate during body translations indicates that the limb articulation during walking contributed to the responses to walking. STSa cell sensitivity for form-from-articulated motion is further supported by findings that
25% of cells sensitive to a walking agent responded to biological motion stimuli corresponding to the specific walking action, but again at reduced firing rate (Oram and Perrett, 1994
). In the biological motion condition the form of the body is defined only by the motion of light patches attached to the points of limb-articulation. We therefore assume that all STSa cells sensitive to a walking agent use form information from body cues (e.g. head and body view), and to a certain extent the biological motion of the articulating limbs.
Visual cues to an objects spatial location are numerous and include the retinal image size of the object (especially for familiar objects), expansion/contraction, disparity, and environmental cues to distance. Again, it is likely that several of these cues contribute to sensitivity for spatial location. It is unclear to what extent location may have been computed inside STSa, or elsewhere and fed into STSa. Findings of spatial sensitivity in single neurons in V4 suggest that distance may be computed at stages earlier in the processing chain than STSa (Dobbins et al., 1998).
The motion of the object is most likely coded in the prestriate areas (V5)MT/MST, and fed into STSa (Boussaoud et al., 1990; Oram and Perrett, 1996
).
In the present study we did not attempt to assess the relative contributions of the different visual cues, because we were interested in the contribution of each factor per se, irrespective of its origin.
Stimulus Presentation
The stimuli were typically presented live from behind a fast rise-time liquid crystal shutter (aperture 20 by 20 cm at a distance of 15 cm). Between five and 12 repetitions were used per condition. In some cases a mechanical shutter with a larger aperture was used to avoid narrowing the scope of view of the subject. In addition to live presentation, stimuli were sometimes presented on film projected onto a screen at life size. The video stimuli were made with a camera positioned at the subjects location to produce a realistic image. The live stimuli were shown at 14 m distance from the subject. Retinal images of live presented bodies varied from 67° x 23° (vertically x horizontally) at 1.5 m distance, to 28° x 9° at 4 m distance. Control stimuli consisted of objects of comparable size moved in compatible ways.
Analysis
Offline spike sorting was routinely performed (Spike2, Cambridge Electronic Design, UK). Spike counts were obtained during 1 s epochs in which the agent was walking in the near and far zones; the middle zone was discarded from the analysis. The analysis epoch did not start directly following shutter opening. Upon shutter opening the subject was confronted with the sight of the agent standing still, waiting to commence walking. This image lasted for 2 s, after which walking started. The analysis epoch started after walking had began, thus excluding the acceleration at the start, and it ended before the end of the walk, so as to exclude the deceleration at the end of the walk. This was done to avoid confounding the results with cell sensitivity for changes in velocity.
Cell responses were analyzed using ANOVAs and NewmanKeuls post-hoc testing (significance level = 0.05). In addition, multiple regression analysis (with effect coding: +1 and 1 for the two levels of each condition) was performed on each individual cell tested in all eight stimulus conditions, to fit a linear equation to the responses. This allowed an estimate to be made of the relative weight of each factor, and of all two-way, and the three-way, factor interactions. Only cells tested in all eight stimulus conditions were included in the analysis (n = 31).
Eye position was analyzed offline (Iview, Sensomotoric Instruments, Germany). Statistical analysis of the percentages of time the subject spent fixating with different eye positions during the recording periods indicated that the response magnitude was not related to the pattern of fixation.
Cell Localization
A detailed description of the cell localization procedure can be found elsewhere (Jellema et al., 2000). At completion of each experiment, frontal and lateral X-ray photographs were taken with the electrode still in place, to locate the electrode and the recorded cells with respect to specific bone landmarks (Aggleton and Passingham, 1981
). During the final experiment, electrolytic microlesions were produced at the site of recording. The subject was then sedated and given a lethal dose of anaesthetic. After transcardial perfusion the brain was removed, and coronal sections (25 µm) were cut, photographed and stained. The X-ray photographs were aligned with the histological sections to determine the cell locations (accuracy
1 mm).
Although the histological reconstructions indicated that all recordings in the current study were made from cells in the upper bank and fundus of the STSa (i.e. STP, polysensory area; Fig. 4B), we used the term STSa for the sake of consistency with our previous studies. Also, the error in the accuracy of reconstructions (1 mm) means that we cannot exclude the possibility that some cells may have been located in the lower bank of STSa, outside STPa.
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Cells Sensitive to the Spatial Location of the Agent
Eighteen out of the 31 cells (58%) showed sensitivity for the spatial location of the walking agent. These cells responded significantly differently depending on where the agent was walking: in the near or in the far location. Two typical examples of such cells are given in Figures 1 and 2.
|
Figure 2 illustrates another cell, which responded maximally to walking at the near location, provided the front view of the body was visible and the direction of motion was away from the subject. Again, changing the body view or the motion direction abolished responses [F(7,40) = 18.56, P < 0.00001]. Posthoc testing showed that the combination motion away/front body view/location near, evoked a larger response than all other combinations (P < 0.0002).
Cells not Sensitive to the Spatial Location of the Agent
In the remaining 13 of the 31 cells no sensitivity for location was found. A relatively large number of these cells responded in an object-centred manner (9/13, 69%). That is, they coded for one type of walking, either forward walking or backward walking, irrespective of the direction of motion and body view (cf. Perrett et al., 1989b). Object-centred coding was also found in cells that were sensitive to location, albeit less frequently (5/18, 28%). Equal numbers of cells responded to forward walking compared to backward walking.
The cell illustrated in Figure 3 is a typical example of an object-centred cell, insensitive to the location of the agent. This cell spiked vigorously as long as the agent walked forward, either away from the subject (Fig. 3A, top left) or toward the subject (bottom right). Backward walking in both directions evoked significantly smaller responses (top right and bottom left) [F(7,62) = 24.4, P < 0.00001]. In none of the conditions did the response significantly differ between the near and far locations (P > 0.1). The stimulus combinations in which body view and motion direction indicated forward walking (front view, motion toward; back view, motion away) evoked significantly larger responses than the combinations indicating backward walking (P < 0.0002), at both the near and the far location. The sensitivity to forward walking also extended into other directions, i.e. to the left and right of the subject (Fig. 3B). No location sensitivity was found between the left and right locations (P < 0.0001).
|
Population Response
Table 1 summarizes the multiple regression analyses performed on each individual cell. The main effects for the factors motion (M), form (F) and location (L), and the two- and three-way interactions are indicated. The table should be read as follows (using cell 1 as an example): the intercept value (9.0) represents the mean number of spikes/s over all eight conditions. The entry in column M determines the number of spikes that should be added/subtracted in case of a main motion effect. Here to the value (7.5) should be multiplied by the effect code (1 or +1). Since motion away was given an effect code of 1, the effect of walking away is an increase in spiking activity of 7.5 spikes to 16.5 spikes/s. Walking toward (effect code: +1) results in a decrease of spiking activity of 7.5 spikes to 1.5 spikes/s. Similarly, the F and L columns give the increase/decrease in spiking activity in case of a main effect of form (effect code is 1 for back body view, and +1 for front body view) and location (effect code is 1 for near, and +1 for far), respectively. To obtain the contributions of the two- and three-way interactions, the entry is multiplied by all the effect codes of the appropriate levels. Thus, the contribution of the three-way interaction between e.g. motion away (1), back view (1) and far (+1) is an increase with 2.3 spikes [(1) x (1) x (+1) x 2.3 = 2.3] to 11.3 spikes/s (which is a non-significant effect). In this way, the mean cell response in each of the 8 conditions can be calculated. The results of these calculations are given in Table 2, where C1 to C8 are the eight conditions tested.
|
|
Of the significant two-way interactions, the form x motion interaction was most frequently encountered (in 87% of cells, 27/31). The interactions with location were less frequent: the location x form interaction was found in 32% (10/31) and location x motion interaction in 29% (9/31) of cells. The three-way form x motion x location interaction was least frequently found (in 19% of cells, 6/31; see Figs 1 and 2 for examples of such cells).
Although individual cells discriminated sharply between conditions, one-way ANOVA showed that the population of cells as a whole did not discriminate between them [F(7,240) = 1.7, P = 0.101].
Cell Preference for Certain Combinations of Factor Levels
The population of 31 cells turned out to be quite heterogeneous with respect to the particular combination of factors most effective in driving the cell. For each of the eight possible factor-level combinations, a cell could be found that was maximally excited by it, or which produced a response not significantly different from the maximal response (Table 2). Certain combinations were, however, much more prevalent than others.
Close examination of the motion x form interactions showed that 19 of the 27 cells (70%) were sensitive to form and motion interactions indicating forward walking (i.e. motion away/back body view; motion toward/front body view). These are the cells in Table 1 with significant values in the MF column with positive sign, because multiplication of the effect codes for motion away (1) and back body view (1) is +1, as for motion toward (+1) and front body view (+1). A minority of the cells (8/27, 30%) responded to motion x form interactions that indicated backward walking (motion away/front body view; motion toward/back body view). The 3:1 ratio of cells coding for forward and backward walking confirmed earlier reports (Perrett et al., 1985a, 1989b; Oram and Perrett, 1996
). The ratio might reflect the prevalence of forward walking in human, and to a lesser extent in monkey, society.
With respect to interactions with the factor location, one might expect the near location to be preferentially combined with motion toward and front body view. Similarly, one might expect the far location to be preferentially combined with motion away and back body view. The underlying idea is that visual features that convey a similar message are combined (cf. Mistlin and Perrett, 1990). However, due to the relatively small number of cells, firm conclusions could not yet be drawn. The cell numbers are provided here merely to give an indication. From the nine cells with significant Motion location interactions, three showed the expected preference (the cells with a negative value in the ML column). From the ten cells with significant form x location interactions, seven showed the expected preference (cells with negative value in the FL column).
Cell Locations
For the majority of the cells that integrated form, motion and location, histological reconstructions were made. Figure 4 shows a reconstruction of the locations of cells for one subject. All cells were located within the upper bank and fundus of STSa, between 12 and 20 mm anterior to the interaural plane.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Cellular Computations
The intricate and varied ways in which the three factors explained the cells responses provide a glimpse of the computations that take place at the cellular level, presumably within cortical columns. It appears that the weight of the input of the three factors varies from cell to cell, while the cells output does not merely reflect a summation of the individual inputs, but follows conditional rules. For example, the location near may have a significant influence on the responses of a cell on the condition that the agent is moving away from the subject and the front view of the body is visible, changing the level of just one of these factors abolishes responses (see Fig. 2).
Suggestions as to the underlying computations also come from recordings of cells located closely together along a single electrode track. Such cells sometimes showed surprisingly large differences in preferred stimulus configuration. Although this data is anecdotal, it nicely highlights the complexity of cellular computations within a small patch of STSa. An example is given in Figure 5, which shows recordings from three neighbouring cells, along a single track in STSa. The upper cell responded maximally to the agent walking compatibly away at the far location. The middle cell, located 85 µm deeper in the brain (according to the reading on the micromanipulator), required the agent to walk away incompatible at the near location. The third cell, located just 15 µm deeper than the second cell, responded to incompatible walking in either direction, irrespective of location. Thus, along this single track, the contributions of all three factors varied significantly over a distance of
100 µm. In contrast, there were other examples of tracks of neighbouring cells with quite similar response patterns. Since electrode tracks were positioned perpendicular to the longitudinal axis of the STS, different recording tracks likely sampled from different columns. The inaccuracy in histological procedures for reconstructing cell location (Fig. 4) does not allow conclusions to be drawn as to the number of columns that might have been traversed by individual electrode tracks.
|
The location information in STSa could have come from various parts in the brain and could have been based on various visual cues. For instance, distance sensitivity observed in cells in area V4 (Dobbins et al., 1998) might extend into STSa, since V4 forms the main visual input into inferior temporal cortex (IT), and IT projects heavily onto STSa (Felleman and Essen, 1991
). Of particular interest are the nearby hippocampus and parahippocampal gyrus (the latter projects to STSa via the perirhinal cortex; Seltzer and Pandya, 1994
). The primate hippocampus contains cells that code for the view of a particular part of the testing environment (i.e. allocentric place coding; Tamura et al., 1992
; O'Mara, 1995
; Rolls et al., 1997
). In humans, the parahippocampal gyrus becomes active during passive viewing of the geometrical layout of local coherent space (Epstein and Kanwisher, 1998
; Epstein et al., 1999
), and when subjects have to navigate or recall navigational routes (Maguire et al., 1997
). The parahippocampal area thus seems to code the spatial layout irrespective of the presence of discrete objects at those locations, whereas the spatial sensitivity we found in STSa requires the presence of an (animate) object at the effective location. Thus, information about spatial location has a profound influence in the temporal lobe, but its utilization in the visual processing of complex animate objects and their actions is still largely unknown.
The retinal size of the object might also have been a cue to its distance. For familiar objects of known dimensions (such as humans), a small retinal size indicates a far away location and a large retinal size a nearby location. However, the prevalent view is that cells in inferior temporal cortex (including STSa) generalize over object size, as over many other stimulus characteristics (such as illumination, contrast, and orientation; Gross et al., 1972; Bruce et al., 1981
; Perrett et al., 1982
, 1985b; Rolls and Baylis, 1986
; Kovács et al., 2003
). This is consistent with their presumed role in object recognition and object constancy. There are nevertheless indications that generalization over object size may be less ubiquitous in STSa than previously assumed (Ashbridge et al., 2000
).
The Frame of Reference for Spatial Coding
We labelled the locations of the walking agent from the subjects perspective, i.e. near or far from the subject, to the subjects left or right. Although this reflects an egocentric frame of reference, we cannot exclude an allocentric frame of reference (i.e. spatial descriptions based on environmental landmarks rather than the subjects own position and orientation). The abundance of cells in STSa sensitive to the perspective view of the object (viewer-centred framework), compared to the minority of cells responding equally well to all views of an object (object-centred framework) (Perrett et al., 1991; Ashbridge and Perrett, 1998
), suggests that a spatial sensitivity of STSa cells might also be expressed in relation to the observer (i.e. egocentric; but see Milner and Goodale, 1995
). However, allocentric coding has been observed for STSa cells sensitive to goal-directed actions (Perrett et al., 1989b
) and occluded agents (Baker et al., 2001
). Ego- and allocentric coding do not have to be mutually exclusive. Processing of position could begin with an egocentric frame and progress to an allocentric one, in much the same way as view-general (object-centred) cell properties can be generated by combining view-specific cell properties (Perrett et al., 1989b
).
Some authors emphasized the distinction between categorical and coordinate spatial representations (e.g. Kosslyn, 1987). Categorical representations specify relative spatial relations (e.g. on top or below, to the left or to the right), which are especially relevant for detecting goals and consequences of actions. Coordinate representations specify the exact spatial positions, and are typically used for guiding actions (e.g. picking up an object). Our data are consistent with the notion that spatial coding in STSa is of the categorical type, i.e. independent of the absolute positions of the agent and the object in space, but dependent on their relative positions.
The Functional Significance of Positional Coding in STSa
Why would the ventral visual stream care about distance if its purpose is to perform object identification? One would rather expect it to ignore distance in order to achieve object constancy.
Previously we suggested that STSa plays a role in the visual analysis of the intentions and goals of others actions (i.e. social cognition), in addition to animate object identification (Emery and Perrett, 1994; Jellema and Perrett, 2002
). We argue that the significance of spatial coding in STSa must be seen in this light. The spatial positions that an individual occupies with respect to other individuals or objects contains vital information for an observer when it comes to determining the goal or intention of that individual.
Cells in STSa have been shown to code for congruent sets of body actions and postures, which convey information about the direction of others attention (Perrett et al., 1992), their intentions (Jellema et al., 2000
) and goals (Perrett et al., 1989b
). Such body actions typically relate to particular locations, e.g. reaching toward a location at which a food reward is kept, or walking toward the door. The previous studies, however, did not define the role of these target locations in the cell sensitivity. It might well be that spatial sensitivity also extends to these situations, but this remains to be investigated.
Why Was Spatial Coding in the STS not Found Before?
Our results suggest that spatial coding may indeed be widespread in STSa, which prompts the question of why it was not found before. The only reports so far come from our lab showing that STSa cells are sensitive to the location of occluded agents (Jellema and Perrett, 1999; Baker et al., 2000
, 2001). One reason is probably that, given the predominant view of the functions of the dorsal and ventral visual streams (Ungerleider and Mishkin, 1982
model), most studies on the ventral stream were biased towards investigating object recognition, neglecting possible effects of position.
Another reason may be related to the function of the STS in social perception (Allison et al., 2000; Jellema and Perrett, 2002
). Spatial relationships may be coded in STS provided they contribute to the social significance of the visual stimuli. This might explain the failure to activate the STS in imaging studies that used socially meaningless spatial relationships between human figures to localize spatial coding in the brain (e.g. Courtney et al., 1996
).
Our findings suggest that spatial coding is not an exclusive property of the dorsal visual stream, but occurs in the ventral visual stream (STSa) as well. Moreover, our findings open up the prospect of a much more elaborate integration of visual information about animate objects at the single cell level in STSa. Such integration may support the comprehension of animals and their actions.
![]() |
Notes |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Address correspondence to Tjeerd Jellema, Department of Psychology, University of Hull, Hull HU6 7RX, UK. Email: T.Jellema{at}hull.ac.uk.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Allison T, Puce A, McCarthy G., (2000) Social perception from visual cues: role of the STS region. Trends Cogn Sci 4:267278.[CrossRef][ISI][Medline]
Ashbridge E, Perrett DI (1998) Generalising across object orientation and size. In: Perceptual Constancy (Walsh V, Kulikowski J, eds), pp. 192209. Cambridge: Cambridge University Press.
Ashbridge E, Perrett DI, Oram MW, Jellema T (2000) Effect of image orientation and size on object recognition: Responses of single units in the macaque monkey temporal cortex. Cogn Neuropsychol 17:1334.[CrossRef][ISI]
Baker CI, Keysers C, Jellema T, Perrett DI (2000) Coding of spatial position in the superior temporal sulcus of the macaque. Curr Psychol Lett Behav Brain Cogn 1:7187.
Baker CI, Keysers C, Jellema T, Wicker B, Perrett, DI (2001) Neuronal representation of disappearing and hidden objects in temporal cortex of the macaque. Exp Brain Res 140:375381.[CrossRef][ISI][Medline]
Boussaoud D, Ungerleider LG, Desimone R (1990) Pathways for motion analysis: cortical connections of the medial superior temporal and fundus of the superior temporal visual areas in the macaque. J Comp Neurol 296:462495.[ISI][Medline]
Bruce C, Desimone R, Gross CG (1981) Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. J Neurophysiol 46:369384.
Courtney SM, Ungerleider LG, Keil K, Haxby JV (1996) Object and spatial visual working memory activate separate neural systems in human cortex. Cereb Cortex 6:3949.[Abstract]
Desimone R, Ungerleider LG (1989) Neural mechanisms of visual processing in monkeys. In: Handbook of Neuropsychology (Boller F, Grafman J, eds), vol. 2, pp. 267299. Amsterdam: Elsevier.
Dobbins AC, Jeo RM, Fiser J, Allman JM (1998) Distance modulation of neural activity in the visual cortex. Science 281:552555.
Emery NJ, Perrett DI (1994) Understanding the intentions of others from visual signals: neurophysiological evidence. Curr Psychol Cognit 13:683694.
Epstein R, Kanwisher N (1998) A cortical representation of the local visual environment. Nature 392:598601.[CrossRef][ISI][Medline]
Epstein R, Harris A, Stanley D, Kanwisher N (1999) The parahippocampal place area: Recognition, navigation, or encoding? Neuron 23:115125.[ISI][Medline]
Felleman DJ, Essen DCV (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1:147.[Abstract]
Goodale MA, Milner AD, Jakobson LS, Carey DP (1991) A neurological dissociation between perceiving objects and grasping them. Nature 349:154156.[CrossRef][ISI][Medline]
Gross CG, Rocha-Miranda CE, Bender, DB (1972) Visual properties of neurons in inferotemporal cortex of the macaque. J Neurophysiol 35:96111.
Haxby JV, Grady CL, Horwitz B, Ungerleider LG, Mishkin M, Carson RE, Herscovitch P, Schapiro, MB, Rapoport SI (1991) Dissociation of object and spatial visual processing pathways in human extrastriate cortex. Proc Natl Acad Sci USA 88:16211625.[Abstract]
Hikosaka K, Iwai E, Saito H-A, Tanaka K (1988) Polysensory properties of neurons in the anterior bank of the caudal superior temporal sulcus of the macaque monkey. J Neurophysiol 60:16151637.
Jellema T, Perrett DI (1999) Coding of object position in the banks of the superior temporal sulcus of the macaque. Soc Neurosci Abstr 25:919.
Jellema T, Baker CI, Wicker B, Perrett DI (2000) Neural representation for the perception of the intentionality of actions. Brain Cogn 44:280302.[CrossRef][ISI][Medline]
Jellema T, Perrett DI (2002) Neural coding for visible and hidden objects. Attention and Performance XIX, pp. 356380.
Jellema T, Baker CI, Oram, MW, Perrett, DI (2002) Cell populations in the banks of the superior temporal sulcus of the macaque and imitation. In: The imitative mind. development, evolution, and brain bases (Prinz W, Meltzoff A, eds), pp. 267290. Cambridge: Cambridge University Press.
Jellema T, Perrett DI (2003) Perceptual history influences neural responses to face and body postures. J Cogn Neurosci 15:961971.
Karnath H-O (2001) New insights into the functions of the superior temporal cortex. Nat Rev Neurosci 2:568576.[CrossRef][ISI][Medline]
Kosslyn SM (1987) Seeing and imagining in the cerebral hemispheres: a computational approach. Psychol Rev 94:148175.[CrossRef][ISI][Medline]
Köhler S, Kapur S, Moskovitch M, Winocur G, Houle S (1995) Dissociation of pathways for object and spatial vision: a PET study in humans. Neuroreport 6:18651868.[ISI][Medline]
Kovács G, Sáry G, Köteles K, Chadaide Z, Tompa T, Vogels R, Benedek G (2003) Effects of surface cues on macaque inferior temporal cortical responses. Cereb Cortex 13:178188.
Livingstone M, Hubel D (1988) Segregation of form, color, movement & depth: anatomy, physiology and perception. Science 240:740749.[ISI][Medline]
Maguire EA, Frackowiak RSJ, Frith CD (1997) Recalling routes around London: activation of the right hippocampus in taxi drivers. J Neurosci 17:71037110.
Milner AD, Goodale MA (1995) The Visual Brain in Action. Oxford: Oxford University Press.
Mistlin AJ, Perrett DI (1990) Visual and somatosensory processing in the macaque temporal cortex: the role of expectation. Exp Brain Res 82:437450.[ISI][Medline]
OMara SM (1995) Spatially-selective firing properties of hippocampal neurons in rodents and primates. Prog Neurobiol 45:253274.[CrossRef][ISI][Medline]
Optican LM, Richmond BJ (1987) Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. III. Information theoretic analysis. J Neurophysiol 57:162178.
Oram MW, Perrett DI (1994) Responses of anterior superior temporal polysensory (STPa) neurons to biological motion stimuli. J Cogn Neurosci 6:99116.[ISI]
Oram MW, Perrett DI (1996) Integration of form and motion in the anterior superior temporal polysensory area (STPa) of the macaque monkey. J Neurophysiol 76:1091297.
Perrett DI, Rolls, ET, Caan W (1982) Visual neurones responsive to faces in the monkey temporal cortex. Exp Brain Res 47:329342.[ISI][Medline]
Perrett DI, Smith PAJ, Potter DD, Mistlin AJ, Head AS, Milner AD, Jeeves MA (1984) Neurones responsive to faces in the temporal cortex: studies of functional organization, sensitivity to identity and relation to perception. Hum Neurobiol 3:197208.[ISI][Medline]
Perrett DI, Smith PAJ, Mistlin AJ, Chitty AJ, Head AS, Potter DD, Broennimann R, Milner AD, Jeeves MA., (1985a) Visual analysis of body movements by neurons in the temporal cortex of the macaque monkey: a preliminary report. Behav Brain Res 16:153170.[CrossRef][ISI][Medline]
Perrett DI, Smith, PAJ, Potter DD, Mistlin AJ, Head AS, Milner AD, Jeeves MA (1985b) Visual cells in the temporal cortex sensitive to face view and gaze direction. Proc R Soc Lond Biol B 223:293317.
Perrett DI, Mistlin AJ, Harries MH, Chitty AJ., (1989a) Understanding the visual appearance and consequences of hand actions. In: Vision and action: the control of grasping (Goodale MA, ed.), Norwood, NJ: Ablex Publishing Corporation.
Perrett DI, Harries MH, Bevan R, Thomas S, Benson PJ, Mistlin AJ, Chitty AJ, Hietanen JK, Ortega JE (1989b) Frameworks of analysis for the neural representation of animate objects and actions. J Exp Biol 146:87113.[Abstract]
Perrett DI, Oram MW, Harries MH, Bevan R, Hietanen JK, Benson PJ, Thomas S (1991) Viewer-centred and object-centred coding of heads in the macaque temporal cortex. Exp Brain Res 86:159173.[ISI][Medline]
Perrett DI, Hietanen JK, Oram MW, Benson PJ (1992) Organization and functions of cells responsive to faces in the temporal cortex. Phil Trans R Soc Lond Biol B 335:2330.
Perrett DI, Oram MW (1998) Visual recognition based on temporal cortex cells: viewer-centred processing of pattern configuration. Zeitschr Naturforschung C-A/ J Biosci 53:518541.
Reynolds JH, Desimone R (1999) The role of neural mechanisms of attention in solving the binding problem. Neuron 241929.
Rolls ET, Baylis GC (1986) Size and contrast have only small effects on the responses to faces of neurons in the cortex of the superior temporal sulcus of the monkey. Exp Brain Res 65:3848.[ISI][Medline]
Rolls ET, Robertson RG, Georges-François P (1997) Spatial view cells in the primate hippocampus. Eur J Neurosci 9:17891794.[ISI][Medline]
Seltzer B, Pandya DN (1978) Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey. Brain Res 149:124.[CrossRef][ISI][Medline]
Seltzer B, Pandya DN (1994) Parietal, temporal and occipital projections to cortex of the superior temporal sulcus in the rhesus monkey: a retrograde tracer study. J Comp Neurol 243:445463.
Singer W, Gray CM (1995) Visual feature integration and the temporal correlation hypothesis. Annu Rev Neurosci 18:555586.[CrossRef][ISI][Medline]
Tamura R, Ono T, Fukuda M, Nakamura K (1992) Spatial responsiveness of monkey hippocampal neurons to various visual and auditory stimuli. Hippocampus 2:307322.[ISI][Medline]
Tanaka YZ, Koyama T, Mikami A (1999) Neurons in the temporal cortex changed their preferred direction of motion dependent on shape. Neuroreport 10:393397.[ISI][Medline]
Ungerleider LG, Mishkin M (1982) Two cortical visual systems. In: Analysis of visual behavior (Ingle, DJ, Goodale MA, Mansfield RJW, eds), pp. 549586. Cambridge, MA: MIT Press.