Department of Physiology, University of Szeged, H-6720, Dóm tér 10, Szeged, Hungary and , 1 Laboratorium voor Neuro- en Psychofysiologie, Katholieke Universiteit te Leuven, Campus Gasthuisberg, Herestraat, B-3000, Leuven, Belgium
Address correspondence to G. Benedek, Department of Physiology, University of Szeged, Dóm tér 10, Szeged, Hungary H-6720. Email: benedek{at}phys.szote.u-szeged.hu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
It is an everyday experience that object recognition is to a large extent independent of another change in the retinal image, the change (and reduction) in surface detail of the object: recognition of an object depicted as a line drawing or black and white photograph or recognition of the same object as a coloured photograph are of approximately equal difficulty. This phenomenon has been widely used by artists (e.g. Matisse or Picasso) and by professional illustrators (e.g. the emergency pictograms found in every aeroplane). Indeed, in a human psychophysical experiment, it has been found (Biederman and Ju, 1988) that the naming latencies of masked objects presented as coloured photographs or as line drawings were essentially the same. A series of experiments revealed no benefits for chromatic over achromatic representations (Ostergaard and Davidoff, 1985
), or over line drawing representations (Davidoff and Ostergaard, 1988
) in different classification tasks. This suggests that surface characteristics such as colour, texture and shading play only a secondary role in object recognition once contour information is available. This finding is in line with edge-based theories of object recognition (Grossberg and Mingolla, 1985
; Biederman, 1987
; Ullman, 1989
).
Hayward and colleagues (Hayward et al., 1999) compared the abilities of human subjects to recognize silhouettes and shaded images of objects rotated in depth. They found that humans use the three-dimensional (3-D) representation for object recog- nition and that silhouettes provide only partial 3-D information, due to the lack of shading. This suggests that the interpretation of the 3-D structure of an object is enhanced by shading and internal contours (Cavanagh, 1991
).
In the present study, we systematically examined whether the shape selectivity of IT neurons is dependent on changes in retinal input caused by variations of the surface attributes of the presented objects. Each object inside its occluding contours was systematically reduced as follows. First, we removed the texture and shading, keeping the inner contours; at this stage, the contrast polarity was varied too. Secondly, the internal contours were also removed, leaving merely a silhouette of the object. During the experiments, we recorded the single-cell activity of certain IT neurons in awake, fixating monkeys. For each individual neuron, from a standard set of 20 coloured objects, we first identified two objects to which the neuron responded vigorously (effective stimuli) and two to which it did not respond (non-effective stimuli). Next, we compared the responses of the neurons to these four objects presented under the progressively reduced conditions. This procedure is similar to the step-by-step stimulus reduction paradigm employed in an earlier study (Tanaka et al. 1991), during which a 3-D object is gradually reduced by removal of its colour, texture, shading, contours and object-parts, which allows determination of the critical features for the neurons in anaesthetized animals. However, there is a conceptual difference between that system- atic stimulus reduction method and our method. Instead of first identifying an object that a particular neuron responds to and then reducing it to determine a feature that is still essential for maximal activation of the neuron, we used both the original objects and their reduced variants as stimuli in order to be able to compare the behaviour of the neurons at a population level.
In psychophysical tests on one animal, we additionally measured its ability to discriminate between objects in the original and some of the surface-reduced variants. The animal was first taught to discriminate eight coloured images into two groups. When it had perfected this discrimination task, we presented the same objects under different surface-reduced conditions and measured the animals spontaneous ability to discriminate between them. We found that rhesus monkeys are able to identify the shape of a coloured object in its reduced variants as well. Some of the results reported here appeared earlier in abstract form (Kovács et al., 1998).
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Two adult macaque monkeys [one a Macaca mulatta, monkey C (11 kg) and one a Macaca nemestrina, monkey K (12 kg)] were used as subjects; only monkey C was tested in the psychophysical experiment. The monkeys were deprived of water for 20 h preceding the experimental sessions. After the daily experimental sessions, the animals received supplementary water, vitamins, fruits and vegetables as necessary and had access to dry food ad libitum. Recordings were generally made for 23 h a day, four or five times a week. During a session, each monkey typically consumed 200300 ml of water or fruit juice. The weight of the animals was checked regularly and was kept at 90% of the normal body weight. Special attention was paid to the animals general condition, with frequent checks of their body weight, fur and excrement. Training or recording sessions were interrupted for 1 month every 23 months.
Surgery
Before surgery, the animals were adapted to the laboratory and to the primate chair. A scleral search coil was implanted into one eye of monkey C, according to procedures described previously (Judge et al., 1980), at the same time, a stainless steel peg was cemented to the skull for head fixation purposes. The head of monkey K was fixed by the reversible method developed in an earlier study (Pigarev et al., 1997
) and after a 23 week recovery period a scleral search coil was implanted into one eye. A recording chamber was next implanted over the anterior dorsolateral part of the skull in both animals (Vogels, 1999b
). The position of the recording chamber was determined with the help of magnetic resonance and computerized tomography (CT) images taken before the operation. The centre of the recording chamber was situated 17 mm anterior to the auditory meatus and 24 mm lateral to the sagittal midline over the left hemisphere in monkey C and 17 mm anterior to the auditory meatus and 23 mm lateral to the sagittal midline over the right hemisphere in monkey K. The chamber of monkey C was tilted 6° inward, while that of monkey K was positioned vertically. Recording chambers were implanted over both hemispheres in monkey C, but all recordings were made in the chamber positioned over the left hemisphere. All surgical procedures were carried out under full anaesthesia and under aseptic conditions. Anaesthesia was initiated with an i.m. injection of ketamine (Calypsol; 8 mg/kg) and atropine (0.05 mg/kg). An endotracheal tube was placed into the trachea and anaesthesia was maintained with a mixture of N2O and O2 in a ratio of 2:1. An i.v. line was inserted for continuous access and additional fentanyl (i.v., 24 µg/kg) was given whenever necessary. Before the surgical procedure, a preventive dose of antibiotic was given (i.v. Augmentin, 500 mg amoxycillin and 100 mg clavulanic acid). The same doses of antibiotics were given i.v. on the first five postoperative days. The incision was infiltrated with local anaesthetic (Procaine). Nalbuphin and non- steroidal anti-inflammatory drugs were administered to the animals postoperatively. Arterial oxygen saturation, expired CO2 level, heart rate and rectal temperature were monitored continuously throughout the surgery and kept within normal limits.
At the end of the recording sessions, several penetrations were made in the brain of monkey C with steel wires under ketamine anaesthesia. The monkey was then killed with an overdose of Nembutal and perfused with fixative. Recording sites were reconstructed by identifying the tracks of the last few penetrations in coronal brain sections (100 µm) stained with cresyl violet. Monkey K is still being used for ongoing experiments (Tompa et al., 2001). All procedures conformed to the guidelines of the NIH for the care and use of laboratory animals and were approved by the Ethical Committee of the University of Szeged.
Apparatus
During the recording sessions, the monkey sat in a custom-made primate chair with its head fixed. A standard 17 in. monitor (74 Hz refresh rate) was placed in front of the animal, 57 cm from the eye. A PC recorded eye movements (200 Hz sampling rate), delivered the reward and controlled the animals behaviour. Other computers presented stimuli and collected electrophysiological data.
Sterile tungsten electrodes (FHC, parylene-coated with an impedance of 1.02.0 M), held by a Narishige hydraulic microdrive, were used for single-cell recordings. Signals were amplified, frequency-filtered and fed into the recording PC, audio monitor and oscilloscope. Single-cell discrimination was performed with an amplitude window discriminator for monkey C and with a spike separator system (SPS-8701, Real Time Waveform Discriminator System; Malvern, SA, Australia) for monkey K. The background luminance in the experimental room was kept constant at a level <1 cd/m2.
Stimuli
Stimuli were presented on a uniform grey background square (side, 18°; luminance, 8 cd/m2) positioned in the centre of the screen. A set of chromatic stimuli (COL) composed of 20 figures was used (Fig. 1). Half of the figures were simple geometrical shapes filled with a coloured, textured pattern, created by commercial image-processing software. The stimuli occupied the same area (6 x 5°) and had an average luminance of 7.9 cd/m2 (SD = 5.6 cd/m2). The other half were chromatic images of natural and artificial objects (occupying the central 10 x 7° of the screen, with an average luminance of 4.8 cd/m2 (SD = 3.0 cd/m2), chosen randomly from the image pool of the laboratory. Stimuli were presented centrally during the fixation of a small blue fixation spot (0.1° radius and 5.5 cd/m2 luminance) that remained on screen throughout the trial.
|
Stimulus Sequence and Behavioural Paradigms
Single-cell Recording
A simple fixation paradigm was used during the single-cell recording sessions. Initially, the screen was black. A trial started with the presentation of the fixation spot. If the animal foveated the fixation spot, the uniform grey background pattern was presented for 500 ms, after which the stimulus appeared for another 500 ms. Animals were rewarded for maintaining the fixation within a 0.5 x 0.5° square window until the stimulus offset. If they left the fixation window earlier than the stimulus offset, the trial was considered aborted and excluded from further analysis. To associate the reward and stimuli, but not the fixation spot, reinforcement was given immediately after the stimulus offset, while the fixation spot remained on-screen for a variable time (100300 ms). The inter-trial interval was 1000 ms.
Behavioural Test of Object Discrimination
In the object discrimination training, the stimulus offset was followed by the appearance of two circular red targets (0.45° radius and 2 cd/m2 luminance), flanking the fixation spot at a distance of 7.5° to the right or left. After completing the single-cell recording sessions, monkey C was trained to make a saccadic eye movement to the left or to the right target after the presentation of each of eight individual COL stimuli. These stimuli were classified into two groups, such that the discrimination task could not be solved by the presence or absence of one particular feature (i.e. by the detection of a particular colour, texture, shading or inner contour) in the images. Four stimuli (1, 3, 13 and 17 in Fig. 1) were associated with one side and the remaining four (2, 10, 12 and 20 in Fig. 1
) with the other side. During training, the animal was rewarded for correct responses with drops of water. Once the animal had reached an average of 90% correct responses for the eight objects, we introduced object discrimination transfer probe test trials. During these trials, the previously learned COL stimuli were intermixed with either the BLD or the SIL versions of the same objects. First, COL and BLD and, secondly, COL and SIL versions of the objects were presented with equal frequencies of 10 trials for each stimulus. The animal was rewarded in these probe trials for the BLD and SIL conditions, regardless of the responses it made. This equal reinforcement for correct and incorrect responses allowed us to measure the spontaneous categorization of the novel BLD and SIL stimuli (Vogels, 1999a
).
Single-cell Recording Protocol
We searched for single cells by presenting our standard set of 20 COL stimuli. Once a cell was isolated and found to be responsive to at least one of the COL stimuli, it was tested further. To test stimulus selectivity, we ran tests by presenting four objects, two eliciting larger firing rates of the particular neuron and two less effective stimuli, determined by auditory feedback and upon inspection of the peristimulus time histograms (PSTH). Each of the four objects was then presented as COL and under the surface-reduced stimulus conditions. Each stimulus condition was presented at least 10 times in an interleaved fashion.
Data Analysis
Off-line spike counts were computed trialwise with a 500 ms bin, starting 50 ms after stimulus onset. Net responses were calculated by trialwise subtraction of the neural activity during a fixation period of the same duration as the stimulus time window, but just preceding stimulus onset. Analysis of variance [ANOVA (Kirk, 1968)] was used to test the significance of the responses to the stimuli and the significance of shape selectivity. Tests were classified as significant if the corresponding type I error was <0.05. To determine the responsiveness of the neurons, ANOVA was performed on the neural data with the stimulus and the time period of the firing activity (before versus after stimulus onset) as factors [split-plot design (Kirk, 1968
)]. A cell was considered responsive to COL stimuli if the main effect of the responses was significant. To determine the selectivity of the neurons, for each neuron and condition we first ranked the four objects according to their net responses under the COL condition. Secondly, we calculated the average firing rate separately for each unit and each condition as a function of the stimulus rank. Neurons that exhibited an interaction effect of the two factors (stimulus rank and time period of neural activity) were considered shape-selective under given conditions. The responses to the different conditions were compared by generating responsivity indices (RIs). For each cell, we subtracted the average net firing rates in response to the preferred stimulus under the BLD, DLD, LD and SIL conditions from the average net response to the preferred stimulus under COL conditions and divided this difference by the sum of the two responses.
The response onset latency was calculated by using Poisson spike train analysis (Hanes et al., 1995), modified from Legéndy and Salcman (Legéndy and Salcman, 1985
). In this analysis, for each cell and stimulus, the trialwise average of the onset times of the first activations was used as latency.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
After the animal had reached a discrimination performance of at least 90% correct for each COL object, we introduced the object discrimination probe test trials for the BLD and SIL conditions. Figure 2 shows the average performance of monkey C for the COL, BLD and SIL conditions during the first 160 probe test trials. The animal performed much above chance level in the object discrimination task for the eight stimuli (mean and standard error of correct responses in the first 160 BLD and SIL trials: 85 ± 5 and 75 ± 8.24%, respectively; binomial test; both P < 0.01). Monkey Cs performance did not show significant stimulus-specificity, i.e. there were no shapes for which the animal performed much worse than the average (cross- tabulation of COL stimuli versus responses, Pearsons
2, d.f. = 7, not significant).
|
This result suggests that monkeys, like humans, are able to identify the shape of an object largely independently of its surface characteristics. The lower performance of the animal under the SIL condition parallels the more difficult recognition of silhouette images by humans (Kovács et al., 1996). An alternative, however unlikely, explanation of this lower performance is that due to indifferential rewarding in the probe-test trials, the monkey learned that it did not need to perform very well in order to be rewarded, i.e. its drive was reduced. This may account for the lower performance of the animal under the SIL condition, that was tested second. However, since this general decrease of motivation would decrease performance, the main conclusion that there is a transfer of object knowledge from COL to BLD and SIL would not be affected.
Single-cell Recording
A total of 714 single neurons were tested in the IT of the two animals. The present study is based on 149 neurons that proved to be visually responsive and selective for the chromatic versions of the objects (67 and 82 neurons in monkeys C and K, respectively). The remaining neurons are not considered further here. Table 1 lists the numbers of cells recorded under each stimulus condition.
|
|
For 90 neurons, we tested how the removal of internal texture and shading cues and their replacement with a uniform surface brighter than the background (BLD) affects the neural responses. Figure 4A presents examples of the stimuli and the responses of a typical IT neuron for the COL and three reduced stimulus conditions. This neuron responded vigorously to the chromatic versions of stimuli 20 and 17 in Figure 1
. These responses were not significantly different from those observed under the BLD condition (Scheffes post hoc analysis, P > 0.7 for each stimulus). Furthermore, the shape selectivity was also preserved in the responses under each condition [ANOVA, interaction between rendering condition and stimuli: F(3,74) = 0.28, not significant].
|
|
There were neurons for both the BLD (12, 13%) and DLD (20, 26%) conditions with RIs >0.8, suggesting that, at least for some cells, removal of internal shading did affect response rates.
For 57 cells, we tested the effect of the presence or absence of internal contours. Comparison of the DLD and SIL conditions in Figure 4A,B shows that for two neurons removal of the dark occluding contours and of the inner lines separating the main parts of the objects had no effect on the neural responses and selectivity. At a population level, the neurons had similar response rates and selectivities under the two conditions. Figure 5C
shows the distribution of RIs for COL and SIL with a median of 0.19 (1st quartile, 0.025; 3rd quartile, 0.69; n = 106), suggesting somewhat decreased firing rates for the SIL images. Twenty-one (20%) of the neurons had RIs >0.8, suggesting sensi- tivity for the internal structure of the images. A comparison of the DLD and SIL responses revealed only very small differences [Fig. 5D
; median 0.09 (1st quartile, 0.02; 3rd quartile, 0.31; n = 76)].
Of the 44 neurons tested, 70% remained responsive to the LD stimuli when we removed all contrast from the images, but retained the contours as revealed by ANOVA (see Materials and Methods). However, this stimulus modification reduced the neuronal responses significantly. Typical neuronal responses are presented under the COL, DLD, SIL and LD conditions in Figure 4B. This response reduction was a general finding, as revealed by analysis of the 31 COL- and LD-responsive shape- selective neurons. The median RI for the COLLD comparison was 0.77 (Fig. 5E
; 1st quartile, 0.29; 3rd quartile, 1.17; n = 44), suggesting significantly larger responses under the COL than under the LD conditions (Wilcoxon matched pair test, T = 78; P < 0.05).
Response Latencies
The median response latencies under the COL and BLD conditions were 103 and 110 ms, respectively, a difference not statistically significant (Wilcoxon matched pair test, T = 1217, not significant, n = 79). The median response latencies under the BLD and DLD conditions were 114 and 110 ms, respect- ively, again, not a statistically significant difference (Wilcoxon matched pair test, T = 583, not significant, n = 49). The difference in median response latencies were not statistically significant for DLD and SIL conditions either (110 ms under the DLD and 112 ms under the SIL conditions, Wilcoxon matched pair test, T = 534, not significant, n = 50).
The median of the distribution of the neuronal latencies for the LD conditions was 101 ms, a value not significantly different from the median latencies of the same neuronal population under the COL and DLD conditions (Wilcoxon matched pair test, T = 38 and T = 39, respectively, not significant).
Effects of Colour, Texture, Shading and Internal Contour Removal on Shape Selectivity
The shape selectivities of the neurons were also similar under the COL and the BLD and DLD conditions (Fig. 6A). The average net responses under both texture-removed conditions decreased significantly with increasing stimulus rank (for this analysis, ranking was performed according to the neuronal responses under COL conditions), demonstrating similar shape selectivities with and without internal texture information. The net responseshape rank curves, however, are flatter under the BLD and DLD conditions than under the COL condition (Fig. 6A
). To determine whether this is merely a consequence of the lower response rates seen under the surface-reduced conditions or constitutes a genuine difference in shape selectivity between the COL and BLD/DLD conditions, we additionally calculated the average net normalized responses (Fig. 6B
), dividing the responses by the response in rank 1 of the COL condition. Normalization eliminates the absolute differences in net responses. As shown in Figure 6B
, normalization reduced the difference between the COL and BLD/DLD conditions. However, the decrease in the normalized firing rate with increasing stimulus rank was still significantly less under both the BLD and DLD texture-removed conditions compared with the COL condition [ANOVA, interaction of rendering condition and stimulus ranking: COLBLD, F(3,267) = 20.65, P < 0.01; COLDLD, F(3,225) = 35.87, P < 0.01], indicating that, at a population level, colour, texture and shading removal affected the shape selectivity weakly.
|
To determine the generality of our finding that shape selectivity is similar after colour, texture and shading removal, we grouped our neuronal sample according to behaviour. We defined four groups of neurons: neurons maintaining exact ranking order (1-2-3-4); neurons whose rank 1 is the same in the COL and in the surface-reduced conditions (1-x-x-x); neurons whose rank 2 in the COL condition became rank 1 in the surface-reduced condition (2-x-x-x); and, finally, neurons whose rank 3 or rank 4 of the COL condition became rank 1 in the surface-reduced condition. As it can be seen from Table 2, 2233% of the recorded neurons had exactly the same shape preference order in the COL and in the surface-reduced conditions. We emphasize here that both stimuli leading to responses under rank 1 and rank 2 conditions were selected for greater effectiveness, while rank 3 and rank 4 conditions were selected as examples for ineffectiveness. This explains why both the 1-x-x-x and 2-x-x-x cell categories are consistent with generalization across rendering conditions. Thus, when considered together, ~80% of the neurons had the shape being defined as rank 1 or rank 2 in the COL condition as rank 1 or rank 2 in the surface-reduced conditions as well. These data suggest robust independence of the neuronal shape selectivity from the rendering condition. As Figure 4A
shows, for one neuron reversal of the contrast sign affected neither the neural firing nor the shape selectivity. Analysis of the 57 COL-responsive and -selective neurons revealed that this was a general finding: there is no significant difference between BLD and DLD in the firing- ratestimulus-rank function [see Fig. 6
, ANOVA, interaction of rendering conditions and stimulus ranking: F(3,168) = 1.04, not significant], suggesting that selectivity is similar for stimuli brighter or darker than the background pattern, i.e. when the sign of the contrast between the object and the background is reversed.
|
It is obvious from a comparison of the DLD and SIL objects presented in Figure 4A,B that removal of the contour lines from the images does not have equal effects on the perception of a simple, one-part object, such as a circle, or of a more complex object with several different components, e.g. a drum. It is possible that the apparent lack of any difference we obtained under DLD and SIL conditions is due to averaging of the response differences for simple objects and objects composed of several parts. To test this hypothesis, we made a separate analysis for those cells whose preferred stimulus (determined as rank 1) was composed of at least five parts (i.e. the object could be separated into at least five closed, convex components by the dark inner lines under DLD condition; e.g. stimulus 20 in Fig. 1
). However, there was no significant difference between the selectivities of these cells (n = 29) under DLD and SIL conditions [ANOVA, interaction of DLD and SIL, F(3,84) = 1.31, not significant], suggesting that this lack of difference in shape selectivity does not depend on the number of object components. The response strengths for these 29 neurons were also similar under DLD and SIL conditions: RI for DLD and SIL with a median of 0.11 (1st quartile, 0.02; 3rd quartile, 0.27; n = 29; Fig. 5D
).
Figure 4B shows the shape selectivities for COL, DLD, SIL and LD conditions for a TE neuron. At population level, the average net normalized response decreases significantly less with increasing stimulus rank under the LD condition as compared with the COL condition [ANOVA, interaction of rendering condition and stimulus ranking, F(3,63) = 10.13, P < 0.001, n = 22], indicating that the removal of texture, shading and contrast affected shape selectivity.
Figure 7 shows that, at a population level, the IT neurons exhibit similar selectivities for images with and without internal contours [ANOVA, interaction of rendering condition (DLD and SIL) and stimulus ranking: F(3,165) = 1.07, not significant].
|
|
Analysis of Different Response Intervals
The basic information about shapes is present in the very early part of the neuronal responses (Rolls and Tovée, 1994; Kovács et al., 1995b
). Further, as found by Sugase et al. (Sugase et al., 1999
), IT neurons convey global information about the category (faces, shapes) of the stimulus in the earliest phase of their responses, while fine information about the identity or facial expression of the stimuli is conveyed later in the response. These two sets of data encouraged us to conduct a separate analysis on our data set. First, we determined the response latency of each cell under the COL condition for the stimulus leading to the largest response. Secondly, we determined two response windows, a 100 ms long window, that immediately followed the response onset (early) and a 100 ms window, starting at the end of the early response window (late). Next, spike counts were computed off-line, trialwise for each stimulus condition with the previously determined 100 ms bins. Net responses were calculated by trialwise subtraction of the neural activity during a fixation period of 100 ms just preceding the stimulus onset. For an analysis of shape responsivity we separately computed RIs for both the early and late response windows. None of the differences in RIs, obtained for the COLBLD, COLDLD, COLSIL and COLDLD comparisons, were significant (t-test for dependent variables, not significant) between the early and late response windows. This suggests that information about the stimulus condition is similarly present in the early and late windows of the response.
A similar test was performed to determine whether the shape selectivity of the neurons was different for the early and late phases of the responses. We determined a selectivity index (SI). For each cell and each stimulus condition, we subtracted the average net firing rate in response to the least-preferred stimulus (i.e. rank 4) from the average net response to the preferred stimulus (i.e. rank 1) under the same stimulus conditions and divided this difference by the sum of the two responses.
None of the SIs differ in the early and late response windows, suggesting that shape selectivity is similar for the early and late response components.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Texture and Shading
Few data are available regarding the question of how a change of texture alters the shape selectivity of IT neurons. In this study, instead of merely changing the texture, we removed all texture elements from within the objects. This stimulus variation affected the shape sensitivity of the IT neurons only weakly, suggesting the relatively low importance of texture in IT stimulus selectivity. However, the response rate did decrease under the texture-removed conditions, suggesting some degree of interaction of texture and shading with shape. This is in agreement with the conclusion of earlier workers (Vogels et al., 1999), who systematically tested the effect of the angle of illumination on the IT neural responses and found that, for approximately half of the neurons, the direction of the illumination (i.e. the variations of shading) changed the neural activity.
Silhouettes
The recognition of objects in contre jour situations, when they are illuminated by a strong light from behind, can easily go astray (an example is that of childrens shadow-theatres). This shows that the outer or occluding contours of the objects alone are not always sufficient for proper recognition. On the other hand, schematic line drawings containing the inner contours that distinguish the main parts of the objects are at least as effective for object recognition as grey-scale or coloured representations (Biederman and Ju, 1988). This is reflected by the somewhat lower performance transfer of the monkey for SIL stimuli in the behavioural test (75%) than for BLD (85%). Furthermore, this was not stimulus-dependent: stimulus complexity had no effect on the discrimination transfer in the probe test trials. At a neuronal population level, we observed similarly decreased firing rates for the objects containing the inner contours (DLDs) and for the SILs compared with the chromatic versions, a result supported by another study (Vogels, 1999b
). This suggests that inner contours are not necessary for the selective response of these neurons.
The explanation of the different effects of the elimination of internal contours on the behavioural and neuronal responses demands further studies. However, this discrepancy can be related to the different effects of the stimulus position on behavioural performance and neuronal selectivity (Vogels, 1999b): changes of stimulus position led to responses similar to those for objects shifted in position, while categorization per- formance was affected strongly. It is possible that other neurons (within the IT or in different cortical areas) are responsible for the poorer recognition of images presented in different locations and of SIL images.
Contrast
In the real world, the sign of the contrast across the occluding contours of objects varies significantly, depending on factors such as the changing illumination and texture properties of the background. None the less, perception is largely invariant to contrast changes in the objects. Indeed, real-time, object-naming performance, long-term priming and immediate image integra- tion processes are unaffected by the polarity of the contours and inner surfaces of non-face images (Subramaniam and Biederman, 1997).
Comparison of our line drawing stimuli having higher (BLD) or lower (DLD) luminance values than that of the background showed no differences in either neural response rate or shape selectivity. This indicates that the responses of IT neurons do not reflect the contrast sign of the stimuli, suggesting that the IT may play a role in the contrast-invariant recognition of objects. This result is apparently in conflict with another report (Ito et al., 1994), whose authors measured how the reversal of luminance contrast between object and background alters the neural responses in the anterior IT. Using the stimulus reduction method (Tanaka et al., 1991
) in anaesthetized animals, they found that for 60% of the neurons, contrast reversal reduces the responses by >50%. Furthermore, 57% of their 19 recorded cells also displayed significant changes in shape selectivity with contrast reversal. They concluded that IT neurons carry information about contrast polarity. The apparent disagreement between our finding and that from the study by Ito et al. can be attributed to the fundamental differences in the experimental approaches. First, Ito et al. changed the contrast polarity of the objects and the backgrounds as well (i.e. they presented bright objects on dark surfaces or vice versa), while we presented our stimuli on an identical medium-grey background, making com- parison of the two results difficult. Secondly, Ito et al. used the stimulus reduction paradigm (Tanaka et al., 1991
), starting with a 3-D object and eliminating step by step cues such as colour, texture and object-parts, in this way determining the critical feature for the neurons. During this process Ito et al. intentionally excluded those neurons that had texture or colour as critical features and studied only a small subsample of neurons that had their optimal stimuli defined exclusively by shape. This means that the neuron populations in the two studies over- lapped only partially. Finally, Ito et al. used anaesthetized animals, while we used awake, fixating monkeys. Although our animals were not engaged actively in any shape discrimination task during the recording sessions, we made attempts to draw their attention to the stimuli (see Materials and Methods), making the correlation of the perceptual and neuronal results in our study a plausible one. [In fact, during the recording sessions we had the common experience that, whenever the stimulus set was changed (from our standard set of 20 COL objects to the test objects under COL, BLD, DLD, LD and SIL conditions), the animals had several aborted trials for a while, as if they were surprised by the sudden change of stimuli, showing the involve- ment of active attentional processes.]
Removal of all contrast from within the objects and generating line drawings resulted in significantly lower response rates and changed selectivity. This result is in accord with the results of Ito et al. (Ito et al., 1994), who also found changed selectivity for line drawing stimuli compared with objects with surface cues.
Effect of Practice
To test the possible effect of extended practice on the shape selectivity of our neuronal sample, we analysed the temporal distribution of the selectivity of the neurons, which were responsive and selective for the two conditions under consider- ation. To do this, we divided the total length of the recording period 56 days for monkey C and 70 days for monkey K (note that only days of successful recordings were counted) into four, 15 day periods.
We performed a three-way ANOVA of the normalized firing rates, ranked according to COL (dependent variables) and recording period, with repeated-measure design as independent variable. This analysis suggests similar selectivity curves for COL and surface-reduced conditions in each recording period. This is evidence for the absence of a significant effect of practice in the response selectivity of the recorded sample.
The physiological results regarding the shape selectivity of the IT neurons presented here fit well with psychophysical data from human and monkey experiments: both behavioural per- formance and neuronal shape selectivity were largely invariant to the elimination of colour, to the inversion of contrast sign and, to a lesser degree, to the elimination of texture and shading. These results agree with the hypothesis that the IT plays a sig- nificant role in the discrimination and recognition of degraded images of objects under the variety of conditions encountered in natural environments, independently of the cues present in the image.
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Biederman I (1995) Visual identification. In: An invitation to cognitive science, vol. 2 (Kosslyn SF, Osherson DN, eds), pp. 121165. Cambridge, MA: MIT Press.
Biederman I, Ju G (1988) Surface versus edge-based determinants of visual recognition. Cogn Psychol 20:3864.[CrossRef][ISI][Medline]
Cavanagh, P (1991) Representations of vision: trends and tacit assump- tions in vision research (Gorea A, Fregnac Y, Kapoula Z, Findlay J, eds), pp. 295304. Cambridge: Cambridge University Press.
Davidoff JB, Ostergaard AL (1988) The role of colour in categorial judgements. Q J Exp Psychol A 40:533544.[ISI][Medline]
Dean P (1976) Effects of inferotemporal lesions on the behavior of monkeys. Psychol Bull 83:4171.[CrossRef][ISI][Medline]
Grossberg S, Mingolla E (1985) Neural dynamics of form perception: boundary completion, illusory figures, and neon color spreading. Psychol Rev 92:173211.[CrossRef][ISI][Medline]
Hanes P, Thompson KG, Schall JD (1995) Relationship of presaccadic activity in frontal eye field and supplementary eye field to saccade initiation in macaque: Poisson spike train analysis. Exp Brain Res 103:8596.[ISI][Medline]
Hayward WG, Tarr MJ, Corderoy AK (1999) Recognizing silhouettes and shaded images across depth rotation. Perception 28:11971215.[ISI][Medline]
Ito M, Fujita I, Tamura H, Tanaka K (1994) Processing of contrast polarity of visual images in inferotemporal cortex of the macaque monkey. Cereb Cortex 4:499508.[Abstract]
Ito M, Tamura H, Fujita I, Tanaka K (1995) Size and position invariance of neuronal responses in monkey inferotemporal cortex. J Neurophysiol 73:218226.
Judge SJ, Richmond BJ, Chu FC (1980) Implantation of magnetic search coils for measurement of eye position: an improved method. Vision Res 20:535538.[CrossRef][ISI][Medline]
Kirk RE (1968) Experimental design: procedure for the behavioral sciences. Belmont, CA: Brooks-Cole.
Kovács G, Vogels R, Orban GA (1995a) Selectivity of macaque inferior temporal neurons for partially occluded shapes. J Neurosci 15: 19841997.[Abstract]
Kovács G, Vogels R, Orban GA (1995b) Cortical correlate of pattern backward masking. Proc Natl Acad Sci USA 92:55875591.[Abstract]
Kovács G, Kéri S, Benedek G (1996) Object recognition based on surface and contour information. Perception 25(Suppl.):88b.
Kovács G, Sáry G, Köteles K, Chadaide Z, Benedek G (1998) Effect of surface attributes on the shape selectivity of inferior temporal neurons. Soc Neurosci Abstr 24:898 (355.4).
Kovács G, Sáry G, Köteles K, Chadaide Z, Fiser J, Benedek G, Biederman I (1999) Effect of contour deletion on the responses of inferior temporal neurons. Perception 28(Suppl.):13a.
Legéndy CR, Salcman M (1985) Bursts and recurrences of bursts in the spike trains of spontaneously active striate cortex neurons. J Neurophysiol 53:926939.
Logothethis NK, Sheinberg DL (1996) Visual object recognition. Annu Rev Neurosci 19:577621.[CrossRef][ISI][Medline]
Ostergaard AL, Davidoff JB (1985) Some effects of color on naming and recognition of objects. J Exp Psychol Learn Mem Cogn 11:579587.[CrossRef][ISI][Medline]
Pigarev NI, Nothdurft HC, Kastner S (1997) A reversible system for chronic recordings in macaque monkeys. J Neurosci Methods 77:157162.[CrossRef][ISI][Medline]
Rolls ET, Tovee MJ (1994) Processing speed in the cerebral cortex and the neurophysiology of visual masking. Proc R Soc Lond B Biol Sci 257:915.[ISI][Medline]
Sáry Gy, Vogels R, Orban GA (1993) Cue-invariant shape selectivity of macaque inferior temporal neurons. Science 260:995997.[ISI][Medline]
Subramaniam S, Biederman I (1997) Does contrast reversal affect object recognition? Invest Ophthalmol Vis Sci 38:46383628.[ISI]
Schwartz EL, Desimone R, Albright TD, Gross CG (1983) Shape recognition and inferior temporal neurons. Proc Natl Acad Sci USA. 80:57765778.[Abstract]
Sugase Y, Yamane S, Ueno S, Kawano K (1999) Global and fine information coded by single neurons in the temporal visual cortex. Nature 400:869873.[CrossRef][ISI][Medline]
Tanaka K (1996) Inferotemporal cortex and object vision. Annu Rev Neurosci 19:109139.[CrossRef][ISI][Medline]
Tanaka K, Saito H, Fukada Y, Moriya M (1991) Coding visual images of objects in the inferotemporal cortex of the macaque monkey. J Neurophysiol 66:170189.
Tanaka H, Takanori U, Kenji Y, Makoto K, Ichiro F (2001) Processing of shape defined by disparity in monkey inferior temporal cortex. J Neurophysiol 85:735744.
Tompa T, Chadaide Z, Lenti L, Csifcsák G, Kovács G, Benedek G (2001) Invariances of shape-processing for reduced surface cues: how IT neurons and psychophysics correlate in the macaque. In: Proceedings of the Third International Conference on Cognitive Science (ICCS 2001), pp. 4044. Beijing: Press of USTC
Ullman S (1989) Aligning pictorial description: an approach to object recognition. Cognition 32:193254.[CrossRef][ISI][Medline]
Vogels R (1999a) Categorization of complex visual images by rhesus monkeys. Part 1: behavioral study. Eur J Neurosci 11:12231238.[CrossRef][ISI][Medline]
Vogels R (1999b) Categorization of complex visual images by rhesus monkeys. Part 2: single cell study. Eur J Neurosci 11:12391255.[CrossRef][ISI][Medline]
Vogels R, Biederman I, Bar M (1999) Sensitivity of macaque inferior temporal neurons to variations in object shading. Soc Neurosci Abstr 25:918 (370.7).