1Department of Cognitive Neuroscience, Osaka University Medical School and 2Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, Osaka 565-0871; and 3Laboratory for Cognitive Neuroscience, Department of Biophysical Engineering, Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Uka, Takanori, Hiroki Tanaka, Kenji Yoshiyama, Makoto Kato, and Ichiro Fujita. Disparity Selectivity of Neurons in Monkey Inferior Temporal Cortex. J. Neurophysiol. 84: 120-132, 2000. The inferior temporal cortex (IT) of the monkey, a final stage in the ventral visual pathway, has been known to process information on two-dimensional (2-D) shape, color, and texture. On the other hand, the dorsal visual pathway leading to the posterior parietal cortex has been known to process information on location in space. Likewise, neurons selective for binocular disparity, which convey information on depth, have been found mainly in areas along the dorsal visual pathway. Here, we report that many neurons in the IT are also selective for binocular disparity. We recorded extracellular activity from IT neurons and found that more than half of the neurons changed their response depending on the disparity added. The change was not attributed to monocular responses or eye movements. Most neurons selective for disparity were "near" or "far" cells; they preferred either crossed or uncrossed disparity, and only a small population was tuned to zero disparity. Disparity-selective neurons were also selective for shape. Most preferred the same type of disparity irrespective of the shape presented. Disparity preference was also invariant for the fronto-parallel translation of the stimuli in most of the neurons. Finally, nearby neurons exhibited similar disparity selectivity, suggesting the existence of a functional module for processing of binocular disparity in the IT. From the above and our recent findings, we suggest that the IT integrates shape and binocular disparity information, and plays an important role in the reconstruction of three-dimensional (3-D) surfaces.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
When horizontal binocular disparity is added to a
part of a two-dimensional (2-D) shape, we perceive a three-dimensional
(3-D) structure consisting of multiple surfaces at different depths and
orientations (Wheatstone 1838). Binocular disparity is
thus an important cue that aids the visual system in reconstructing 3-D
surface structures from 2-D retinal images.
The primary visual cortex (area 17 or V1) of the cat and monkey
contains neurons that encode binocular disparity (Barlow et al.
1967; Poggio and Fischer 1977
). In the monkey,
disparity information is then thought to flow mainly along the
magnocellular-dominated stream, and subsequently along the dorsal
visual pathway. Disparity-selective neurons are found in a greater
number in the thick stripes of V2 than in the other stripes
(Hubel and Livingstone 1987
; Peterhans and Von
der Heydt 1993
; Roe and T'so 1995
), and many
areas in the dorsal visual pathway have been shown to contain
disparity-selective neurons (Maunsell and Van Essen
1983
; Roy et al. 1992
; Sakata et al.
1997
). Furthermore, DeAngelis et al. (1998)
provided evidence from microstimulation experiments that neuronal
activity in area MT indeed contributes to the discrimination of
stereoscopic depth.
Neurons in the ventral visual pathway, on the other hand, have been
shown to respond to the surface characteristics of objects such as
color, 2-D shape, and texture. Neurons in V4 are selective for
color (Schein and Desimone 1990;
Zeki 1978
) and shape (Gallant et al.
1993
; Kobatake and Tanaka 1994
), and neurons in
the inferior temporal cortex (IT) are selective for color
(Komatsu et al. 1992
), texture (Sáry et al.
1995
), and shape (Desimone et al. 1984
; Gross et al. 1972
; Tanaka et al. 1991
).
Information on the location of a stimulus in the front-parallel plane
is largely lost in the IT, a characteristic known as position
invariance (Ito et al. 1995
; Schwartz et al.
1983
).
To perceive 3-D surfaces, however, the brain must know the depth as well as the other attributes of a surface, such as its shape. So how are the information on shape and location in depth integrated? One possibility is that both two types of information are integrated in a single neuron. Neurons selective for both disparity and shape might be found in either the dorsal or the ventral visual pathway.
Neurons in the dorsal visual pathway have long been thought to lack
information on shape. Sereno and Maunsell (1998),
however, recently reported shape-selective neurons in area LIP
(also see Murata et al. 1996
; Taira et al.
1990
). Their finding suggests that neurons in the dorsal visual
pathway may have the potential to integrate shape and disparity
information, although they have not examined whether both shape and
disparity information converge in a single neuron.
A recent study by Janssen et al. (1999) showed that
neurons in the IT, a higher stage in the ventral visual pathway, are
selective for disparity gradients. Their finding suggests that IT
neurons are sensitive to binocular disparity. In the present study, we address the potential role of the IT in stereopsis by quantitatively examining how IT neurons are tuned to binocular disparity. The results
indicate that a large population of IT neurons are indeed selective for
binocular disparity, as well as shape. Furthermore, we show that their
disparity selectivity is invariant for position, and that neurons with
similar disparity selectivity are clustered together. Preliminary
results have been reported elsewhere (Uka et al. 1997
).
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Subjects
Two male Japanese monkeys (Macaca fuscata, 7 and 11 kg body wt) were used. The monkeys were subjected to a psychophysical study described in a different paper (Uka et al. 1999)
and were confirmed to have stereoscopic vision. In short, the monkeys
were trained on a two-alternative forced choice discrimination task, where they discriminated a cross with crossed disparity on its horizontal limb from a cross with uncrossed disparity on its horizontal limb. After extensive training, discrimination tests were conducted to
investigate whether the training effect extended to crosses segmented
into two crossing bars by occluding contours. Transfer of the training
effect confirmed that the monkeys were discriminating depth
reconstructed from the disparity cues.
All animal care and experimental procedures were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals (1996) and were approved by the animal experiment committee of Osaka University Medical School. The monkeys were rewarded for correct responses with a drop of water during the experimental sessions. After each session, they were returned to their cages and given an adequate amount of vegetables or fruits. Water supply to the monkeys was restricted at their home cages throughout the experiments. Monkey chow was made available to them ad libitum.
Surgery
First, we attached a head holder to the monkeys' skulls to fix
their heads to the chair and implanted a search coil under the
conjunctiva of one eye for monitoring eye position (Judge et al.
1980), using standard aseptic surgical procedures and
pentobarbital sodium anesthesia (35 mg/kg ip). After the surgery, the
monkeys were treated with an antibiotic (piperacillin sodium, 30 mg/kg im), an analgesic (ketoprofen, 0.5 mg/kg im), and a corticosteroid (dexamethasone sodium phosphate, 0.1 mg/kg im) to reduce potential inflammation. They were allowed to recover for at least 3 wk before the
first training session for a fixation task. After the monkeys were able
to maintain their fixation for a 2-s period of stimulus presentation,
we performed a second operation. A recording chamber was attached to
one side of the monkeys' skull, and a search coil was implanted into
the other eye. After a further 2 wk of recovery, we started
electrophysiological recordings in the hemisphere with the recording
chamber. After completing the recordings in this hemisphere, we
performed a third surgery to attach a recording chamber to the other
side of the skull.
Task and visual stimuli
The monkeys were made to sit on a primate chair and face a
15-in. color monitor (screen size: 260 mm × 195 mm) placed 57 cm away. The monkeys were trained for a computer-controlled fixation task
(PC486FS: Epson, Suwa, Japan). The positions of both eyes were sampled
at the rate of 100 Hz using the search coil technique (Judge et
al. 1980) and stored for off-line analysis, although the
position of only one of the eyes was monitored on-line. Stimuli were
presented using a PC/AT computer (Asus Computer International, San
Jose, CA, display resolution: 1,024 × 768 pixels).
A gray spot (0.2 × 0.2°) was presented at the center of the monitor on a black background (luminance 1.0 cd/m2), and the monkeys were required to fixate within a 2.0 × 2.0° "electronic" window within 500 ms. A stimulus appeared at the fixation point after another 500 ms. The monkeys were required to maintain their fixation within the fixation window throughout the 2-s period of visual stimulus presentation to receive a drop of water. Otherwise, the task was aborted the moment they broke their fixation.
For each neuron recorded, effective stimuli were first determined from
a stimulus set consisting of bars, crosses, squares, a circle, an oval,
and a star (Fig. 1B) at zero
disparity. Except for the two short white bars (luminance 15.8 cd/m2), the color of the figures was red
(luminance 5.7 cd/m2), because red phosphors are
short-lived compared with those of other colors, and we can obtain
better stereo separation between the two eyes. Only the neurons that
responded to one or more stimuli in the set at zero disparity were
further analyzed in this study. Disparity was then added to the most
effective figure, and in some cases, to sub-optimal and nonresponsive
figures as follows: nine disparities at 0.8,
0.4,
0.2,
0.1, 0, 0.1, 0.2, 0.4, and 0.8°. A maximum of five figures was used to
determine disparity selectivity for one neuron. Stereoscopic figures
were displayed using a liquid crystal stereoscopic modulator (SGS610:
Tektronix, Beaverton, Oregon, refresh rate: 70 Hz for each eye). The
stimuli at each disparity level were presented 10 times in random
order. For monocular stimulation, a blank screen with a fixation spot was presented to the unstimulated eye.
|
Electrophysiological recordings
We recorded the extracellular activity of single neurons from
three hemispheres of the two monkeys. A small hole (3 mm diam) was made
in the skull within the recording chamber a day before we started
recordings. An elgiloy recording electrode (tip diameter 7-15 µm,
impedance 2-3 M at 1 kHz) was advanced from the lateral side of the
skull with a micromanipulator (MO-95s, Narishige, Tokyo) mounted on the
recording chamber. The electrode penetrated the dura mater to reach the
lateral surface of the IT. Single neurons were isolated using a
conventional amplifier and a window discriminator. The number of action
potentials recorded during the task was counted by a computer for
off-line analysis. After 1-2 wk of recording, the hole was closed with
dental cement, and a new hole was made for another 1-2 wk of
recording. This procedure was repeated until all the approachable areas
in the IT were thoroughly surveyed.
Histology
After all the experiments were completed, we implanted two pins
into the brain at the anterior and posterior edges of the recording
chamber. The animals were anesthetized with an overdose of
pentobarbital sodium (60 mg/kg ip), the chest cavity was opened, and
heparin (200 IU/kg) was injected into the heart. The animals were
transcardially perfused with 500 ml of phosphate-buffered saline (PBS,
37°C) and then with a fixative solution consisting of 1,000 ml of
ice-cold 4% paraformaldehyde, 0.1% glutaraldehyde in 0.1 M PBS, and
800 ml of ice-cold 4% paraformaldyhyde in 0.1 M PBS. The brains were
removed, photographed, blocked, postfixed overnight in the
last-mentioned fixative, and soaked in 0.1 M PBS containing a graded
series of sucrose (10-30%). The location of the implanted pins was
verified for reconstruction of the recording area. The primary visual
cortex (V1) from these brains was used in a different anatomical study
(Wang et al. 1998).
Data analysis
The spontaneous firing rate was calculated from the 500-ms period immediately prior to stimulus presentation, while the monkey was fixating. The response magnitude was calculated from the firing rate during a 2-s period starting 80 ms after the onset of stimulus presentation. Both were calculated for each trial, the spontaneous activity was averaged over all trials for each neuron, and the magnitude of responses to the stimulus presentation was averaged over the 10 trials for each stimulus. The standard error of the mean was calculated for the magnitude of responses to the stimulus presentation for each stimulus. All statistical analyses were performed using the magnitude of response to the stimulus presentation.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Recording site
Histological analysis revealed that our recording sites (striped area in Fig. 1A) were around the posterior middle temporal sulcus. The recording region contained areas TEd, TEOd, and the ventral bank of the superior temporal sulcus. The two dots in Fig. 1A show the location of the implanted pins. The data from all these areas were combined, since we found no difference in disparity selectivity among the areas.
Disparity selectivity of IT neurons
We recorded from 225 neurons (n = 89 in monkey 1, n = 136 in monkey 2), which responded to at least one stimulus in the initial stimulus set (Fig. 1B) at zero disparity. Of these, 142 (63.1%, n = 54 in monkey 1, n = 88 in monkey 2) changed their response depending on the binocular horizontal disparity added to at least one of the stimuli presented (ANOVA, P < 0.05). We will refer to these neurons as "disparity-selective neurons."
Figure 2 shows an example of the
responses of a disparity-selective neuron. This neuron responded to a
red cross at zero disparity (cf. Fig. 8A). In addition to
the red cross, the neuron responded to a red vertically oriented bar at
nonzero disparity. Figure 2 shows the responses of the neuron to the
red vertical bar. ANOVA revealed a significant modulation of the
response amplitude of the neuron by the addition of binocular disparity
(P < 0.0001). The neuron responded more strongly when
a crossed disparity was added to the bar than when an uncrossed
disparity was added. The disparity tuning curve in Fig. 2B
shows that the neuron is a "near" cell described by Poggio
and Fischer (1977) or a "tuned near" cell described by
Poggio et al. (1988)
.
|
To determine whether this response modulation was caused simply by the translation of the stimulus figure in either eye, we examined the responses of the neuron to the disparity figures presented to each eye separately. The monocular tuning curves in Fig. 2B show that the response of the neuron was significantly modulated by the translation of the stimulus figure in the right eye (ANOVA, P < 0.0001 and P > 0.05 for the right and left eye, respectively). However, the modulation was small and did not account for the binocular effect, to rule out the possibility that the neuron responded to the translation of the stimulus figure. We will further analyze the responses to monocular figures later.
Figure 3 shows an analysis of the eye movements during the presentation of the disparity stimuli to the neuron in Fig. 2. The traces of the vergence angle in Fig. 3A show that the monkey did not respond with vergence movements to disparity stimuli. The monkey's average vergence angle deviated within only 0.1° with the introduction of disparity stimuli, as can be seen in Fig. 3B. There was also no difference in the average vergence angle for various disparity stimuli (ANOVA, P > 0.3). This was also true for other neurons. In most cases (94%) in which eye positions were compared between zero-disparity trials and trials with each disparity level, the average vergence angle deviated within only 0.1° (within 0.05° in 71% of the cases). This indicates that for most trials with disparity cues, the monkeys looked at the same plane in depth as they did for zero-disparity trials. From these control studies, we conclude that the response modulation of the neuron was neither due to monocular responses nor to eye movements, but represented genuine selectivity for binocular disparity.
|
Types of disparity-selective neurons
Poggio and Fischer (1977) defined four types of
disparity tuning. In Fig. 4, we show
examples of disparity-selective IT neurons for each type. Figure
4A shows a typical "far" neuron that responded more
strongly to uncrossed disparity cues than to crossed disparity cues.
This neuron is complementary to the "near" neuron shown in Fig. 2.
Figure 4B shows a typical "tuned excitatory" neuron that
responded strongly to zero disparity: the tuning curve was sharp and
symmetric around the peak. Figure 4C shows a typical "tuned inhibitory" neuron, inhibited at zero disparity, with a sharp symmetric tuning around the peak. Some neurons, however, did not
fall into any of these four types. The neuron in Fig. 4D,
for example, had a tuning curve that could be classified either as a
near neuron or a tuned inhibitory neuron. We categorized these
intermediate types of neurons as "unclassifiable." The
classification procedure was purely subjective.
|
Most of the neurons in which we presented more than one figure exhibited the same type of disparity tuning for each figure and were classified as such. A small population of neurons that showed different types of disparity tuning for different shapes were classified as "mixed."
Figure 5 shows the distribution of each
type of neurons. Most of the disparity-selective neurons were
classified as near cells or far cells (40% in monkey 1, 39% in monkey 2), and there were only a few tuned
excitatory and tuned inhibitory neurons (3% in monkey 1, 7% in monkey 2). The findings of more than half of the neurons in the IT being disparity selective, and that most of them were
near or far cells were consistent in the two monkeys. The distribution
was similar to that observed in the MT (Maunsell and Van
Essen 1983) and MST (Roy et al. 1992
). In the
present analysis, we have included "tuned near" and "tuned far"
cells described by Poggio et al. (1988)
in the near and
far cells, respectively. We might have underestimated the percentage of
tuned near or tuned far, as well as that of tuned inhibitory neurons,
since we did not analyze neurons that did not respond at zero
disparity. Although most of the disparity-selective neurons were
classified into the discrete types of neurons described by
Poggio and Fischer (1977)
, a substantial number (16% in
monkey 1, 13% in monkey 2) were unclassifiable. These data lead one to question the existence of discrete types of
disparity tuning in the IT.
|
Tuning of disparity-selective neurons
We next determined the disparity tuning index to show the degree
of modulation by disparity of the response of these neurons. The
disparity tuning index is calculated using the following formula
![]() |
|
Monocular responses and binocular interaction
Most neurons in the IT were binocular in that they responded equally to monocular presentation of a figure to either eye, whether or not they were disparity selective. The ocular dominance histogram based on the response of 59 neurons indicates that most neurons do not have preference for the ipsilateral or contralateral eye (Fig. 7, class 4). A large population of neurons (36%, 21/59, class I) were not responsive to monocular figures, although they were to binocular figures, while some neurons responded more strongly to monocular figures than to binocular figures. However, we do not know whether any neuron responded only to monocular and not to binocular presentation of a figure, because we did not analyze neurons that were not activated binocularly.
|
The presentation of monocular figures of the disparity stimuli to either the right or left eye, which results in a translation of the figure across the fronto-parallel plane, did modulate the responses of the neurons. A large population of neurons (69%, 41/59) modulated their responses for the right or left eye depending on the location of the stimuli as shown in Fig. 2B (ANOVA, P < 0.05). Although we do not know the exact cause of the modulation by the translation of the monocular figure across the fronto-parallel plane, one explanation could be that the neurons were responding to the combined shape of the fixation point and the stimulus. Because the fixation point was visible during the presentation of the figures, and because the figures were presented on the fixation point, the translation of monocular figures resulted in a change of the compound shape of the visual stimuli, which might have modulated the responses of the neurons. In any case, the modulation by translation in one eye did not explain the selectivity for binocular disparity of the cell shown in Fig. 2B, as described above. We now examine whether this holds true for other neurons.
We conducted a two-way ANOVA between each monocular responses (i.e., responses to the right or left eye) and the binocular responses to the most effective stimulus for each of the 39 disparity-selective neurons from which we were able to obtain monocular responses. The main factors were ocularity (binocular and right eye, or binocular and left eye) and translation across the fronto-parallel plane. We defined a neuron as having binocular selectivity (not explained by monocular modulation) if the two-way ANOVA showed significance for either ocularity or interaction between ocularity and translation (P < 0.05). Ninety percent (35/39) of the neurons showed a significant effect for either ocularity or interaction, indicating that for the majority of neurons, the binocular responses did not correspond to responses to either the right or left eye.
We also calculated whether a linear summation or an average of the monocular responses could explain the binocular responses. We conducted a two-way ANOVA between the summation and the average of the net monocular responses and the net binocular responses to the most effective stimulus for the 39 disparity-selective neurons. We defined a neuron as having binocular selectivity explainable by a linear summation or an average of the monocular responses if they showed no significance for both ocularity and the interaction between ocularity and translation (P > 0.05). Only 2 of the 39 neurons did not show significance for both ocularity and the interaction for the linear summation, and only one did not show significance for the average. The results show that for most neurons, a linear summation or an average of the monocular responses cannot explain the binocular responses.
Shape selectivity of disparity-selective neurons
Next, we examined whether disparity-selective IT neurons were also selective for shape. If disparity-selective neurons were also selective for shape, the integration of shape and disparity information in a single neuron would enable them to encode the depth of a particular shape. Figure 8 shows two examples of neurons selective for both disparity and shape. Figure 8A shows the responses of the neuron in Fig. 2 to horizontal and vertical bars, a cross, a star, and a circle. The response magnitude changed depending on the shape of the stimulus. Two-way ANOVA revealed a main effect for shape and disparity (P < 0.0001), and also the interaction between shape and disparity (P < 0.0001).
|
For 36 disparity-selective neurons, we examined the disparity selectivity for more than three shapes. Two-way ANOVA revealed that all of the 36 neurons were selective for shape in that they exhibited a main effect for shape (P < 0.05). A majority of neurons (78%, 28/36) also exhibited a significant interaction between shape and disparity (P < 0.05).
The interaction between shape and disparity information observed in the two-way ANOVA, however, seems to be a side effect of shape selectivity. The neuron in Fig. 8A exhibited a significant interaction (P < 0.0001), while it exhibited a near type response for most of the figures presented, and none of the figures produced a far type response. The same observation holds for the neuron in Fig. 8B. This neuron exhibited a far type response for all the figures and never exhibited a near type response (interaction, P < 0.0001). The interaction seems to be observed because some figures elicited no or weak response, or because responses to some shapes were more tuned to a particular disparity than those to others. Overall, the neurons seemed to have the same type of disparity preference (i.e., such as near preference or far preference) when presented with different shapes. Indeed, only 5 of the 36 neurons changed their disparity tuning (i.e., near, far, tuned excitatory, and tuned inhibitory) when presented with different shapes. Of the five neurons that did change their disparity tuning, three changed from near or far to tuned excitatory or tuned inhibitory, while only two changed from near to far. Thus the peak of the disparity tuning curve did not change much.
To quantitatively show that neurons did not change their disparity preference, we focused on 28 disparity-selective neurons that we examined the disparity selectivity of more than 3 shapes, and that had near or far preference. First, we calculated the average response to crossed and uncrossed disparity cues for the shape that gave the maximal response. We termed the neuron as having a near or far preference if the average responses to crossed disparity cues were larger or smaller than those to uncrossed disparity cues, respectively (t-test, 2-tailed, P < 0.05). Then we calculated the average responses to crossed and uncrossed disparity cues for the other shapes tested.
In Fig. 9, we plotted the average responses to crossed disparity cues after subtracting those to uncrossed disparity cues for the shape that gave the maximal response on the abscissa, and the average responses to crossed disparity cues after subtracting those to uncrossed disparity cues for other sub-optimal shapes on the ordinate. If the type of disparity tuning does not change, there should be a positive correlation between the abscissa and the ordinate. Open circles correspond to shapes for which the neuron exhibited a significant near or far preference, whereas closed circles correspond to those without significant preference. A majority of open circles (95%, 35/37) were found in the top right quadrant or the bottom left quadrant, indicating that they had the same type of disparity preference as the shape that gave the maximal response. Furthermore, there was a strong positive correlation between the abscissa and the ordinate (r = 0.81, P < 0.0001, n = 71), which supported the idea that most neurons have the same type of disparity preference even if they are presented with different shapes. Therefore although the two-way ANOVA revealed an interaction between shape and disparity information, a majority of neurons did not change their disparity preference depending on the shape presented, but rather only changed their response magnitude and/or tuning width.
|
Position invariance of disparity selectivity
Next, we examined whether the responses of the disparity-selective neurons were invariant for position. If a neuron was selective for disparity on a stimulus shape presented at the center of the monitor, we presented the same disparity-added shape outside the center of the monitor in random sequence. Figure 10A shows an example of a neuron responding to disparity-added shapes presented at different locations. The stimuli were presented 2° from the center to the ipsilateral, contralateral, upper, or lower visual fields. This neuron had a far cell type response, regardless of whether the stimulus was presented in the center, ipsilateral, contralateral, or lower visual field, although the response was weaker and modulated to the largest extent by disparity (i.e., the disparity tuning index was the largest) at the center. However, the different degrees of modulation between different stimulus locations might be attributable to the fact that we did not present the stimuli at the center in the same block of trials as those in which the stimuli were presented outside the center.
|
We examined a total of nine disparity-selective neurons at five positions as above, and five other disparity-selective neurons in three positions (center, 2° to the ipsilateral and contralateral visual field, an example is shown in Fig. 10B). Twelve of the 14 neurons exhibited a near or far response to the stimulus presented at the center (t-test, 2-tailed, P < 0.05). For these neurons, we plotted the average responses to crossed disparity cues after subtracting those to uncrossed disparity cues for the stimulus presented at the center on the abscissa, and the average responses to crossed disparity cues after subtracting those to uncrossed disparity cues for other positions on the ordinate (Fig. 11). If the type of disparity tuning does not change, there should be a positive correlation between the abscissa and the ordinate. Open circles correspond to locations where the neuron produced a significant near or far preference, whereas closed circles correspond to those without significant preference. A majority of open circles (77%, 23/30) were in the top right quadrant or the bottom left quadrant, indicating that they had the same type of disparity preference as the stimulus presented in the fovea. Furthermore, there was a strong positive correlation between the abscissa and the ordinate (r = 0.65, P < 0.0001, n = 40), which supported the idea that for most neurons, the type of disparity preferred is position invariant.
|
In the present study, we did not map the receptive field of the
neurons. Thus we do not know how much the translation of the figures
covered the entire receptive field. The receptive fields of IT neurons
are known to be large (Boussaoud et al. 1991;
Kobatake and Tanaka 1994
). According to Kobatake
and Tanaka (1994)
, the average sizes of the receptive fields
were 16.5 and 5.4° for areas TEd and TEOd, respectively. The
receptive fields of most of the neurons covered the center of vision.
Therefore the figures were presented presumably inside the receptive
field of the neurons analyzed in the present study. For neurons in
TEOd, the translation of 2° presumably covered most of the receptive
field. For neurons in area TEd with large receptive fields, however,
the translation of 2° presumably did not cover much of the entire
receptive field. Nonetheless, to the extent that we were able to
examine, the type of disparity preference was position invariant for
most IT neurons.
Local clustering of disparity-selective neurons
In some cases, we were able to record from more than one single neuron with a single electrode; from two single units or from one single and a background multiple unit. We next report on the results of such simultaneous recordings from multiple neurons.
For multiple unit recordings, we used two conventional window discriminators. Recordings from more than one single unit were made only when the isolation of one single unit (or 2 single units) was reliable, to prevent contamination of spikes. The upper level of one window discriminator was set substantially below the lower level of the other window discriminator to exclude the spikes of the single unit from the other single unit or the multiple unit. The spontaneous firing rate of the multiple unit was on average 7.3 times higher than that of the single unit. Therefore we recorded from approximately seven single neurons in our multiple unit recording.
Figure 12A shows the
disparity tuning curves from two single neurons recorded simultaneously
from the same electrode. Both neurons were near tuned, had a large peak
at 0.4° and a small peak at 0.4°, and exhibited a strong
inhibition of responses below their spontaneous firing rates. To
analyze the similarity of the responses of the two neurons, we
calculated Pearson's correlation coefficient of the average response
magnitude to disparity stimuli between the two neurons. Values near 1 indicate that the response selectivity for disparity was similar for
the two neurons. We will refer to this value as the response
correlation. The response correlation r for this pair of
neurons was 0.95 (P < 0.0001).
|
Figure 12B shows the disparity tuning curves from one single
neuron and the background multiple unit recorded simultaneously from
the same electrode. The single neuron is the same as the one shown in
Fig. 2. Both of the units were near tuned, and peaked at 0.4°. The
response correlation r for this pair of units was 0.85 (P < 0.005). The spontaneous firing rate of the
multiple unit was 14 times larger than that of the single unit.
We recorded simultaneously from 30 pairs of units (13 pairs of 2 single
neurons and 17 pairs of a single unit and a multiple unit) in which at
least one of the neurons was selective for disparity. Figure
13A shows the distribution
of the response correlation for these pairs of units. Only the
responses to the most effective shape were analyzed for each unit. The
overall distribution was shifted toward positive values (median = 0.30). The distribution was also similar for single unit-single unit
pairs () and single unit-multiple unit pairs (
). We also
calculated the response correlation for pairs of units that were not
recorded simultaneously; i.e., pairs of units from different locations.
We calculated all possible combinations of response correlations for
the 60 units from 30 recording sites, and excluded those calculated
from the same recording site (Fig. 13B). The distribution of
response correlation from different recording sites peaked near 0 (median =
0.01), showing no overall correlation of disparity
selectivity between pairs of neurons recorded apart. The distributions
shown in Fig. 13, A and B, are significantly
different (Mann-Whitney U test, P < 0.0005).
|
The results indicate that nearby neurons have similar disparity selectivity; in other words, neurons with similar disparity selectivity are clustered together. This suggests that a functional module of disparity-selective neurons exists in the IT.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The present study shows that IT neurons process binocular
disparity in addition to color (Komatsu et al. 1992),
texture (Sáry et al. 1995
), and 2-D shape
(Desimone et al. 1984
; Gross et al. 1972
;
Tanaka et al. 1991
). More than half of the neurons in
the IT were disparity selective as well as shape selective, and their disparity selectivity was invariant for position. This suggests that
similar disparity and shape information from different retinotopic locations converges onto a single IT neuron. Furthermore, we showed that neurons with similar disparity selectivity are clustered together.
Together with our recent findings on the role of the interaction
between shape and disparity information (Uka et al. 1997
), the present results suggest that the IT plays an
important role in the reconstruction of 3-D surfaces.
Disparity selectivity in IT and other visual areas
Poggio and Fischer (1977) were the first to
describe disparity-selective neurons in areas V1, V2, and V3 in awake
monkeys. Subsequent studies showed that there exist a larger population of disparity-selective neurons in the thick stripe, compared with the
thin and pale stripes in V2 (Hubel and Livingstone 1987
;
Peterhans and Von der Heydt 1993
; Roe and
Ts'o 1995
). Along with the findings that many neurons in the
dorsal visual pathway leading to the posterior parietal cortex are
disparity selective (Maunsell and Van Essen 1983
;
Roy et al. 1992
; Sakata et al. 1997
), the
dorsal visual pathway has been thought to be important for stereopsis (e.g., Sakata et al. 1997
). Results of positron-emission
tomography (PET) studies in human subjects (Gulyas and Roland
1994
) are also consistent with this idea. Activation was
observed mainly in the occipital and parietal regions, but not in the
temporal regions, during visual discrimination of binocular disparity.
The present results show that the dorsal visual pathway is not the only
pathway that processes disparity information. This is consistent with a
recent finding that IT neurons respond to disparity gradients
(Janssen et al. 1999). Rather than having an independent
pathway for disparity processing, the visual system seems to process
disparity information in parallel in several visual cortical areas,
each possibly playing different roles in stereopsis. Further studies
are necessary to distinguish among the different roles of different
visual areas in stereopsis.
DeAngelis and Newsome (1999) recently reported that
disparity-selective neurons in MT are clustered to form a columnar
organization. The observation that disparity-selective IT neurons are
locally clustered is consistent with this finding, although we do not know whether disparity-selective IT neurons are clustered into columns.
Clustering of disparity-selective neurons may be a common feature
across different visual areas. It should be examined whether disparity-selective neurons in other areas are also clustered. It would
also be interesting to investigate the relationship between disparity
clusters and columns for visual features of objects (Fujita et
al. 1992
) in the IT.
How do IT neurons become disparity selective?
The binocular responses of disparity-selective IT neurons could
not be explained by either a linear summation or an average of the
monocular responses. This is unlike simple cells in V1 that encode
binocular disparity with a characteristic receptive field structure
(Anzai et al. 1997; DeAngelis et al.
1991
). The responses of disparity-selective IT neurons are more
similar to those of complex cells, in which neurons are sensitive to
small binocular disparity change, although they are not due to the
translation of stimuli in each eye (Ohzawa et al. 1990
).
Just as disparity-selective complex cells are thought to receive
information from disparity-selective simple cells (Ohzawa et al.
1997
), it is likely that IT neurons do not detect binocular
disparity per se, but receive information on disparity from neurons in
earlier visual areas. The fact that major inputs into the IT do not
arise from neurons that carry information on monocular images, such as
those in the LGN or V1, is also consistent with this idea (but
see Yukie and Iwai 1981
).
Where do disparity-selective IT neurons receive their inputs from? As a
larger population of disparity-selective neurons exists in the thick
stripes than in the thin and pale stripes in V2 (Hubel and
Livingstone 1987; Peterhans and Von der Heydt
1993
; Roe and Ts'o 1995
), a majority of
disparity information fed to the dorsal visual pathway is thought to
rise from disparity-selective neurons in the thick stripes. On the
other hand, projections to the IT are known to arise from the thin and
pale stripes of V2, via V4 (Felleman et al. 1997
). Thus
a major candidate pathway for conveying disparity information into the
IT would be from the disparity-selective neurons in the thin stripes,
albeit their having a smaller population of disparity-selective neurons
than the thick stripes (Peterhans and Von der Heydt
1993
), through V4. Another candidate pathway would be from the
disparity-selective neurons in the thick stripe, into V4, via some
other cortical areas such as V3, V3A, or MT. In any case,
disparity-selective neurons should exist in V4. DeYoe and Van
Essen (1985)
reported that some neurons in V4 have binocular interaction, although no study has ever extensively examined the effects of binocular disparity on the responses of V4 neurons (but see
Felleman and Van Essen 1987
). Yet another candidate
pathway for conveying disparity information into the IT would be from posterior parietal cortex to areas to the superior temporal sulcus (STS), which in turn projects to the IT.
The result that nearby IT neurons have the same type of disparity selectivity irrespective of the shape or location of the stimulus imposes a constraint on the organization of the inputs into disparity-selective IT neurons. As discussed above, IT neurons possibly do not detect binocular disparity per se, but receive information on disparity from neurons in earlier visual areas. Since nearby IT neurons have the same type of disparity selectivity irrespective of the shape or location of the stimulus, they are likely to receive the same type of disparity information from earlier cortical areas. We propose that disparity information inputs from local disparity detectors into clusters of disparity-selective IT neurons, arising first from V1, come from neurons with similar disparity selectivity, irrespective of their orientation selectivity or receptive field location, at least in central vision.
Role of disparity-selective neurons in the IT
The presence of disparity-selective neurons in the IT does not
necessarily mean that the IT is involved in stereoscopic depth perception. For example, Cumming and Parker (1997)
showed that responses of disparity-selective neurons in monkey V1 do
not correlate with perception of depth, but rather signal information
possibly used for vergence eye movements (Masson et al.
1997
).
Cowey and Porter (1979) found that lesions of the monkey
IT impair the ability of the animal to discriminate depth in random dot
stereograms. Ptito and Zatorre (1988)
and Ptito
et al. (1991)
also found that lesions of the temporal lobe
impair the ability in humans to discriminate depth in random dot
stereograms without disturbing stereoacuity. These results suggest that
the role of the IT or the ventral visual pathway in stereopsis is to
reconstruct 3-D surfaces from local disparity cues.
Cowey (1985) later described that lesions of the monkey
IT impaired stereoacuity. Although this finding contradicts those reported in humans by Ptito et al. (1991)
, it is
possible that disparity-selective IT neurons are involved in
stereoacuity. How the coarse tuning of disparity-selective neurons can
achieve stereoacuity, however, remains to be explained (Lehky
and Sejnowski 1990
).
Recently, we found that a population of disparity-selective IT neurons
are not only selective for local disparity cues. Some disparity-selective IT neurons responded to the depth order of surfaces, irrespective of the type of disparity added (Uka et al. 1997), and also irrespective of whether the cues defining the structure was binocular disparity or occlusion cues (our
unpublished observation). Thus an important role of disparity-selective
IT neurons is to represent the depth order of surfaces reconstructed from local disparity cues, and not to detect and signal the local disparity cues per se. Consistent with these findings, Janssen et al. (1999)
recently described evidence that some IT neurons respond to disparity gradients rather than the local disparity cues.
This suggests that IT neurons may be involved in the reconstruction of
3-D shape.
Another possible role of disparity-selective IT neurons is to process
disparity cues to reconstruct shape from disparity, such as in random
dot stereograms. Indeed, we have recently found that some neurons in
the IT respond to shape in random dot stereograms (Tanaka et al.
1999).
In spite of the various speculations, the role of disparity-selective IT neurons still remains unknown. Future studies correlating the responses of IT neurons with a particular behavior might clarify the role of disparity-selective IT neurons in stereopsis.
![]() |
ACKNOWLEDGMENTS |
---|
We thank Dr. Gregory C. DeAngelis for valuable comments on the manuscript.
This work was supported by grants to I. Fujita from Core Research for Evolutional Science and Technology of the Japan Science and Technology Corporation, Science and Technology Agency (Special Coordination Funds for Promoting Science and Technology), and the Ministry of Education, Science, Sports, and Culture (09268222). T. Uka and H. Tanaka are recipients of the Japan Society for the Promotion of Science Research Fellowship for Young Scientists.
![]() |
FOOTNOTES |
---|
Address for reprint requests: I. Fujita, Laboratory for Cognitive Neuroscience, Dept. of Biophysical Engineering, Graduate School of Engineering Science, Osaka University, Machikaneyama 1-3, Toyonaka, Osaka 560-8531, Japan (E-mail: fujita{at}bpe.es.osaka-u.ac.jp).
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 15 November 1999; accepted in final form 15 March 2000.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|