Group in Vision Science, School of Optometry, University of California, Berkeley, California 94720-2020
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Anzai, Akiyuki, Izumi Ohzawa, and Ralph D. Freeman. Neural Mechanisms for Processing Binocular Information II. Complex Cells. J. Neurophysiol. 82: 909-924, 1999. Complex cells in the striate cortex exhibit extensive spatiotemporal nonlinearities, presumably due to a convergence of various subunits. Because these subunits essentially determine many aspects of a complex cell receptive field (RF), such as tuning for orientation, spatial frequency, and binocular disparity, examination of the RF properties of subunits is important for understanding functional roles of complex cells. Although monocular aspects of these subunits have been studied, little is known about their binocular properties. Using a sophisticated RF mapping technique that employs binary m-sequences, we have examined binocular interactions exhibited by complex cells in the cat's striate cortex and the binocular RF properties of their underlying functional subunits. We find that binocular interaction RFs of complex cells exhibit subregions that are elongated along the frontoparallel axis at different binocular disparities. Therefore responses of complex cells are largely independent of monocular stimulus position or phase as long as the binocular disparity of the stimulus is kept constant. The binocular interaction RF is well described by a sum of binocular interaction RFs of underlying functional subunits, which exhibit simple cell-like RFs and a preference for different monocular phases but the same binocular disparity. For more than half of the complex cells examined, subunits of each cell are consistent with the characteristics specified by an energy model, with respect to the number of subunits as well as relationships between the subunit properties. Subunits exhibit RF binocular disparities that are largely consistent with a phase mechanism for encoding binocular disparity. These results indicate that binocular interactions of complex cells are derived from simple cell-like subunits, which exhibit multiplicative binocular interactions. Therefore binocular interactions of complex cells are also multiplicative. This suggests that complex cells compute something analogous to an interocular cross-correlation of images for a local region of visual space. The result of this computation can be used for solving the stereo correspondence problem.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Complex cells are nonlinear computing devices.
This was already apparent in Hubel and Wiesel's original description
of complex cells in the cat's striate cortex (Hubel and Wiesel
1962). They observed that receptive fields (RFs) of complex
cells generally do not show discrete ON and OFF
subregions, but appear to consist of overlapping ON and
OFF regions. The subregions, when found, do not follow the
rules of summation between ON (or OFF)
subregions and antagonism between ON and OFF
subregions. In fact, complex cells respond to a stimulus regardless of
its position within the RF. Otherwise, like simple cells, they exhibit
selectivity to stimulus orientation. As a possible scheme for
explaining complex cell RFs, Hubel and Wiesel (1962)
proposed a hierarchical model in which simple cells with similar
orientation preferences but different RF positions feed into a complex cell.
Because complex cells do not satisfy the principle of linear
superposition, their first-order responses (e.g., responses to single
bars) do not predict their RF properties such as tuning for
orientation, spatial frequency, and binocular disparity. However, researchers have found that second-order responses (e.g., responses to
pairs of bars) do provide useful predictions for RF properties of
complex cells. For example, Movshon et al. (1978)
measured two-bar interaction profiles of complex cells in the cat's
striate cortex and found that the interaction profiles along the
direction perpendicular to the cells' preferred orientations exhibit
ON and OFF subregions similar to those of
simple cells (see also Baker and Cynader 1986
;
Gaska et al. 1994
; Rybicki et al. 1972
). They showed that the inverse Fourier transform of spatial frequency tuning measured with drifting sinusoidal gratings agrees well with the
two-bar interaction profile (see also Gaska et al.
1994
). This suggests that there are linear subunits underlying
the RFs of complex cells. Later, Szulborski and Palmer
(1990)
measured two-dimensional profiles of the second-order
interaction using a pair of small square or rectangular stimuli and
found that the interaction profile consists of ON and
OFF subregions that are elongated along the axis of a
cell's preferred orientation (see also Heggelund 1981
).
The second-order interaction also has been examined in the joint
space-time domain (Baker and Cynader 1986;
Emerson et al. 1987
, 1992
; Gaska et al.
1994
; Movshon et al. 1978
). Emerson et al. (1987
, 1992
) measured two-bar interactions exhibited by
complex cells in the cat's striate cortex using ternary white noise.
They found that direction-selective complex cells exhibit
space-time-oriented interaction profiles, indicating that underlying
subunits are direction selective. Because the interaction does not
depend on the positions of the two bars within the RF as long as the
interspacing and time offset of the bars are kept constant, they
concluded that subunits are distributed uniformly across the RF.
Ohzawa et al. (1990, 1997
) examined the second-order
interaction between the two eyes by measuring binocular interaction
profiles of complex cells in the cat's striate cortex with a pair of
bars (1 in each eye) flashed randomly across the RF. They found that the profiles are largely independent of the monocular stimulus position. That is, complex cells respond to bars regardless of their
monocular positions as long as the interocular spatial offset, i.e.,
binocular disparity, is kept constant (see also von der Heydt et
al. 1978
for a similar observation). These results suggest that
underlying subunits are binocular and share the same optimal binocular
disparity (see also Ohzawa and Freeman 1986
).
All of these studies indicate that complex cells are composed of
subunits that are, to a first approximation, linear. Subunit RFs are
strikingly similar to those of simple cells, and they seem to determine
RF properties of complex cells. Therefore these results are consistent
with the hierarchical model of Hubel and Wiesel (1962).
However, it should be noted that these measured subunits do not
necessarily represent individual afferent neurons. Because second-order
interaction profiles are likely to reflect responses of multiple
afferent neurons, the subunits should be regarded as
functional (Emerson et al. 1987
; see also
Szulborski and Palmer 1990
) rather than cellular units.
The above-mentioned studies also suggest that subunits that feed into a
complex cell are relatively homogeneous in the sense that they share
some of the same optimal stimulus parameters, such as orientation,
spatial frequency, direction selectivity, and binocular disparity.
However, because complex cells respond to bright and dark stimuli at
the same location within the RF, ON and OFF
subregions of the subunits need to overlap extensively to make up a
complex cell RF. In other words, the spatial relationship of subunit
RFs must conform to one of the following conditions: the RFs are
located at different positions as Hubel and Wiesel (1962) originally suggested; they are at the same position but have different spatial phases; or they are at different positions and
have different spatial phases.
Various models of complex cells have been proposed using one of the
spatial relationships among subunit RFs (e.g., Cavanagh 1984; Glezer et al. 1980
, 1982
; Pollen
and Ronner 1983
; Pollen et al. 1989
;
Spitzer and Hochstein 1985
, 1988
). For instance, Pollen
et al. (Pollen and Ronner 1983
; Pollen et al.
1989
) proposed that a complex cell consists of four subunits: a
pair of even- and odd-symmetric subunits (a quadrature pair) and their
sign-inverted versions. This is an attractive model from a
computational point of view because these four subunits are sufficient
to represent a local Fourier spectrum of the stimulus (Pollen
and Ronner 1982
; Pollen et al. 1989
) and are
building blocks of what is known as an energy model (Adelson and
Bergen 1985
; Watson and Ahumada 1985
).
A linear filter followed by a squaring device and then an integrator is
called an energy detector (Green and Swets 1966). It
generally is modeled as two linear band-pass filters that are in a
quadrature phase relationship, with the outputs of the linear filters
squared and then summed (Adelson and Bergen 1985
;
Watson and Ahumada 1985
). The model of Pollen et al.
(Pollen and Ronner 1983
; Pollen et al.
1989
) described in the preceding text is a more physiologically
plausible variation of the energy model in that each linear filter is
subdivided further into two linear filters that are sign-inverted
versions of each other, and their outputs are rectified before being
squared and summed. As a model for binocular complex cells,
Ohzawa et al. (1990)
proposed a binocular version of the
energy model that responds to the stimulus energy associated with
binocular disparity. The model provides a good first approximation to
binocular interactions exhibited by complex cells in the cat's striate
cortex (Ohzawa et al. 1990
, 1997
). In fact, the
second-order interactions exhibited by complex cells described earlier
(Emerson et al. 1987
, 1992
; Movshon et al.
1978
; Ohzawa et al. 1990
, 1997
;
Szulborski and Palmer 1990
; see also Baker and
Cynader 1986
; Gaska et al. 1994
; Rybicki
et al. 1972
) are all, at least qualitatively, consistent with
an energy model.
Although the energy model has been increasingly popular for complex
cells (e.g., Emerson et al. 1992; Fleet et al.
1996
; Ohzawa et al. 1990
, 1997
; Pollen et
al. 1989
; Qian 1994
; Qian and Zhu 1997
), quantitative evaluations of the model have been limited. In particular, binocular properties of subunits that underlie complex
cells have not been examined to determine if they are consistent with
subunit components of an energy model.
Here, the analysis of nonlinear binocular interactions is extended to
complex cells to learn about how they process binocular information.
Binocular interaction RFs and monocular RFs of complex cells in the
cat's striate cortex are measured with spatiotemporal white noise
generated according to binary m-sequences (Sutter 1992).
Through the examination of binocular interaction RFs, the responses of
complex cells to binocular disparity is described. Functional subunits
that underlie individual complex cells are estimated by applying
singular value decomposition (SVD) on the binocular interaction RF of
each cell. To evaluate an energy model for complex cells, the number of
subunits as well as the RF properties of subunits are compared with
those predicted by the energy model. Phase and position disparities
between left and right eye RFs of subunits also are estimated to
address the issue of how complex cells encode binocular disparity.
Results of these analyses provide important clues for understanding the
neural computations performed by binocular complex cells and the
participation of subunits in the computations. Possible functional
roles of complex cells in processing binocular information are considered.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Details of surgical and histological procedures, apparatus, and
recording methods are identical to those described in the preceding
papers (Anzai et al. 1999a,b
) Binocular interaction RFs
and monocular RFs of complex cells are measured using dichoptic one-dimensional (1D) binary m-sequence noise (for details of the stimulus configuration, see Anzai et al. 1999a
). The RFs
are constructed as described in Anzai et al. (1999b)
.
The binocular interaction RFs are decomposed into those of functional
subunits that underlie complex cells using the singular value
decomposition (SVD). Phase and position disparities between left and
right eye RFs of subunits are estimated to determine the relative
contribution of the two disparities to the encoding of binocular
disparity through complex cells.
SVD of binocular interaction RFs
To estimate functional subunits that underlie binocular complex
cells, an SVD is performed for each cell on its binocular interaction
RF at the optimal correlation delay (the delay at which the sum of
squared values of all data points in the RF is maximum). The SVD is a
standard technique of linear algebra (e.g., Press et al.
1992) that can be used to obtain a description of data in terms
of orthogonal (quadrature) components, i.e., components that are
mutually uncorrelated. The original data are described as a linear sum
of the SVD components, which are ordered such that each component
accounts for a progressively smaller fraction of the total variance in
the data. Mathematically, the SVD is equivalent to principal component analysis.
Performed on the binocular interaction RF (B) of a complex cell, the SVD breaks the RF into a number of binocular interaction RFs, each of which represents an SVD component (see Fig. 3 for an example of SVD). The SVD components are considered subunits of the complex cell. However, it should be noted that the SVD components do not necessarily represent actual afferent neurons underlying the cell. Rather they are likely to represent a combination of multiple afferent neurons. Therefore they should be regarded as functional subunits.
The binocular interaction RF of each SVD component is described by the
product of left (L) and right (R) eye RFs, weighted by a constant (W).
In a matrix notation, the SVD is formulated as
![]() |
(1) |
![]() |
(2) |
To estimate a noise level for each SVD component, the SVD also is
conducted on binocular interaction RFs that contain only noise. For
each cell, binocular interaction RFs are obtained at noncausal
correlation delays (the delays for which the response precedes the
stimulus) ranging from 45 to
240 ms at a 5-ms interval, and an SVD
is performed on each RF. Then weights of the noise SVD components are
averaged separately for each component order. The mean weights (e.g.,
Fig. 3B,
) represent estimated noise levels for the SVD
components obtained from the RF at the optimal correlation delay.
Estimating interocular RF disparities of SVD components
Monocular RFs of the first and second SVD components are fitted
with a 1D Gabor function (see Anzai et al. 1999a for
details of the fitting procedure). Then RF position and phase
disparities of the first SVD component are computed for each cell by
applying a reference-cell method (Anzai et al. 1999a
) to
the first two SVD components of the same cell rather than two different
cells. An RF phase disparity of the first SVD component is obtained as the difference in RF phase between the left and right eye RFs of the
component. An RF position disparity of the first SVD component is
obtained as the difference in RF position between the two eyes, while
left and right eye RFs of the second SVD component (a reference) are
assumed to be at retinal correspondence (i.e., 0 RF position disparity). Therefore RF position disparities measured here are relative position disparities and are subjected to a statistical analysis to estimate true position disparities. See Anzai et al. (1999a)
for formal definitions of the RF disparities and a
statistical analysis of the RF position disparity.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Monocular RFs and binocular interaction RFs have been obtained for 64 binocular complex cells in 15 adult cats. Of these, 48 cells exhibited significant binocular interactions and are analyzed here. The remaining 16 cells showed very weak, if any, binocular interactions due to low signal-to-noise ratios (5 cells were nevertheless strongly responsive to stimulation of either eye; the other 11 cells were either ocularly unbalanced, responding almost exclusively to only one eye, or were not responsive to stimulation of either eye). These cells have been excluded from the analysis.
Examples of monocular RFs and binocular interaction RFs
Figure 1 shows examples of monocular RFs (L and R) and binocular interaction RFs (B) for six complex cells. Most complex cells respond to bright and dark stimuli at the same location in space. Therefore their monocular RFs, the responses to bright stimuli minus the responses to dark stimuli, are in general relatively flat (e.g., Fig. 1, C and F), although there are some cells that exhibit significant residual responses in monocular RFs (e.g., Fig. 1, A and D).
|
The binocular interaction RF is a profile of responses to stimuli of
matched polarity (bright-bright and dark-dark) in the two eyes minus
the responses to stimuli of mismatched polarity (bright-dark and
dark-bright) in the two eyes. It represents the responses attributable
to nonlinear binocular interaction. Unlike simple cells, complex cells
exhibit binocular interaction RFs that are not left-right separable, as
shown in Fig. 1. Instead their binocular interaction RFs consist of
subregions that are elongated along the front-parallel axis
XF. Therefore responses of complex
cells are largely independent of monocular stimulus position or phase
as long as the binocular disparity of the stimulus is kept constant. In
other words, complex cells are truly tuned to binocular disparity.
Similar observations have been made by von der Heydt et al.
(1978) and Ohzawa et al. (1990
, 1997
). The profile along the binocular disparity axis D varies from cell to cell,
suggesting that each cell has a different tuning function for binocular
disparity. These binocular interaction RFs are qualitatively consistent
with those predicted by a binocular-disparity energy model
(Ohzawa et al. 1990
, 1997
). In the next section, the
binocular interaction RFs are examined to see if they agree
quantitatively with the predictions of the energy model.
Singular value decomposition (SVD) of binocular interaction RFs
Binocular complex cells have been modeled as detectors of the
stimulus energy associated with binocular disparity (Fleet et al. 1996; Ohzawa et al. 1990
, 1997
; Qian
1994
). Figure 2 shows the
structure of the model. It consists of two major units, each of which
is enclosed by a dashed line in the figure. These units are said to be
in quadrature, i.e., the spatial phases of monocular RFs for one unit
and those for the other are 90° apart. Thus the model responds to a
stimulus independent of its spatial phase. Each quadrature unit
consists of two simple cell-like subunits, each of which is modeled as
a linear binocular filter followed by a half-squaring nonlinearity (the
structure described for simple cells in the previous paper,
Anzai et al. 1999b
). These subunits have monocular RF
profiles that are sign-inverted versions of each other so that the
model responds equally to both bright and dark bars at the same
location of space.
|
Because outputs of the subunits are combined linearly in this model, the binocular interaction RF of the model is a sum of binocular interaction RFs for individual subunits. Therefore if binocular complex cells are consistent with the model, then one should be able to describe their binocular interaction RFs as a sum of binocular interaction RFs for subunits that are in quadrature. To test this prediction, SVD has been performed on the binocular interaction RF of each complex cell to obtain a description of the RF in terms of orthogonal (quadrature) components (see METHODS for details about the SVD). The SVD components are ordered such that each component accounts for a progressively smaller fraction of the total variance in the data. Because an energy model consists of a pair of units that are in quadrature, the model predicts that the number of SVD components necessary to represent its binocular interaction RF is two. Note that the subunits comprising each quadrature unit are not independent, but are sign-inverted versions of each other. Therefore these subunits would be represented by a single SVD component.
Figure 3 shows an example of the SVD
analysis. The monocular RFs and binocular interaction RF of the raw
data are shown in Fig. 3A. The SVD has been conducted on the
binocular interaction RF to obtain 16 mutually uncorrelated components.
Weights of the components are shown in Fig. 3B () along
with mean weights of components obtained from the SVD performed on
estimated noise in the binocular interaction RF (
). Only the first
two components have weights that are significantly above those of the
noise SVD components. These two components account for >80% of the
variance in the raw data. The percentage goes up to 95% if the
variance accounted for by noise is subtracted. Binocular interaction
RFs and monocular RFs of the first six components are shown in Fig. 3,
C-H. The first two components exhibit monocular and
binocular interaction RFs that are strikingly similar to those of
simple cells (see Anzai et al. 1999b
). Although the
binocular interaction RF of an SVD component is, by definition
(Eq. 1) the product of its left and right eye RFs, the
actual shape of the RFs is derived by the data. These results are
consistent with the prediction of an energy model.
|
However, slight deviations from the prediction also have been observed for some cells. In Fig. 4, another example of the SVD analysis is shown. As in the previous example, there are two major components (the 1st and 2nd) that account for ~85% of the variance in the raw data. Their RFs are very much like those of simple cells. In addition to these two components, this cell also exhibits two weak but significant components (the 3rd and 4th) whose RFs do not resemble those of simple cells. Altogether, the first four components account for >95% of the total variance in the data. The existence of the third and fourth components suggests that relationships among subunits of complex cells may not be as constrained as those of an energy model. An interpretation of these extra components is considered later in the DISCUSSION.
|
To examine if individual cells are consistent with an energy model, the
minimum number of SVD components necessary to represent the binocular
interaction RF is estimated for each cell. The number is determined by
dividing a plot of component weights into two portions according to the
rate of change in component weight (Scree test) (Gorsuch
1983) and counting the number of components in the first
portion. For example, a plot of component weights shown in Fig.
3B consists of two parts: a quickly decreasing part (the first 2 components) and a more gradually and linearly decreasing part
(the third and the rest of the components). The latter portion is
virtually indistinguishable from the noise level and is not necessary
to represent the binocular interaction RF. Therefore the number of SVD
components for this cell is considered to be two. Likewise, the minimum
number of SVD components for the cell shown in Fig. 4 is determined to
be four. For most cells examined, the transition between the two
portions is abrupt and quite obvious. However, some cells exhibit
transitions that are gradual, and it is not immediately clear how many
components these cells should be considered to have. In such cases, the
latter portion is determined first as a gradually and linearly
decreasing part, and the remaining part then is assigned to the first portion.
In Fig. 5A, a histogram of the minimum number of SVD components for the population of complex cells examined is shown. The majority (56%) of the cells exhibit two components. Therefore these cells are consistent with an energy model. Almost all of the remaining cells exhibit either three or four SVD components, indicating that binocular complex cells are composed of only a small number of functional subunits that are linearly independent.
|
Although the existence of the extra components is a clear deviation from the prediction of an energy model, an energy model still provides a good approximation to the data on average. Figure 5B shows a summary of how much variance in the raw data each SVD component accounts for. Each data point is a mean value for the population of complex cells examined. The variance accounted for by noise was subtracted from the data for each cell before the mean was computed. Open circles represent percentages of the total variance accounted for by each SVD component, and open triangles indicate cumulative percentages. Error bars represent ±SD. On average, the first and second SVD components account for ~50 and 30% of the variance in the data, respectively. Each of the remaining components contributes an average of only ~6% or less of the total variance. Therefore in general, binocular interaction RFs of complex cells can be well approximated by the sum of binocular interaction RFs of two units that are in quadrature, i.e., the binocular interaction RF of an energy model.
Comparisons between the first and second SVD components
In addition to the number of underlying functional subunits, an
energy model also makes a prediction about the RF properties of the
subunits. A binocular-disparity energy model assumes that the RF
properties of subunits are the same except for their spatial phases,
which are constrained to be in quadrature. Therefore if binocular
complex cells are consistent with the energy model, then their SVD
components should have all RF parameters but spatial phase in common.
To examine if this prediction holds, left and right eye RFs of the
first and second SVD components were fitted with a 1D Gabor function,
and the center coordinate of the Gaussian envelope, envelope width,
spatial frequency, and binocular phase disparity of the RFs were
extracted (see Anzai et al. 1999a for a definition of
the 1D Gabor function). Figure 6 shows
scatter plots of the RF parameters for the second SVD components
against those for the first SVD components. A slope of unity is
indicated by the solid line. Most of the data points are scattered
around the solid line, indicating that RF properties of the first and second SVD components are very similar. Therefore the first two SVD
components of binocular complex cells are consistent with subunits of
an energy model in regard to RF properties.
|
Interocular RF disparities of SVD components
In the first paper of this series (Anzai et al.
1999a), it is shown that the range of position disparities
between the left and right eye RFs of simple cells is relatively small
compared with that of RF phase disparities. This suggests that RF phase disparity plays a major role in encoding binocular disparity for simple
cells. However, because RF phase disparities of cells tuned to high
spatial frequencies are necessarily small in degree visual angle (deg
VA), RF position disparity may still play an important role in encoding
binocular disparity for those cells.
Because monocular RFs of complex cells do not exhibit
structures that allow one to measure the binocular disparities of RFs, the neural mechanism through which complex cells encode binocular disparity is not well understood. Here this issue is addressed by
examining RF phase and position disparities of SVD components by
applying a reference-cell method (see METHODS) (see also
Anzai et al. 1999a for details of estimating RF phase
and position disparities). The relative contributions of RF phase and
position disparities to the encoding of binocular disparity are
examined in relation to various RF parameters.
DISPARITY HISTOGRAMS.
Figure 7 shows histograms of RF phase and
position disparities for the first SVD components. In Fig.
7A, a histogram of RF phase disparity in degree phase angle
(deg PA) is shown. The distribution is centered around zero, indicating
that components with similar RF profiles in the two eyes are most
numerous. However, there are also many components that exhibit large
disparities, suggesting that their RF profiles are quite dissimilar
between the two eyes. The majority of the components have RF phase
disparities within ±90°. Therefore the first SVD components of most
complex cells satisfy the quarter cycle limit suggested by Marr
and Poggio (1979) for unambiguously encoding binocular
disparity through band-pass filters.
|
RELATIONSHIP BETWEEN POSITION AND PHASE DISPARITIES. Because the overall preference of cells for binocular disparity is determined by the sum of RF phase and position disparities, it would be interesting to know if there is a relationship between the two types of RF disparities. For example, they may always add up or they may always partially cancel each other. Figure 8 shows a scatter plot of position disparity against phase disparity. Data points are scattered widely along the phase disparity axis. Although a linear regression analysis indicates that there is a weak but significant correlation between the two disparities (P = 0.01), the correlation coefficient is only 0.36, and just 13% of the variance in the data are accounted for by the model. Therefore there is a tendency for phase and position disparities to add up, but it is only of marginal significance.
|
RELATIONSHIP BETWEEN DISPARITY AND RF ORIENTATION.
As described in Anzai et al. (1999a) and in previous
studies (DeAngelis et al. 1991
, 1995
; Ohzawa et
al. 1996
), profiles of left and right eye RFs are relatively
well matched for simple cells tuned to horizontal orientations, whereas
those for cells tuned to vertical orientations are predominantly
dissimilar. However, this is not the case for the first SVD components
of binocular complex cells. In Fig.
9A, magnitudes of phase
disparities in deg PA are plotted as a function of RF orientation (the
cell's optimal orientation for gratings). Orientations of 0 and 90°
correspond to horizontal and vertical, respectively. Phase disparities
of SVD components for complex cells tuned to horizontal orientations (
20°) are widely scattered. Because simple cells tuned to
horizontal orientations do not exhibit phase disparities >90 deg PA
(Anzai et al. 1999a
), it seems unlikely that they can
account for the large phase disparities of the SVD components at
horizontal orientations. Therefore it is possible that some of the
complex cells that are tuned to horizontal orientations receive
nonsimple cell input and are not consistent with the hierarchical model
of Hubel and Wiesel (1962)
. Except for the large phase
disparities at horizontal orientations, the data are generally
comparable with those for binocular simple cells reported in
Anzai et al. (1999a)
.
|
RELATIONSHIP BETWEEN DISPARITY AND RF SPATIAL FREQUENCY. Figure 10 shows how position and phase disparities of the first SVD components depend on RF spatial frequency. In Fig. 10A, magnitudes of phase disparities in deg PA are plotted as a function of RF spatial frequency. No obvious correlation is found in the data. Therefore whether spatial profiles of left and right eye RFs are similar or dissimilar does not depend on the RF spatial frequency of the components.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In this study, white noise analysis has been applied to measurements of binocular interaction RFs and monocular RFs for complex cells in the cat's striate cortex. Binocular interaction RFs of complex cells are found to be elongated along the frontoparallel axis at a particular binocular disparity. In other words, the binocular interaction exhibited by complex cells is independent of monocular stimulus position (within limits) or phase as long as stimulus binocular disparity is kept constant. In this sense, complex cells are truly tuned to binocular disparity. The binocular interaction RF is shown to be well described by a sum of binocular interaction RFs of underlying functional subunits that exhibit simple cell-like RFs and preference for different monocular phases but the same binocular disparity. A majority of the complex cells examined are found to be consistent with an energy model, with respect to the number of subunits, as well as to the relationships between RF properties of subunits. Subunits also exhibit interocular RF disparities that are largely consistent with a phase mechanism for encoding binocular disparity. These results indicate that binocular interactions of complex cells are derived from simple cell-like subunits, which exhibit multiplicative binocular interactions. Therefore binocular interactions of complex cells are also mutiplicative. This suggests that complex cells compute something analogous to the interocular cross-correlation of images within a local region of space. The result of the computation can be used for solving the stereo correspondence problem.
Binocular interaction RF and binocular disparity tuning
As described in the preceding paper (Anzai et al.
1999b), binocular interaction RFs of simple cells are
left-right separable; this indicates that the binocular interaction
depends on monocular phases of the stimulus. On the other hand,
binocular interaction RFs of complex cells consist of subregions that
are elongated along the frontoparallel axis, and they are left-right
inseparable. The inseparable RF presumably is constructed by combining
separable RFs of simple cell-like subunits that exhibit preferences for different monocular phases but for the same binocular disparity. This
eliminates the monocular phase dependency at the complex cell level.
Therefore complex cells respond to a stimulus regardless of its
monocular phase as long as the binocular disparity of the stimulus is
kept constant. Because of this, the binocular interaction RF can be
reduced to a one-dimensional function of binocular disparity by
integrating the RF along the frontoparallel axis (Ohzawa et al.
1997
). The resulting function represents the binocular
disparity tuning of a cell.
Ferster (1981) modeled the binocular disparity tuning of
simple cells in areas 17 and 18 of cats as a cross-correlation between left and right eye RFs. It is shown in the preceding paper
(Anzai et al. 1999b
) that this model is indeed
appropriate for the binocular disparity tuning of simple cells. He also
applied the same model for complex cells (Ferster 1981
).
He measured activity profiles of complex cells using a pair of bars;
one bar was presented to one eye as a conditioning stimulus to raise
the overall response level and the other was swept across the RF of the
other eye to obtain an activity profile. By doing this for each eye, he
obtained left and right eye activity profiles and computed an
interocular cross-correlation of the profiles to predict the binocular
disparity tuning. Although it is not clear if the activity profiles are analogous to the interocular two-bar interaction profiles described in
this study, if they were, then they would correspond to monocular RFs
of underlying functional subunits of a complex cell. Therefore the
binocular disparity tuning obtained as a cross-correlation between the
left and right eye activity profiles is of an underlying functional
subunit (which exhibits a separable binocular interaction RF) rather
than of the complex cell itself (which exhibits an inseparable
binocular interaction RF). To obtain binocular disparity tuning for a
complex cell, one needs to sum interocular cross-correlations of
monocular RFs for all subunits. This may explain why the model worked
better for simple cells than for complex cells (Ferster 1981
). Nonetheless because subunits of a complex cell are
expected to have similar binocular disparity tuning, the model still
should provide a good approximation to the tuning of the cell.
In any case, the results of Ferster (1981) and of this
study suggest that binocular interactions exhibited by complex cells are multiplicative, a direct consequence of inheriting multiplicative binocular interactions from underlying subunits. This has an important implication as to what the functional roles of complex cells might be,
which is discussed in the following text. It should be noted that the
binocular summation at the input stage of subunits is still linear
(Ohzawa and Freeman 1986
), and this is not incompatible with a multiplicative interaction, which is observed at the output stage of complex cells.
Subregions of the binocular interaction RF
The binocular interaction RFs of complex cells consist of
subregions that are elongated along the frontoparallel axis at
different binocular disparities. The subregions of positive values
(solid contours in Fig. 1) can be attributed (but not necessarily
exclusively) to responses to interocular polarity-matched stimuli
(bright or dark bars presented to the 2 eyes), and the subregions of
negative values (dashed contours in Fig. 1) to responses to interocular polarity-mismatched stimuli (a bright bar presented to 1 eye and a dark
bar to the other eye). On the basis of similar observations, Ohzawa et al. (1990) suggested that complex cells
respond to different binocular disparities depending on the interocular
polarity combination of the stimulus. However, because the binocular
interaction RF of a complex cell can be described as a sum of
cross-correlations between left and right eye RFs for underlying
subunits, an alternative interpretation is possible.
Consider a stimulus, say a sinusoidal grating (or a Fourier component of a more complicated stimulus), presented at zero-disparity. Figure 11 shows 1D profiles of the luminance distribution relative to the mean luminance level for the left (L) and right (R) eye images of such a stimulus. The contour plot in the figure is obtained by multiplying the left and right eye stimulus profiles. This plot represents the spatial structure of the interocular cross-correlation for the stimulus, in the sense that integrating it along the frontoparallel axis XF yields the interocular cross-correlation function of the stimulus. The solid and dashed contours represent positive (stimulus polarities are matched between the 2 eyes) and negative (stimulus polarities are mismatched between the 2 eyes) values, respectively. The solid horizontal lines are constant disparity lines that go through solid-contour regions. In other words, along the solid lines, stimulus polarities between the two eyes are always matched. On the other hand, stimulus polarities between the two eyes are always opposite (i.e., only dashed-contour regions are found) along the dashed horizontal lines. The dashed lines indicate constant disparities that are shifted from the disparities indicated by the solid lines by an amount that is equivalent to 180 deg PA of the sinusoidal grating. Therefore an extended periodic (or band-pass filtered) stimulus has an interocular cross-correlation structure that consists of two types (polarity-matched and polarity-mismatched) of subregions at different binocular disparities, despite the fact that the stimulus itself is defined at a single binocular disparity (0-disparity for the example shown in Fig. 11). The stimulus illustrated here would be effective for a complex cell that exhibits a binocular interaction RF that consists of polarity-matched response subregions at zero disparity and polarity-mismatched response subregions at disparities equivalent to ±180 deg phase of the sinusoidal grating. Because binocular interaction RFs of complex cells are elongated along the frontoparallel axis, the stimulus would be effective regardless of its monocular phases as long as the binocular disparity of the stimulus is kept constant. Therefore it seems that subregions of the binocular interaction RF represent a structure suitable for detecting a stimulus as being at a particular binocular disparity rather than a mechanism designed for detecting different disparities depending on the interocular polarity combination of the stimulus.
|
As mentioned earlier, integrating the contour plot in Fig. 11 along the frontoparallel axis XF yields the interocular cross-correlation function of the stimulus. Note that this is also how the binocular disparity tuning of a cell is obtained from the binocular interaction RF. In this sense, the binocular disparity tuning function of a cell can be interpreted as a matching template to be compared with, or a filter to be applied to, the interocular cross-correlation function of the stimulus.
Assumptions involved in the SVD analysis and interpretation of SVD components
In this study, the SVD has been applied to binocular interaction RFs of complex cells to estimate RFs of underlying subunits. The SVD components obtained from the analysis are assumed to represent underlying functional subunits, but not necessarily actual afferent neurons. But what does it mean that subunits are functional? How should they be interpreted? Before answering these questions, assumptions involved in the SVD analysis need to be examined.
The use of the SVD on the binocular interaction RF involves three
assumptions. First of all, binocular interaction RFs of subunits are
assumed to be left-right separable, i.e., the binocular interaction RF
of a subunit is proportional to the product of left and right eye RFs
of the subunit. In the preceding paper (Anzai et al.
1999b), binocular interaction RFs of most simple cells were
shown to be proportional to the product of their left and right eye
RFs. Therefore if one assumes that subunits of complex cells are either
simple cells or LGN cells that are arranged in such a way that they are
functionally equivalent to individual simple cells at the dendrites of
complex cells, then binocular interaction RFs of subunits are expected
to be separable.
Second, the binocular interaction RF of a complex cell is assumed to be
a sum of binocular interaction RFs of subunits. In other words, a
subunit's output has an additive contribution to the complex cell.
Although there is no direct evidence for this assumption, the behavior
of complex cells is consistent with the assumption (e.g.,
Emerson et al. 1992; Gaska et al. 1994
;
Glezer et al. 1980
; Hubel and Wiesel
1962
; Movshon et al. 1978
; Ohzawa et al.
1990
, 1997
; Spitzer and Hochstein 1985
), and
there is no evidence that suggests otherwise.
Finally, RFs of subunits are assumed to be mutually orthogonal or in a
quadrature phase relationship. This is probably the most critical
assumption for understanding what SVD components represent. Because
nearby simple cells have been shown to be in quadrature (Liu et
al. 1992; Pollen and Ronner 1981
), it is
possible that subunits that feed into a complex cell are indeed in
quadrature. However, because there is no firm evidence for or against
this assumption, interpretation of SVD components needs to be
considered both for the case where this assumption holds and for the
case where it does not.
Suppose that subunits are indeed in quadrature and exhibit preference for the same spatial frequency, then the number of SVD components should be two, as an energy model predicts. However, the converse is not true. If the number of SVD components is two, subunits may or may not be in quadrature. Because subunits are assumed to be linearly summed to make up a complex cell, there is no unique solution for dividing a binocular interaction RF of a complex cell into RFs of subunits unless one makes an assumption regarding relationships among the subunits, such as a quadrature phase constraint. Therefore SVD components do not necessarily represent individual subunits but linear combinations of subunits. In this sense, SVD components are only functionally equivalent to subunits. Obviously this is not a major limitation if one would like to know functional structures of complex cells. It is a problem, however, if one wishes to identify the actual physical implementation of the functional structures.
If the number of SVD components is more than two, that indicates the existence of extra subunits that are not in quadrature, provided that the subunits have the same spatial frequency as that of the first two components (i.e., quadrature subunits). However, RFs of the extra SVD components may not represent those of the underlying nonquadrature subunits. This again is because SVD components are linear combinations of real subunits. Therefore the extra SVD components, which are likely to be linear combinations of quadrature as well as nonquadrature subunits, generally do not exhibit RFs that resemble those of simple cells (e.g., Fig. 4, E and F).
If SVD components do not represent individual subunits but functional subunits, then what do comparisons between RF properties of the first and second SVD components (Fig. 6) and those between RF phase and position disparities (Figs. 7-10) show? For cells with only two SVD components, the comparisons describe functional structures of the cells in terms of functional subunits. For cells with more than two SVD components, they describe functional structures that approximate behavior of the cells best.
System structure of complex cells and comparisons with that of an energy model
Complex cells have been modeled as a system that consists of
parallel subunits (e.g., Glezer et al. 1980, 1982
;
Hubel and Wiesel 1962
; Ohzawa and Freeman
1986
; Ohzawa et al. 1990
, 1997
; Pollen
and Ronner 1983
; Pollen et al. 1989
;
Spizer and Hochstein 1985
, 1988
). Each subunit is
modeled as a linear filter followed by a static nonlinearity and is
assumed to represent a simple cell or a collection of LGN cells.
Variations of the model in the number of subunits and relationships
between the subunits can account for the behavior of various complex cells.
The SVD analysis conducted in this study indicates that two functional subunits that form a quadrature pair are sufficient to account for binocular interaction RFs of a majority of complex cells. In other words, most complex cells are consistent with an energy model. For the model to be more physiologically plausible, each member of a quadrature pair needs to be represented by two subunits that are sign-inverted versions of each other. Therefore at least four subunits are needed to model a complex cell.
Some complex cells are shown to deviate slightly from an energy model; more than two SVD components are required for these cells to describe their binocular interaction RFs. This indicates the existence of nonquadrature subunits. There are at least two possibilities for the origin of the nonquadrature subunits. One is that two subunits that make up a member of a quadrature pair may not be exactly sign-inverted versions of each other. This could explain why monocular RFs of some complex cells are not entirely flat. Nonquadrature subunits are likely to be due to a misalignment of the RF position.
Another possibility is that the number of subunits may be more than
four. An energy model predicts that the aspect ratio of the binocular
interaction RF should be one, i.e., the extent of the RF along the
frontoparallel axis should be the same as that along the binocular
disparity axis. However, some complex cells exhibit binocular
interaction RFs that are elongated along the frontoparallel axis more
than is expected from their extent along the binocular disparity axis.
This suggests that more subunits may be added to expand the overall RF.
As a special case, additional subunits may form quadrature pairs
themselves. It has been demonstrated that spatial pooling of multiple
quadrature pairs improves the reliability of disparity tuning
(Qian and Zhu 1997). Therefore the deviations from an
energy model seen in some complex cells actually may be advantageous
from a computational point of view.
In this study, cells that did not exhibit significant binocular
interaction RFs were not analyzed. However, some of these cells still
can be activated by stimulation of either eye alone. Similar cells also
were reported previously (e.g., Ferster 1981; Ohzawa and Freeman 1986
). These cells can be explained
by a difference in ocular dominance among subunits (Ohzawa and
Freeman 1986
). That is, subunits of these cells are presumably
quite monocular so that they do not exhibit a significant binocular
interaction. However, some subunits are left eye dominant, whereas
others are right eye dominant. Therefore such complex cells still
respond to stimulation of either eye. There is also a possibility that subunits are binocular, but their preferred binocular disparities are
uniformly distributed (Ohzawa and Freeman 1986
). In any
case, these cells cannot encode binocular disparity, and they are
likely to play little or no direct role in the processing of binocular disparity information.
Finally, it should be pointed out that the model examined in this
study is a feed forward model and is by no means complete. Complex
cells exhibit various nonlinear properties, including contrast gain
control (e.g., Ohzawa et al. 1982, 1985
) and end- and
side-inhibition (e.g., Blakemore and Tobin 1972
;
DeAngelis et al. 1994
; DeValois et al.
1985
; Hubel and Wiesel 1968
; Kato et al.
1978
; Maffei and Fiorentini 1976
). Although it
is not clear at this point if these nonlinearities are essential for
the processing of binocular information, they eventually need to be
incorporated into any complete model of complex cells.
RF position and phase disparities of complex-cell subunits
To examine how complex cells encode binocular disparity, RF
position and phase disparities of underlying functional subunits were
estimated from left and right eye RFs of SVD components. The range of
RF position disparities is found to be quite small compared with that
of RF phase disparities. In addition, RF phase disparity, but not RF
position disparity, were found to exhibit a dependency on the RF
spatial frequency, a result consistent with the size-disparity
correlation observed in human psychophysics (DeValois
1982; Felton et al. 1972
; Kulikowski
1978
; Legge and Gu 1989
; Richards and
Kaye 1974
; Schor and Wood 1983
; Schor et al. 1984a
,b
; Smallman and MacLeod 1994
).
Therefore it appears that complex cells encode binocular disparity
mainly through the RF phase disparity. However, because RF phase
disparities for cells tuned to high spatial frequencies are necessarily
small in deg VA, RF position disparities still may play an important role in encoding binocular disparity for these cells.
A reference-cell method (see Anzai et al. 1999a for
details) was applied for the estimation of position disparities; the
position disparity of the first SVD component was measured for each
cell with respect to RF positions of the second SVD component (a
reference) of the same cell. In other words, the position disparity
measured here is the relative position disparity of the first SVD
component to that of the second SVD component. Assuming that true
position disparities of the first and second SVD components are
uncorrelated, the distribution of relative position disparities is
expected to be broader than that of true position disparities by a
factor of
(see APPENDIX in Anzai et al.
1999a
for details). However, because phase disparities of the
first and second SVD components are correlated, as shown in Fig.
6D, it is possible that their position disparities also are
correlated. If that is the case, then the correction factor should be
smaller than
, and the use of
would
underestimate the standard deviation of the distribution for true
position disparity. Unfortunately, it is not possible to determine
whether position disparities of the first and second SVD components are
correlated. However, because position and phase disparities of simple
cells are not correlated (Anzai et al. 1999a
), if simple
cells with the same phase disparity are selected randomly to feed into
a complex cell, then their position disparities would not be
correlated. Therefore unless position disparities of the subunits are
negligible compared with their phase disparities, complex cells would
have to be able to select subunits that exhibit not only the same phase
disparity but also the same position disparity for them to maintain a
selectivity to a constant binocular disparity.
Suppose that the position disparities of SVD components indeed were
correlated. Then the distribution of true position disparities would be
broader than that estimated in this study. How much broader would it
be? This question cannot be answered unless one measures a degree of
correlation between position disparities of the first and second SVD
components. However, if one assumes that subunits are simple cells,
then the distribution of position disparities for simple cells provides
the upper limit for the broadness of the distribution. A comparison
between the phase disparity distribution of complex cells and the
distribution for true position disparity of simple cells (Anzai
et al. 1999a) indicates that the former is broader than the
latter (see RESULTS). Therefore the range of binocular
disparities that can be encoded through RF phase disparity is still
larger than that for RF position disparity.
On the basis of the results presented in Anzai et al.
(1999a) and in this study, the following picture emerges as a
neural mechanism for encoding binocular disparity. Depending on the
phase disparity, the profile (phase) of the binocular interaction RFs along the binocular disparity axis changes, i.e., subregions of the
binocular interaction RF are located at different depths. Cells are
tuned to spatial frequency, and therefore, binocular disparity is
encoded at each spatial scale. The binocular interaction RFs would be
large for cells tuned to low spatial frequencies, and hence they could
encode a wide range of binocular disparity. Cells tuned to somewhat
higher spatial frequencies would have correspondingly smaller binocular
interaction RFs, and hence they encode a smaller range of binocular
disparity (the size-disparity correlation). Small RF position
disparities would not affect their binocular disparity tuning. However,
for cells tuned to very high spatial frequencies, the RF position
disparities may not be negligible. Therefore monocular RFs are no
longer at the corresponding points, and the location of the binocular
interaction RFs may be shifted in depth around the fixation plane. This
would effectively expand the range of binocular disparity that could be
encoded by cells tuned to high spatial frequencies. In other words, the
range of binocular disparity would be determined by the range of RF
position disparity, and would no longer be a function of spatial
frequency (a constant disparity limit).
Functional roles of binocular complex cells
In the preceding paper (Anzai et al. 1999b),
binocular interactions exhibited by simple cells were shown to be
multiplicative at the output stage. It also was shown that, because of
the multiplicative binocular interaction, responses of binocular simple
cells contain a component that is formally equivalent to a
cross-correlation of the left and right eye images that are band-pass
filtered. Because subunits of complex cells are functionally equivalent to simple cells, complex cells would be expected to exhibit a multiplicative binocular interaction. Therefore complex cells also
compute something analogous to an interocular cross-correlation of
images in a local region. The difference between the interocular cross-correlation computed by simple cells and that computed by complex
cells is that the former depends on the monocular stimulus phases,
whereas the latter does not; i.e., the binocular interaction RF is
left-right separable for simple cells, whereas it is inseparable for
complex cells.
An interocular cross-correlation is a fundamental computation for the
processing of binocular information. For example, it has been shown
that the stereo correspondence problem can be solved by computing the
interocular cross-correlation of stereo images (Jenkin and
Jepson 1988; Sanger 1988
). There are also
psychophysical studies that indicate that the visual system is very
sensitive to the interocular correlation of images (Cormack et
al. 1991
, 1993
; Stevenson et al. 1991
, 1992
;
Tyler and Julesz 1978
) and that cyclopean processing in
humans is consistent with multiplicative mechanisms such as an
interocular cross-correlation (Cormack et al. 1991
;
Stevenson et al. 1991
). The results of this
study suggest that complex cells may underlie these psychophysical data
and play an important role in solving the stereo correspondence problem.
The multiplicative binocular interaction results from a squaring
nonlinearity that follows a linear binocular filter of a subunit.
Because a linear binocular filter is simply the sum of left and right
eye linear filters, the monocular interaction also is expected to be
multiplicative. This suggests that complex cells may compute something
analogous to autocorrelation of the monocular image in a local region
(Movshon et al. 1978), which is also an algorithm useful
for detecting stimulus attributes such as form and motion. Combining
monocular and binocular processing, complex cells can be considered
local spatiotemporal correlators. From a computational point of view,
this description may be preferable to an energy detector because it
indicates the algorithm of the neural computations that they perform.
However, it should be noted that because a Fourier transform of an
autocorrelation function of signals yields a Fourier power spectrum of
the signals, the description of complex cells as a local correlator is
equivalent to the notion of an energy detector.
In the series of three papers presented here, we have described the
functional architecture of neurons in the striate cortex for processing
binocular information. Simple cells exhibit interocular RF phase
disparities that are suitable for detecting binocular disparities in
the retinal images (Anzai et al. 1999a). The binocular disparity information encoded through such a mechanism then is subjected to a nonlinearity to perform a computation that is analogous to an interocular cross-correlation of images in a local region of
space (Anzai et al. 1999b
). Simple cells provide
monocular phase specific components of the computation (Anzai et
al. 1999b
), whereas complex cells combine outputs of simple
cell-like subunits to eliminate the monocular phase specificity as
shown in the current paper. The results of the computation are useful
for solving the stereo correspondence problem. Considered together with
the previous work we have described, we now have a good understanding
of the functional roles of simple and complex cells with respect to
binocular vision. Therefore our findings provide a solid foundation on
which to base exploration of the next stages of binocular visual processing.
![]() |
ACKNOWLEDGMENTS |
---|
We are grateful to Dr. Erich Sutter for advice on binary m-sequences and their applications to receptive field mapping and to Dr. Stanley Klein for advice on singular value decomposition analysis. We also thank Drs. Russel DeValois and Edwin Lewis for discussions and helpful comments and suggestions.
This work was supported by research and CORE grants from the National Eye Institute (EY-01175 and EY-03176).
![]() |
FOOTNOTES |
---|
Address reprint requests to: R. D. Freeman, 360 Minor Hall, School of Optometry, University of California, Berkeley, CA 94720-2020.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 2 June 1998; accepted in final form 2 April 1999.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|