Neural Mechanisms for Processing Binocular Information I. Simple Cells

Akiyuki Anzai, Izumi Ohzawa, and Ralph D. Freeman

Group in Vision Science, School of Optometry, University of California, Berkeley, California 94720-2020


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Anzai, Akiyuki, Izumi Ohzawa, and Ralph D. Freeman. Neural Mechanisms for Processing Binocular Information I. Simple Cells. J. Neurophysiol. 82: 891-908, 1999. The visual system integrates information from the left and right eyes and constructs a visual world that is perceived as single and three dimensional. To understand neural mechanisms underlying this process, it is important to learn about how signals from the two eyes interact at the level of single neurons. Using a sophisticated receptive field (RF) mapping technique that employs binary m-sequences, we have determined the rules of binocular interactions exhibited by simple cells in the cat's striate cortex in relation to the structure of their monocular RFs. We find that binocular interaction RFs of most simple cells are well described as the product of left and right eye RFs. Therefore the binocular interactions depend not only on binocular disparity but also on monocular stimulus position or phase. The binocular interaction RF is consistent with that predicted by a model of a linear binocular filter followed by a static nonlinearity. The static nonlinearity is shown to have a shape of a half-power function with an average exponent of ~2. Although the initial binocular convergence of signals is linear, the static nonlinearity makes binocular interaction multiplicative at the output of simple cells. This multiplicative binocular interaction is a key ingredient for the computation of interocular cross-correlation, an algorithm for solving the stereo correspondence problem. Therefore simple cells may perform initial computations necessary to solve this problem.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Neural signals from the left and right eyes are segregated until they reach the striate cortex and converge onto single cells to form binocular neurons. Therefore it is believed that binocular neurons in the striate cortex perform initial computations for mediating binocular fusion and stereoscopic depth perception (e.g., Barlow et al. 1967; Pettigrew 1965; Pettigrew et al. 1968). To identify the neural computations carried out by the binocular neurons, it is essential to obtain rules of how signals from the two eyes are combined at the level of single neurons, i.e., the binocular interaction of signals.

Hubel and Wiesel (1959) were the first to describe binocular interactions exhibited by simple cells in the cat's striate cortex. They observed that stimulating ON (or OFF) subregions of the left and right eye receptive fields (RFs) simultaneously results in response summation, whereas stimulating an ON subregion in one eye and an OFF subregion in the other eye cancels the response. This suggests that the binocular interaction of signals may be linear. They also reported that some cells respond only when stimulated binocularly (Hubel and Wiesel 1962), which is indicative of a nonlinear binocular interaction. However, this still could be attributed to a subthreshold summation that is linear (Ohzawa and Freeman 1986). Because they found that left and right eye RFs occupy corresponding positions on the two retinae and are strikingly similar in their organization, they thought that retinal images of objects either in front of or behind the point of visual fixation would not be effective for evoking responses from the cells (Hubel and Wiesel 1959, 1962). Therefore they concluded that binocular cells in the striate cortex are probably not involved in stereoscopic depth discrimination (Hubel and Wiesel 1959, 1962, 1970, 1973). Instead, it was thought that such cells may be related to mechanisms of binocular fixation (Hubel and Wiesel 1959).

Other studies also found that the binocular interaction of signals results in response facilitation, summation, or occlusion, but contrary to Hubel and Wiesel's claim, these studies reported that a substantial number of cells are selective to binocular disparity (Barlow et al. 1967; Bishop et al. 1971; Blakemore 1969; Ferster 1981; Fischer and Kruger 1979; Kato et al. 1981; LeVay and Voigt 1988; Maske et al. 1986a,b; Pettigrew 1965; Pettigrew et al. 1968; von der Heydt 1978). For instance, Pettigrew et al. (1968) measured the tuning for binocular disparity of cells in the cat's striate cortex using moving bright bars of various binocular disparities. They found that some cells are narrowly tuned to binocular disparity and that the optimal disparity and the width of the tuning vary from cell to cell. Others found similar results (Barlow et al. 1967; Blakemore 1969; Bishop et al. 1971; Ferster 1981; Fischer and Kruger 1979; Kato et al. 1981; LeVay and Voigt 1988; Maske et al. 1986a,b; von der Heydt 1978).

Cells selective to binocular disparity also are found in monkey striate cortex (Cumming and Parker 1997; Gonzalez et al. 1993; Poggio 1990; Poggio and Fischer 1977; Poggio and Talbot 1981; Poggio et al. 1985, 1988). A proportion of these cells are shown to respond to dynamic random-dot stereograms (Cumming and Parker 1997; Gonzalez et al. 1993; Poggio 1990; Poggio et al. 1985, 1988), which suggests that the stereo correspondence problem may be solved, at least partially, at the striate cortex (Gonzalez et al. 1993; Poggio et al. 1985; but see Cumming and Parker 1997). Indeed, some of the cells are sensitive to binocular image correlation (Gonzalez et al. 1993; Poggio et al. 1985, 1988).

Although these studies have established that responses of binocular cells are modulated depending on the binocular disparity of a stimulus, there are some problems that make interpretation of the results difficult. First of all, the use of moving bars confounds spatial and temporal factors. When the binocular disparity of the stimulus is changed, the timing at which left and right eye bars reach the corresponding positions of the retinae also is changed. In other words, a binocular disparity introduces an interocular temporal offset as well as a spatial offset. Therefore it is not clear whether binocular disparity tuning results from differential responses to binocular disparity, the temporal sequence of bar stimulation, or both.

Second, there are many pairs of monocular bar positions that yield the same binocular disparity. Therefore it is possible that cells respond differently to the same binocular disparity depending on the monocular positions of the bars. The previous studies ignored this possibility either by averaging responses over space using moving bars or by the use of extended stimuli such as dynamic random-dot stereograms (but see Ohzawa et al. 1990).

Another problem is that there is some evidence that suggests that binocular disparity tuning is stimulus dependent. Maske et al. (1986a) measured the tuning for binocular disparity of cells in the cat's striate cortex using bright and dark bars. They found that tuning curves obtained with these stimuli are different for some cells. Ohzawa et al. (1990) measured binocular interaction profiles of cells in the cat's striate cortex using not only bright and dark bars but also a combination of the two, i.e., a bright bar in one eye and a dark bar in the other eye. They found that the profiles depend on the stimulus (see also Cumming and Parker 1997). Therefore binocular disparity tuning measured with only bright or dark bars/dots, as in most of the previous studies, is incomplete.

There is also an important issue that most of the previous studies could not address (but see Ferster 1981): what are the neural mechanisms underlying binocular interactions that make these cells selective to binocular disparity? Ohzawa and Freeman (1986) measured the tuning for interocular phase disparity of simple cells in the cat's striate cortex using drifting sinusoidal gratings. They found that most cells show a phase-specific binocular interaction that is consistent with the predictions of linear binocular summation. Therefore they concluded that the binocular interaction exhibited by simple cells is linear. This suggests that a simple linear mechanism is responsible for a cell's selectivity to binocular disparity.

On the other hand, there is also evidence for nonlinear binocular interactions. Ferster (1981) measured the tuning of cells in areas 17 and 18 of cats for binocular disparity using moving bright bars. He compared the binocular disparity tuning with the profiles of left and right eye RFs and found that the binocular disparity tuning can be predicted by taking a cross-correlation between the left and right eye RF profiles. This indicates that the binocular interaction is multiplicative and suggests that the mechanism underlying binocular disparity selectivity is nonlinear. This result appears to be at odds with Ohzawa and Freeman's result that binocular interaction is linear. A resolution of this apparent contradiction requires a more detailed analysis of binocular interaction and monocular RFs.

To avoid the problems of the previous studies and address the issue of neural mechanisms underlying binocular interaction, white noise analysis (e.g., Marmarelis and Marmarelis 1978) is conducted in this study. Spatiotemporal white noise generated according to binary m-sequences (Sutter 1987, 1992) is used to measure binocular interaction RFs and monocular RFs of simple cells in the cat's striate cortex. The binocular interaction RF represents how signals from the left and right eyes are combined at each pair of monocular positions. It describes how a cell responds to stimuli of various binocular disparities and how that depends on monocular stimulus position. Therefore the question of whether binocular disparity tuning depends on the monocular position of a stimulus can be addressed. The noise stimulus covers the entire left and right eye RFs and is updated rapidly so that binocular disparity exists everywhere in the RFs all the time. This ensures that spatial and temporal parameters of the stimulus are not confounded. Moreover, the stimulus contains all binocular combinations of bright and dark bars (bright-bright, dark-dark, bright-dark, and dark-bright), so that the measurement is complete.

The use of white noise also allows one to examine the system structure of cells and estimate parameters for the components of the system (see Anzai 1997 for a review on this topic). It has been proposed that simple cells can be modeled as a system that has a structure of a linear filter followed by a static nonlinearity (e.g., Albrecht and Geisler 1991; Andrews and Pollen 1979; DeAngelis et al. 1993; Hamilton et al. 1989; Heeger 1992b; Jagadeesh et al. 1993, 1997; Mancini et al. 1990; Movshon et al. 1978; Ohzawa and Freeman 1986; Pollen et al. 1988; Tadmor and Tolhurst 1989; Tolhurst and Dean 1987, 1990), and that the static nonlinearity is a half-squaring function (e.g., Emerson et al. 1989; Heeger 1992b; Mancini et al. 1990). However, most of the studies that examined the linearity of simple cells conducted rather relaxed tests of linearity (see Anzai 1997 and Heeger 1992b for a review), and the linearity was not tested for each point of the RF in space and time. In addition, any deviation from a linear prediction often was attributed to a static nonlinearity without an appropriate analysis of the nonlinearity. White noise analysis offers an alternative method of identifying the system structure of cells (e.g., Billings and Fakhouri 1978; Chen 1995; Chen et al. 1986; Hunter and Korenberg 1986; Korenberg and Hunter 1986; Marmarelis and Marmarelis 1978). For example, if a cell has a system structure of a linear binocular filter followed by a static nonlinearity, its binocular interaction RF and monocular RFs are expected to show a certain relationship. Therefore by examining the relationship among the RFs, one can determine if the system structure of binocular simple cells is consistent with the model. A similar analysis has been applied to temporal interaction (Emerson et al. 1989; Mancini et al. 1990) and spatiotemporal interaction (Jacobson et al. 1993; Emerson 1997) for monocular responses of simple cells. Once the system structure is identified, one can estimate parameters for the system components (Emerson et al. 1989; Mancini et al. 1990). In particular, parameters for nonlinear components of the system (e.g., the shape of the static nonlinearity) are important because they represent the underlying nonlinear computations performed by the cell.

Here, by determining the system structure for binocular simple cells and describing the nature of nonlinearities in the system, neural mechanisms underlying binocular interactions are identified. Thus the issue of whether binocular interaction is linear or nonlinear is resolved. This analysis also provides important clues as to what kind of neural computations are performed by binocular simple cells. Possible roles of binocular simple cells in binocular fusion and stereopsis are considered.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Surgical and histological procedures, apparatus, and recording procedures are identical to those described in the preceding paper (Anzai et al. 1999a). Binocular interaction RFs of simple cells are obtained, along with their monocular RFs, using dichoptic one-dimensional (1D) binary m-sequence noise (for details of the stimulus configuration, see Anzai et al. 1999a). The relationship between the binocular interaction RF and monocular RFs is analyzed for each cell to determine whether binocular simple cells behave in a way that is consistent with a model of a linear binocular filter followed by a static nonlinearity. Then for those cells that are consistent with the model, the shape of the static nonlinearity is estimated.

Construction of RF maps and their interpretation

Each spike train recorded as a response to binary m-sequence noise is cross-correlated with the stimulus sequence to obtain RF maps. The cross-correlation between the stimulus sequence in the left eye and a spike train yields a left eye RF (L). Substituting the stimulus sequence for the right eye into the cross-correlation yields a right eye RF (R). The cross-correlation among the stimulus sequences in the left and right eyes and the spike train yields a binocular interaction RF (B). The cross-correlations are computed by means of the fast m-transform (Sutter 1991), which is a very efficient algorithm for the computations. Operationally, these computations can be described as follows.

To obtain a monocular RF, first a spike train is cross-correlated with a binary m-sequence at each position of the stimulus elements. This yields a cross-correlogram that represents a temporal response profile (in steps of 5 ms) of the RF for each position (Fig. 1). Then a spatial response profile of the RF is constructed by taking a value from each correlogram at a correlation delay (tau ). The monocular RF represents the responses to bright bars minus the responses to dark bars and provides the best linear approximation, in a mean-squared error sense, to the stimulus-response relationship of the cell.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 1. Construction of monocular receptive field (RF). First, a spike train is cross-correlated with a stimulus m-sequence at each position of stimulus elements. This gives a cross-correlogram that represents a temporal response profile of the RF for each position. Then a spatial response profile of the RF is constructed by taking a value from each correlogram at a correlation delay, tau .

A binocular interaction RF is constructed as illustrated in Fig. 2. There is a region in space that is covered by both left and right eye stimuli, which is labeled as the binocular view field in the figure. Any point in the binocular view field can be specified by two stimulus bar locations---one in each eye. If this region is filled with bright dots when the corresponding left and right eye bars have the same polarity and with dark dots when the polarities are different, a two-dimensional (2D) noise pattern like that shown in the figure is obtained. This pattern changes every 40 ms according to the same m-sequence used to generate the dichoptic 1D noise stimulus. The sequence has a different time shift for each point in the binocular view field so that the synthesized pattern is uncorrelated in space and time for the purpose of RF mapping. Then one can compute a cross-correlation between a spike train and the sequence of the synthesized pattern and obtain a 2D activity map in the same way that the monocular RF is obtained. The map is called a binocular interaction RF and represents the responses to stimuli of matched polarity in the two eyes minus the responses to stimuli of mismatched polarity in the two eyes. This map reflects only responses due to nonlinear binocular interaction; i.e., if the left and right eye signals are summed linearly without any further nonlinear processing, the map is uniformly zero. As illustrated in Fig. 2, right, the binocular interaction RF has axes of left eye bar position, XL, and right eye bar position, XR. The vertical axis, D, represents binocular disparity, and the frontoparallel axis, XF, runs in the horizontal direction.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 2. Construction of binocular interaction RF. There is a region in space that is covered by left and right eye stimuli that is labeled as the binocular view field. Each point in the binocular view field can be specified by a bar position in the left eye and a bar position in the right eye. If the binocular view field is filled with bright dots when the corresponding left and right eye bars have the same polarity and with dark dots when the polarities are different, a 2-dimensional (2D) noise pattern like that shown inside the binocular view field is obtained. This pattern is also m-sequence noise and changes every 40 ms as the dichoptic noise stimulus is updated. Then a cross-correlation can be computed between a spike train and the sequence of the synthesized noise pattern in the same manner as monocular RFs are computed. The resulting activity map is the binocular interaction RF. Binocular interaction RF is mapped in the binocular view field and has axes of left eye bar position, XL, and right eye bar position, XR. Therefore binocular disparity changes along the vertical axis, D, and the frontoparallel axis, XF, runs in the horizontal direction.

Identification of the system structure for binocular simple cells

White noise analysis allows one to determine the system structure of cells (see Anzai 1997 for a review). For a binocular simple cell that has a structure of a linear binocular filter followed by a static nonlinearity, as depicted in Fig. 3, a relationship exists between the binocular interaction RF and monocular RFs. That is, such a cell satisfies the following condition (see APPENDIX for derivation)
<IT>B</IT><SUB><IT>i,j</IT></SUB>(<IT>&tgr;</IT>)<IT>≈&agr;·</IT><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>·</IT><IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>) (1)
where L, R, and B are the left eye, right eye, and binocular interaction RFs, respectively. The subscripts, i and j, denote stimulus bar positions in the left and right eyes, respectively. The variable, tau , indicates a cross-correlation delay, and alpha  represents a constant. This equation states that if a binocular simple cell has the system structure illustrated in Fig. 3, then its binocular interaction RF is approximately proportional to the product of the left and right eye RFs (note that for a Gaussian input stimulus this relationship is exact, not an approximation). In other words, if one plots the binocular interaction RF, Bi, j (tau ), against the product of the left and right eye RFs, Li (tau ) · Rj (tau ), then the data points should fall on a straight line through the origin with a slope alpha . A linear regression analysis is conducted on such a plot for each cell to examine if the condition stated in Eq. 1 is met.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 3. Model of binocular simple cells. Model consists of 2 major parts: a linear binocular filter (the dashed box) and a static nonlinearity (N). Binocular filter is subdivided into a left eye linear filter (L) and a right eye linear filter (R). These monocular filters receive inputs [SL(t) and SR(t)] from multiple spatial locations in the left and right eyes. Outputs from the monocular filters are combined linearly to produce the output of the binocular filter, W(t). Finally, the static nonlinearity converts W(t) to Y(t), the output response of the cell to the stimuli SL(t) and SR(t). The variables SL, SR, W, and Y are all functions of time t.

In this analysis, data points outside the cell's monocular RFs are excluded; the values of i, j, and tau  in Eq. 1 are restricted to be within the extent of the cell's RFs. The extent of each monocular RF is defined by the smallest region in space and time outside of which the squared value of each data point is <5% of the squared value of the peak data point.

Estimating the shape of the static nonlinearity

For cells that are well described by a linear binocular filter followed by a static nonlinearity (i.e., those that satisfy Eq. 1), the shape of the static nonlinearity is estimated from its input-output relationship. The input to the static nonlinearity, i.e., the output of a linear binocular filter [denoted by W(t) in Fig. 3], is estimated by convolving the monocular RFs (L and R) with the noise stimuli (SL and SR) used to obtain the RFs.1 The output of the static nonlinearity [Y(t) in Fig. 3] is the spike train recorded as a response to the noise stimuli. Both W and Y represent a time series. A value of W indicates an input to the static nonlinearity summed over a period of 40 ms, which is the stimulus update period. Likewise, a value of Y represents a total spike count for a 40-ms period. The input-output relationship of the static nonlinearity then is obtained by plotting Y values against W values for the entire record of the spike train (~20 min long). Because spike generation is a stochastic process, the same input value of W does not necessarily yield the same spike count Y. Therefore the axis for the input W is divided into bins (see the legend of Fig. 9 for details of binning) and a mean Y value and a mean W value of the data points are computed for each bin. A curve connecting the mean values for all bins defines the shape of the static nonlinearity. See Fig. 9 for an example.

As shown in RESULTS, the static nonlinearity turns out to be an expansive function. Such a function can be well described by a half-power function of the form
<IT>Y</IT><IT>=max </IT>[<IT>&bgr;</IT>(<IT>W</IT><IT>+&thgr;</IT>)<IT>, 0</IT>]<SUP><IT>n</IT></SUP> (2)
This equation is a combination of a threshold and an expansive (for n > 1) power function. The parameter theta  represents a threshold value, and beta  scales the input W. Max[  ] denotes the largest of the values inside the brackets and acts as a rectifier. The exponent n determines the steepness of the expansive function. When n is 1 and theta  is 0, Eq. 2 reduces to a half-rectification. If n equals 2, then it is a half-squaring function (Heeger 1992b). Taking the logarithm of Eq. 2 (for beta (W+theta ) > 0) yields
Log Y=<IT>n</IT><IT> Log &bgr;+</IT><IT>n</IT><IT> Log </IT>(<IT>W</IT><IT>+&thgr;</IT>) (3)
This equation indicates that the relationship between Log Y and Log W can be described by a straight line with a slope n for W >> theta . It also predicts that Log Y values at W approx  theta  would deviate from the straight line (see Fig. 9B for an example). To estimate the exponent of the static nonlinearity, we plot the data on log-log coordinates and fit a straight line to three consecutive data points to find the maximum slope, which is taken as an estimate of the exponent.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Monocular RFs and binocular interaction RFs have been obtained for 85 binocular simple cells in 16 adult cats. Of these, 49 cells exhibited substantial binocular interactions and are analyzed here. The remaining 36 cells showed only weak binocular interactions. Ten of these cells were responsive to stimulation of either eye, but their signal-to-noise ratios for the binocular interaction RF are low due to relatively low spike counts. The rest of the cells were either ocularly very unbalanced or not very responsive to stimulation of either eye. Therefore these 36 cells have been excluded from the analysis.

Examples of monocular RFs and binocular interaction RFs

Figure 4 shows examples of monocular RFs (L and R) and binocular interaction RFs (B) for six simple cells. For each cell, the RFs are constructed at a common correlation delay, which is chosen from optimal correlation delays of the RFs. The optimal correlation delay of an RF is defined as the delay at which the sum of squared values of all data points in the RF is maximum. For a given cell, two monocular RFs and a binocular interaction RF generally had the same optimal correlation delay. When they had different optimal delays (differences never exceeded 20 ms), the one that maximizes signal-to-noise ratios of the RFs was chosen to be the common correlation delay.



View larger version (30K):
[in this window]
[in a new window]
 
Fig. 4. Examples of monocular RFs and binocular interaction RFs. One-dimensional profiles of left (L) and right (R) eye RFs and contour plots of binocular interaction RFs (B) are shown for 6 simple cells (A-F). The symbols, XL and XR, denote positions in the left and right eyes, respectively. The axes, XF and D, represent position along the frontoparallel axis and binocular disparity, respectively. For each cell, monocular RFs and the binocular interaction RF are plotted on the same scale. Contour lines are drawn such that they divide the response amplitude between zero and either a positive or negative peak, whichever is greater, into 8 equally spaced levels. Solid and dashed contours indicate positive and negative values, respectively. Checkered pattern of the binocular interaction RF is a characteristic of simple cells. Cross-correlation delays: 44 (A), 49 (C and F), and 54 ms (B, D, and E).

Monocular RFs of the cells shown in Fig. 4, A and D, have similar profiles in the two eyes, indicating relatively small RF phase disparities. On the other hand, the rest of the cells have clearly different RF profiles in the two eyes, i.e., some degree of RF phase disparity. By definition, these RFs represent spatial structures that characterize the best linear transformation between stimulus and response (in a mean-squared error sense). In other words, if the left and right eye signals were summed linearly without any further nonlinear processing, these RFs would be sufficient to characterize the cell's responses to binocular stimulation. However, binocular simple cells also exhibit nonlinear response properties, as evidenced by the binocular interaction RFs shown in the figure. This indicates that a cell's response to binocular stimulation is determined not just by the structure of the monocular RFs but also by the structure of the binocular interaction RF.

The common feature of the binocular interaction RFs is their checkered patterns. The checker elements indicated by the solid and dashed contours represent combinations of positions in the left and right eyes at which the cell responds preferentially to the interocular polarity matched and mismatched stimuli, respectively. The polarity of the checker elements changes along the axes of stimulus position in the left eye (XL) and in the right eye (XR). This indicates that the binocular interaction of simple cells depends on the monocular stimulus position or phase. Because of this, the strength of binocular interaction depends not only on the stimulus binocular disparity (D), but also on the stimulus position or phase along the frontoparallel axis (XF) where binocular disparity is constant. However, because the polarity of the checker elements does not change along the frontoparallel axis, integrating the binocular interaction RF along the constant disparity axis would yield a binocular disparity tuning function. Therefore binocular disparity tuning exhibited by simple cells is a consequence of their tuning for monocular phase in each eye and not for binocular disparity per se.

The checkered pattern also suggests that the binocular interaction RF is separable into left and right eye functions, i.e., the RF is described as the product of two functions---one for each eye. In fact, locations of checker elements seem to be aligned with locations of peaks and troughs of monocular RFs, implying that the left and right eye RFs may be the two functions. As described in METHODS, if a binocular simple cell has a system structure of a linear binocular filter followed by a static nonlinearity, the binocular interaction RF should be proportional to the product of the left and right eye RFs. This prediction is examined next.

Structure analysis of binocular simple cells

To determine if binocular interaction RFs are proportional to the product of left and right eye RFs, first qualitative comparisons are made between the predictions and raw data in Fig. 5. In the figure, binocular interaction RFs of three cells from Fig. 4 are shown on the left (Raw data). The product of the left and right eye RFs is computed for each cell and is shown on the right side of the figure (Prediction), along with 1D profiles of the left and right eye RFs. Contour plots for the predictions are quite similar qualitatively to those for the raw data, suggesting that they are proportional to each other. This is consistent with the results of Ferster (1981), who showed that the binocular disparity tuning of simple cells can be predicted by taking a cross-correlation between the left and right eye RF profiles (dot products of the left and right eye RF profiles at various interocular RF shifts).



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 5. Comparisons between binocular interaction RFs and the product of left and right eye RFs. Binocular interaction RFs (B) of 3 cells (A-C) from Fig. 4 are shown on the left (Raw data). Contour plots for the product of left and right eye RFs (L×R) are shown for each cell (D-F) on the right (Prediction) along with 1-dimensional (1D) profiles of the left (L) and right (R) eye RFs. Contour plots for the prediction are scaled so that each has the same peak as that of the corresponding plot for the raw data. Predictions are qualitatively very similar to the raw data.

This finding is further confirmed by the following quantitative comparisons. In Fig. 6, the value of each data point in the binocular interaction RF is plotted against that of the corresponding point in the predicted interaction RF, i.e., the product of left and right eye RFs, for each of the cells shown in Fig. 4. The solid lines indicate linear regression lines fitted to the data. Clearly, a straight line provides a good fit. Pearson's correlation coefficient r is indicated at the top right of each plot. The coefficients are very high (>0.9) for all cells shown here, suggesting that binocular interaction RFs are proportional to the product of left and right eye RFs.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 6. Scatter plots showing the relationship between values from the binocular interaction RF (B) and those from the predicted RF (L×R) for the cells shown in Fig. 4 (A-F). open circle , value from a location in a binocular interaction RF at a cross-correlation delay. ---, linear regression fits. Correlation coefficients (r) are in general very high (>0.9), suggesting a linear relationship between the raw data and the prediction.

Figure 7 shows a histogram of correlation coefficients for a population of cells examined. The distribution is strongly biased toward high values, and ~80% of the cells have an r value either equal to or >0.75. Therefore most binocular simple cells behave in a manner that is consistent with the model of a linear binocular filter followed by a static nonlinearity, as depicted in Fig. 3. Similar results have been obtained for temporal interaction data of simple cells (Emerson et al. 1989; Mancini et al. 1990).



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 7. A histogram of correlation coefficients (r) for a population of binocular simple cells (n = 49). Distribution is heavily biased toward high values, and ~80% of the cells have a correlation coefficient of >= 0.75, which suggests that binocular interaction RFs of most cells are proportional to the product of the left and right eye RFs. Therefore a majority of binocular simple cells can be modeled as a linear binocular filter followed by a static nonlinearity.

Figure 7 also indicates that 10 cells (20% of sample) have correlation coefficients of <0.75; their binocular interaction RFs are correlated only moderately with the products of left and right eye RFs. One example of such cells is shown in Fig. 8. The binocular interaction RF (Raw data in Fig. 8A) of this cell is somewhat elongated along the frontoparallel axis and therefore cannot be described by the product of the left and right eye RFs (Prediction in Fig. 8B). When data points of the binocular interaction RF are plotted against those of the predicted RF (Fig. 8C), they scatter vertically around a linear regression line (---), resulting in only a moderate correlation (r = 0.7). Of 10 cells with correlation coefficients <0.75, 8 exhibit a left-right inseparable binocular interaction RF at one or more cross-correlation delays (see also Emerson 1997; Jacobson et al. 1993). Therefore these cells have a system structure that is different from that depicted in Fig. 3. However, it is not clear if they are real variations of simple cells or simple cell-like complex cells because binocular interaction RFs of complex cells are inseparable (Anzai et al. 1999b). In the following section, only those cells with a correlation coefficient of >= 0.75 (n = 39) are considered to have the system structure illustrated in Fig. 3 and are subjected to further analysis.



View larger version (32K):
[in this window]
[in a new window]
 
Fig. 8. Example of a simple cell for which the binocular interaction RF (Raw data) is only weakly correlated with the product of the left and right eye RFs (Prediction). Binocular interaction RF (B) is elongated slightly along the frontoparallel axis XF (A), and therefore cannot be described as the product (L×R) of left (L) and right (R) eye RFs (B). Correlation delay: 79 ms. Scatter plot (C) shows the relationship between the binocular interaction RF and the predicted RF. ---, linear regression fit. Correlation coefficient (r) is 0.70, which suggests that a model of a linear binocular filter followed by a static nonlinearity does not provide an accurate description of the data for this cell.

Shape of the static nonlinearity

Having identified the system structure for binocular simple cells, one can proceed to estimating parameters for the components of the system. There are three components in the system: a left eye linear filter, a right eye linear filter, and a static nonlinearity (Fig. 3). Because monocular RFs are already in hand, the shapes of the left and right eye filters are known. Only the shape of the static nonlinearity needs to be determined.

The shape of the static nonlinearity is obtained from its input-output function (see METHODS for details). As shown in Fig. 3, the input to the static nonlinearity W(t) is the output of the linear binocular filter and can be estimated by convolving monocular RFs with the stimulus used to obtain the RFs. The output of the static nonlinearity Y(t) is the spike train obtained as the cell's response to the stimulus. Figure 9A shows an example of the input-output function plotted on-linear coordinates. Each dot represents a data point for a pair with input value W and output value Y. Because Y is a spike count, it is always positive and takes discrete values, whereas W is continuous and can be negative. The horizontal axis is divided into bins (see the legend of Fig. 9A for details of binning), and mean W and Y values of the data points are computed for each bin. The mean data are indicated by open circles and open triangles in the figure. Note that the mean Y values do not necessarily fall on the middle of the data ranges along the vertical axis. This is because the distribution of the data points along the axis is generally heavily biased toward zero. Solid lines connecting the open symbols represent the shape of the static nonlinearity. As seen in this example, the static nonlinearity has the shape of an expansive function.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 9. Example of a static nonlinearity. Static nonlinearity can be characterized by its input-output function. Input to the static nonlinearity [W(t) in Fig. 3] is the output of the linear binocular filter, which is estimated by convolving monocular RFs (L and R) with the stimulus (SL and SR) used to map the RFs. Output of the static nonlinearity [Y(t) in Fig. 3] is the spike train obtained as a response to the stimulus. A: example of the input-output function plotted on-linear coordinates. · , spike count [Y(t)] for 1 of the 40-ms periods taken from the entire spike train that is ~20 min long, plotted against the normalized input [W(t)] estimated for the same period. x axis is divided into bins; the bin size is set to 0.1 for W < 0.1 (linear binning) and to 0.1 log unit for W >=  0.1 (log binning). A mean Y value and a mean W value of the data points are computed for each bin and are indicated by triangle  and open circle  for linear and log binning, respectively. Error bars indicate ±1 SE and are generally smaller than the size of the symbols. ---.open circle , shape of the static nonlinearity. Static nonlinearity of simple cells is essentially an expansive function. B: input-output function plotted on log-log coordinates. Only mean data at W >=  0.1 (open circle  in A) are plotted. Three consecutive data points are fitted by a straight line to find the maximum slope, which is taken as an estimate of the exponent (n) for the static nonlinearity. , 3 data points that yield the maximum slope (n = 2.08) for a straight line fit (---). Static nonlinearity of this cell is approximately a half-squaring function. Deviation from the straight line of the data points at W < 0.2 is predicted by the effect of the threshold (theta  in Eq. 3).

Because the data points in Fig. 9A are scattered widely, one might wonder if the static nonlinearity is actually a half-rectification but noise in the system makes it look like an expansive function on average. It is also possible that the threshold for spiking changes from time to time. Then the shape of the static nonlinearity would be smeared when averaged over time and a half-rectification could look like an expansive function. Although we cannot rule out these possibilities, it is nonetheless important to characterize the shape of the static nonlinearity as a functional description of the cell. The result shown in Fig. 9A indicates that, regardless of its true shape, the static nonlinearity acts like an expansive function on average.

As described in METHODS, an expansive function like seen in Fig. 9A can be well described by a half-power function (Eq. 2). The degree of expansion is represented by an exponent (n in Eq. 2) of the power function, which can be estimated quite easily by plotting the input-output function on log-log coordinates as shown in Fig. 9B. On log-log coordinates, the exponent corresponds to the slope of a straight line (see Eq. 3). We fit a straight line to three consecutive data points to find the maximum slope, which is taken as an estimate of the exponent for the static nonlinearity (see METHODS for details). For the example shown in Fig. 9B, the three data points indicated by filled circles yield the maximum slope of 2.08 for a straight line fit (---). The static nonlinearity of this cell is, therefore, approximately a half-squaring function. Note that the deviation from the straight line of the data points at W < 0.2 is predicted by the effect of a threshold (theta  in Eq. 3). It is also interesting that this cell does not show clear response saturation, despite the fact that instantaneous spike rates could exceed 400 spikes/s (Fig. 9A). This is in marked contrast to the response saturation seen in contrast response functions (e.g., Albrecht and Hamilton 1982; Anzai et al. 1995; Dean 1981; Maffei and Fiorentini 1973; Movshon and Tolhurst 1975; Sclar et al. 1990; Tolhurst et al. 1981). It is possible that response saturation is a consequence of adaptation to a prolonged exposure of the cell to a band-limited stimulus (such as a sinusoidal grating) of high contrast, and that, without such adaptation, cells can produce a much higher spike rate instantaneously.

Figure 10 shows more examples of the input-output function on log-log coordinates. The effect of a threshold is apparent at low W values in any of the cells shown, but only a slight hint of response saturation can be seen in some cells (D-F). The maximum slope (n) of a straight line fit (---) varies from cell to cell, indicating that each cell has a different exponent. In Fig. 11, a histogram of exponents is shown for the population of simple cells examined. The exponent ranges from 1.32 to 3.11. The distribution has a mean of 2.17 ± 0.53 SD. Therefore the exponent of the static nonlinearity for binocular simple cells is, on average, ~2. Emerson and his collaborators (Emerson et al. 1989; Mancini et al. 1990) conducted a similar analysis on temporal interaction data of simple cells and found that a second-degree polynomial captures the main characteristic of the shape of the static nonlinearity. They concluded that the static nonlinearity is basically a half-squaring function, which is concordant with the results presented here.



View larger version (68K):
[in this window]
[in a new window]
 
Fig. 10. Examples of static nonlinearities. Each plot (A-F) indicates the input-output function plotted on log-log coordinates of the static nonlinearity for a simple cell. Only mean data points ( and open circle ) are shown for clarity. ---, best fits to the 3 consecutive data points () that yield the maximum slope. Maximum slope represents the exponent (n) of a power for the static nonlinearity and is indicated at top right of each plot.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 11. Histogram of exponents of the static nonlinearity for the population of binocular simple cells examined (n = 39). Exponent ranges from 1.32 to 3.11. Mean and SD of the distribution are 2.17 ± 0.53.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

In this study, white noise analysis has been applied to measurements of binocular interaction RFs and monocular RFs for simple cells in the cat's striate cortex. Binocular interaction RFs of most simple cells are found to be proportional to the product of left and right eye RFs. This indicates that the binocular interaction depends not only on stimulus binocular disparity but also on stimulus position or phase in the left and right eyes. The binocular interaction RF is consistent with that of a linear binocular filter followed by a static nonlinearity. The static nonlinearity is well characterized by a half-power function with an average exponent of ~2, i.e., a half-squaring function. This squaring nonlinearity is an implementation of a multiplicative operation and may play a fundamental role in computations performed by simple cells. In the context of binocular information processing, the squaring nonlinearity makes the initial linear convergence of signals from the left and right eyes multiplicative at the output of simple cells. This multiplicative binocular interaction is a key ingredient for the computation of interocular cross-correlation, an algorithm for solving the stereo correspondence problem. Therefore the process of solving the stereo correspondence problem may begin with these binocular simple cells.

Binocular interaction RF and binocular disparity tuning

The binocular interaction RF is a response map of nonlinear binocular interaction. It describes how a cell responds to stimuli at various positions in the left and right eyes, compared with the prediction from a linear sum of responses to stimulation of either eye. Therefore it represents the tuning of a cell for binocular disparity and how it depends on stimulus position in each eye.

In most previous studies, disparity tuning was measured with moving bars or extended stimuli such as random-dot stereograms. As a result, the dependency of the tuning on monocular stimulus position could not be examined. Binocular interaction RFs reported in this study reveal that binocular interactions exhibited by simple cells do depend on monocular stimulus positions and in a predictable manner. The binocular interaction RF is proportional to the product of the left and right eye RFs. Therefore the binocular disparity tuning of a simple cell, which may be obtained by integrating its binocular interaction RF along the frontoparallel axis, is predictable from its monocular RFs.

Ferster (1981) described how the binocular disparity tuning of simple cells can be predicted from monocular RFs. He computed dot products of left and right eye RFs for various interocular RF shifts (i.e., binocular disparities) and obtained the predicted tuning for binocular disparity. He found that the predicted and measured tuning matched very well. This computation corresponds to a cross-correlation between left and right eye RFs and is operationally equivalent to deriving a binocular interaction RF as the product of left and right eye RFs (as shown in Fig. 5, right) and integrating the binocular interaction RF along the frontoparallel axis (see Fig. 12). Because the measured binocular interaction RF is proportional to the product of left and right eye RFs, this is, in fact, an appropriate way of predicting binocular disparity tuning from monocular RFs.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 12. Schematic illustration of how binocular disparity tuning curves of various shapes can be obtained from a pair of monocular RFs with various phase disparities. A: for a cell with the same RF profile in the 2 eyes (i.e., no phase disparity), the binocular interaction RF, which is proportional to the product of the left and right eye RFs, is symmetric around the axis of a constant binocular disparity that goes through the peak. Therefore the binocular disparity tuning function obtained by integrating the binocular interaction RF along the frontoparallel axis is expected to be symmetric around the peak of the tuning function. It will belong to the tuned excitatory category. B: as the RF phase disparity increases, the binocular interaction RF becomes more and more asymmetric, resulting in asymmetric disparity tuning. Tuning will be classified as tuned near or tuned far if the RF spatial frequency is high (subregions of the monocular RF are small) and near or far if the RF spatial frequency is low (RF subregions are large). C: if the left and right eye RFs are sign-inverted versions of each other (i.e., RF phase disparity of 180°), then the binocular disparity tuning will be symmetric around the negative peak, similar to that of the tuned inhibitory category.

This predictability of binocular disparity tuning from monocular RFs implies that there will be some relationships between the parameters of binocular disparity tuning and monocular RFs. First of all, the cell's optimal disparity should correspond to the distance between the peaks of left and right eye RFs, i.e., the RF phase disparity.2 When the RF phase disparity is small, i.e., left and right eye RFs are similar in shape, the disparity tuning function is expected to be symmetric around the peak of the tuning function because the binocular interaction RF is symmetric around the axis of a constant binocular disparity that goes through the peak (Fig. 12A). It will resemble the disparity tuning function for cells in the tuned excitatory category according to Poggio's classification (Poggio and Fischer 1977; see Poggio 1995 for a review). As the RF phase disparity increases, the disparity tuning becomes more and more asymmetric (Fig. 12B). It will be similar to that of tuned near or tuned far cells if the RF spatial frequency is high (subregions of the monocular RF are small) and near or far cells if the RF spatial frequency is low (RF subregions are large). If the RF phase disparity is maximum (±180°), i.e., left and right eye RFs are sign-inverted versions of each other, then the binocular disparity tuning will be symmetric around the negative peak (Fig. 12C), similar to that of tuned inhibitory cells. If monocular RFs have multiple subregions, then the disparity tuning also should have multiple peaks. The width of disparity tuning will be proportional to the size of the subregions or inversely proportional to the RF spatial frequency. Therefore the profiles of the monocular RFs are important in determining the shape of the binocular disparity tuning for simple cells.

Because binocular disparity tuning depends on monocular RFs, simple cells are not truly tuned to binocular disparity per se. They are simply tuned to the spatial phases of left and right eye stimuli. However, because the polarity of the checkered pattern in the binocular interaction RF does not change along the frontoparallel axis (see Fig. 4), a group of simple cells that have the same RF phase disparity but different monocular RF phases can represent a binocular disparity independent of monocular stimulus phase. As shown in the following paper (Anzai et al. 1999b), binocular interaction RFs of complex cells are consistent with this scheme.

Do simple cells respond to random-dot stereograms?

The behavioral demonstration through the use of random-dot stereograms that binocular disparity alone is sufficient to mediate the perception of depth (Julesz 1960) illuminated a fundamental aspect of stereoscopic depth perception. It revealed that recognition of object form is not necessary to solve the stereo correspondence problem, which in turn implies that the correspondence problem may be solved at very early stages of binocular information processing.

Physiological evidence supporting this implication was provided by Poggio and his collaborators, who found that some cells in V1 and V2 of macaque monkeys respond to cyclopean stimuli embedded in random-dot stereograms (Poggio 1990; Poggio et al. 1985, 1988). Interestingly, these cells were predominantly complex cells, which suggests that mostly complex cells are responsible for solving the correspondence problem. It also suggests that the hierarchical notion of Hubel and Wiesel (1962) that simple cells feed into complex cells may not be correct because stimuli that complex cells respond to must, by that model, also be effective for simple cells. Are simple cells really not responsive to random-dot stereograms?

In fact, random-dot stereograms are not the only stimuli that are reportedly ineffective for simple cells. Hammond and MacKay (1975, 1977) claimed that complex cells but not simple cells respond to monocularly presented moving random-dot patterns (see also Morrone et al. 1982). If simple cells do not respond to monocular random-dot patterns, then it is not surprising that they do not respond to random-dot stereograms, either. However, Hammond and MacKay's results later were challenged by studies that demonstrated that simple cells do respond to random-dot patterns (Skottun et al. 1988; see also Casanova et al. 1995; Gulyas et al. 1987). There is also a theoretical framework that predicts that simple cells should respond to such patterns (Grzywacz and Yulli 1990, 1991), and simple cell behavior matches with that predicted by the theory (Skottun et al. 1994). Furthermore the fact that RFs of simple cells can be mapped with the 2D white noise used in the previous paper (Anzai et al. 1999a; see also Jacobson et al. 1993; Reid et al. 1997) is compelling evidence that they do in fact respond to random-dot patterns.

Likewise, the fact that simple cells respond to dichoptic white noise and exhibit binocular interactions as shown in Fig. 4 strongly suggests that they should respond to random-dot stereograms. Random-dot stereograms can be considered a special case of white noise; the left and right eye patterns are both monocularly white but are interocularly correlated. Nonlinear interactions exhibited by simple cells are, in general, of low orders (perhaps the first few), and the strength of interactions declines progressively as the order of interaction increases (Mancini et al. 1990). Therefore responses of simple cells to cyclopean stimuli in random-dot stereograms are likely due to low-order interactions, mostly of the second order. Then the binocular interaction RF, which represents second-order binocular interactions, should indicate how a cell responds to random-dot stereograms. In other words, the disparity tuning obtained by integrating the binocular interaction RF along the frontoparallel axis (as illustrated in Fig. 12) should be very similar, if not identical, to the tuning obtained with random-dot stereograms.

Then why did Poggio's group find that the overwhelming majority (90%) of cells that respond to random-dot stereograms are complex cells (Poggio 1990; see also Cumming and Parker 1997; Gonzalez et al. 1993)? The binocular interactions exhibited by simple cells depend on monocular positions, as shown in Fig. 4. This indicates that simple cells will not respond well to stereograms with monocular phases that are not optimal for them. Because the monocular phase of dynamic random-dot stereograms changes constantly, it is to be expected that simple cells would respond in a sporadic rather than a sustained fashion to the stereograms. Therefore it may be difficult to associate their responses to the binocular disparity of the cyclopean stimulus. However, if monocular spatial phases of the stimulus are distributed evenly over the stimulus presentation period and measurements are repeated many times, responses of simple cells to random-dot stereograms should become apparent, and the binocular disparity tuning would emerge as was the case for a minority of simple cells reported in the previous studies.

System structure for binocular simple cells

In this study, the system structure for most binocular simple cells has been identified as a linear binocular filter followed by a static nonlinearity. This result is concordant with the results of previous studies that conducted similar analyses on simple cells. Mancini (Mancini 1983; Mancini et al. 1990) measured responses of simple cells in the cat's striate cortex to temporal white noise generated according to binary m-sequences at various positions over the RF. He obtained a temporal profile of the RF as well as a profile for second-order temporal interaction and found that responses of simple cells can be well described by a model of a linear temporal filter followed by a static nonlinearity.

Emerson et al. (1989) successfully applied a more general structure comprising a cascade of a linear filter, a static nonlinearity, and another linear filter to describe responses of simple cells. Although one would not expect a linear filter to follow the output nonlinearity of simple cells, certain aspects of spike generation (e.g., a slow inactivation of sodium channels) (French and Korenberg 1989) and temporal binning of spikes in the analysis could introduce additional temporal filtering. Therefore some deviations from the proportionality condition of Eq. 1 seen in our data could be accounted for by the linear filter after the static nonlinearity. Nonetheless the fact that most simple cells satisfy Eq. 1 indicates that the second linear filter is a minor component, if necessary, to model simple cells. Indeed, the model's performance does not change very much with or without it (Jacobson et al. 1993).

In contrast to these findings, Jacobson et al. (1993) found that the structure of a linear filter followed by a static nonlinearity can explain, on average, only ~60% of the responses of simple cells in the striate cortex of macaque monkeys. They measured responses of simple cells to white noise and obtained monocular RFs as well as monocular spatiotemporal (second-order) interaction RFs. They show in their paper some examples of interaction RFs that are elongated (i.e., inseparable) and therefore cannot be described by the product of monocular RFs. A minority of the simple cells examined in our current study also exhibit inseparable binocular interaction RFs at one or more cross-correlation delays. These results suggest that some simple cells are not consistent with a model of a linear filter followed by a static nonlinearity; their structure may consist of parallel streams of a linear filter followed by a static nonlinearity (Jacobson et al. 1993). However, it is not clear if these cells are real variations of simple cells or simple cell-like complex cells since complex cells exhibit inseparable binocular interaction RFs (Anzai et al. 1999b).

It should be pointed out that the system structure estimated in this study is by no means complete. It has been known that simple cells exhibit various other nonlinear properties. For example, responses of cells are normalized according to stimulus contrast, which is known as contrast gain control or contrast normalization (e.g., Albrecht and Geisler 1991; Bonds 1991; Geisler and Albrecht 1992; Heeger 1992a; Ohzawa et al. 1982, 1985). The gain control signal presumably is provided by a group of other cortical cells as a feedback signal. Because the noise stimuli used in this study have an average contrast that is relatively constant over time, the response gain of the cell is also expected to be relatively steady. Therefore the effect of the feedback signal can be considered constant, and the feedback circuitry can be separated effectively from the feedforward circuitry. In other words, the structure studied here only applies to the feedforward circuitry. There also are known inhibitory influences originating outside of the classical RF such as end and side inhibition (e.g., DeAngelis et al. 1994; DeValois et al. 1985; Hubel and Wiesel 1968; Kato et al. 1978; Maffei and Fiorentini 1976). In this study, the stimuli used were only slightly larger than the classical RF. Therefore inhibitory surrounds were not stimulated to any great extent. These nonlinear mechanisms need to be examined separately to build a more complete model of simple cells.

Static nonlinearity of simple cells

The results of this study show that the static nonlinearity of simple cells is a half-power function with an exponent of ~2. This suggests that the static nonlinearity of simple cells performs a nonlinear computation that is more than just thresholding. If the static nonlinearity were to serve as only a threshold, a half-rectification (an exponent of 1) would be sufficient. In that case, the output would be proportional to the input that exceeds a threshold, and therefore the underlying computation represented by the static nonlinearity would be essentially linear above the threshold. The fact that the exponent of a half-power function ranges approximately from 1.32 to 3.11 (much larger than 1) suggests that the expansive nonlinearity may be fundamental to the computations performed by simple cells.

It is interesting that the range of exponents is rather small. Any exponent other than 1 signifies some sort of nonlinear computation, but is there any reason why the exponent needs to be in this range? Obviously, the exponent should be significantly higher than 1 for simple cells to perform nonlinear computations without restricting the response dynamic range (exponent values <1 also represent nonlinearities, but they are of a compressive type). However, if exponents are too high, then a half-power function becomes similar to an over-rectification, i.e., a rectification (the exponent is 1) with a high-threshold and high gain (slope). Therefore it may be approximated as linear for inputs above the threshold. Although the sensitivity to small change in input would increase, a high gain also has an undesirable effect of reducing the input range that cells can encode because the output reaches the maximum quickly as input increases. Taken together, the range of exponents seen among simple cells may reflect a range suitable for nonlinear computations that can be implemented within the limitation imposed by the maximum firing rate.

Given that the exponent is somewhere ~2, what kind of computations can be achieved by the static nonlinearity? The exponent of 2, a squaring, is an attractive operation from a computational point of view. First of all, because simple cells are selective to spatial frequency and phase, their output, if squared, corresponds to something analogues to a phase specific component of Fourier energy in a local region of the stimulus. This may be an ideal way of preserving local amplitude and phase information (Pollen and Ronner 1982). Second, the squared output of a linear filter is a building block for an energy model (Adelson and Bergen 1985; Ohzawa et al. 1990; Watson and Ahumada 1985). Third, the squaring enhances stimulus selectivity (Albrecht and Geisler 1991; J. L. Gardner, A. Anzai, R. D. Freeman, and I. Ohzawa, unpublished data); the tuning of cells for stimulus parameters such as orientation and spatial frequency becomes narrower, and the tuning band edges steeper, than would be without squaring. Finally, the squaring makes second-order interactions multiplicative. This is an important consequence of having an exponent near 2 because multiplication is a fundamental nonlinear operation. The implication of this multiplicative nonlinearity for functional roles of simple cells in binocular information processing will be discussed later. It should be noted that the above arguments should not depend critically on the exponent being exactly 2.

The neural bases and/or biophysical mechanisms responsible for the expansive nonlinearity are not known. One possibility is that spike generation at the soma is a function of the square of the average membrane potential over time. Another possibility is that the expansive nonlinearity can be a form of a dynamic nonlinearity, such as contrast normalization (Heeger 1992a,b). In this scheme, the static part of the nonlinearity is considered a half-rectification, i.e., the exponent is 1. However, the response gain (the slope of the half-rectification) and threshold (the position of the rectification) change dynamically according to stimulus contrast (assuming that the contrast is relatively low to avoid response saturation) such that the time average of the dynamic nonlinearity mimics a static nonlinearity with an exponent near 2 (see Suarez and Koch 1989 for a similar model). Because the gain normalization signal is thought to come from a group of other cortical neurons (Heeger 1992a), a feedback circuitry is involved in mediating the multiplicative nonlinearity in this scheme. There is also a suggestion that recurrent cortical excitation could amplify input signals (e.g., Douglas et al. 1995; Somers et al. 1995).

Multiplicative operations can be performed at dendritic trees as well. Mel (1992, 1993) showed that a model pyramidal cell driven by strong N-methyl-D-aspartate synaptic currents and/or containing dendritic Ca2+or Na+ channels, responds more strongly to synaptic inputs that are spatially clustered than to those distributed diffusely. Therefore such a neuron could perform multiplications among the neighboring synaptic inputs and sum the results along the dendritic trees. This type of neuron is equivalent to what is known as a Sigma-pi neuron (Rumelhart et al. 1986), and its potential importance in nonlinear computations has been suggested (e.g., Durbin and Rumelhart 1989; Koch and Poggio 1992; Rumelhart et al. 1986). Whether or not real neurons, including simple cells in the striate cortex, are Sigma-pi neurons remains to be seen.

Is the binocular interaction exhibited by simple cells linear or multiplicative?

Ohzawa and Freeman (1986) measured the tuning for interocular phase disparity using drifting sinusoidal gratings to study the binocular interactions exhibited by simple cells in the cat's striate cortex. They found that most cells show tuning that is consistent with the predictions of linear binocular summation. On the other hand, Ferster (1981) measured the binocular disparity tuning of simple cells in the cat's striate cortex using moving bright bars and found that the disparity tuning can be predicted by taking a cross-correlation between left and right eye RF profiles. This result suggests that binocular interaction is multiplicative.

The results obtained in our current study offer a resolution to this apparent contradiction regarding the binocular interaction exhibited by simple cells. The system structure for binocular simple cells has been identified as a linear binocular filter followed by a half-power function with an exponent near 2. This can be formulated as
<AR><R><C>(<IT>W</IT><SUB><IT>L</IT></SUB><IT>+</IT><IT>W</IT><SUB><IT>R</IT></SUB>)<SUP><IT>2</IT></SUP><IT>=</IT><IT>W</IT><SUP><IT>2</IT></SUP><SUB><IT>L</IT></SUB><IT>+</IT><IT>W</IT><SUP><IT>2</IT></SUP><SUB><IT>R</IT></SUB><IT>+2</IT><IT>W</IT><SUB><IT>L</IT></SUB><IT>W</IT><SUB><IT>R</IT></SUB></C><C><IT>if </IT>(<IT>W</IT><SUB><IT>L</IT></SUB><IT>+</IT><IT>W</IT><SUB><IT>R</IT></SUB>)<IT>>0 and</IT></C></R><R><C><IT>=0</IT></C><C><IT>otherwise</IT></C></R></AR> (4)
where WL and WR are the output of the left and right eye linear filters, respectively. This equation indicates that the initial convergence of binocular signals is linear (the term WL+WR on the left hand side of the equation), but a squaring nonlinearity makes the binocular interaction multiplicative (the term 2WLWR on the right hand side of the equation). Therefore the answer to the question of whether binocular interaction is linear or nonlinear depends on whether one looks at the input stage (before the static nonlinearity) or the output stage (after the static nonlinearity). Although linear filtering is an important function of simple cells, one should incorporate the static nonlinearity into the description of their functional roles because what is important is the spike activity (after the static nonlinearity) that is sent to the next processing stages. Thus it seems appropriate to consider the binocular interaction as nonlinear (multiplicative). This interpretation offers an important functional role for binocular simple cells, as discussed next.

Functional roles of binocular simple cells

The fact that the binocular interactions exhibited by simple cells is multiplicative has an important implication as to their functional role in processing binocular information. It has been suggested that the stereo correspondence problem can be solved by taking the interocular cross-correlation of stereo images (Jenkin and Jepson 1988; Sanger 1988). For cortical cells to compute an interocular cross-correlation, they must be able to perform multiplication between left and right eye signals. Because simple cells exhibit multiplicative binocular interactions, they potentially could compute something analogous to an interocular cross-correlation to solve the stereo correspondence problem.

As formulated in Eq. 4, simple cells sum the outputs of left and right eye linear filters. The results then are rectified and squared. Because the output of a linear filter is a weighted sum of the stimulus over space,3 i.e., a dot-product of the stimulus and the RF, the first line of Eq. 4 (for the positive output of the linear binocular filter) can be rewritten as
<FENCE><LIM><OP>∫</OP></LIM> <IT>L</IT>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>L</IT></SUB><IT>+</IT><LIM><OP>∫</OP></LIM> <IT>R</IT>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>R</IT></SUB></FENCE><SUP><IT>2</IT></SUP> (5)

<IT>=</IT><FENCE><LIM><OP>∫</OP></LIM> <IT>L</IT>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>L</IT></SUB></FENCE><SUP><IT>2</IT></SUP><IT>+</IT><FENCE><LIM><OP>∫</OP></LIM> <IT>R</IT>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>R</IT></SUB></FENCE><SUP><IT>2</IT></SUP><IT>+2 </IT><LIM><OP>∫</OP></LIM> <IT>L</IT>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>L</IT></SUB> <LIM><OP>∫</OP></LIM> <IT>R</IT>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>R</IT></SUB>
where L and R denote left and right eye RFs, respectively. The variables S and X represent stimulus and stimulus position, respectively, and their subscripts L (left) and R (right) denote the eye of origin. The first two terms of the right hand side of the equation represent responses due to monocular stimulation. Therefore these terms do not depend on binocular disparity. The third term depends on binocular disparity and is presumably relevant to the computations leading to binocular fusion and stereopsis. Using the relationship that the binocular interaction RF is proportional to the product of left and right eye RFs, the third term can be rearranged as
2 <LIM><OP>∬</OP></LIM> <IT>L</IT>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>R</IT>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>L</IT></SUB><IT>d</IT><IT>X</IT><SUB><IT>R</IT></SUB><IT>=&agr; </IT><LIM><OP>∬</OP></LIM> <IT>B</IT>(<IT>X</IT><SUB><IT>L</IT></SUB><IT>, </IT><IT>X</IT><SUB><IT>R</IT></SUB>)<IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT><SUB><IT>L</IT></SUB>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><SUB><IT>R</IT></SUB>)<IT>d</IT><IT>X</IT><SUB><IT>L</IT></SUB><IT>d</IT><IT>X</IT><SUB><IT>R</IT></SUB> (6)
where B denotes a binocular interaction RF and alpha  indicates a constant. By changing variables XL and XR to X and X + delta , respectively, Eq. 6 becomes
&agr; <LIM><OP>∬</OP></LIM> <IT>B</IT>(<IT>X</IT><IT>, </IT><IT>X</IT><IT>+&dgr;</IT>)<IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><IT>+&dgr;</IT>)<IT>d</IT><IT>X</IT><IT>d&dgr;</IT> (7)
where X indicates a monocular stimulus position and delta  denotes a binocular disparity. Comparing Eq. 7 with an interocular cross-correlation, phi LR(delta ), of the stimulus given by
&PHgr;<SUB>LR</SUB>(&dgr;)=<LIM><OP>∫</OP></LIM> <IT>S</IT><SUB><IT>L</IT></SUB>(<IT>X</IT>)<IT>S</IT><SUB><IT>R</IT></SUB>(<IT>X</IT><IT>+&dgr;</IT>)<IT>d</IT><IT>X</IT> (8)
it is clear that Eq. 7 is an interocular cross-correlation that is weighted according to the binocular interaction RF, which then is integrated over all binocular disparities. Therefore the computation performed by binocular simple cells includes interocular cross-correlation due to the static nonlinearity.

This interpretation requires the following comments. First, the output of simple cells contains monocular terms as indicated in Eq. 5 while the interocular cross-correlation defined in Eq. 8 does not. Therefore strictly speaking, simple cells do not compute interocular cross-correlation. However, because the monocular terms in Eq. 5 are independent of binocular disparity, responses to cyclopean stimuli entirely depend on the binocular term in Eq. 5 (i.e., Eq. 7). In this sense, simple cells can be considered to be computing something analogous to the cross-correlation of left and right eye images that are band-pass filtered.

It is also important to realize that the interocular cross-correlation performed by simple cells is local, i.e., the computation is restricted within its RF. Therefore they do not provide the complete solution to the stereo correspondence problem (Cumming and Parker 1997). In fact, they signal false matches as well. However, false matches can be rejected easily by combining the local solutions at various spatial locations, scales/frequencies, and orientations (Fleet et al. 1996).

Finally, the computation of interocular cross-correlation depends on the monocular spatial phase of a stimulus because simple cells are sensitive to monocular spatial phase. However, complex cells, which are not phase-sensitive, can provide phase independent interocular cross-correlation, as demonstrated in the following paper (Anzai et al. 1999b).


    APPENDIX
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Derivation of Eq. 1

Suppose that a binocular simple cell has the system structure of a linear binocular filter followed by a static nonlinearity, as depicted in Fig. 3. The output of the linear filter (W(t) in Fig. 3) is described by a sum of convolution integrals at various positions in space
<IT>W</IT>(<IT>t</IT>)<IT>=</IT><LIM><OP>∑</OP><LL><IT>i</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>i</IT></SUB>(<IT>t</IT><IT>−&tgr;</IT>)<IT>d&tgr;+</IT><LIM><OP>∑</OP><LL><IT>j</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>n</IT><SUB><IT>L</IT></SUB><IT>+j</IT></SUB>(<IT>t</IT><IT>−&tgr;</IT>)<IT>d&tgr;</IT> (A1)
where L and R are impulse response functions for the left and right eye linear filters, respectively, and S denotes the input stimulus generated according to a binary m-sequence. The subscripts, i and j, indicate stimulus positions in the left and right eyes, and nL and nR denote the total number of stimulus positions in the left and right eyes, respectively. The static nonlinearity (N in Fig. 3) can be represented, with any desired accuracy in a mean-squared error sense, by a polynomial of an arbitrary degree, provided that it is continuous within a given interval (Billings and Fakhouri 1978). Therefore the output of the neuron [Y(t) in Fig. 3] can be described as
<IT>Y</IT>(<IT>t</IT>)<IT>=</IT><LIM><OP>∑</OP><LL><IT>d</IT></LL></LIM><IT> &ggr;</IT><SUB><IT>d</IT></SUB>{<IT>W</IT>(<IT>t</IT>)}<SUP><IT>d</IT></SUP> (A2)
where gamma d (d = 0,1,2,· · ·) is a constant. Substituting Eq. A1 into Eq. A2 yields
<IT>Y</IT>(<IT>t</IT>)<IT>=&ggr;<SUB>0</SUB>+&ggr;</IT><SUB><IT>1</IT></SUB><FENCE><LIM><OP>∑</OP><LL><IT>i</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>i</IT></SUB>(<IT>t</IT><IT>−&tgr;</IT>)<IT>d&tgr;+</IT><LIM><OP>∑</OP><LL><IT>j</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>n<SUB>L</SUB>+j</IT></SUB>(<IT>t−&tgr;</IT>)<IT>d&tgr;</IT></FENCE><IT>+</IT> (A3)

<IT>&ggr;</IT><SUB><IT>2</IT></SUB><FENCE><LIM><OP>∑</OP><LL><IT>i</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>i</IT></SUB>(<IT>t−&tgr;</IT>)<IT>d&tgr;+</IT><LIM><OP>∑</OP><LL><IT>j</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>n<SUB>L</SUB>+j</IT></SUB>(<IT>t−&tgr;</IT>)<IT>d&tgr;</IT></FENCE><SUP><IT>2</IT></SUP><IT>+</IT>

<IT>&ggr;</IT><SUB><IT>3</IT></SUB><FENCE><LIM><OP>∑</OP><LL><IT>i</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>i</IT></SUB>(<IT>t−&tgr;</IT>)<IT>d&tgr;+</IT><LIM><OP>∑</OP><LL><IT>j</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<IT>S</IT><SUB><IT>n<SUB>L</SUB>+j</IT></SUB>(<IT>t−&tgr;</IT>)<IT>d&tgr;</IT></FENCE><SUP><IT>3</IT></SUP><IT>+…</IT>
The left eye RF of the neuron is obtained by cross-correlating the output of the neuron and the stimulus for the left eye
<IT><A><AC>L</AC><AC>ˆ</AC></A></IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>=</IT><IT>E</IT>[<IT>Y</IT>(<IT>t</IT>)<IT>S</IT><SUB><IT>i</IT></SUB>(<IT>t</IT><IT>−&tgr;</IT>)] (A4)
where E[  ] indicates the expected value of the quantity inside the brackets, and &Lcirc;i is the measured left eye RF.

A binary m-sequence stimulus with a power density P and a stimulus update period Delta  has the following rth order correlation property
<IT>E</IT><FENCE><LIM><OP>∏</OP><LL><IT>k</IT></LL><UL><IT>q</IT></UL></LIM> {<IT>S</IT><SUB><IT>h</IT><SUB><IT>k</IT></SUB></SUB>(<IT>t−&tgr;</IT><SUB><IT>k</IT></SUB>)}<SUP><IT>m</IT><SUB><IT>k</IT></SUB></SUP></FENCE><IT>=</IT><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>r</IT><IT>/2</IT></SUP><IT> where </IT><IT>r</IT><IT>=</IT><LIM><OP>∑</OP><LL><IT>k</IT></LL><UL><IT>q</IT></UL></LIM> <IT>m<SUB>k</SUB></IT> (A5)
if mk is even for all possible values of k, and
E<FENCE><LIM><OP>∏</OP><LL><IT>k</IT></LL><UL><IT>q</IT></UL></LIM> {<IT>S</IT><SUB><IT>h</IT><SUB><IT>k</IT></SUB></SUB>(<IT>t</IT><IT>−&tgr;</IT><SUB><IT>k</IT></SUB>)}<SUP><IT>m</IT><SUB><IT>k</IT></SUB></SUP></FENCE><IT>≈0</IT> (A6)
if mk is odd for at least one value of k.

Substituting Eq. A3 into Eq. A4, and using the property described above, the left eye RF becomes
<IT><A><AC>L</AC><AC>ˆ</AC></A></IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>=&ggr;</IT><SUB><IT>1</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>+3&ggr;</IT><SUB><IT>3</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>2</IT></SUP><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<FENCE><LIM><OP>∑</OP><LL><IT>f</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUP><IT>2</IT></SUP><SUB><IT>f</IT></SUB>(<IT>&tgr;</IT><SUB><IT>a</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>a</IT></SUB><IT>+</IT><LIM><OP>∑</OP><LL><IT>g</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUP><IT>2</IT></SUP><SUB><IT>g</IT></SUB>(<IT>&tgr;</IT><SUB><IT>b</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>b</IT></SUB></FENCE><IT>−2&ggr;</IT><SUB><IT>3</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>2</IT></SUP><IT>L</IT><SUP><IT>3</IT></SUP><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>+…</IT> (A7)
Assuming that no single Li(tau ) dominates, so that
<IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<FENCE><LIM><OP>∑</OP><LL><IT>f</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUP><IT>2</IT></SUP><SUB><IT>f</IT></SUB>(<IT>&tgr;</IT><SUB><IT>a</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>a</IT></SUB><IT>+</IT><LIM><OP>∑</OP><LL><IT>g</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUP><IT>2</IT></SUP><SUB><IT>g</IT></SUB>(<IT>&tgr;</IT><SUB><IT>b</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>b</IT></SUB></FENCE><IT>&z.Gt;</IT><IT>L</IT><SUP><IT>3</IT></SUP><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>) (A8)
for any i and tau  and similar relationships hold for higher-order (>3) polynomial terms in Eq. A7, Eq. 7 becomes
<IT><A><AC>L</AC><AC>ˆ</AC></A></IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>≈&bgr;</IT><SUB><IT>m</IT></SUB><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>) (A9)
where beta m is a constant defined by
&bgr;<SUB><IT>m</IT></SUB><IT>=&ggr;</IT><SUB><IT>1</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><IT>+3&ggr;</IT><SUB><IT>3</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>2</IT></SUP><FENCE><LIM><OP>∑</OP><LL><IT>f</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUP><IT>2</IT></SUP><SUB><IT>f</IT></SUB>(<IT>&tgr;</IT><SUB><IT>p</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>p</IT></SUB></FENCE> (A10)

<FENCE><IT>+</IT><LIM><OP>∑</OP><LL><IT>g</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUP><IT>2</IT></SUP><SUB><IT>g</IT></SUB>(<IT>&tgr;</IT><SUB><IT>q</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>q</IT></SUB></FENCE><IT>+…</IT>
Likewise, the right eye RF &Rcirc;j is described as
<IT><A><AC>R</AC><AC>ˆ</AC></A></IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<IT>≈&bgr;<SUB>m</SUB></IT><IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>) (A11)
Equations A9 and A11 state that the measured monocular RFs are approximately proportional to the impulse response functions of the monocular filters.

Similarly, a binocular interaction RF &Bcirc;i,j is obtained by taking a cross-correlation among the output of the neuron and the left and right eye stimuli
<IT><A><AC>B</AC><AC>ˆ</AC></A></IT><SUB><IT>i,j</IT></SUB>(<IT>&tgr;</IT>)<IT>=</IT><IT>E</IT>[<IT>Y</IT>(<IT>t</IT>)<IT>S</IT><SUB><IT>i</IT></SUB>(<IT>t</IT><IT>−&tgr;</IT>)<IT>S</IT><SUB><IT>n</IT><SUB><IT>L</IT></SUB><IT>+j</IT></SUB>(<IT>t</IT><IT>−&tgr;</IT>)] (A12)
Substituting Eq. A3 into Eq. A12 and using the property described in Eqs. A5 and A6 and assuming Eq. A8 and similar relationships for higher order (>3) polynomial terms, Eq. A12 becomes
<IT><A><AC>B</AC><AC>ˆ</AC></A></IT><SUB><IT>i,j</IT></SUB>(<IT>&tgr;</IT>)<IT>=2&ggr;</IT><SUB><IT>2</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>2</IT></SUP><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<IT>+4&ggr;</IT><SUB><IT>4</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>3</IT></SUP><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)<FENCE><IT>3</IT><FENCE><LIM><OP>∑</OP><LL><IT>f</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUP><IT>2</IT></SUP><SUB><IT>f</IT></SUB>(<IT>&tgr;</IT><SUB><IT>a</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>a</IT></SUB><IT>+</IT><LIM><OP>∑</OP><LL><IT>g</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUP><IT>2</IT></SUP><SUB><IT>g</IT></SUB>(<IT>&tgr;</IT><SUB><IT>b</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>b</IT></SUB></FENCE><IT>−2</IT>{<IT>L</IT><SUP><IT>2</IT></SUP><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>+</IT><IT>R</IT><SUP><IT>2</IT></SUP><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>)}</FENCE><IT>+…≈&bgr;</IT><SUB><IT>b</IT></SUB><IT>L</IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT>R</IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>) (A13)
where beta b is a constant defined by
&bgr;<SUB><IT>b</IT></SUB><IT>=2&ggr;</IT><SUB><IT>2</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>2</IT></SUP><IT>+12&ggr;</IT><SUB><IT>4</IT></SUB><FENCE><FR><NU><IT>P</IT></NU><DE><IT>&Dgr;</IT></DE></FR></FENCE><SUP><IT>3</IT></SUP><FENCE><LIM><OP>∑</OP><LL><IT>f</IT></LL><UL><IT>n</IT><SUB><IT>L</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>L</IT><SUP><IT>2</IT></SUP><SUB><IT>f</IT></SUB>(<IT>&tgr;</IT><SUB><IT>a</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>a</IT></SUB></FENCE> (A14)

<FENCE><IT>+</IT><LIM><OP>∑</OP><LL><IT>g</IT></LL><UL><IT>n</IT><SUB><IT>R</IT></SUB></UL></LIM> <LIM><OP>∫</OP></LIM> <IT>R</IT><SUP><IT>2</IT></SUP><SUB><IT>g</IT></SUB>(<IT>&tgr;</IT><SUB><IT>b</IT></SUB>)<IT>d&tgr;</IT><SUB><IT>b</IT></SUB></FENCE><IT>+…</IT>
Equation A13 states that the binocular interaction RF is approximately proportional to the product of the left and right eye impulse response functions. From Eq. A9, A11, and A13,
<IT><A><AC>B</AC><AC>ˆ</AC></A></IT><SUB><IT>i,j</IT></SUB>(<IT>&tgr;</IT>)<IT>≈&agr;</IT><IT><A><AC>L</AC><AC>ˆ</AC></A></IT><SUB><IT>i</IT></SUB>(<IT>&tgr;</IT>)<IT><A><AC>R</AC><AC>ˆ</AC></A></IT><SUB><IT>j</IT></SUB>(<IT>&tgr;</IT>) (A15)
where alpha  = beta b/beta m 2. In other words, the binocular interaction RF is approximately proportional to the product of the left and right eye RFs. Note that for a Gaussian white noise stimulus Eqs. A9, A11, A13, and A15 are exact (e.g., Marmarelis and Marmarelis 1978), not an approximation.


    ACKNOWLEDGMENTS

We are grateful to Dr. Erich Sutter for advice on binary m-sequences and their applications to receptive field mapping and to Dr. Stanley Klein for advice on nonlinear systems analysis and help with APPENDIX. We also are indebted to Dr. E. J. Chichilnisky for kindness in sharing with us his unpublished results regarding the shape of the static nonlinearity in retinal ganglion cells. We also thank Drs. Russel DeValois and Edwin Lewis for discussions and helpful comments and suggestions.

This work was supported by research and CORE Grants EY-01175 and EY-03176 from the National Eye Institute.


    FOOTNOTES

Address for reprint requests: R. D. Freeman, 360 Minor Hall, School of Optometry, University of California, Berkeley, CA 94720-2020.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1 For a cell with a structure illustrated in Fig. 3, measured RFs, L and R, are actually equivalent to impulse response functions of the left and right eye linear filters only up to some unknown scaling factor (see Eqs. A9 and A11 in the APPENDIX). Therefore the input to the static nonlinearity W(t), when estimated using L and R, also is scaled by the same factor. For this reason, W will be presented as a normalized quantity. However, the shape of the static nonlinearity depends neither on the scaling factor nor on the normalization.

2 For RFs that are modeled as a Gabor function, the distance between peaks in the left and right eye RFs is actually slightly smaller than the RF phase disparity; as the RF phase disparity increases, the interocular peak distance increases slightly less. However, the difference is generally insignificant.

3 Although the output of a linear filter is a convolution over time and a weighted sum over space between a stimulus and RF, the time domain is ignored here for simplicity.

Received 2 June 1998; accepted in final form 2 April 1999.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

0022-3077/99 $5.00 Copyright © 1999 The American Physiological Society