Contextual Influence on Orientation Discrimination of Humans and Responses of Neurons in V1 of Alert Monkeys

Wu Li,1 Peter Thier,2 and Christian Wehrhahn1

 1Max-Planck-Institut für Biologische Kybernetik; and  2Sektion für Visuelle Sensomotorik, Neurologische Universitätsklinik, 72076 Tubingen, Germany


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Li, Wu, Peter Thier, and Christian Wehrhahn. Contextual Influence on Orientation Discrimination of Humans and Responses of Neurons in V1 of Alert Monkeys. J. Neurophysiol. 83: 941-954, 2000. We studied the effects of various patterns as contextual stimuli on human orientation discrimination, and on responses of neurons in V1 of alert monkeys. When a target line is presented along with various contextual stimuli (masks), human orientation discrimination is impaired. For most V1 neurons, responses elicited by a line in the receptive field (RF) center are suppressed by these contextual patterns. Orientation discrimination thresholds of human observers are elevated slightly when the target line is surrounded by orthogonal lines. For randomly oriented lines, thresholds are elevated further and even more so for lines parallel to the target. Correspondingly, responses of most V1 neurons to a line are suppressed. Although contextual lines inhibit the amplitude of orientation tuning functions of most V1 neurons, they do not systematically alter the tuning width. Elevation of human orientation discrimination thresholds decreases with increasing curvature of masking lines, so does the inhibition of V1 neuronal responses. A mask made of straight lines yields the strongest interference with human orientation discrimination and produces the strongest suppression of neuronal responses. Elevation of human orientation discrimination thresholds is highest when a mask covers only the immediate vicinity of the target line. Increasing the masking area results in less interference. On the contrary, suppression of neuronal responses in V1 increases with increasing mask size. Our data imply that contextual interference observed in human orientation discrimination is in part directly related to contextual inhibition of neuronal activity in V1. However, the finding that interference with orientation discrimination is weaker for larger masks suggests a figure-ground segregation process that is not located in V1.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

In the primary visual cortex (V1) responses of neurons to stimuli inside the classical receptive field (CRF) can be modulated by contextual stimuli outside (Allman et al. 1985; Blakemore and Tobin 1972; DeAngelis et al. 1994; Gilbert et al. 1996; Knierim and Van Essen 1992; Lamme et al. 1998; Levitt and Lund 1997; Li and Li 1994; Maffei and Fiorentini 1976; Nelson and Frost 1978; Nothdurft et al. 1999; Sengpiel et al. 1997). These observations suggest that even the earliest stage of visual cortical processing is involved in complex visual perception. Cells do not merely extract through their tiny and discrete RFs simple visual attributes such as orientation, spatial size, disparity, and color. Modulation of cell responses by contextual stimuli may have its neural substrate in long-range horizontal connections between cells in the same area as well as in feedback connections from higher cortical areas (for recent review, see Lamme et al. 1998). Contextual modulation in V1 has been interpreted as the neural substrate of many psychophysical phenomena such as the tilt illusion (Gilbert and Wiesel 1990), perceptual pop-out (Kastner et al. 1997; Knierim and Van Essen 1992; Nothdurft et al. 1999), detection of focal orientation discontinuity (Sillito et al. 1995), simultaneous brightness contrast (Rossi et al. 1996), and figure-ground segregation (Lamme 1995; Zipser et al. 1996).

For a century, the approach of masking has been used widely to probe visual functions (Breitmeyer 1984). The visibility and contrast sensitivity for oriented stimuli under masking is an extensively studied topic (Foster and Westland 1995; Gilinsky 1968; Macknik and Livingstone 1998; Polat and Sagi 1993; Sekuler 1965; Virsu and Taskinen 1975; Wolfson and Landy 1999), and the effects of relative timing of target and mask presentation on target detection have been well documented (for recent work, see Macknik and Livingstone 1998). Masking procedures also were applied in human hyperacuity studies (Waugh et al. 1993; Wehrhahn et al. 1996; Westheimer and Li 1996). In one of the latter studies (Wehrhahn et al. 1996), the effects of masking on human orientation discrimination were investigated. We found that orientation discrimination of a line in the fovea of human observers was impaired to varying extents when the target line was followed immediately by masks of various configurations. Our data suggest that the influence of masks on orientation discrimination could result from the interactions within the orientation domain between the target line and the masking elements. We proposed that the underlying neural substrate might be in the interactions between cells in V1 because V1 contains many orientation-selective units the responses of which can be inhibited markedly by contextual modulation mechanisms. Many electrophysiological studies showed that for most neurons in the primary visual cortex of cats and monkeys, stimuli presented outside the CRF inhibit responses elicited by stimuli inside (Blakemore and Tobin 1972; DeAngelis et al. 1994; Kastner et al. 1997; Knierim and Van Essen 1992; Li and Li 1994; Nothdurft et al. 1999; Sengpiel et al. 1997). A study using optical recording from monkey primary visual cortex also revealed that large surround stimuli usually suppress the cortical responses to a center stimulus (Grinvald et al. 1994). All these findings make V1 a good candidate for the neural substrate underlying the masking effects observed in our previous psychophysical study.

In the present study, we compared the effects of various contextual stimuli on orientation discrimination of humans with the effects of the same contextual patterns on neuronal activity in V1 of alert monkeys at comparable parafoveal locations. Our data show that although there are some discrepancies, the contextual stimuli have many similar effects on both the performance of human orientation discrimination and the responses of V1 neurons.

Preliminary accounts of our observations have been given at the 28th Annual Meeting of Society for Neuroscience (Li et al. 1998; Wehrhahn and Li 1998).


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Hardware and software used for the stimuli was identical for both psychophysical and physiological experiments. The images from two CRT displays were superimposed through a pellicle mirror (National Photocolor, ST-SQ-NP40) that allowed 40% transmission/reflection of the amount of light. The viewing distance was set to 60 cm. Under computer (Pentium MMX) control, the fixation spot and all stimuli were generated by use of bright line segments (~0.5 mm wide) on a vector scope (HP1345A, P31 phosphor) at a frame rate of 100 Hz. A homogeneous luminance background could be generated by an image synthesizer (Innisfree, Picasso) on the other CRT (Tektronix 608, green phosphor), which was refreshed at 200 Hz.

Human psychophysical experiments

Unless otherwise specified, the following stimulus parameters were used throughout. A 0.1° fixation spot was shown 500 ms before stimulus onset and lasted until 500 ms after stimulus offset. The stimuli, the center of which was located at 2.5° left and 2.5° down with respect to the fixation spot, consisted of a center target line and a surround square mask of various configurations on a dark (0.001cd/m2) background (see Figs. 4 and 5, insets, for examples). In this paper, the terms mask and contextual stimuli are used interchangeably and equally, also target line and test line. The target line length was 0.8°. The mask was 5 × 5° in size and composed of 7 × 7 masking elements, each 0.7° long. In most experiments, a 2 × 2° blank window was put in the center of the mask to exclude the masking pattern in the immediate vicinity of the target. Except for two experiments (Figs. 7 and 8) in which masking lines were 16 times dimmer than the target line, in all the other experiments the target line and masking elements were of the same brightness and the average luminance of the mask was ~1.5 cd/m2. In one experiment (Fig. 6), a homogeneous bright (1.5 cd/m2) surround was used as mask instead of textured patterns. Target and mask were presented simultaneously for only 40 ms to prevent saccades toward the stimuli. Experiments were conducted binocularly in a dark room on four (2 male and 2 female) subjects. Two of them were naïve as to the purpose of the experiment. The observers' task was to determine, by pressing the left or right mouse button, whether the target line appeared tilted to the left or right with respect to the vertical. No error feedback was given. The method of constant stimuli was used to determine orientation discrimination thresholds. Data shown in this paper were collected after 1-2 wk of training, when stable thresholds had been reached at the parafoveal location tested. Each data point was based on >= 280 responses collected over several days. The various conditions of an experiment were presented in an interleaved fashion.

As our interests focused on the spatial interaction between the target line and the contextual patterns, not on the effects of relative timing of target and mask presentation, we used a simultaneous masking procedure to minimize the so called backward masking (or meta-contrast) effect which greatly impairs the visibility of the target (Breitmeyer 1984; Macknik and Livingstone 1998; Ramachandran and Cobb 1995). In all experiments, the stimuli were suprathreshold to ensure clear identification of the target line.

Physiological experiments

ANIMALS AND PREPARATION. Two male juvenile rhesus monkeys (Macaca mulatta) were used in physiological recordings. The monkeys first were trained to detect the dimming of a small spot of light by touching and releasing a lever (Wurtz 1969). Subsequently, a search coil, a head post, and a stainless steel chamber were implanted under intubation anesthesia (for details, see Thier et al. 1988). Inside the chamber, the skull was removed to enable access to V1. After recovery from the surgical operations, the monkeys were trained further with decreasing fixation windows before recordings. All clinical and experimental procedures complied with the National Institutes of Health guidelines and the policy of The American Physiological Society and were approved by the regional animal care committee.

APPARATUS AND SETUP. The setup was the same as that used in human psychophysical experiments (refer to the foregoing text for details). The monkeys, seated in a primate chair in the dark room with head fixed, looked into the pellicle mirror. Under computer control, grating patches and homogeneous luminance backgrounds could be generated by the Innisfree image synthesizer on the Tektronix CRT. The fixation spot and stimuli made of bright line segments could be displayed on the HP scope.

RECORDINGS. Single units or multiunits were recorded with tungsten-in-glass microelectrodes of 1-3 MOmega impedance. The electrodes were driven into the cortex using a David Kopf Micropositioner (model 650) mounted on the recording chamber. Spikes were detected with a two-level window discriminator (List L/M-S40) or alternatively with a multi-spike detector (Alpha Omega, Israel) and then fed into a computer (HP Vectra Pentium Pro). Eye-position signals, which were recorded at a sampling rate of 600 Hz by a custom-made computerized search coil system (Bechert and Koenig 1996), also were sampled by this computer for fixation control. The fixation window size was between 0.4 × 0.4° and 1.0 × 1.0° but was 0.5 × 0.5° for most recordings.

MAPPING OF THE CRFS. Once neuronal activity was isolated, an orientation tuning curve was measured by collecting cell responses to a large (9.5° diam) patch of drifting square-wave gratings the orientation of which was varied in steps of 10-20°. The optimal orientation was determined from the peak of the tuning curve (i.e., the orientation produced the strongest responses) for those recording sites having a distinct tuning peak, which were the most cases. For a small portion of recording sites lacking a clear tuning peak (in particular those broadly tuned recording sites), the orientation corresponding to the middle point of the tuning curve was taken as the optimum. Subsequently by listening to the speaker, the other grating parameters (spatial frequency and moving velocity) were adjusted to optimal values of the cell. Then the CRF was mapped as minimal responsive rectangular area by moving a small grating patch along the axes parallel and orthogonal to the optimal orientation of the cell with all grating parameters set to the optimal values for maximum activation of neuronal responses (for details, see Li and Li 1994). The four boundaries of the CRF were determined from the data points on the response profiles where the cell responses were not significantly different from the spontaneous activity. The RF center also was determined and all subsequent stimuli were centered on the RF.

STIMULI. Grating patches were only used to map the RFs and measure the tuning properties of cells. All the other stimuli (the test line and contextual stimuli) were identical to those used in human psychophysical experiments except that in all experiments, including those two in Figs. 7 and 8, the composing elements of contextual stimuli and the test line were of the same brightness. All stimuli were centered on the RF (see Figs. 4 and 5, insets, for examples). The test line and the contextual stimuli were presented simultaneously for 500 ms, beginning after the monkeys had held their fixation for 500 ms within the fixation window. Test line length was the same as RF length. In most experiments, the contextual stimuli were confined to a 5 × 5° area around the test line and composed of 7 × 7 masking elements, 0.7° long each. A 2 × 2° blank window was used in the mask center so that all the contextual stimuli were outside the CRF. For each stimulus condition, a total of 5-20 trials were tested, and for each experiment different stimulus conditions were randomized.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Human psychophysical experiments were conducted on four observers in a parafoveal area the center of which was 2.5° left and 2.5° down relative to the fovea, corresponding to an eccentricity of 3.5°. In physiological experiments, a total of 169 single- or multiunits were recorded from V1 of two monkeys. We did not distinguish single cells from multiunits, and most recordings were from multiunits (clusters of 2-4 units). Throughout the text the term recording site is used to indicate a single unit as well as a cluster of multiunits, and the term receptive field (RF) represents the RF of single unit or the aggregate RF of multiunits. All RFs had sizes between 0.5 and 2.0° (1.0 ± 0.3°, mean ± SD) and eccentricities between 0.9 and 4.2° (2.8 ± 0.6°). Comparisons were made, under similar contextual masking conditions, between human psychophysical performance and population responses of all recording sites.

Effect of target line length on human orientation discrimination

In human foveal vision, the threshold for orientation discrimination of a line decreases with increasing target line length (Wehrhahn et al. 1996; Westheimer 1981). To select an appropriate target line length to explore the influence of contextual stimuli on orientation discrimination in the parafovea, we first determined the threshold for orientation discrimination as a function of target line length. Data from four human observers were averaged. As shown in Fig. 1, near minimal threshold (1.15 ± 0.16 deg of angle, mean ± SE, n = 4) was reached at a line length of ~0.8° (arrow in Fig. 1). Therefore for all subsequent psychophysical experiments a line of 0.8° in length was used as the target stimulus.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 1. Threshold for orientation discrimination of a line around the vertical as a function of line length. Center of the target line was located at 2.5° left and 2.5° down with respect to the fixation spot, corresponding to an eccentricity of 3.5°. Exposure duration was 40 ms. Data were averaged for 4 human subjects. Error bars represent ±1 SE. Note that threshold drops to near minimal value at a line length of ~0.8° (down-arrow ).

Contextual mask produces general suppression of neuronal responses in V1

We found in an earlier psychophysical study (Wehrhahn et al. 1996) that human orientation discrimination of a line in the fovea was impaired if the line was followed immediately by a surround mask of various configurations. If the masking effects observed in human orientation discrimination were related to interactions between cells in V1, we would expect to see inhibition of cell responses by contextual stimuli (masks).

By presenting a test line in the RF center and varying the orientation of the line, we measured the orientation tuning curves from 68 recording sites in V1 with and without the presence of a surround mask composed of randomly oriented lines (see Fig. 2A, insets). For most recording sites, the responses to the center line were suppressed at all orientations when the surround mask was present, resulting in a general inhibition of the tuning curves, no matter whether the cells were narrowly (Fig. 2A) or broadly (Fig. 2B) tuned. Only for a small portion of the tested recording sites the surround random lines had no significant effect on the tuning curves (Fig. 2, C and D). In Fig. 2, A and B, the peak of the tuning curve measured under the masking condition was normalized to the peak of the tuning curve obtained without masking, and the resultant curve is shown in the same plot for comparison.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 2. Examples of orientation tuning curves of cells in V1 measured with and without the presence of a surround mask (see insets). All the recording sites shown in this figure have receptive field (RF) sizes between 0.5 and 1.0°. A test line of the same length as RF was centered on RF (denoted by gray areas in insets). Mask was 5 × 5° in size and composed of 7 × 7 evenly distributed short lines, 0.7° long each. Composing lines in the mask center were occluded by a 2 × 2° blank window to ensure the mask was outside the RF. Orientation of each line in the surround was set randomly from trial to trial. Test line and the mask were presented simultaneously for 500 ms, and the mean responses and SE were calculated for this period. A and B: examples of recording sites showing general suppression in tuning curves with the presence of a surround mask. Most of V1 neurons show this effect. Peak of the tuning curve measured under the masking condition was normalized to the peak of the tuning curve obtained without masking, and the resultant curve was inserted in the same plot for comparison. C and D: examples of recording sites the tuning curves of which were not significantly affected by the surround mask.

From the orientation tuning curves, the optimal responses and the tuning widths can be determined quantitatively and compared for individual recording sites under the conditions with and without a surround mask. Figure 3A is a comparison of the mean responses at optimal orientation with and without the mask, and Fig. 3B compares the tuning widths measured as the width at half-amplitude. In Fig. 3A, the optimal orientation used to measure the optimal response was determined from the peak of the tuning curve for those recording sites having a clear tuning peak, such as the examples shown in Fig. 2. A, C, and D, which were the most cases. For those units lacking such a clear tuning peak, the orientation corresponding to the middle point of the tuning cure was taken as the optimum, such as the example shown in Fig. 2B. For each recording site, the responses at the same orientation for no-mask condition and masking condition were compared. In Fig. 3B, in measuring the tuning width, the amplitude of the tuning curve was counted from the minimum response to the peak response on the curve, and the tuning width was taken from the tuning curve as the difference between the two orientations giving half-amplitude responses.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 3. Comparisons, for individual V1 recording sites between the conditions with and without a surround mask (see Fig. 2, insets) of the mean responses at optimal orientation (A) and orientation tuning widths at half-amplitude (B). In A, the optimal orientation used to measure the optimal response was determined from the peak of the tuning curve for those recording units having a clear tuning peak, such as the examples shown in Fig. 2, A, C, and D, which were the most cases. For those units lacking such a clear tuning peak, the orientation corresponding to the middle point of the tuning cure was taken as the optimum, such as the example shown in Fig. 2B. For each recording site, the responses at the same orientation for no-mask condition and masking condition were compared. In B, in measuring the tuning width, the amplitude of the tuning curve was counted from the minimum response to the peak response on the curve, and the tuning width was measured directly from the tuning curve at half-amplitude. , 1 recording site; n = 68.

It can be seen from Fig. 3 that although there is no systematic change in orientation tuning width (paired t-test, t = 0.05, P > 0.5), a surround mask made of randomly oriented lines significantly suppresses neuronal responses to the center line by ~17% in average (paired t-test, t = 6.97, P < 0.01). The mean orientation tuning widths obtained here (53 ± 16° for no-mask condition, and 53 ± 14° for masking condition, mean ± SD) agree well with the values reported previously (Albright 1984; Celebrini et al. 1993; De Valois et al. 1982; Vogels and Orban 1990).

Orientation interaction between center and surround stimuli

Many studies in both cat area 17 (Blakemore and Tobin 1972; Li and Li 1994; Nelson and Frost 1978; Sengpiel et al. 1997) and monkey V1 (Grinvald et al. 1994; Kastner et al. 1997; Knierim and Van Essen 1992; Levitt and Lund 1997; Nothdurft et al. 1999; Sillito et al. 1995) have shown that inhibitory modulation from outside the CRF depends on the relative orientation of stimuli within and outside the CRF. The inhibition is strongest when the orientation of surround stimuli matches that of center stimuli. In the next experiment, we compared human performance in orientation discrimination and neuronal activity in V1 of alert monkeys under the same contextual masking conditions, in which masking lines were either parallel or orthogonal to the target line, or randomly oriented. The stimuli are schematically illustrated in Fig. 4, top.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 4. Effects of orientation of contextual lines on orientation discrimination of humans and responses of neurons in V1. Stimulus configurations are shown schematically as insets at the top of the figure. A: human thresholds for orientation discrimination under different conditions. Data were averaged for 4 observers. Target line length was 0.8°. Masks were 5 × 5° in size and composed of 7 × 7 lines, 0.7° long each. Position of each element in the masks was set randomly in each trial. A 2 × 2° blank window was put around the target line. Target and mask were presented simultaneously for 40 ms. Error bars represent ±1 SE. Horizontal line is the threshold level for no-mask condition. B-D: responses of neurons in V1 under similar masking conditions. Mask configurations were the same as described in A. A line of the same length as the RF was centered on the RF at optimal orientation as the conditioning stimulus (test line, the counterpart of target line in psychophysical experiments) B: peristimulus time histograms (PSTHs) from 1 typical recording site under different masking conditions. Neuronal responses to the surround lines alone are also shown here as controls. Time 0 indicates stimulus onset, 500 ms indicates stimulus offset. A binwidth of 20 ms was used to construct the PSTHs. C: population responses from all recording sites (PSTHs averaged for all the recording sites for each bin, n = 61). Before averaging across all recording sites, the PSTHs from each recording site under all test conditions were normalized in such a way that the peak of the PSTH for no-mask condition was always 100. Binwidth here is 10 ms. D: relative inhibition of cell responses averaged for all the recording sites (n = 61). Inhibition in percentage was calculated as follows: for a given condition the net decrease of mean response relative to the no-mask condition averaged between 40 and 500 ms after stimulus onset was divided by the mean response to the center test line alone. Forty milliseconds was taken as the response latency of the cells. Error bars represent 1 SE.

The psychophysical experiments (Fig. 4A) showed that human threshold for orientation discrimination was elevated slightly when the target line was surrounded by orthogonal lines. For a mask composed of randomly oriented lines, threshold was elevated further. An array of lines parallel to the target line produced the strongest interference with orientation discrimination. These results are consistent with the orientation interactions reported in the V1 studies cited in the preceding text. It is noteworthy that the implicit reference orientation in our experiments was the vertical, but the human observers were not able to use the additional explicit orientation of the contextual lines as a reference to achieve an improvement in orientation discrimination. The contextual line stimuli, regardless of horizontal or vertical, always interfere with human orientation discrimination.

As a comparison, we collected responses of 61 recording sites in V1 of monkeys to the same stimulus patterns used in the human psychophysical experiments. A line of the same length as the CRF was presented at optimal orientation inside the CRF, and the masks were presented well outside. Results (peristimulus time histograms, PSTHs) from one typical recording site are shown in Fig. 4B. The neuronal responses to the surround lines alone are also shown here as controls. Figure 4C shows the population PSTHs averaged for all the recording sites. Before averaging across the population, the PSTHs from each recording site under all test conditions were normalized in such a way that the peak of the PSTH for the no-mask condition was always taken as 100. Time 0 indicates stimulus onset, and 500 ms indicates stimulus offset. The binwidth is 20 ms in Fig. 4B and 10 ms in 4C.

It can be seen from Fig. 4, B and C, that by adding surround masks composed of lines of different orientations, neuronal responses to the center line were suppressed in all cases. A mask of orthogonal lines produced the weakest suppression, whereas lines parallel to the test line gave the strongest inhibition. These results confirm those reported by Knierim and Van Essen (1992).

In Fig. 4D, we replotted the neuronal response data using the percentage of inhibition as a measure. The inhibition of cell responses was calculated relative to the no-mask condition (no inhibition) for a time period between 40 and 500 ms after stimulus onset. That is, for a given masking condition, the decrease of mean response relative to the no-mask condition averaged between 40 and 500 ms after stimulus onset was divided by the mean response to the center test line alone for the same period. Forty milliseconds was taken as the response latency of the cells. The inhibition was calculated for each recording site and then averaged across the whole population. By comparing Fig. 4, D with A, we can see clearly the similarities between the impairment of orientation discrimination and the suppression of neuronal responses in V1 by the same contextual patterns. In both cases, contextual lines that were parallel to the target line yielded the strongest interference, whereas orthogonal lines gave the weakest effect. The masking strength of randomly oriented lines was in between.

Effects of configurations of contextual stimuli

The results shown in Fig. 4 indicate that the interaction between target and mask takes place in the orientation domain. This suggests that a mask with straight lines should have a stronger masking effect on responses of orientation-selective cells in V1 of monkeys as well as on orientation discrimination of humans. To test this hypothesis, we chose contextual patterns composed of either random dots or lines of different curvature, such as circles, semicircles or straight lines (see Fig. 5, insets). The total length of elements in the mask was the same in all masking conditions, and so was the overall average luminance.



View larger version (47K):
[in this window]
[in a new window]
 
Fig. 5. Effects of contextual stimuli of various configurations (see insets). Total length of mask elements was kept constant in all masking conditions in which the elements were random dots, circles, semicircles, and straight lines, respectively. In masking conditions other than dot mask, the masks were divided into 7 × 7 equally spaced compartments for holding individual elements. In circle mask, some spatial jitter (from 0 to 1/4 the size of a compartment) was introduced to randomize the circle position. In semicircle mask and line mask, the orientation of each element was randomized in each trial. A 2 × 2° blank window was put around the target line to occlude masking patterns. A: orientation discrimination thresholds averaged for 4 human observers. B-D: responses of neurons in V1 under the same masking conditions. Negative value in D represents facilitation of responses relative to the no-mask condition. Refer to Fig. 4 legend for details of stimulation and data analysis.

Figure 5A shows the averaged data from psychophysical experiments on four human subjects. Just as expected from our hypothesis, masking strength increased with decreasing line curvature, that is, when the mask elements were changing from circles, semicircles to straight lines. A mask made of random dots had almost no effect on orientation discrimination.

Applying the same contextual patterns to V1 neurons of monkeys, we observed a pattern of interference similar to that observed in human perception. Data shown in Fig. 5 were analyzed and arranged in the same way as in Fig. 4 (refer to the forgoing text for details). Figure 5B shows the responses from one typical recording site under different conditions, 5C shows the relative population responses averaged for all the recording sites, and 5D shows the averaged relative inhibition of cell responses.

By comparing Fig. 5, D with A, we see that as far as contextual interference (masking) is concerned, configurations of contextual stimuli exert similar relative masking strength in human orientation discrimination and V1 neuronal responses. A mask made of straight lines produced the strongest interference, whereas random dots had the weakest masking effect. Interference strength of contextual circles and semicircles was in between. Two differences were yet observed here. First, masks composed of semicircles and circles made no significant difference in suppressing neuronal responses, whereas in human orientation discrimination semicircles produced stronger interference than circles. Second, a random dot surround yielded facilitation of overall cell responses (black downward bar with negative value in D), whereas it gave weak interference with human orientation discrimination.

Comparing the no-mask condition with the dot-mask condition in Fig. 5, B and C, we see that the random dots do inhibit the initial cell responses. However, the initial suppression is followed by a substantial increase of the firing rate relative to the responses without a mask. About 40% of the recording sites showed this late-phase facilitation in the presence of surround random dots. The recording site shown in Fig. 5B is a typical example.

For a further comparison, we tested the effects of a featureless homogeneous surround the luminance of which was the same as the average luminance of all textured masks. Three human observers and 12 V1 recording sites were tested. Very similar to random dots, a homogeneous surround yielded some late facilitation of cell responses (Fig. 6, B-D) and weak elevation of human orientation discrimination threshold (Fig. 6A). This corroborates the point that textured stimuli, in particular straight lines, inhibit responses of V1 neurons most effectively, and similarly interfere with orientation discrimination of humans most efficiently.



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 6. Effects of homogeneous surround the luminance of which was the same as the average luminance of textured patterns. Results from masking by randomly oriented lines are also shown here for comparison. Three human observers and 12 V1 recording sites were tested. A: orientation discrimination thresholds averaged for 3 human observers. B-D: responses of neurons in V1 under the same masking conditions. Negative value in D represents facilitation of responses relative to the no-mask condition. Refer to Fig. 4 legend for details of stimulation and data analysis.

Masking size dependency

Our results shown so far suggest that the interaction between the CRF and its surround, that is, between cells distributed over a large area in V1, is involved in the perceptual masking of orientation discrimination by contextual stimuli. What if the immediate vicinity of the target line also is masked? How does the masking effect depend on the size of the mask? By masking different areas, we again determined orientation discrimination in four human subjects and measured neuronal activity in V1 of monkeys.

Figure 7A shows the effects on human orientation discrimination when different regions relative to the target line were masked. In surround masking experiments described previously (Figs. 4 and 5), a 2 × 2° blank window was put in the center of the mask. The target line was always clearly identifiable even though it was of the same brightness as the contextual lines. However, when the target line was also covered by the masking patterns (center- and full-mask conditions, Fig. 7, insets), the subjects reported difficulties in distinguishing the target line from the masking lines at the parafoveal location tested. To solve this problem, we used masking lines that were 16 times dimmer than the target line.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 7. Effects of masking of different regions. A: orientation discrimination thresholds of human observers (n = 4). Brightness of masking lines was 16 times dimmer than that of the target line to ensure clear identification of the target line in the parafovea tested. In center-mask condition, the mask was confined to a 2 × 2° region centered on the target line. In surround-mask condition, the mask was 5 × 5° in size with a 2 × 2° blank window in the center, from which masking pattern was excluded. In full-mask condition, the mask, which was the sum of the center and surround mask, was 5 × 5° in size, consisting of 7 × 7 randomly oriented lines, 0.7° long each. B-D: responses of neurons in V1. Mask configurations were the same as described in A except that the masking lines were of the same brightness as the test line. Refer to Fig. 4 legend for details of stimulation and data analysis.

Figure 7A shows that when both the center and surround areas were covered by random lines (full-mask condition), thresholds increased by a factor of about two relative to the no-mask condition. A surprising phenomenon was observed if only the center region was masked (center-mask condition). The thresholds increased by about another factor of two relative to full-mask condition. This means that by adding a surround mask, which by itself produced some interference, the interference produced by the center mask was much weakened. The surround mask has a disinhibitory effect on the smaller center mask.

From the data shown in Fig. 7A, we can infer that the bigger the masking area is, the weaker the masking strength will be. We can see this more clearly in another experiment shown in Fig. 8A. By adding more elements to the mask to increase its size, the thresholds for orientation discrimination dropped dramatically, implying that a figure-ground segregation process could be involved. When the mask gets bigger, the target line gets more distinct from the masking elements.



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 8. Effects of mask size on human orientation discrimination and cell responses. A: orientation discrimination thresholds of human observers (n = 4). Masking lines were 16 times dimmer than the target line. Masks were composed of 3 × 3, 5 × 5, and 7 × 7 elements, giving mask sizes of 2.1, 3.5, and 4.9°, respectively. A 0.7 × 0.7° blank window was put in the center of all masks to exclude the center element. B-D: responses of neurons in V1. Mask configurations were the same as described in A except that the masking lines were of the same brightness as the test line. Refer to Fig. 4 legend for details of stimulation and data analysis.

Using similar stimulus patterns as those used in psychophysical experiments shown in Figs. 7A and 8A, we tested the neuronal activity in V1. In these two physiological experiments, the masking lines and the test line were of the same brightness as in all the other physiological experiments. The results are shown in Figs. 7, B-D, and 8, B-D, respectively. Data were analyzed in the same way as in Figs. 4-6. The example of a typical recording site is shown in B. The relative population PSTHs are shown in C, and D shows the averaged relative inhibition of cell responses.

Similar to the effects observed in human orientation discrimination, responses of neurons to a line in the RF center (no-mask condition, Figs. 7 and 8, dotted curves) were inhibited by all these masking patterns. Moreover, a mask covering the center area (center and full mask in Fig. 7) was much more effective than a mask covering only the surround in suppressing neuronal responses (Fig. 7D), similar to the interference with human orientation discrimination (Fig. 7A). There was, however, a major difference. Inhibition of neuronal responses increased with increasing size of mask (Fig. 8D), whereas increasing masking area resulted in weaker interference with human orientation discrimination (Fig. 8A).

Time course of contextual influence on neuronal responses

The inhibition of neuronal responses in V1 produced by surround contextual lines has been reported to appear early in the population responses (Knierim and Van Essen 1992; Nothdurft et al. 1999). To see more clearly the time course of suppression of cell responses, we normalized the population responses shown in Figs. 4C, 5C, 7C, and 8C, for each bin between 40 and 500 ms, to the no-mask condition (Figs. 4C, 5C, 7C, and 8C, dotted curves). That is, for any given time slice, the responses activated by center test line alone were always taken as 100%. Here 40 ms was taken as the response latency of the population responses. The normalized PSTHs are shown in Fig. 9, A-D, respectively, where we can compare the relative firing rate within any given time slice (e.g., 40-50, 50-60, 60-70 ms, and so on) for the no-mask condition (dotted horizontal lines in the plots) and all masking conditions. The population response obtained in the experiment using homogeneous luminance surround (Fig. 6C, thin solid curve) also was normalized to the no-mask condition (Fig. 6C, dotted curve) in the same way, and the resulting curve was inserted into Fig. 9B for a comparison with dot-mask condition. Time 0 in the plots corresponds to stimulus onset.



View larger version (49K):
[in this window]
[in a new window]
 
Fig. 9. Time course of suppression of cell responses under masking of various contextual stimuli. Population PSTHs under various masking conditions shown in Figs. 4C, 5C, 7C, and 8C were normalized, for each bin between 40 and 500 ms, to the no-mask condition, and the normalized results are shown in A-D in this figure, respectively. Forty milliseconds was taken as the response latency of the population responses. Population PSTH of 12 recording sites to a line with the presence of a homogeneous luminance surround (Fig. 6C) also was normalized to the no-mask condition in the same way and the resultant curve was inserted into B for a comparison with dot-mask condition. In all plots time 0 indicates stimulus onset. Bin width is 10 ms.

As shown in Fig. 9, for all the masking conditions including random dots and a homogeneous surround of equal luminance, the inhibition starts in the initial phase of cell responses and increases rapidly to maximal values. This is in accord with what has been reported in studies using straight lines as contextual stimuli (Knierim and Van Essen 1992; Nothdurft et al. 1999).

We also can see that the strongest inhibition occurs at about the same time (70-80 ms after stimulus onset) when cell responses to the center test line reach the maximum (compare Figs. 4C, 5C, 7C, and 8C with Fig. 9, A-D, respectively). This supports the notion that the initial burst of neuronal responses carries more information than the later sustained response components and is important in perception (Macknik and Livingstone 1998; Thorpe et al. 1996).

As also shown in Fig. 9, the strong fast initial inhibition of cell responses by various masks was followed by a weaker inhibitory period in which for all the masking conditions the cell responses returned toward the control (no mask) level. For some recording sites during this period, surround random dots or a homogeneous luminance surround even produced facilitation of responses relative to the no mask condition (Figs. 5 and 6, B and C, and 9B).

It is noteworthy that during the whole period of cell responses, although the inhibition strength for each masking condition varies with time, the relative inhibition strength for different masking conditions in each experiment is quite stable. For example, inhibition produced by contextual lines parallel to the test line is always stronger than the inhibition produced by orthogonal lines (Fig. 9A).


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

In this study, we compared the influence of various contextual stimuli on human orientation discrimination with the influence of the same contextual patterns on neuronal activity in V1 of awake monkeys. These contextual stimuli interfere similarly in many respects with human orientation discrimination as well as with V1 neuronal responses.

Similarities between human perception and neuronal activity in V1

Our data show that human orientation discrimination of a line always is impaired by contextual patterns of various configurations. The strongest interference occurs when the target line is surrounded by straight lines and when the straight lines are parallel to the target. Although the orientation of contextual lines could serve as explicit reference of orientation (when the surround lines are orthogonal or parallel to the target line), human observers cannot take advantage of this additional reference. In any of these cases, only impairment of orientation discrimination was observed. As responses of neurons in V1 to a line in the RF center are suppressed in a similar way by contextual stimuli, this suggests that spatial interactions in the orientation domain between cells in V1 could be part of the neural basis of the interference observed psychophysically in orientation discrimination.

Knierim and Van Essen (1992) studied the effects of the orientation of contextual lines on neuronal responses to a line within the CRF. They reported that for ~80% of cells in V1 of alert monkeys the responses to a center line were suppressed significantly by the textured surrounds, and cells tended to respond more strongly to a stimulus in which there was a contrast in orientation between the center and surround (surround lines orthogonal to the center line) than to stimuli lacking such contrast (surround lines parallel to the center line or randomly oriented). Similar observations were later made in anesthetized animals (Nothdurft et al. 1999). Our results (Fig. 4) are also in accord with these results. In fact in their data, even for so-called orientation contrast cells, an array of surround lines orthogonal to the center line also suppressed the neuronal responses to the center line. That is, a line in the RF center without any textured surrounds always activates the cells most efficiently. The orientation contrast between center and surround does not really help the cells to respond better to a line inside the CRF.

Discrepancies between human perception and neuronal activity in V1

From our data, a larger mask, compared with a smaller one, yields weaker interference with human orientation discrimination (Fig. 8A). A figure-ground segregation process may explain this result. It is conceivable that segmentation between target and mask is difficult if the spatial extent of target and mask is similar but becomes easier when the mask is getting larger. We did not see a comparable effect of mask size on neuronal responses in V1. On the contrary, a larger mask tends to produce stronger suppression of neuronal responses to a line in the RF center (Fig. 8D). This suggests that the figure-ground segregation process involved in perception is located in a cortical area beyond V1. An alternative explanation for this discrepancy is that in human orientation discrimination, observers attended to the target line, whereas in physiological recordings, the monkeys attended to the fixation spot only. A better approach for a further investigation would be that the monkeys also are trained to do the orientation discrimination task as human observers do. Many lines of evidence show that in extrastriate cortical areas attention can strongly modulate cell responses (Bushnell et al. 1981; Maunsell 1995; McAdams and Maunsell 1999; Treue and Maunsell 1996). The influence of attention on V1 neuronal responses is still a matter of debate. As yet it has been reported that attention also can modulate cell responses in V1 (Ito and Gilbert 1999; McAdams and Maunsell 1999; Roelfsema et al. 1998), but none of these studies has identified functional mechanisms allowing us to explain the perceptual phenomenon we observed in terms of effects of selective attention in V1.

As mentioned earlier in the psychophysical experiments shown in Figs. 7A and 8A (experiments of different masking areas), when the immediate vicinity of the target line also was covered by the masking patterns, the subjects had difficulties in distinguishing the target line from the masking lines in the parafovea, and we had to use masking lines dimmer than the target line to solve this problem. We do not think that the difference in mask luminance is able to account for the discrepancy between human perception and neuronal responses. The point we want to make in those experiments is that the relative effect of increasing mask size on orientation discrimination of humans (with dimmer masks) on one hand, and on the firing rate of orientation-selective cells in V1 (with brighter masks) on the other hand, is in a reverse manner. As a matter of fact, our previous psychophysical study in the fovea using a backward masking approach (Wehrhahn et al. 1996) showed that when the target line was followed by masks made of lines with brightness the same as the target, a bigger full mask also produced a much weaker masking effect than a small center mask. This supports the point we want to make in this study.

Contextual modulation observed in V1 has been suggested to contribute to figure-ground segregation in perception (Lamme 1995; Zipser et al. 1996). In a study of contextual modulation in V1, Zipser et al. (1996) reported that when any cue (disparity, color, luminance, and orientation) defined a textured figure centered on the RF, there was always an enhancement of neuronal responses 80-100 ms after stimulus onset. In their study, the control condition relative to which the enhancement was shown was a large homogeneous textured pattern (background) covering both the regions of RF and extra-RF. As we showed in this study, using the responses activated by center stimulation alone as a control, all textured patterns but random dots outside the CRF suppress neuronal responses to the center stimuli. Even for random dots and a homogeneous surround of equal luminance, the initial responses activated by the center line was suppressed (Fig. 9B). We interpret this as the neuronal basis of the masking effects observed in human orientation discrimination as any patterns surrounding the target line always impair the orientation discrimination of the target. One phenomenon observed in our study is yet quite similar to the late facilitation of responses reported by Lamme (1995) and Zipser et al. (1996). As shown in Fig. 9, the strong fast initial inhibition of cell responses by various masks was followed by a weaker inhibitory period in which for all the masking conditions the cell responses returned toward the control (no mask) level. For some cells during this period, surround random dots or a homogeneous luminance surround even produced facilitation of responses relative to the no mask condition (Figs. 5 and 6, B and C, and 9B). This disinhibitory or facilitatory phase has about the same latency and duration as the late facilitation observed by Lamme (1995) and Zipser et al. (1996), and we think that both have the same origin but were interpreted differently. As the latency of this response phase is very long (100 ms), one plausible explanation is that the underlying neural substrate could be the feedback connections from the extra-striate cortical areas, which may enable the cells in V1 to diminish the inhibition produced by the contextual stimuli and may play an important role in figure-ground segregation. A recent study using reversible inactivation of area MT of monkeys suggested that cortical feedback from higher-order area improves discrimination between figure and background by V1, V2, and V3 neurons (Hupe et al. 1998). We cannot rule out, however, the possibility of the contribution of intrinsic connections within V1 to this late disinhibitory or facilitatory response phase. A recent study (Bringuier et al. 1999) of intracellular recordings from orientation-selective neurons in area 17 of cats showed subthreshold synaptic depolarizing responses to stimuli flashed outside the CRF, and the delay of depolarization increased linearly with the stimulus distance from the RF center. It was proposed that these signals spread along slowly conducting horizontal connections within primary visual cortex. Facilitatory surround effects have been reported by several authors (Ito and Gilbert 1999; Kapadia et al. 1995; Nelson and Frost 1985; Polat and Sagi 1993), but the facilitation reported in those studies only exists in small restricted region colinear with the target. Moreover, the facilitation of cell responses by colinear lines starts at the onset of cell responses (Ito and Gilbert 1999).

Masks composed of semicircles and circles made no significant difference in suppressing neuronal responses, whereas in human orientation discrimination, semicircles produced stronger interference than circles. This suggests that, from the point of view of single V1 neurons, only large differences between contextual elements are able to be differentiated through contextual modulation mechanisms.

Similarity between foveal and parafoveal vision

The comparisons between the influence of contextual stimuli on human orientation discrimination and on neuronal activity in V1 of monkeys were made at comparable parafoveal locations, but there is no reason that the results should not apply to foveal vision as well. If scaled by the cortical magnification factor, human vernier acuity was reported to be as good in the periphery as it is centrally (Levi et al. 1985). What we observed here in masking of human orientation discrimination in the parafovea is comparable with what we observed in the fovea before (Wehrhahn et al. 1996) except at a different scale. As for V1 neurons, the main difference between foveal and parafoveal neurons is the size of RF. The RF size in fovea is too small to handle in awake fixating monkeys.

Orientation tuning and orientation discrimination

We demonstrated in this study the concordance between V1 neuronal responses and human perception in terms of contextual influence. Our data also showed that, although contextual lines suppress significantly neuronal responses to a line in the RF center, they do not systematically change the bandwidth of orientation tuning functions of V1 cells (Fig. 3). Recently McAdams and Maunsell (1999) reported that attention enhanced the responses of V4 and V1 neurons, but the orientation selectivity, measured by the width of the tuning curve, was not systematically altered by attention. Their data, together with ours, support the notion that the mean firing rate of cells could be more important than the tuning width in coding orientation information. This does not necessarily mean that the tuning width is not important at all. As was already shown in many studies, as far as single cells are concerned, the factors which affect the discriminative capability of cells include the mean firing rate, the response variance, as well as the tuning width (Bradley et al. 1985, 1987; Britten and Newsome 1998; Heggelund and Albus 1978; Scobey and Gabor 1989; Snowden et al. 1992; Vogels and Orban 1990). According to these studies, what matters in discriminative capability of the cells is the slope of the flanks of tuning curves.

Although theoretically the discriminative capability of single cells is reported to be comparable with the perceptual threshold in discrimination tasks (refer to the studies cited in the preceding text), individual V1 recording sites characterized in this study are insufficient for reliable encoding the orientation of stimuli. Rather, psychophysical performance must rely on the responses of a large ensemble of cells as proposed earlier (Britten and Newsome 1998; Gilbert and Wiesel 1990; Vogels 1990, 1991; Westheimer et al. 1976). One important reason is that, as pointed out by Vogels and Orban (1991), orientation discrimination in perception is robust to changes of stimulus parameters such as spatial frequency, stimulus contrast, and position (Burbeck and Regan 1983; Paradiso et al. 1989; Regan and Beverley 1985), whereas responses of individual V1 neurons are sensitive to changes of these stimulus parameters, and a change in cell activity could be the result of any change of the stimulus features. The ambiguity of coding at single cell level only can be resolved by using the output of an ensemble of cells. The main result of this study is that, for several of the conditions analyzed, contextual patterns interfere with the responses of most V1 cells in a way comparable to the impairment observed in orientation discrimination of human observers. This implies that orientation-selective neurons in V1 carry signals relevant for perception. However, the neuronal signals indicating segmentation processes in perception (psychophysical experiments of different masking area, Figs. 7A and 8A) have not been observed in these neurons.


    ACKNOWLEDGMENTS

We thank G. Westheimer for numerous suggestions and comments, U. Ilg and S. Treue for help with animals, M. Repnow for assistance in computer programming, and T. Wiegand-Grewe and Ute Großhennig for technical support.

This work was supported by the Deutsche Forschungsgemeinschaft (Schwerpunktprogramm "Physiologie und Theorie neuraler Netze"). K. Kirschfeld and N. Logothetis generously provided additional support.


    FOOTNOTES

Address for reprint requests: C. Wehrhahn, Max-Planck-Institut für biologische Kybernetik, Spemannstrasse 38, 72076 Tubingen, Germany.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Received 22 April 1999; accepted in final form 4 October 1999.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

0022-3077/00 $5.00 Copyright © 2000 The American Physiological Society