1Zanvyl Krieger Mind/Brain Institute, 2Department of Biomedical Engineering and 3Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21218
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Pasupathy, Anitha and Charles E. Connor. Responses to Contour Features in Macaque Area V4. J. Neurophysiol. 82: 2490-2502, 1999. The ventral pathway in visual cortex is responsible for the perception of shape. Area V4 is an important intermediate stage in this pathway, and provides the major input to the final stages in inferotemporal cortex. The role of V4 in processing shape information is not yet clear. We studied V4 responses to contour features (angles and curves), which many theorists have proposed as intermediate shape primitives. We used a large parametric set of contour features to test the responses of 152 V4 cells in two awake macaque monkeys. Most cells responded better to contour features than to edges or bars, and about one-third exhibited systematic tuning for contour features. In particular, many cells were selective for contour feature orientation, responding to angles and curves pointing in a particular direction. There was a strong bias toward convex (as opposed to concave) features, implying a neural basis for the well-known perceptual dominance of convexity. Our results suggest that V4 processes information about contour features as a step toward complex shape recognition.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Visual shape information is processed in the
ventral cortical pathway, which runs from V1 to V2, V4, and finally
into various subregions of inferotemporal (IT) cortex (Felleman
and Van Essen 1991; Ungerleider and Mishkin
1982
). At lower levels in this pathway (V1 and V2), shape is
represented at least partly in terms of local orientation
(Baizer et al. 1977
; Burkhalter and Van Essen 1986
; Hubel and Livingstone 1987
; Hubel
and Wiesel 1959
, 1965
, 1968
). At the final stages in IT, cells
are often selective for complex objects like faces and hands
(Desimone et al. 1984
; Gross et al. 1972
;
Perrett et al. 1982
; Tanaka et al. 1991
).
To understand how lower-level orientation signals are transformed into
complex object representations, it is important to study shape
processing at intermediate stages like area V4.
Only a few studies have addressed shape processing in area V4.
Desimone and Schein (1987) showed that many V4 cells are
tuned for orientation, width, and length of bar stimuli and for
orientation and spatial frequency of gratings, as in V1 and V2.
Kobatake and Tanaka (1994)
found that some V4 cells
respond better to complex shapes than to simple bar stimuli.
Gallant and colleagues (1993
, 1996
) demonstrated
selectivity for curvilinear as well as linear gratings. These studies
indicate that V4 encodes both orientation and higher-level shape
information. The exact nature of the higher level information remains
to be determined.
A primary goal in the study of shape processing at intermediate levels
like area V4 is to identify the shape primitives or basic features
represented at those levels. Many shape processing theories invoke
contour features (angles and curves) as intermediate shape primitives
(Attneave 1954; Biederman 1987
;
Dickinson et al. 1992
; Milner 1974
;
Poggio and Edelman 1990
; Ullman 1989
). Contour features constitute a simple geometric step beyond individual oriented edges simpler, in some sense, than rectangular bars, which
comprise 4 edges and 4 right angles in a specific arrangement. They are
ubiquitous visual elements with high information content (Attneave 1954
), their presence can be derived by
combining individual edge orientation signals (Milner
1974
), and they form natural parts for constructing more
complex representations. Psychological findings imply the existence of
specialized mechanisms for perception of contour features
(Andrews et al. 1973
; Chen and Levi 1996
; Fahle 1997
; Heeley and Buchanan-Smith
1996
; Regan et al. 1996
; Treisman and
Gormican 1988
; Watt and Andrews 1982
;
Wilson et al. 1997
; Wolfe et al. 1992
).
Physiologists have studied responses to contour features at earlier
stages in the ventral pathway (V1 and V2) (Dobbins et al.
1987
; Hammond and Andrews 1978
; Hegde and
Van Essen 1997
; Heggelund and Hohmann 1975
;
Hubel and Wiesel 1965
; Versavel et al.
1990
). It has been proposed that contour feature extraction is
the ultimate purpose of endstopping (i.e., preference for terminated
edges or lines) (Hubel and Livingstone 1987
;
Hubel and Wiesel 1965
). For these reasons, we chose to
study responses to contour features in area V4.
We designed a large parametric set of contour feature stimuli, illustrated in Figs. 1 and 2A. Each stimulus consisted of a single contour feature (angle or curve) or straight edge centered on the receptive field (RF) of the cell under study. Outside the RF, the stimulus edges continued and stimulus color gradually faded into the background gray, as if a spotlight was illuminating one portion of a larger object. In this way, a single contour feature could be presented essentially in isolation. This allowed us to examine whether some cells in V4 that might appear to be selective for more complex stimuli are actually sensitive to individual corners or curve segments. We found that a substantial fraction of V4 cells exhibit such lower-order specificity, suggesting that in some cases responses to complex shapes can be understood in terms of their constituent contour features.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Single-cell recording
We recorded spike activity from isolated V4 cells in the lower
parafoveal representation on the surface of the prelunate gyrus and
adjoining banks of the lunate and superior temporal sulci. Recording
locations were initially based on skull landmarks and then adjusted on
the basis of response properties, retinotopy, and inferred positions of
the sulci. Other technical details have been described previously
(Connor et al. 1997). All animal procedures conformed to
National Institutes of Health and USDA guidelines and were carried out
under an institutionally approved animal protocol.
Preliminary tests
Stimuli were presented on a computer monitor while the animal maintained fixation (on a small white dot) within a 0.5° radius window. Continuous fixation for 4.5 s was rewarded with a drop of juice. Each isolated cell was initially characterized by handplotting with colored rectangular bars and ellipses to find the approximate RF center and optimum bar orientation.
Color and width tuning were tested by presenting optimally oriented
bars (with rounded endcaps) at the handplotted RF center in eight
colors and five widths. The colors were red, green, blue, yellow, cyan,
magenta, white, and black. All colors were adjusted to an approximate
luminance of 20 cd/m2, except for blue (15 cd/m2) and black, and presented against a
background gray of 2.5 cd/m2. The widths were
0.025, 0.05, 0.075, 0.1, and 0.125 times the average V4 RF diameter at
the handplotted eccentricity [based on the relation between RF
diameter and eccentricity reported by Gattass et al.
(1988); their data suggest that average diameter equals
approximately 1° + 0.625 × eccentricity]. The bar stimuli were
flashed for 500 ms each and separated by 250-ms interstimulus intervals. During each trial a sequence of five stimuli was presented. Stimuli were presented in random order until each stimulus had been
presented a total of three times. When cells were unresponsive to bar
stimuli the optimum color was determined by handplotting.
When cells were responsive to bars, orientation tuning was tested using optimum color and width values derived from the previous test. Bar length was set to 0.5 times the average RF diameter and bar endcaps were rounded. Twelve orientations (15° intervals) were tested (5 repetitions each).
The RF was then plotted more precisely with small bar stimuli flashed at locations in a square grid covering a circular area with a diameter of 2.0 times the handplotted RF diameter and a spacing of 0.125 times the handplotted diameter. The bars were of optimum color, width, and orientation, with length equal to 0.25 times the handplotted diameter. Bars were flashed for 250 ms each with 500-ms interstimulus intervals. The grid locations were sampled once each in random order. The response plot was smoothed by means of local spatial averaging. The RF center was estimated by calculating the center of mass for all responses >50% of the maximum (75% for highly asymmetric plots). Many cells failed to respond in this test; in these cases, handplotting was used to estimate the RF center. Handplotting also was used in some cases where the remaining recording time for the day was limited.
Contour feature test
Figure 1 shows four example contour feature stimuli. The small white dot represents the fixation point, and the dashed circle (which was not part of the actual display) represents the estimated RF. In Fig. 1, A-C, the contour features are 90° sharp angles pointing toward the right. The 90° angle is rendered in white as a projecting convex corner (A), an outline (B), or a concave indentation (C). Smooth curve stimuli were B-spline approximations to the angles. Figure 1D shows a 90° curve pointing to the right, with the B-spline control points indicated by diamonds. In all cases, stimulus color and brightness were constant within the RF and then gradually faded into the background gray over a distance equal to the RF radius, giving the impression of a spotlight illuminating one corner of a larger object. In this way, a single contour feature could be presented in isolation.
|
The stimuli were scaled according to average V4 RF diameter at the
cell's eccentricity (dashed circles in Fig. 1), based on Gattass et al. (1988) (see preceding text). Scaling with
eccentricity in this manner ensures a generally consistent relationship
between stimulus size, RF size, and acuity. In any case, stimulus size is not a major concern in this experiment because the stimuli consist
of individual edges and corners, which have no real size, and the rest
of the stimulus fades gradually into the background.
The full set of contour feature stimuli is shown in Fig. 2A. Stimuli were presented in the optimum color for the cell under study (shown here as white). Each stimulus consisted of a single contour feature defined by a sharp luminance/color boundary or a line of width equal to 1/16 the average V4 RF diameter at the cell's eccentricity. Stimulus luminance was constant within the RF (20 cd/m2, except for blue and black), then gradually faded into the background gray (2.5 cd/m2) over a distance of 0.5 times the average RF diameter (the full extent of fading is not shown in Fig. 2A). The stimulus set had four dimensions (the first 3 plotted horizontally and the 4th vertically):
|
CONVEXITY. The stimuli were rendered as convex projections (Fig. 2A, left), concave indentations (right), or outlines (middle). Convexity/concavity was defined by considering the stimulus (shown in white here) to be the figure, based on its smaller size relative to the homogeneous gray background (see Fig. 1).
CURVATURE. The stimuli were either sharp angles (on the left within each block of Fig. 2A) or smooth curved B-spline approximations to the angles (on the right within each block; see Fig. 1 for details of B-spline construction).
ACUTENESS. The angles and their corresponding curves had three levels of acuteness (45, 90, and 135°), with the straight edges (180°) representing the limit at the obtuse end of the scale for both types of stimuli.
CONTOUR FEATURE ORIENTATION. The features point in eight directions: upward (90°) in the top row, upper left (135°) in the second row, etc. Contour feature orientation is a circular dimension, and has been arbitrarily split in Fig. 2 between 90° (top) and 45° (bottom). For most cells, stimuli were presented at the eight orientations shown in Fig. 2. In cases where the preliminary bar orientation test revealed a strong tuning peak, the orientations of all the stimuli were rotated so that there would be straight edge stimuli at the preferred orientation.
Stimuli were flashed for 500 ms each and separated by interstimulus intervals of 250 ms. A sequence of five stimuli was presented in each trial. The entire stimulus set was presented in random order without replacement five times, except in one case where only three repetitions were completed.Position test
Some cells were tested with a subset of the stimuli at five positions: at the RF center, and offset to the right, left, top, and bottom. The offsets were 0.175 times the average RF diameter, so that the total span in the horizontal and vertical directions was 0.35 times the average RF diameter. The selected stimuli included the contour feature evoking the strongest response and at least one other contour feature that contained the same component orientations (or a similar range of orientations for smooth curves) but elicited a weak response (see Fig. 12 for examples).
Data analysis
Response rates were calculated by counting spike occurrences within a 500-ms window beginning at stimulus onset. Background rate was derived in a similar way from null stimulus periods interspersed randomly among stimulus presentations in all tests. Background rates were typically low (average =1.9 spikes/s), and analyses with and without background subtraction yielded similar results. The results presented here are based on subtraction of average background rate from the response rates for each individual repetition of each stimulus.
We used quantitative indices of tuning strength and breadth to assess what kind of information, if any, cells might convey about contour features. We first averaged responses to each stimulus across repetitions to get a 3 (convexity) × 2 (curvature) × 3 (acuteness) × 8 (contour feature orientation) response function. (The 180° acuteness single edge and line stimuli were excluded from this analysis so as not to confound tuning for angles and curves with tuning for edge orientation. Exclusion of the edge stimuli made little difference; see RESULTS.) We next applied a peak-finding algorithm that identified compact, contiguous regions of the four-dimensional response function in which all stimuli evoked responses greater than half the maximum response (>HM). The region with the largest summed >HM response was designated as the primary peak. (The >HM sum was based on just the portions of the response rates above the half-maximum cutoff.)
Primary peak strength was defined as the primary peak's >HM sum divided by total >HM responses across all stimuli. A cell with a single large peak would have a primary peak strength of 1.0, whereas a cell with many separate small peaks would have a primary peak strength closer to 0. In contrast to some measures of tuning strength, like those based on the difference between maximum and minimum values, primary peak strength reflects specifically unimodal tuning. This was important for our data because multimodal tuning was likely to represent sensitivity to other dimensions such as edge orientation. Primary peak size was defined as the fraction of stimuli (of a total of 144, excluding single straight edges and lines) included within the >HM primary peak. This index is analogous to peak width at half height in a one-dimensional response function.
For those cells with clear overall tuning (based on high primary peak strength and low primary peak size values), we also characterized responses in each of the four stimulus dimensions separately. For each dimension, we generated a one-dimensional response function by summing across the other three dimensions. (The sums included all responses to individual stimulus repetitions that exceeded background, with no thresholding at half-maximum as in the peak determination.) The summed values were normalized by dividing by their average, rather than their maximum, so that response variation could be compared visually in terms of peak values (stronger tuning corresponds to higher peaks). A similar procedure was used to generate edge orientation tuning functions (for all cells), collapsing across the three convexity values to get four summed values that were again normalized by dividing by their average.
Significance of response variation was measured with randomization
ANOVA (Edgington and Bland 1993; Manly
1991
). Randomization tests rely less on assumptions about
sampling, and they can be used to test the significance of derived
measures like tuning indices (Manly 1991
; cf.
Connor et al. 1997
; Gallant et al. 1996
). A main effect F ratio was calculated for the original data.
Then the response rates for individual stimulus repetitions were
randomly permuted across the dimension in question (but within the
other 3 dimensions), and the test statistic was recalculated. This
procedure was repeated 10,000 times to yield a distribution of values
expected on the basis of the null hypothesis (that the dimension in
question had no bearing on response rates). The level of significance
(P) was the fraction of randomly generated values greater
than or equal to the original value.
Randomization was also used to test whether contour feature orientation tuning functions were consistent across different values of other dimensions, e.g., acuteness. This was done by first calculating the correlation coefficient between the two contour feature orientation functions in question. Then the pairing of values between the two functions was randomly permuted and the correlation coefficient recalculated 10,000 times. This procedure yielded a distribution of values expected on the basis of the null hypothesis (that there was no underlying correlation between the 2 functions). The level of significance was the fraction of randomized correlation values greater than or equal to the original value.
Another statistical question was whether the distribution of contour
feature orientation tuning peaks was significantly nonuniform. This was
assessed with a Monte Carlo version of Kuiper's test, which is a
circular Kolmogorov-type analysis (Mardia 1972). The Kuiper's test statistic is the sum of the maximum positive and negative deviations of the observed cumulative distribution function from the hypothetical function (which in our case was the uniform distribution function). In each Monte Carlo simulation, a random function (with equivalent number of observations and discretization) was generated (under the assumption of uniformity), and the Kuiper's statistic was calculated. This produced a distribution of values expected on the basis of the null hypothesis (that the underlying distribution was uniform). The level of significance was the fraction of 106 randomly generated Kuiper's values
greater than or equal to the observed value.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Contour feature tuning
We used the stimulus set shown in Fig. 2A to test the
responses of 152 V4 neurons with RF eccentricities ranging from 0.1 to
7.8°. Isolated contour features were generally more effective than
edges or bars in driving V4 responses. For the large majority of cells
(138/152 or 91%), the most effective stimulus in our test was a
contour feature rather than a straight edge or line. On average, the
strongest edge/line response was only about half the strongest contour
feature response (average ratio 0.56). This must at least partially
reflect the high degree of endstopping in V4 reported previously
(Desimone and Schein 1987). But within a subsample of 61 cells tested with bar stimuli of length equal to half the estimated RF
diameter the majority still exhibited stronger responses to contour
features. These cells were tested with bars of optimum color and width
at 12 orientations (15° intervals) in a preliminary characterization
of orientation tuning (see METHODS). Other cells were not
tested in this way either because preliminary handplotting and
color/width tests disclosed little or no response to bars, handplotting
indicated an absence of bar orientation tuning, or the remaining
recording time was too short for extensive preliminary tests. Even in
this subsample, which was to some degree preselected for stronger bar
responses, 74% of the cells (45/61) had a higher maximum response in
the contour feature test than in the oriented bar test.
Many cells exhibited clear, unimodal tuning for a particular range of contour features. An example is shown in Fig. 2B. In this plot, average firing rate based on five stimulus repetitions is represented by the background gray level surrounding each stimulus icon. Response rates range from 0 (light gray) to 42 ± 2.3 (SE) spikes/s (black). This cell responded best to convex features oriented in the 135-180° range. Responses were stronger for sharp (vs. smooth) and acute (vs. obtuse) features. The results cannot be explained in terms of standard orientation tuning for individual edges, since many of the least effective stimuli (including the straight edges) contain the same edge orientations as the most effective stimuli. Another example of contour feature tuning is shown in Fig. 2C. This cell responded best to convex and outline smooth curve features oriented in the 315-0° range.
A contrasting result is presented in Fig. 2D. The response pattern for this cell reflects standard orientation tuning. The cell responded to a variety of sharp angle and smooth curve outline stimuli (middle) containing edges oriented near 75°. There is no clear single peak as in the other examples, and thus no indication of contour feature tuning.
To quantify contour feature tuning, we determined for each cell the primary peak in the 3 (convexity) × 2 (curvature) × 3 (acuteness, excluding the 180° single straight edges and lines) × 8 (contour feature orientation) stimulus space. A peak was defined as a contiguous set of stimuli evoking responses greater than half-maximum (see METHODS). In Fig. 2, B-D, the stimuli falling within the primary peak are indicated by asterisks. The peaks appear discontinuous because two of the dimensions (curvature and acuteness) are plotted recursively. The primary peaks were characterized by two indices, primary peak strength and primary peak size. Primary peak strength represents the fraction of response strength above the half-maximum level contained within the primary peak (see METHODS). Primary peak strength was high for the cells exhibiting contour feature tuning in Fig. 2, B (1.0) and C (0.83), but low for the cell exhibiting standard orientation tuning in Fig. 2D (0.37). Primary peak size represents the fraction of stimulus space covered by the primary peak (analogous to width at half-height for a 1-dimensional tuning function). Primary peak sizes for the cells in Fig. 2, B-D, were 0.028, 0.042, and 0.014, respectively.
Figure 3 shows the primary peak index
values for the entire sample of 152 V4 cells. Each cell is represented
by a dot. Primary peak size is indicated by position with respect to
the x axis and primary peak strength by position with
respect to the y axis. Cells with strong, focused tuning
peaks in contour feature space (as in Fig. 2, B and
C) fall near the upper left. At the extreme upper left,
several cells with primary peak strength = 1.0 and primary peak
size corresponding to just a few stimuli are superimposed (). One of
these narrowly tuned cells is shown in Fig.
4. Cells with multiple small peaks and no
apparent contour feature tuning (e.g., Fig. 2D) fall
near the bottom of the plot. The shaded box marks the
region of cells chosen for more detailed analysis (see following text).
The cutoffs (primary peak strength = 0.7 and primary peak
size = 0.15) are necessarily arbitrary, since the distribution is
continuous, but the selected cells all had a single predominant and
relatively focused tuning peak in contour feature space. The subsample
comprises 50 cells (33% of the entire sample; 2 additional cells
falling within the specified range were excluded because they responded
best to a single straight edge or line). For these cells, on average,
the strongest edge/line response was only about one-tenth of the
strongest contour feature response (average ratio = 0.11). The
peak analysis presented here excluded the edge/line stimuli so as not
to confound contour feature tuning with edge orientation tuning, but
inclusion of the edge/line stimuli made little difference: No further
cells appeared in the shaded region, and four cells that were in the
shaded region fell slightly below the primary peak strength cutoff of
0.7.
|
|
Contour feature orientation
Contour feature orientation tuning functions for individual cells are shown in Fig. 5. The tuning functions were derived by summing across the other three dimensions and normalizing. Contour feature orientation is plotted in the circular dimension and normalized response rate is plotted in the radial dimension. The inner ring in each plot corresponds to the normalized average response; successively larger rings correspond to twice the average and three times average. Where necessary to avoid truncation, the scale was compressed so as to include four or five times average. The plots are arranged in rows according to which contour feature orientation produced the strongest responses. The distribution of contour feature orientation peaks is uneven but not significantly different from a uniform distribution (Kuiper's test, P = 0.44). Shading denotes significant (P < 0.05) tuning based on randomization ANOVA. Tuning in this dimension was significant in all but one case, and response variance was higher than in any other dimension for most cells (38/50; 76%).
|
Contour feature orientation tuning was typically consistent across other dimensions. For example, the cell in Fig. 6 responded well to features oriented at 30° (and to a lesser extent 75°) across all three acuteness values and all three convexity values (see also Fig. 2, B and C). Analysis showed that this cell's contour feature orientation tuning functions were significantly correlated across all pairings of acuteness values, two pairings of convexity (convex/outline and concave/outline) and across curvature. The response patterns for most cells showed significant correlations across at least two values of acuteness (38/50; 76%), convexity (37/50; 74%), and curvature (36/50; 72%). This consistency argues against explanations of contour feature tuning in terms of lower level factors like contrast direction, spatial frequency and component edge orientation, since these factors change across acuteness and convexity but cells continue to respond specifically to contour features pointing in a particular direction (see DISCUSSION).
|
Convexity
Convexity response functions for individual cells are shown in Fig. 7 (for the definition of convexity, see Fig. 1 and METHODS). In each graph, the horizontal axis represents the three convexity values (in the arbitrary order convex/outline/concave) and the vertical axis represents normalized response summed across the other three dimensions. Cells with significant (P < 0.05) response variation in the convexity dimension are plotted in Fig. 7, bottom row; nonsignificant cases are plotted in Fig. 7, top row. Response variation across convexity was significant in all but two cases. Cells are plotted in Fig. 7, left, middle, and right columns, according to whether they responded best to convex, outline, or concave features. Most cells responded best to either convex (23/50) or outline (21/50) features. This bias against concave features is exemplified by the cell in Fig. 8, which responded well to convex and outline features oriented at 15° but not at all to the corresponding concave features (see also Fig. 2, B and C). The convexity bias is interesting in light of psychological studies showing that convex features are more perceptually significant than concave features (see DISCUSSION).
|
|
Acuteness
Acuteness tuning functions for individual cells are shown in Fig. 9. In each graph, the horizontal axis represents acuteness, and the vertical axis represents normalized response summed across the other three dimensions. Cells with significant (P < 0.05) acuteness tuning are plotted in Fig. 9, bottom row; nonsignificant cases are plotted in Fig. 9, top row. Tuning was significant for 72% (36/50) of the cells in our sample. Cells are plotted left, middle, and right columns according to whether they responded best to the 45, 90, or 135° acuteness levels. The majority of cells responded best to 45° features. However, this acuteness bias was much less pronounced than the convexity bias described above, in that the average response variance associated with acuteness was approximately one-fifth that associated with convexity (compare the steepness of the functions in Figs. 7 and 9).
|
Curvature
Curvature results for individual cells are shown in Fig. 10. In each graph, the horizontal axis represents sharp angle versus smooth B-spline curve stimuli, and the vertical axis represents normalized response summed across the other three dimensions. Cells with significant differences between sharp and smooth stimuli are plotted in Fig. 10, bottom row. Cells responding better to sharp stimuli are plotted in left column, and cells responding better to smooth stimuli are plotted in right column. Curvature was the dimension of least influence in our data; only 56% of cells (28/50) exhibited significantly different responses to sharp and smooth stimuli, and the average response variance associated with curvature was lower than for any other dimension. Thus many cells responded in a similar fashion to both sharp angles and their B-spline curve counterparts, as can be seen to some extent in Figs. 2B, 4, and 8. On the other hand, the responses of some cells were clearly biased toward either sharp or smooth stimuli (e.g., Fig. 2C). These results do not necessarily imply anything about curvature representation in general. They only serve to contrast responses to angles with sharp corners and angles smoothed using the specific B-spline procedure shown in Fig. 1.
|
Edge orientation
Standard orientation tuning was measured by analyzing responses to the single edge/line (180° acuteness) stimuli. Normalized tuning functions were created by collapsing across convexity to yield four values. (As discussed in METHODS, when preliminary tests revealed an orientation tuning peak for bar stimuli the stimulus set was rotated so that one of the four edge orientations coincided with that peak.) ANOVA indicated significant edge/line orientation tuning in 57% of the entire sample (87/152) and 46% of the subsample from Fig. 3 (23/50). These percentages would presumably be higher if the cells had been tested with optimum bar stimuli rather than continuous edges and lines.
Since edge orientation is a standard tuning dimension, it provides a
useful comparison with tuning for contour feature-related dimensions.
The most relevant comparison is with contour feature orientation, which
was the dimension of strongest tuning and is the most analogous to edge
orientation. In Fig. 11, orientation tuning for edges and contour features is compared in terms of the
differences between maximum and minimum values in the respective tuning
functions. In Fig. 11A, the tuning index for both edges and
contour features is (maximum minimum)/(maximum). (Thus a value
of 0.75 indicates that the largest response difference was 75% of the
maximum value in the tuning function.) Each cell is plotted with
respect to its edge orientation index on the x axis and its
contour feature orientation index on the y axis. The
different symbols indicate cells with strong contour feature tuning
from the subsample in Fig. 3 (
), cells that showed significant edge orientation tuning (
), and the intersection of these two groups (
; i.e., cells from the Fig. 3 subsample that also showed
significant edge orientation tuning). Cells that belonged to neither
group are not shown. Tuning strength in the two domains is roughly
comparable, with the majority of cells showing index values >0.5. Edge
orientation tuning is stronger than contour feature orientation tuning
for the majority of cells overall (as indicated by the preponderance of
cells to the right of - - -), though not for the majority
of cells in the Fig. 3 subsample (
and
).
|
The analysis in Fig. 11A ignores absolute differences
between edge and contour feature tuning, since the two functions are normalized separately. For the Fig. 3 subsample cells, which had much
lower responses to edge stimuli, this means that high edge index values
may actually reflect relatively low response rate differences. This
would explain why some cells without significant edge orientation
tuning () still have edge index values around 1.0. To provide a more
direct comparison, in Fig. 11B the index values are scaled
according to the maximum response to an individual stimulus within the
relevant category (contour features for the contour feature orientation
index, edges/lines for the edge orientation index; see legend for
details). This greatly reduces the edge orientation index for many
cells, especially those from the Fig. 3 subsample. Thus edge
orientation tuning appears strong when the edge/line stimuli are
considered separately, but less striking when considered relative to
the typically higher responses to contour features. Orientation tuning
for bars would probably be much stronger in an absolute sense, but bars
were not included in our stimulus set. The most that can be said from
the present results is that some cells show strong contour feature
orientation tuning, others show strong edge orientation tuning, and
tuning strength for these two groups in their respective domains is
roughly comparable. By extension, tuning for convexity, acuteness and curvature may be somewhat weaker than standard orientation tuning.
Position
A critical question in studies of shape representation is whether apparent tuning for complex stimuli actually depends on changes in the position of simpler components. In our study, for example, a particular contour feature might evoke a stronger response simply because it included an edge close to a particularly responsive region in the RF. This seems unlikely to explain the data presented in the preceding text because in every case the component edges in the optimum stimuli appeared in other stimuli at nearly the same positions but failed to evoke strong responses. As a further control, however, we tested 24 cells of the 50 in the Fig. 3 subsample with a selected subset of the contour feature stimuli presented at multiple positions. In each case, the selected stimuli included one optimum contour feature and at least one other contour feature that contained the same edge orientations but failed to strongly activate the cell. Example results for three cells are shown in Fig. 12. In each plot, the stimuli are shown at left, with the optimum feature at the top followed by three other features containing the same component orientations (or ranges of orientations in the smooth curve case). The other columns show the responses to these stimuli when presented at the center of the RF and offset to the right, top, left, and bottom. (In A and C, the 2nd row stimuli evoked moderate responses because they fell within the flanking regions of the contour feature orientation peaks for the cells.) The separation between the right/left and top/bottom positions was 0.35 times the average RF diameter (see METHODS). Larger displacements were found to drastically reduce responses overall, rendering the test less meaningful. Even in the examples shown in Fig. 12, some displacements produced lower responses to the optimum stimulus. But the important point is that there were no positions at which a previously ineffective stimulus evoked responses comparable to those evoked by the optimum stimulus at the center position.
|
Results for all 24 cells are presented in Fig.
13. In each plot, the normalized
responses to the optimum stimulus are represented by , and responses
to a nonoptimum stimulus containing the same edge orientations are
represented by
. Responses at the five different positions are
represented by bar graphs at corresponding locations. In some cases,
displacements from the center position strongly increased or decreased
the response to the optimum stimulus, but the maximum response (across
positions) to the nonoptimum stimulus never equaled the maximum
response to the optimum stimulus.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We have shown that many cells in area V4 exhibit systematic tuning for contour features, i.e., angles and curves. There is no simple explanation for contour feature tuning in terms of lower-level factors such as edge orientation, spatial frequency, and contrast direction. The dimensions of greatest response variation are contour feature orientation (the direction in which angles or curves are pointed) and convexity (whether the angle/curve is rendered as a convex projection, an outline, or a concave indentation). There is a strong bias toward convex (and outline) features and against concave features, consistent with psychological findings (see following text). Altogether, the results suggest that contour features are extracted as intermediate level shape primitives, as a step toward complex shape recognition.
Lower-level factors
It is important to consider whether apparent tuning for contour features might simply reflect standard tuning for lower-level factors such as edge orientation, spatial frequency, and contrast direction. Standard tuning for edge orientation fails to explain the present data on several grounds. In every case of contour feature tuning, the component edge orientations contained in the optimum features also appeared in many other stimuli that failed to evoke strong responses (e.g., Fig. 2B); tuning for contour feature orientation typically remained consistent across acuteness despite changes in component edge orientations (e.g., Fig. 6); and almost all cells responded better to contour features than to any individual edges or lines. Spatial frequency tuning fails to explain the data for similar reasons: similar spatial frequencies appeared in optimum and nonoptimum stimuli, and tuning for contour feature orientation typically remained consistent across acuteness and convexity despite substantial changes in spatial frequency content (particularly between the outline and convex/concave features; see Fig. 6). Selectivity for color/luminance contrast direction is likewise inadequate to explain the response patterns, again because the same contrast edges are shared by both optimum and nonoptimum stimuli and tuning is consistent across different convexity values despite changes in contrast direction. Finally, the response patterns cannot be explained by differential surround stimulation. Although it is true that surround stimulation varied with stimulus type, tuning for contour feature orientation remained consistent across convexity despite the associated changes in surround stimulation (e.g., Figs. 2, 6, and 8).
Endstopping is another standard response characteristic that might be
invoked to explain the present results. In fact, it has been proposed
that the ultimate function of endstopping is to derive information
about contour features (Hubel and Livingstone 1987;
Hubel and Wiesel 1965
), and our findings support that
hypothesis. Endstopping by itself does not predict the contour feature
tuning patterns described here, since the same endstopped edges or
lines were typically contained in both optimum and nonoptimum stimuli (see, e.g., the outline stimuli in Fig. 8). However, contour feature tuning can be explained specifically in terms of
combinations of endstopped orientation signals (and other
lower-level information). For example, the tuning pattern in Fig. 8
could be explained as activation by the combination of an edge oriented
near 70° (counterclockwise from horizontal) and endstopped at the top
plus an edge oriented near 160° and endstopped at the right, with a
preference for a specific contrast direction (brighter toward the left
and darker toward the right). (Weak activation by individual component
orientations is apparent.) The end result would be a signal related to
the presence of a sharp corner pointing to the right. Theorists have proposed that contour feature information is derived by combining endstopped orientation signals precisely in this manner (Hummel and Biederman 1992
; Milner 1974
).
A simpler, related explanation might be that cells are tuned for a single edge/bar orientation, again with endstopping in just one direction. This could explain contour feature orientation tuning for acute (45°) angles because acute angles contain two closely apposed edges of similar orientation and have a relatively narrow width, so that responses might reflect tuning for either edges or bars at a nearby orientation. However, this mechanism would not explain why contour feature orientation tuning remains consistent for the 90 or 135° angles (as it does for the majority of tuned cells; see RESULTS and Figs. 2, 4, 6, and 8). These more obtuse contour features contain dissimilar edge orientations that substantially overlap with edge orientations contained by contour features pointing in other (nonoptimum) directions. Moreover, they have no real "bar" orientation, i.e., no oriented section of relatively narrow, relatively constant width. Thus, consistency of contour feature orientation tuning across different levels of acuteness implies a slightly more complex mechanism.
Convexity
Our data reveal a strong response bias toward convex features and against concave features. Convex features are defined here as angles and curves in which the figure projects into the background (see Fig. 1). More precisely, the region inside the angle (or curve) is continuous with the smaller image region, i.e., the figure (which in our experiment was filled with the optimum color for the cell), whereas the region outside the angle is continuous with the large homogeneous field that covers the rest of the screen, i.e., the ground (which in our experiment was dark gray).
Because the figure was always rendered in the optimum color against a dark gray background, we don't know which cue (figure/ground organization or color contrast direction) was critical for the convexity bias. The two alternatives are illustrated in Fig. 14, for the case where white is the optimum color. The stimuli observed or predicted to evoke the strongest responses are indicated by checks. One possibility (top row) is that cells responded because the figure was convex (i.e., occupied the interior of the angle). In this case, even if contrast direction were reversed (C and D), cells would continue to show a bias toward convex figures, responding better to C (assuming they respond at all under the contrast reversed condition). The other possibility (bottom row) is that cells responded because the optimum color (white) was convex (occupied the angle interior). In this case, if contrast direction were reversed, cells might respond best to D, where an angle of the white background projects into the dark concave figure. Note, however, that the cells in question (those responding to white) would then be representing the background not the figure. Cells representing the figure (those responding to dark gray in conditions C and D) should always display a bias toward convex stimuli.
|
Whichever cue is critical for driving cells, a likely mechanism to
explain the convexity bias is surround inhibition. The fadeout portions
of the concave stimuli occupy more of the RF surround (see Fig. 1),
which is known to be silently inhibitory in area V4 (Desimone et
al. 1985). The situation would be similar for concave features
within any real life object large enough to exceed V4 RF borders.
Concave features are by their nature surrounded by other portions of
the object and thus more subject to surround inhibition. The V4
response bias against concave features might be less pronounced for
smaller shapes that fit within V4 classical RFs.
The neurophysiological convexity bias that we found in V4 parallels
psychological results showing that convex features are perceptually
dominant. Human observers favor figure/ground interpretations that
emphasize convex projections over concave indentations (Kanizsa and Gerbino 1976). Convex features are more determinative than concave features in judgments of shape similarity
(Subirana-Vilanova and Richards 1996
). These results are
consistent with our finding that convex features are more strongly
represented in visual cortex. Hoffman and Richards
(1984)
predicted on theoretical grounds that segmentation of
complex objects into parts (for the purpose of shape recognition)
should occur along boundaries of maximum concavity, producing convex
parts. This "curvature minima" rule is supported by psychophysical
results showing that human observers are more likely to recognize parts
from a previously viewed object if they are convex, i.e., segmented at
points of concavity (Braunstein et al. 1989
). Our
results suggest that the Hoffman and Richards minima rule is
instantiated in the neural circuitry of the ventral visual pathway.
Implications for cortical shape processing
Most current theories of shape processing are based on the idea of
feature extraction, i.e., the identification of object parts. There are
alternatives to feature extraction, including template matching and
Fourier decomposition, but feature- or part-based mechanisms are better
adapted to the real-world difficulties of three-dimensional viewpoint
transformations, partial occlusion and plastic deformation
(Hoffman and Richards 1984). Moreover, physiological
results in higher level extrastriate cortex (e.g., Desimone et
al. 1984
; Gross et al. 1972
; Perrett et
al. 1982
; Tanaka et al. 1991
) seem more
compatible with feature-based theories. The simplistic notion of
all-or-nothing feature detectors arranged in a hierarchy leading up to
"grandmother cells" has been justly criticized, but more reasonable
models can be constructed on the basis of broadly tuned feature
filters, feeding into higher-level units that are themselves broadly
tuned and represent complex shapes through population coding
(Barlow 1972
; Poggio 1990
).
A key question for feature-based theories is the nature of the
elementary features or shape primitives on which recognition is based.
The first-level feature in most models is local edge orientation, a
choice dictated by the physiology of early stages in visual cortex
(Baizer et al. 1977; Burkhalter and Van Essen 1986
; Hubel and Livingstone 1987
; Hubel
and Wiesel 1959
, 1965
, 1968
). The choice of intermediate-level
features is not so constrained by physiology. Two general types of
intermediate features have been considered, one relating to object
boundaries and the other to solid volumes. Boundary-related features
include two-dimensional angles and curves of the type studied here and
homologous three-dimensional surface features (sharp corners, curved
surface patches, and indentations). Solid or volumetric primitives
(also referred to as generalized cones or geons) are defined by the
orientation and shape of their medial axes along with various
cross-section attributes. Some theories postulate a progression from
local orientation to contour features to volumetric primitives to
complete shape descriptions (aggregates of volumetric primitives), with
each stage based on inputs from the preceding stage (e.g.,
Biederman 1987
; Dickinson et al. 1992
).
According to other models, final shape representations could be based
directly on contour features (Poggio and Edelman 1990
).
A third scheme involves direct progression from local orientation to
volumetric primitives, with no intermediate description in terms of
contour features (e.g., Marr and Nishihara 1978
).
Distinguishing between these theoretical alternatives requires
physiological data about what kinds of shape information are represented at various stages in the ventral cortical pathway. Our
finding of systematic tuning for contour features in area V4 argues for
the importance of boundary-related primitives and reinforces previous
evidence for extraction of angles and curves at earlier levels. Most
previous studies have focused on angles and curves pointing in the two
directions orthogonal to the optimum bar orientation (i.e., stimuli
that represent deformations of the optimum bar stimulus). These studies
have shown that endstopped cells in cat area 17 respond well to small
radius curves (Dobbins et al. 1987; Heggelund and
Hohmann 1975
; Versavel et al. 1990
), as
predicted by Hubel and Wiesel (1965)
. Moreover, some
cells respond better to curves pointing in one direction or the other, which has been explained in terms of even- versus odd-symmetric RF
substructure (Dobbins et al. 1987
). There is little
evidence for selective responses to angles, though Hammond and
Andrews (1978)
showed that for a few cells in cat area 18 small
differences in bar orientation tuning in the two halves of the RF
resulted in better responses to obtuse angles than to any straight line stimulus, and Hubel and Wiesel (1965)
provided examples
of endstopped cells responding to angles containing their optimum edge
orientation. Tuning for angles and curves in monkey area V2 has been
reported in abstract form (Hegde and Van Essen 1997
).
Our findings extend this line of research by showing systematic tuning
for contour feature-related dimensions, especially contour feature
orientation, at an intermediate level in the primate ventral pathway
(V4). In addition, our data are consistent with reports of V4
selectivity for complex stimuli that contain angle and curve elements
(Gallant et al. 1996
; Kobatake and Tanaka
1994
).
Our findings also relate to psychological studies suggesting the
existence of specialized mechanisms for angle and curve perception. Three different groups have recently shown that angle perception acuity
is higher than that predicted by line orientation acuity (Chen
and Levi 1996; Heeley and Buchanan-Smith 1996
;
Regan et al. 1996
). Observers are highly sensitive to
the presence of curvature (Andrews et al. 1973
;
Wilson et al. 1997
), and curvature appears to be a basic
feature for visual search (Treisman and Gormican 1988
;
Wolfe et al. 1992
). Moreover, there is no transfer of
perceptual learning between curvature hyperacuity tasks and other
hyperacuity tasks like orientation and vernier discrimination
(Fahle 1997
; Watt and Andrews 1982
). On
the basis of these results, psychologists have postulated specific
neural mechanisms for detecting angles and curves. Our data provide
convergent evidence for the existence of such neural mechanisms.
Contour feature extraction would support an efficient and
flexible population code that could represent a variety of shapes with
a limited number of units. A triangle could be represented by the
activity of cells tuned for acute angles pointing in three specific
directions, and the same cells could participate in the representation
of any number of shapes containing similar acute angles. Contour
feature signals from intermediate areas like V4 could be combined at
subsequent processing stages to create selectivity for more complex
patterns, of the sort that has been observed in IT cortex. Our
demonstration of systematic tuning for contour features in V4 provides
preliminary evidence for such a mechanism. Further studies are under
way to investigate how contour feature tuning relates to complex shape
responses in V4 and higher levels in the ventral pathway
(Pasupathy and Connor 1998).
![]() |
ACKNOWLEDGMENTS |
---|
Technical help was provided by S. Patterson and H. Dong. We thank A. J. Bastian, S. L. Brincat, D. A. Hinkle, S. S. Hsiao, K. O. Johnson, V. B. Mountcastle, G. F. Poggio, R. von der Heydt, and M. A. Steinmetz for comments on earlier versions of the manuscript.
This work was supported by the Lucille P. Markey Charitable Trust.
![]() |
FOOTNOTES |
---|
Address for reprint requests: C. E. Connor, Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, 338 Krieger Hall, 3400 N. Charles St., Baltimore, MD 21218.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Received 12 March 1999; accepted in final form 2 August 1999.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|