Modeling of Auditory Spatial Receptive Fields With Spherical Approximation Functions

Rick L. Jenison1, 2, Richard A. Reale2, 3, Joseph E. Hind2, 3, and John F. Brugge2, 3

1 Department of Psychology, 2 Department of Physiology, and the 3 Waisman Center, University of Wisconsin, Madison, Wisconsin 53706

    ABSTRACT
Abstract
Introduction
Methods
Results
Discussion
References

Jenison, Rick L., Richard A. Reale, Joseph E. Hind, and John F. Brugge. Modeling of auditory spatial receptive fields with spherical approximation functions. J. Neurophysiol. 80: 2645-2656, 1998. A spherical approximation technique is presented that affords a mathematical characterization of a virtual space receptive field (VSRF) based on first-spike latency in the auditory cortex of cat. Parameterizing directional sensitivity in this fashion is much akin to the use of difference-of-Gaussian (DOG) functions for modeling neural responses in visual cortex. Artificial neural networks and approximation techniques typically have been applied to problems conforming to a multidimensional Cartesian input space. The problem with using classical planar Gaussians is that radial symmetry and consistency on the plane actually translate into directionally dependent distortion on spherical surfaces. An alternative set of spherical basis functions, the von Mises basis function (VMBF), is used to eliminate spherical approximation distortion. Unlike the Fourier transform or spherical harmonic expansions, the VMBFs are nonorthogonal, and hence require some form of gradient-descent search for optimal estimation of parameters in the modeling of the VSRF. The optimization equations required to solve this problem are presented. Three descriptive classes of VSRF (contralateral, frontal, and ipsilateral) approximations are investigated, together with an examination of the residual error after parameter optimization. The use of the analytic receptive field model in computational models of population coding of sound direction is discussed, together with the importance of quantifying receptive field gradients. Because spatial hearing is by its very nature three dimensional or, more precisely, two dimensional (directional) on the sphere, we find that spatial receptive field models are best developed on the sphere.

    INTRODUCTION
Abstract
Introduction
Methods
Results
Discussion
References

Ever since the pioneering work of Hubel and Wiesel (1962, 1977) on cortical functional architecture and visual receptive field organization, it has been known that the direction, size, and orientation of the visual stimulus are critical spatial variables characterizing neural response selectivities. The visual receptive field of a single neuron is defined by the retinal receptor area within which a stimulus affects the response of the cell. These retinal locations correspond to points on a hemisphere with its center located at the center of the entrance pupil. Generally hemispheric receptive field locations are mapped to two-dimensional directions in the external world employing a graticule with units of polar angle and eccentricity. These measurements of spatial visual acuity suggested a two-dimensional framework within which mathematical models of cortical receptive fields could be constructed from difference-of-Gaussians (DOGs) or Gabor functions (Daugman 1980; Marcelja 1980; Marr 1982; Rodieck 1965).

The auditory receptive field of a single neuron may be defined by the directions in the external world from which an auditory stimulus affects the response of the cell. These directions correspond to points on a sphere with its center located at the center of the head. Recently, we (Jenison and Fissell 1995, 1996) proposed a natural and straightforward approach to the problem of modeling spherical surfaces, and we describe its application to auditory spatial receptive fields. The method employs spherical, rather than Cartesian, approximation functions and is capable of modeling both scalar (e.g., response rate and latency) and vector (e.g., response pattern) single-cell receptive fields as well as population response metrics.

Virtual space receptive fields

Spatial sensitivity of cat auditory cortical neurons to free-field sound sources measured as discharge rate, response latency, and response pattern has been recognized for many years (Eisenman 1974; Imig et al. 1990; Middlebrooks 1998; Middlebrooks and Pettigrew 1981; Middlebrooks et al. 1994; Rajan et al. 1990a,b; Samson et al. 1993; Soviarvi and Hyvarinen 1974). In those studies, the limitations of the delivery system often restricted the spatial resolution to tens of degrees and usually confined the sample to a single latitude or longitude. Recently, we employed a virtual-space paradigm (Chen et al. 1994, 1995; Reale et al. 1996) that alleviates these physical limitations to study the spatial receptive fields of single cells in the cat's primary auditory (AI) cortex (Brugge et al. 1994, 1996, 1997a,b). Receptive fields obtained under these conditions were designated as virtual space receptive fields (VSRFs), and their extents were measured typically with a resolution of <5° in both azimuth and elevation.

Under barbiturate anesthesia, AI neurons generally respond to any effective acoustic stimulus (virtual or otherwise) with one or a few action potentials (Brugge et al. 1969, 1994, 1996; Imig et al. 1990; Phillips and Irvine 1981; Rajan et al. 1990a). Although the response in terms of number of spikes is small, the probability of response to a directional stimulus is often systematically, albeit not uniquely, related to direction. The most common finding is that response strength is graded, sometimes nonmonotonically, along the azimuthal dimension (Brugge et al. 1997b; Imig et al. 1990; Rajan et al. 1990a,b; Reale et al. 1996; Samson et al. 1993). Furthermore, when a transient stimulus is delivered from virtual acoustic space, the response latency of most cells is tightly locked to the stimulus onset and therefore can serve as another metric on which to delineate a receptive field. When the internal structure of a VSRF was examined at a high spatial resolution, there appeared to be systematic gradients of response latency for a sizable proportion of AI neurons (Brugge et al. 1996).

Spherical-function approximation

Large receptive fields with graded and overlapping response profiles are ubiquitous in biological systems (McIlwain 1976). Although VSRFs are often quite broad, the gradient structure within the field affords systematic variability that could be exploited by some form of population decoding (Heiligenberg 1987; Zohary 1992). In this study, we report on an approach to characterizing the continuous structure of the VSRF by using techniques common to functional approximation theory in mathematics (Jenison 1995, 1996; Jenison and Fissell 1995, 1996). Functional approximation affords not only a mathematically descriptive form of characterizing VSRFs but also reflects a different way of thinking about cortical receptive fields. The field in many instances may not be bounded in the binary sense of the cell either responding or not responding nor in the threshold sense of responding at a particular output level. Rather, it may be better characterized by a graded continuum, and this continuum may in fact have important consequences for the acuity of sound localization resulting from encoding with broad sensitivity.

Artificial neural networks and approximation techniques have been applied typically to problems conforming to a multidimensional Cartesian input space. For example, linear combinations of Gaussians or Gabor functions have been used to model the receptive field surfaces of visual cortex cells (Daugman 1980; Marcelja 1980; Marr 1982; Rodieck 1965). However, approximation problems that exist on a sphere, such as directional response measures, are most accurately framed within a spherical or polar coordinate system rather than projected implicitly onto a two-dimensional Cartesian system. Spherical data arise in many areas of science, such as sensory physiology, geophysics, and meteorology. Consequently, projection geometry has been the focus of cartographic interest for problems involving a transformation of the spherical earth into a planar map. A well-known problem to cartographers is that area, direction, and shape cannot be preserved simultaneously in these projections; hence some degree of spatial distortion must be incurred, unless an alternative basis is used.

The von Mises basis function (VMBF) is such an alternative. Just as a mixture of Gaussians or Gabor functions has been used to model visual cortex responses on the plane, so can a mixture of VMBFs be used to construct a continuous forward mapping from spherical coordinates to neural response (e.g., the VSRF). This function is based on a spherical probability density function that was used to model line directions distributed unimodally with rotational symmetry. The density function is well known in the spherical inferential statistics literature and is commonly referred to as either the von Mises-Arnold-Fisher or Fisher distribution (Fisher et al. 1987; Mardia 1972). The kernel form of the probability density function was first introduced by Arnold (1941) in his unpublished doctoral dissertation. Because of the spherical nature of the VMBF approach, this approximation is likely to be the best "tool" for functionally modeling directional response properties. In this approach, a sufficient number of basis functions is chosen, such that the systematic internal fine structure of the spatial receptive field is characterized by the approximation and the unsystematic noise is isolated as residuals. This is commonly accomplished by keeping the number of basis functions used in the approximation small, in most cases in the neighborhood of 9-35. After asymptotic learning of a spatial receptive field, a dense test set of novel directions is used to interrogate the approximation for how well it generalized the systematic organization.

    METHODS
Abstract
Introduction
Methods
Results
Discussion
References

VMBFs belong to the general class of radial basis function approximation (Powell 1987) in which a linear combination of weighted nonlinear functions is used to map an input surface to an output surface. Gaussians are often employed in approximating physiological data (Girosi et al. 1995). We extended the theory of using radially symmetric functions on the plane to that of using rotationally symmetric functions on the sphere (Jenison 1995, 1996; Jenison and Fissell 1995, 1996). The expression for the VMBF, dropping the constant of proportionality and elevational weighting factor from the probability density function, is
VMBF(θ,φ,α,β,κ) = <IT>exp</IT>{κ[sin φ sin β cos (θ − α) + cos φ cos β]} (1)
where the variables theta  and phi  correspond to any sample direction in azimuth and elevation. For each VMBF, the parameters alpha and beta  correspond to a centroid in azimuth and elevation, and the parameter kappa  corresponds to a concentration parameter. Application of the VMBF requires an azimuthal range in radians from 0 to 2pi and elevational range from 0 to pi . Any sample direction (theta  and phi ) on the sphere will induce an output from a VMBF proportional to the angle between the sample direction and the centroid of the VMBF (alpha  and beta ). The concentration parameter kappa  controls the function's shape, where the larger the value the narrower the function width after transformation by the expansive exponential function. The parameters alpha , beta , and kappa  for a fixed number of basis functions were "learned" adaptively with a sum-of-squared-error cost function (see Appendix A). Although other spherical functions have been proposed for approximation on the sphere (e.g., thin-plate pseudo-spline) (Wahba 1990), the VMBF serves as a convenient spherical analog of the well-known multidimensional Gaussian on a plane. It resembles a bump on a sphere and behaves in a similar fashion to the planar Gaussian with the centroid corresponding to the mean and kappa -1 corresponding to the SD. It differs from the thin-plate spline in that it has a parameter for controlling the width or concentration of the basis function, which allows the VMBF to focus resolution where needed.

The VSRF of a single neuron is composed of the direction-dependent discharge patterns that are evoked in response to simulation of a particular free-field sound source. Therefore the receptive field can be viewed as a multivariate output mapped from a spherically specified input. This output surface can be approximated by a linear summation of a number of appropriately weighted VMBFs
<A><AC>VSRF</AC><AC>ˆ</AC></A><SUB><IT>l</IT></SUB>(θ, φ) = <LIM><OP>∑</OP><LL><SUB><IT>j</IT>=1</SUB></LL><UL><SUP><IT>J</IT></SUP></UL></LIM><IT>w</IT><SUB><IT>lj</IT></SUB>VMBF(θ, φ, α<SUB><IT>j</IT></SUB>, β<SUB><IT>j</IT></SUB>, κ<SUB><IT>j</IT></SUB>) (2)

For this multivariate output, <A><AC>VSRF</AC><AC>ˆ</AC></A>l indexes the lth variate in the output vector, and VMBF(theta , phi , alpha j, beta j, kappa j) is the output of the jth VMBF, and wlj is the least-squares weight for the jth function. It is often the case that one basis function is used as an offset in the approximation, akin to an intercept in linear regression. This is accomplished by clamping kappa  to zero in one basis function, effectively generating a constant uniform sphere. The full set of approximation parameters (alpha , beta , kappa , and w) for a fixed number of basis functions can be adaptively learned by applying a gradient-descent method to the desired cost function for the training set of P input-output pairs, where the pth observed VSRF (theta p,phi p) is modeled as
VSRF(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) = <A><AC>VSRF</AC><AC>ˆ</AC></A>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) + ε<SUB><IT>p</IT></SUB> (3)
such that the error or noise term epsilon p is independent with zero mean and unit variance but not necessarily identically distributed. For this case we require the approximation sum-of-squared-error to be minimized.


View larger version (54K):
[in this window]
[in a new window]
 
FIG. 1. A: virtual space receptive field (VSRF) approximated from 2 von Mises basis functions (VMBFs) and rendered as a solid-body spherical surface. The two basis functions have equal centroids, but the one with larger kappa -1 is weighted negatively, and the one with smaller kappa -1 is weighted positively. This summation is analogous to the classic difference-of-Gaussian (DOG). The spherical surfaces in A are mapped onto the Cartesian plane in B. By comparison, a hypothetical construction of a two-dimensional DOG function that might be used to model a simple cell in visual cortex is shown in C.

Our recorded population of AI cells consists of hundreds of receptive fields in which direction was sampled at a high spatial resolution but in which the "pattern of discharge" was not estimated with a similar confidence. This situation occurs because, in the empirically measured receptive field, each unique direction is tested only one time, although there are tens of hundreds of unique directions. Employing this paradigm, the only quantitative response metric that appears to be reliably extracted is "response latency." The times of action potentials elicited in response to a virtual space stimulus were measured with a resolution of 1 µs, and the first-spike latency was taken as the response latency. In the case of first-spike latency, the index (l) in <A><AC>VSRF</AC><AC>ˆ</AC></A>l is restricted to 1, and the VSRF can be viewed as a scalar spherical surface (e.g., a deformed sphere). Figure 1A shows a hypothetically VMBF approximated VSRF rendered as a deformed spherical surface, shown equated to its two constituent VMBFs. The placement of the centroids (alpha  and beta ) for the two VMBFs as well as their concentration values (kappa ) are typically derived from sampled data with the methods presented in Appendix A; however, in this example the two basis functions are shown with identical centers, with a larger kappa -1 negatively weighted (inverted) VMBF added to the smaller kappa -1 positively weighted (bump) VMBF, which is analogous to the classic DOG. These same spherical surfaces are graphed on the Cartesian plane in Fig. 1B. By comparison, a hypothetical construction of a two-dimensional DOG function that might be used to model a simple cell in visual cortex is shown in Fig. 1C.

Clearly, the edge discontinuities apparent in Cartesian approximation (Fig. 1C) are eliminated with the VMBF approximation technique (Fig. 1B). Just as importantly, but less obvious, VMBFs afford global interpolation in contrast with the more typical local methods of interpolation. The benefit of the global spherical approximation is that all of the basis functions contribute to the surface approximation on the sphere, which by definition is continuous in all directions. There is no possibility for unconstrained extrapolation, as can occur with planar approximation. The benefits of spherical approximation over that of planar approximation are described in greater detail in the original derivation of the technique (Jenison 1995; Jenison and Fissell 1995).

Map projection

That any projection of the sphere onto a two-dimensional surface introduces distortion was the motivating force for the development of VMBF approximation for receptive field modeling. Although improvements can be gained mathematically, the approximation still must be rendered visually in two dimensions, which, although distorted, should do so with minimal distortion in the frontal hemisphere. The quartic-authalic equal area map is one such projection (Fig. 2A). This projection is often used in distribution mapping of spherical statistical variables. In this choice of projection, scale is true along the Equator and is constant along any parallel and between any pair of parallels equidistant from the Equator (Snyder 1987). The projection is not conformal, that is, it does not preserve shape. Furthermore, because no map preserves correct scale throughout, it is useful to determine the extent that it varies on any particular map. The deformation of an infinitely small circle on the sphere into an ellipse by the transformation to the plane (i.e., Tissot's indicatrix) is a common graphic tool for evaluating the characteristics of various types of distortion. A small circle anywhere on the sphere projects a perfect ellipse on any map projection. In the equal-area map projection (Fig. 2B), the ellipses of distortion show considerable variability in their ellipticity, although they have uniform area. If the projection were conformal, the ellipse is a circle, and there is no angular deformation so that shape, but not size, is preserved. Hence conformal projections are typically used for navigational charts and topographical maps. Finally, it is important to draw the distinction between the operation of spherical function approximation, which is free of interpolation distortion that would be introduced by planar approximation, and the map projections used for final visualization. The mathematical representation of the VSRF is distortion free, which is evident by the matched edges at +180° and -180° when projected onto two dimensions for visualization. This matching would not occur with the employment of planar basis functions.


View larger version (35K):
[in this window]
[in a new window]
 
FIG. 2. Quartic-authalic projection, illustrating the coordinate system (A) and uniform placement of Tissot indicatrices (B). The Quartic-authalic projection is an equal-area projection, but neither conformal nor equidistant. The circles are of identical radius on the surface of the sphere. The frontal hemisphere is highlighted as a reference for subsequent maps.

    RESULTS
Abstract
Introduction
Methods
Results
Discussion
References

Generally, the size of a VSRF in AI is dependent on the intensity of the simulated sound source, although with near threshold intensity the receptive field of most cells is confined to one quadrant of acoustic space (Brugge et al. 1994, 1996). Measurements on confined VSRFs showed the majority to be positioned in the frontal hemisphere, either contralateral or ipsilateral to the location of the neuron under study. Regardless of size, these VSRFs exhibit gradients in response latency, and, for a sizable proportion of cells, an orderly pattern of gradients could be discerned by the eye (Brugge et al. 1996). The purpose of our functional modeling approach is to quantify the systematic aspects of these spatial gradients.

VMBF Approximated VSRFs

The VMBF approximation method was applied to our database consisting of hundreds of empirically measured VSRFs obtained from the studies of Brugge and colleagues (1994, 1996). In the first set of examples, a single stimulus presentation is used at each virtual space direction. At most directions, an action potential is evoked; however, in some cases, a response fails to be observed within a 5- to 50-ms window after stimulus onset. Figure 3 shows deformed spherical renderings of a receptive field surface as a function of sound direction when 1) empirical values and 2) modeled values of response latency are employed, akin to the schematized combination of basis functions shown in Fig. 1A. In this rendering, radii to the surface are reversed in length such that longer radii correspond to shorter latencies. By using this reversal, the visualization could also communicate the strength of response as a function of direction, which is negatively correlated with latency (Brugge et al. 1996). The summation of spherically continuous basis functions always manifests itself as smoothing of the surface structure because VMBF interpolation is a global outcome. It was generally the case, however, that, as more VMBF functions are added to the approximation, more of the fine structure (higher spatial frequency) of the VSRF emerged. It is often difficult to visually examine global relationships of the receptive field with the deformed spherical rendering; hence the quartic-authalic projection will be the primary means for displaying results. This will afford a view of responses induced by virtual sounds behind the animal, however, at the expense of distorting the spatial pattern of responses shown in Fig. 3.


View larger version (44K):
[in this window]
[in a new window]
 
FIG. 3. (A) Empirical values and (B) VMBF modeled values of response latency shown as a solid-body sphere akin to receptive fields composed in Fig. 1A. Latencies are reversed in length such that long latencies appear as short radii. The front hemisphere is toward the reader. The dotted line denotes 0° elevation in the rear hemisphere.

The initial and final placements of the 10 VMBFs that were employed for this approximation are shown in Fig. 4. The final centroids (alpha  and beta ) and concentrations (kappa ) result from the optimization strategy discussed in Appendix A. The asymptotic solution is straightforward to interpret for this particular VSRF. Three VMBFs are positioned near the intersection of the midline and the horizon, which sculpts the region of shortest latency, whereas several others are positioned on the side and toward the back producing asymmetric (midline) gradients in latency. Nevertheless, it is important to realize that all basis functions contribute to the spherical surface of the receptive field. In this example, it took 580 iterations (i.e., successive presentation of the P input-output teaching pairs) to optimize the parameters at which point the root-mean-squared (RMS) error was <0.6. Figure 5 illustrates the monotonic descent of RMS error as parameter optimization proceeded.


View larger version (27K):
[in this window]
[in a new window]
 
FIG. 4. A: uniform initial placement of the 10 VMBFs on the unit sphere. Each contour, in actuality, is circular on the surface of the sphere. The radius of each circle is proportional to kappa -1, where the initial value of the concentration is 3. B: final positions of VMBFs; again the radius is proportional to kappa -1.


View larger version (14K):
[in this window]
[in a new window]
 
FIG. 5. Gradient descent of root-mean-squared (RMS) error as a function of iterations that optimize the parameters of the VMBF model (10 basis functions).

It can be seen, in this example, that asymptotic error is reached quickly after ~150 iterations. It should be noted that, although asymptotic parameter solutions are typically nonunique, it was our experience that the resulting surface is remarkably consistent across random starting conditions. In other words, there appear to be multiple solutions to the approximation problem that are equally good, at least in a least-squares sense.

Receptive fields were analyzed to show the effects of systematic addition in the number of basis functions on the approximated field structure. Three examples (1 VSRF each, from frontal, contralateral, and ipsilateral populations) were chosen to demonstrate these effects in detail. In these examples, the number of basis functions compared are 10, 20, and 30 VMBFs. The response latency data for both empirically measured and VMBF modeled exemplar VSRFs are shown with the equal-area projection method described in METHODS. Contralateral acoustic space is represented by positive azimuthal angles shown to the right of the midline on each map. The characteristic frequencies of the three neurons are 16.8, 15.5, and 30.5 kHz, respectively, recorded at 40, 40, and 50 dB above their respective thresholds. A color code is used to denote response latency at each location. White squares correspond to unsampled directions, and black squares correspond to directions at which the neuron failed to respond.

Frontal VSRF

Frontal VSRFs (e.g., Fig. 4) occur often when stimulation of each ear alone is excitatory and the binaural interaction is facilitative, and stimulus directions to which the neuron are most sensitive tend to be positioned about the median plane. The VMBF approximation to this frontal VSRF is shown in Fig. 6, A-C, for the three conditions of 10, 20, and 30 basis functions.


View larger version (49K):
[in this window]
[in a new window]
 
FIG. 6. VMBF approximation to frontal VSRF. A: 10 VMBFs; B: 20 VMBFs; C: 30 VMBF; D: isolatency contours (solid white lines) from the 30 VMBF model projected onto the empirically measured VSRF. Black dots correspond to sampled directions that failed to produce a response (action potential). The color legend denotes discrete latency levels from short (yellow) to long (light blue).

The approximated fields are clearly representative of the empirical data (Fig. 6D) in every condition, although increasing detail is evident as the number of basis functions was increased. Examination of the change in RMS error is not indicative of this visual improvement because the spatial area where the error is most improved is limited to a small fraction of the receptive field. Because spatial receptive fields of most AI cells are typically large, this observation was not uncommon in our analysis.

The interaction between the acoustic axis induced by pinna filtering and the characteristic frequency of an AI neuron affects the central position of the VSRF (Brugge et al. 1996). A quantitative measurement of this center can be obtained from the gradients and global minimum of the VMBF approximations. The center for this VSRF is located at -20.6° azimuth and 31.6° elevation. There was not a significant change in the calculated center across these three conditions. Visually, the central position is delineated by overlaying isolatency contours derived from the modeled VSRF (e.g., with 30 VMBFs) on the empirically measured VSRF (Fig. 6D). Because the interval between isolatency contour levels was constant, the spatial proximity between contours reflects the slope of the gradient along different directions in the field. The effect of changes in the number of basis functions is greatest where the gradients are largest. This is because sharp gradients can be constructed by taking the difference between oppositely directed (weighted) basis functions to approximate the gradients in the data, as demonstrated in Fig. 1.


View larger version (49K):
[in this window]
[in a new window]
 
FIG. 7. VMBF approximation to contralateral VSRF. A: 10 VMBFs; B: 20 VMBFs; C: 30 VMBFs; D: isolatency contours (solid white lines) from the 30 VMBF model projected onto the empirically measured VSRF. Black dots correspond to sampled directions that failed to produce a response (action potential). The color legend denotes discrete latency levels from short (yellow) to long (light blue).

Contralateral VSRF

The largest proportion of AI cells with confined receptive fields can be classified as hemifield neurons (Brugge et al. 1994, 1996). The VSRF shown in Fig. 7 belongs to the majority of exemplars in this class because its receptive field was lateralized to the contralateral frontal quadrant at a lower intensity level. The data in Fig. 7 were obtained at a suprathreshold intensity, where the receptive field occupied almost all of the acoustic space. Nevertheless, there is an obvious asymmetry in the distribution of response latency, with the shortest latencies occupying the contralateral frontal quadrant.

Approximating this VSRF with only 10 basis functions (Fig. 7A) captures this laterality and results in smooth increasing gradients of latency along paths radiating from the center of the field estimated to be at 38.2° azimuth and 18.6° elevation. Although this central position remains rather constant with increasing numbers of basis functions (i.e., Fig. 7, B and C), the same cannot be said for the organization of gradients. For example, several additional foci corresponding to the shorter latencies in the range are revealed when more basis functions are added to the approximation. Isolatency contours from the 30 basis function approximation demarcate these regions in the empirical measurements (see Fig. 7D).

This empirical VSRF evidences a region near -90° azimuth and -15° elevation where the neuron failed to respond (Fig 7D, black squares) at several adjacent directions. Such regions are common in cortical receptive fields and may result from binaural interactions or monaural intensity effects (e.g., nonmonotonicity or head shadow effects). Regardless, of their genesis, these regions of undefined latency are always "filled in" with the use of spherical interpolation. Importantly, however, because of the spherical nature of the VMBF approximation, this interpolation is constrained by all other measurements comprising the receptive field. This is one of the strengths of this approximation technique. The VMBFs tend toward smoothing the measurements; hence the "filling in" of gaps in the receptive field surface is dependent on the trajectories or rate of change of neighboring measurements.

Ipsilateral VSRF

The VSRF shown in Fig. 8 was also selected from the population of neurons classified as hemifield. Its receptive field was lateralized to the ipsilateral frontal quadrant at a lower intensity level, although the data in Fig. 8D were obtained at a suprathreshold intensity where the receptive field extended over much of acoustic space. Compared with Fig. 7D, there is again an asymmetry in the distribution of response latency in which the shortest latencies now occupy the ipsilateral frontal quadrant. Furthermore, this exemplar also contains a region where directions in a small region (near -90° azimuth and -15° elevation) of virtual space failed to evoke a response from the cell. However, unlike the juxtaposition of effective directions that marked receptive fields in the frontal hemisphere in Figs. 6D and 7D, this cell's VSRF exhibits a more fractured appearance, characterized by ridges of stochastically undefined latencies. This is a characteristic that is also commonly observed in our population of AI cells.


View larger version (50K):
[in this window]
[in a new window]
 
FIG. 8. VMBF approximation to ipsilateral VSRF. A: 10 VMBFs; B: 20 VMBFs; C: 30 VMBFs; D: isolatency contours (solid white lines) from the 30 VMBF model projected onto the empirically measured VSRF. Black dots correspond to sampled directions that failed to produce a response (action potential). The color legend denotes discrete latency levels from short (yellow) to long (light blue).

Regardless of the number of basis functions employed, modeled VSRFs provide smooth interpolation over the region of undefined response latency. This interpolated region contains the longest latencies in the field because its nearest neighbors with empirical measurements are similarly biased toward longer response latencies. The approximated fields also show several modal regions of short latencies, one near the midline (0° azimuth) and the others progressively more ipsilateral to the midline.

Magnitude and form of VSRF residuals

The goal of VMBF approximation is to estimate parameters that account for systematic variability, and partition out unsystematic variability. We focused on constraining the magnitude to which the model may be fit in a least-squares sense. This was accomplished by controlling the number of basis functions. Figure 9 shows the magnitude of RMS error following the fit of each exemplar VSRF with different numbers of basis functions. The trend is one of decreasing final RMS as number of basis functions are increased from 10 to 30, as expected. It is perhaps surprising that the differences are not larger. One reason is that the majority of the variation is in the frontal hemisphere, and, because of the adaptive gradient-descent algorithm, basis functions are positioned to optimally account for this variability regardless of the number of basis functions, at the expense of variation in the rear hemisphere. Second, the system of equations is greatly overconstrained, which is a good thing, both in terms of the amount of data and because of the spherical nature of this approximation technique; where no boundaries exist, so all basis functions in the approximation equation constrain in all directions. A frequency distribution of final RMS error from VMBF approximation (35 basis functions) of 158 cortical neurons is shown in Fig. 10. The mean RMS error is 1.62 ms for this distribution, so the approximation error of the exemplar neurons falls below this measure of central tendency. However, they do fall near the mode of the distribution. Although the RMS levels of the ipsilateral neuron are the largest for the three exemplar neurons, this is not a common observation across the total sample of neurons modeled, for which the mean ipsilateral and contralateral approximation errors are nearly equivalent (1.38 and 1.33 ms, respectively.) The frontal neurons on average tend to have greater approximation error (1.78 ms.) It is important to note that, fundamentally, the best metric for goodness-of-fit is not necessarily the magnitude of the approximation error but rather the degree to which the residual error is unsystematic. Physiological systems are by nature stochastic, so we should expect significant variation in the final RMS error across cortical neurons.


View larger version (17K):
[in this window]
[in a new window]
 
FIG. 9. Final RMS error for each class of VSRF (frontal, contralateral, and ipsilateral) for each order of VMBF approximation (10, 20, and 30).


View larger version (13K):
[in this window]
[in a new window]
 
FIG. 10. Frequency distribution of final RMS error for 158 VMBF approximated VSRFs. Mean RMS error is 1.62 ms.

The method of approximation described can be categorized as nonparametric in the sense that approximation error is minimized with a cost function that makes no assumption about the nature of the error distribution. Other techniques for parameter estimation assume specific forms of the error distribution, for example, that the error distribution is Gaussian or Poisson. In these cases, a parametric approach such as maximum likelihood (ML) estimation might be employed rather than least squares.1 Occasionally, assessment of the residual error is performed to evaluate the shape of the resulting distribution. With this specific goal in mind, response-latency residuals were examined after VMBF modeling. Figure 11 compares the spatial distribution of residuals from the 30 VMBF approximation of the frontal VSRF (A, top panel) with that resulting from a synthesized normal probability distribution (B, bottom panel) with a mean of zero and the same SD (0.588 ms) as that computed from the VMBF approximation. A visual comparison between the two distributions, both shown as quartic-authalic projections, suggests that the residuals resemble that of a normal distribution, although there is one small clump of extreme residuals (Fig 11, yellow pixels). To test this hypothesis more quantitatively, a formal composite (mean and SD unspecified) test of normality was used. Because tests of normality are null hypotheses, it is difficult to prove that an empirical sample was generated by a normal distribution. One test of normality proposed by Filliben (1975), which possesses strong power, is one that uses the normal probability plot correlation coefficient. The approach examines the correlation between ordered quantiles of the normal distribution and ordered observations of each residual. If the sample of residuals were generated by the hypothesized normal distribution, then the resulting plot should be linear. The Pearson correlation coefficient rxy is a simple and straightforward measure of linearity between two variables and serves in this case as the test statistic for normality. Figure 12 illustrates the use of the normal probability plot. Probabilities are plotted with respect to the observed residuals, such that a sample generated by a normal distribution would tend to fall along the line that passes through the 25th and 75th percentile. The residuals resulting from the VMBF approximation tend to fall along the linear dotted line from ~0.10 to 0.90, with the tail region stretching out relative to the normal distribution.2 Using the sampling distribution of r determined by Filliben (1975) for testing the null hypothesis of normality, there is insufficient evidence to contradict the hypothesis of normality (r = 0.989, P > 0.10).


View larger version (100K):
[in this window]
[in a new window]
 
FIG. 11. A: VMBF residuals color coded and quartic-authalic projected for the frontal VSRF. B: Gaussian variates with same standard deviation as A.


View larger version (24K):
[in this window]
[in a new window]
 
FIG. 12. Normal-probability plot. Dotted line represents normal density function. Dots represent residual samples from VMBF approximation.

The parametric shape of the error distribution is often neglected in models of neural responses. Although nonparametric approaches have been used for sensory coding analysis (e.g., Gruner and Johnson 1998; Middlebrooks et al. 1994), the structure and magnitude of the variability about the modeled response may be important for deriving exact analytic expressions for stochastic models of neural coding (Paradiso 1988).

Multiple stimulus repetitions

Global interpolation by linear combination of spherical basis functions, in effect, fills in the missing data that may cause the fractured appearance in the measured receptive fields for single-stimulus presentation. The functional approximations described so far were from neurons that may be responsive throughout virtual auditory space. Because the amount of time for reliable contact with a single neuron is limited, it was necessary to restrict the number of repetitions to that of a single presentation at each sampled direction. However, by restricting the sampling to a fraction of virtual space, the number of repetitions can be increased, thus increasing the likelihood that a response is obtained for each of the sampled directions. Figure 13A shows response latencies sampled with a single presentation from a restricted region of virtual space from -20° to +90° azimuth and -35° to +68° elevation, again employing the quartic-authalic projection for purposes of visualization. This neuron shows the fractured pattern that is also evident in the neuron shown in Fig. 8. Ten basis functions were used to model responses within this restricted region of virtual space (Fig. 13B), again demonstrating interpolation of the missing data consistent with prior results. The same neuron was also presented with 15 stimulus repetitions, such that the probability of observing an action potential, given this number of presentations, was nearly certain. This is confirmed empirically by Fig. 13C, where all locations within the restricted region give evidence of an observed response. In Fig 13C, the latency of the first spike observed in the repeated sequence of stimuli at each direction is color coded. The same order approximation (10 basis functions) was also used to model the 15-repetition data, and the result is shown in Fig. 13D. Although subtle differences can be observed between Fig. 13, C and D, the approximations under different sampling conditions are remarkably similar. In particular, the trimodal character of the approximated VSRF is clearly evident in both cases. The characteristics of the empirical data that fill in under the condition of multiple repetitions confirm the covariation of probability of response and response latency that occur in many of the cortical neurons described by Brugge et al. (1996). This is evident by the longer latency light blue squares in Fig. 13C, which in general replace the white squares observed in Fig. 13A.


View larger version (61K):
[in this window]
[in a new window]
 
FIG. 13. A: Empirically measured single-presentation VSRF at each sampled direction. White pixels correspond to sampled directions that failed to produce a response (action potential). Gray pixels correspond to regions of virtual space not sampled. B: VMBF approximation with single-stimulus responses. C: first-spike latency from 15-repetition VSRF. D: VMBF approximation with 15-repetition responses. The color legend denotes discrete latency levels from short (yellow) to long (light blue).

    DISCUSSION
Abstract
Introduction
Methods
Results
Discussion
References

A spherical approximation technique was presented that affords mathematical modeling of VSRFs. Parameterizing directional sensitivity in this fashion is much akin to the use of DOG functions for modeling responses in visual cortex. In the vision literature, the most analogous approach to that described in this paper is the work of Hawken and Parker (1987). Their models extended the classical DOG approximations to a linear combination of weighted Gaussians that included more than the standard two basis functions, where the centers were allowed to be spatially offset. For the case of VSRFs, the focus is on accurately accounting for the receptive field surface, which typically requires more than two basis functions to faithfully capture the structure of the spatial receptive field.

The problem with using classical planar Gaussians is that radial symmetry and consistency on the sphere actually translate into directionally dependent distortion on planar projections as shown in Fig. 1. In this regard, an important observation is that the von Mises functions, which are radially symmetric on the sphere, are no longer radially symmetric on the Cartesian coordinate system. Logically, if the receptive fields were radially symmetric in Cartesian coordinates, they in turn would not be on the sphere. The degree of distortion is dependent on how broad the receptive field is, the distance from each of the poles, and the artificial edge of the Cartesian map. The more focused the receptive field becomes, the lesser the distortion. If we wish to represent the relative positions of ensembles of spatial receptive fields, it is particularly important to consider spherical functions whose distances and shapes are distortion free on the sphere. This point becomes urgent when considering population coding of sound direction via an ensemble of auditory cortical neurons (Jenison 1998; Jenison et al. 1997).

The VMBF functions are nonorthogonal and hence require some form of gradient-descent search for purposes of estimating parameters. As discussed previously, other methods for parameter estimation may be considered, such as ML estimation. One promising technique for performing ML estimation of the VMBF parameters is the expectation-maximization algorithm (Dempster et al. 1977), which addresses incomplete-data problems, which are relevant to the undefined latencies (i.e., response gaps) described in this paper. There are several clusters of outliers in the residual analysis (see Fig. 11) where the residuals appear to deviate from normal residuals. These clusters generally occur in regions near the undefined latencies of the receptive field, which are most likely artifacts of the least-squares approach.

Evaluating the goodness of statistical fits is fundamental and of importance in various fields of statistics. Akaike's (1974, 1977) information criterion, known as AIC, provides a useful tool for evaluating the complexity of statistical models. Rissanen's (1983, 1986) minimum description length criterion offers an alternative to the AIC for evaluating the descriptive cost or complexity of the model. This approach adds an additional term to the approximation cost function (see Eq. A2), which penalizes the complexity of the approximation function. Here complexity is characterized as one-half the number of parameters times the log of the sample size. This approach was used previously in auditory neurophysiology to estimate recovery functions of the auditory nerve (Mark and Miller 1992; Miller and Mark 1992). For a doubling in number of parameters, we would expect a doubling in complexity for equal sample sizes. An examination of Fig. 9 indicates that the proportional reduction in RMS error is obviously much less than this; hence, from the perspective of parsimony, adding more basis functions is not justified. Of course this is a rather qualitative evaluation of stochastic complexity, and the utility of the complexity metric ultimately is its formal adaptation into the estimation cost function.

This paper examined coding of neural response only in terms of first-spike latency; however, the approximation approach is by no means wedded to this particular response variable. The approximation approach adapts easily to average discharge rate and response patterns. The use of first-spike latency, however, brings with it the question as to how the information related to the onset of the stimulus is conveyed in the approximation. This question is particularly relevant when an entire population of VMBF approximated cortical neurons is employed to investigate possible forms of population coding of sound direction. One possible resolution is that relative, not absolute, latencies serve as the code. One possible timing mark might be that of a periodic biological clock. In the absence of a periodic marker, relative latencies in an ensemble of cells could be decoded with respect to time relative to the initial spike in the ensemble. In this decoding scheme all responses would contribute to a common estimate of a single ensemble delay from stimulus onset. Under this scenario, a more central neural observer would measure all events relative to the first observed action potential. All events are therefore simply shifted in time by the common unknown delay parameter. This hypothetical decoding mechanism removes the necessity of a stimulus onset marking and establishes the functional equivalency of spike latency and discharge rate as information-bearing sources.

Because spatial hearing is by its very nature three dimensional or, more precisely, two dimensional (directional) on the sphere, mathematical approximations are best developed on the sphere. Available literature on spherical approximation is remarkably small, which ultimately motivated the effort to extend existing planar approximation theory to that of the sphere. We operate in a three-dimensional auditory world, and auditory localization problems demand spherical solutions.

    ACKNOWLEDGEMENTS

  This work was supported by NIH grants DC-02804, DC-00116, and HD-03352.

    APPENDIX

VMBF parameter optimization

In this case, the training pattern is the observed response latency (first-spike latency), measured for a known sound-source direction, VSRF(theta p,phi p). For this univariate model (i.e., L = 1), we require the approximation sum-of-squared-error (E) to be minimized across all P observations and the single output. The general multivariate case, including L > 1 (e.g., spike-train patterns) is
<IT>E</IT>= <FR><NU>1</NU><DE>2</DE></FR><LIM><OP>∑</OP><LL><SUB><IT>p</IT>=1</SUB></LL><UL><SUP><IT>P</IT></SUP></UL></LIM><LIM><OP>∑</OP><LL><SUB><IT>l</IT>=1</SUB></LL><UL><SUP><IT>L</IT></SUP></UL></LIM>(VSRF<SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) − <A><AC>VSRF</AC><AC>ˆ</AC></A><SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>))<SUP>2</SUP> (A1)
Parameter values are learned through successive presentation of the P input-output teaching pairs and application of the specific update rules for each parameter of the VMBF. The amount of change made to each parameter after successive presentation is based on the error gradient (i.e., the negative partial derivative of the error with respect to that parameter), which was derived analytically by using the chain rule for partial differential equations (see Jenison 1995, 1996; Jenison and Fissell 1995, 1996). Over the course of iterative training, the centroid (alpha ,beta ) of each basis function will move on the surface of the sphere, and the concentration or width (kappa ) of the basis function will change, progressively minimizing the cost function (Eq. A1). The cost function is minimized, at least locally, when the error gradients are equal to zero.

The gradient-descent rules for estimating the von Mises basis parameters in a least-squares sense have the following form
<FR><NU>∂<IT>E</IT></NU><DE>∂<IT>w<SUB>lj</SUB></IT></DE></FR>= − <LIM><OP>∑</OP><LL><SUB><IT>p</IT>=1</SUB></LL><UL><SUP><IT>P</IT></SUP></UL></LIM>(VSRF<SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) − <A><AC>VSRF</AC><AC>ˆ</AC></A><SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>))
× VMBF(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>, α<SUB><IT>j</IT></SUB>, β<SUB><IT>j</IT></SUB>, κ<SUB><IT>j</IT></SUB>) (A2)
<FR><NU>∂<IT>E</IT></NU><DE>∂α<SUB><IT>j</IT></SUB></DE></FR>= − <LIM><OP>∑</OP><LL><SUB><IT>p</IT>=1</SUB></LL><UL><SUP><IT>P</IT></SUP></UL></LIM><LIM><OP>∑</OP><LL><SUB><IT>l</IT>=1</SUB></LL><UL><SUP><IT>L</IT></SUP></UL></LIM>{[VSRF<SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) − <A><AC>VSRF</AC><AC>ˆ</AC></A><SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>)] <IT>w</IT><SUB><IT>lj</IT></SUB>}
× VMBF(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>, α<SUB><IT>j</IT></SUB>, β<SUB><IT>j</IT></SUB>, κ<SUB><IT>j</IT></SUB>)
× κ<SUB><IT>j</IT></SUB>[sin φ<SUB><IT>p</IT></SUB>sin β<SUB><IT>j</IT></SUB>sin (θ<SUB><IT>p</IT></SUB>− α<SUB><IT>j</IT></SUB>)] (A3)
<FR><NU>∂<IT>E</IT></NU><DE>∂β<SUB><IT>j</IT></SUB></DE></FR>= − <LIM><OP>∑</OP><LL><SUB><IT>p</IT>=1</SUB></LL><UL><SUP><IT>P</IT></SUP></UL></LIM><LIM><OP>∑</OP><LL><SUB><IT>l</IT>=1</SUB></LL><UL><SUP><IT>L</IT></SUP></UL></LIM>{[VSRF<SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) − <A><AC>VSRF</AC><AC>ˆ</AC></A><SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>)] <IT>w</IT><SUB><IT>lj</IT></SUB>}
× VMBF(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>, α<SUB><IT>j</IT></SUB>, β<SUB><IT>j</IT></SUB>, κ<SUB><IT>j</IT></SUB>)
× κ<SUB><IT>j</IT></SUB>[sin φ<SUB><IT>p</IT></SUB>cos β<SUB><IT>j</IT></SUB>cos (θ<SUB><IT>p</IT></SUB>− α<SUB><IT>j</IT></SUB>) − cos φ<SUB><IT>p</IT></SUB>sin β<SUB><IT>j</IT></SUB>] (A4)
<FR><NU>∂<IT>E</IT></NU><DE>∂κ<SUB><IT>j</IT></SUB></DE></FR>= − <LIM><OP>∑</OP><LL><SUB><IT>p</IT>=1</SUB></LL><UL><SUP><IT>P</IT></SUP></UL></LIM><LIM><OP>∑</OP><LL><SUB><IT>l</IT>=1</SUB></LL><UL><SUP><IT>L</IT></SUP></UL></LIM>{[VSRF<SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>) − <A><AC>VSRF</AC><AC>ˆ</AC></A><SUB><IT>l</IT></SUB>(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>)] <IT>w</IT><SUB><IT>lj</IT></SUB>}
× VMBF(θ<SUB><IT>p</IT></SUB>, φ<SUB><IT>p</IT></SUB>, α<SUB><IT>j</IT></SUB>, β<SUB><IT>j</IT></SUB>, κ<SUB><IT>j</IT></SUB>)
× [sin φ<SUB><IT>p</IT></SUB>sin β<SUB><IT>j</IT></SUB>cos (θ<SUB><IT>p</IT></SUB>− α<SUB><IT>j</IT></SUB>) + cos φ<SUB><IT>p</IT></SUB>cos β<SUB><IT>j</IT></SUB>]. (A5)
The analytic gradients derived can be utilized in more advanced optimization algorithms such as constrained nonlinear programming techniques. A nonlinear program is a problem that can be put into the form whereby a cost function is minimized subject to functional constraints, for example, in the present case where alpha  and beta  are restricted to 0 to 2pi and 0 to pi , respectively. Constrained optimization using sequential quadratic programming (SQP) is available in several commercial optimization packages, including the Matlab optimization toolbox (Branch and Grace 1996), which is the framework that we used. The SQP method exploits the derived analytic gradients (the so-called Jacobians) to further derive a matrix whose components are the second partial derivative of the cost function known as the Hessian matrix. These methods serve to accelerate the minimization process by eliminating interference among parameter dimensions, which can slow down the search for the cost-function minimum.

The SQP method was used to minimize the cost function subject to the constraints -10 < w < 10, 0 <=  alpha  <=  pi , 0 <=  beta  < 2pi , and <=  kappa  < 100. Iterative optimization of the parameters to minimize the specified cost function was performed for several descriptive classes of VSRFs. The scope of the parameter optimization problem for a 10 VMBF model would entail 10 sets of [w, alpha , beta , kappa ] plus the aforementioned offset, for a total of 41 parameters. For the cases considered in this paper there are in the neighborhood of 1,600 observations of response latency as a function of sound-source direction, with the greatest density in the frontal hemisphere, where samples are taken at 4.5° intervals in azimuth and elevation. The rear hemisphere doubles the sampling interval in elevation. The samples range from -180° to 180° azimuth and -36° to 90° elevation.

    FOOTNOTES

1   Under the assumption of a normal error distribution with constant variance, ML estimation and least squares are mathematically equivalent (Freund 1992).

2   An example of a probability distributions where the tails of the distribution separate and stretch relative to the normal are the Student's t-distribution with small degrees of freedom and the Cauchy distribution.

  Address for reprint requests: R. L. Jenison, Dept. of Psychology, University of Wisconsin, 1202 West Johnson St., Madison, WI 53706.

  Received 22 January 1998; accepted in final form 4 August 1998.

    REFERENCES
Abstract
Introduction
Methods
Results
Discussion
References

0022-3077/98 $5.00 Copyright ©1998 The American Physiological Society