1 Center for Advanced Research of Spatial Information, Hunter College, City University of New York, New York, NY.
2 Office of Policy and Planning, New York City Department of Health and Mental Hygiene, New York, NY.
Received for publication May 31, 2002; accepted for publication December 20, 2002.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
arboviruses; geographic information system; space-time clustering; West Nile virus
Abbreviations: Abbreviations: DYCAST, Dynamic Continuous-Area Space-Time; MAUP, modifiable areal unit problem.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
West Nile virus is a mosquito-borne flavivirus belonging to the Japanese encephalitis virus serocomplex. In New York City, the Culex pipiens species is the primary vector of West Nile virus as evidenced by its high competency to transmit West Nile virus, its abundance, and its strong ornithophilic habits (46). The primary hosts are native bird species that lack West Nile virus immunity, particularly the American crow and blue jay. Pigeons, although abundant in New York City, have very low infection rates. Humans are dead-end hosts, and their infections are most likely incidental and the result of a spillover effect. Spillover effects occur when the transmission cycle between mosquitoes and birds intensifies (a process termed the amplification cycle), or when the avian pool is reduced and species of mosquitoes that are opportunistic feeders act as a bridge vector for transmitting West Nile virus to humans (4, 7).
In New York City, during the 1999 season, remediation and control efforts were based on positive West Nile virus human infections, and in the 2000 season, efforts were based on laboratory confirmation of West Nile virus in mosquitoes and birds, which had a delay time of up to 2 weeks. Recently, the New York City Department of Health implemented an in-house laboratory for faster positive mosquito identification. Although these approaches provide a definite confirmation of West Nile virus, they lack timeliness, because positive results in mosquitoes may not appear until West Nile virus activity has substantially intensified. In addition, they identify West Nile virus activity only at discrete point locations, where mosquito traps were positioned and dead birds were found. An important issue for the New York City Department of Health was how to use data points to identify the areal extent of West Nile virus activity in a timely fashion. To address this issue, the New York City Department of Health in cooperation with the Center for Advanced Research of Spatial Information (CARSI) Laboratory of Hunter College, City University of New York, embarked on an effort in January 2001 to develop an area-based system to identify areas of West Nile virus activity that could lead to human infection, for targeting remediation and control efforts. The system needed to be both prospective and dynamic, while providing the New York City Department of Health with the geographic extent of West Nile virus activity.
A widely used methodology for monitoring West Nile virus activity relies on dead crow densities (8, 9) or the number of dead crows per unit area. This approach has a number of limitations: 1) It does not ascribe statistical meaning to the results, leading to the selection of an arbitrary cutoff (critical) density for judging high-risk areas for West Nile virus; 2) density is highly susceptible to reporting bias; 3) density calculations for areas that vary in size, shape, and scale rely on a false assumption of uniformity of crow densities throughout the region, resulting in the aggregation problem of the modifiable areal unit problem (MAUP) (10); 4) densities calculated using kernel functions are subject to edge effects (11), although corrections for edges are available when the process is assumed to be stationary and isotropic (12); and 5) density measures ignore the pathology and ecology of the West Nile virus transmission cycle.
Other methodologies used to model infectious diseases generally factor only the temporal and not the spatial component of epidemics (13). Those methods that do account for space often use single unpartitioned areas or nonoverlapping spatial units specifically designed for unrelated purposes (i.e., census tracts). The latter can lead to an artificial split of clustered data. In addition, these studies are often retrospective and require the use of controls, the selection of which may be biased when knowledge of the ecology of an infectious disease is incomplete (14). There is limited research on prospective methods that account for both space and time localities (1517).
Knox (18, 19) proposed a method that allows for statistical testing of the interaction of incidents of infectious disease in space and time that does not suffer from the limitation of density measures that use an arbitrary critical value of density for determining risk localities. The Knox statistic is calculated by pairing all possible data points (e.g., location in space and time of the death of birds) within a clearly defined geographic area and temporal window and testing them against assigned values of what is "close" in space and time. The number of close space-time data pairs is compared with what would be expected if there were no interaction of space and time, and a probability of nonrandom space-time interaction is determined. When the probability is less than 0.05, the likelihood of space-time interaction is significant.
The Knox test is widely used in epidemiologic studies (2024). However, many of these studies are retrospective with the intent of evaluating infectious etiology from disease incidence in single unpartitioned areas. Only recently has the Knox test been implemented prospectively for local regions (1517). Rogerson (17) used the Knox test on data updated prospectively with a cumulative sum method; however, this method did not account for dynamic spatial phenomena that exhibit spatial movement in time (12). In the case of West Nile virus in New York City, prospective monitoring is necessary for targeting remediation and control efforts, and dynamic monitoring is essential for tracking the changing spatiality of viral activity. Dynamic monitoring reduces false positive-risk areas where viral activity has subsided, it keeps remediation and control efforts focused on the current viral "hot spots," and it can be used to monitor the efficacy of remediation and control initiatives.
The Dynamic Continuous-Area Space-Time (DYCAST) system was developed to identify and prospectively monitor high-risk areas for West Nile virus, and it was used to assist in guiding the remediation and control efforts of the New York City Department of Health. The DYCAST model is prospective and dynamic, and it relies on the Knox test to statistically assess the significance of space-time interaction and, hence, risk. Reporting bias is decreased using this approach, because the decision to identify a high-risk area is not dependent on a cumulative density measure but on a statistical assessment of the significance of space-time interaction. In fact, significant space-time interaction can occur in low-density situations, and nonsignificant interaction can occur in high-density situations. The Knox analysis is implemented as an interpolation function to create a surface of probabilities over a grid of 1,400 cells overlaying New York City. Each grid cell is assigned a probability based on a Knox analysis of dead birds within a 1.5-mile (2.41-km) radius of its centroid (spatial domain) and a 21-day moving window (temporal domain) preceding each days daily run. The DYCAST model was calibrated using year 2000 data on dead bird and human West Nile virus incidence reports and was implemented and tested operationally with year 2001 surveillance data.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methodology
The major assumptions of the model used in this study are as follows: 1) West Nile virus is a continuous phenomenon across space; 2) humans are infected at their place of residence; 3) nonrandom space-time interaction of bird deaths is attributed to West Nile virus infection; and 4) each dead bird has an equal opportunity of being reported.
Knox test
The Knox method is used to test for no interaction of incidents in space and time within a clearly defined spatial and temporal domain. Closeness in space and time is based on a set of criteria (e.g., ecology), and pairs of data points are tested as to which of four categories they fall into: close in space only; close in time only; close in space and time; or close in neither space nor time. Knox (19) suggests the construction of a 2 x 2 contingency table as shown in figure 1. T(o11) is the test statistic or the actual number of pairs found close in space and time, and it is calculated as
|
where sij, tij is 1 when the ith-jth pair is close in space and/or time, respectively, and 0 otherwise.
The Knox statistic compares the observed number of pairs close in space and time with the expected number of pairs close in space and time under a random process, given that s = o11 + o21 pairs were found close in space and that t = o11 + o12 pairs were found close in time (19). The expected number of pairs is calculated as
The variance of the Knox statistic was developed by Barton and David (26).
Model calibration
Residential location and the presumed date of West Nile virus infection in humans were used as the basis for model calibration, because they were considered to be the most reliable indicators of amplified West Nile virus activity. A Knox test was performed on all dead birds, except pigeons and unknown species, found within a 1.5-mile (2.41 km) spatial domain of the residential location of each human case and within a 21-day temporal domain prior to the cases presumed date of West Nile virus infection. The presumed date of human infection was estimated as 7 days prior to the reported date of onset of symptoms, which is above the mean incubation period of hospitalized patients, 5.3 days (27), and within the range of human infection, 315 days (28). The 1.5-mile (2.41 km) buffer represented local areas of relatively high risk based on twice the feeding distance of C. pipiens, 0.68 miles (1.09 km) (29). The 21-day window accounted for two infectious cycles in birds (e.g., infected birds die within 7 days) and the possibility of a spillover effect. Statistical significance was evaluated at different combinations of critical space-time parameters within the spatial and temporal domains.
Selection of critical parameters
A challenging task in the Knox methodology is the selection of critical parameters. Because of the uncertainty in their statistical significance, studies will usually set a "range" and systematically perform the Knox test over the span of the range (23, 24). However, the inference of space-time interaction, based on these ranges over the same data set, results in multiple testing (14, 24). Additionally, a purely statistical decision for space-time parameter selection disregards factors inherent to the nature of the phenomena that are being studied.
In the DYCAST model, the selection of values for the critical parameters happens in the calibration phase and is based in part on ecologic considerations. Critical distance (or measure of space) does not exceed 0.75 miles (1.2 km), reflecting the limited mobility of ill birds and avoiding a distance close to the spatial domain that would reduce the test to one of temporal clustering. Critical time lies between 2 and 7 days, reflecting the period within which infected birds experience limited mobility and die. The Knox test was therefore run for the distances of 0.25 miles (0.4 km), 0.5 miles (0.8 km), and 0.75 miles (1.2 km) and the times of 3 and 6 days, producing six combinations of critical parameters. The actual number of close pairs of dead birds was counted, and the expected number of close pairs and variance (30) were calculated. The probability of significance was assessed using 1) Poisson (if one or more of the Knox contingency table cells had less than five pairs) or chi-square (if all the Knox contingency table cells had five or more pairs) distribution (19, 20, 30), 2) a normal approximation (30), and 3) 1,000 Monte Carlo random permutations as adapted by Mantel (31) at the p = 0.05 level. The critical parameter combination that resulted in significance of space-time interaction of bird deaths within the spatial and temporal domains (1.5-mile buffer, 21 days prior to presumed infection date) for the greatest number of the year 2000 human cases was chosen as the optimal critical parameter combination and set for all future city-wide analyses.
Spatial design
The city-wide spatial design of the model was used for both the retrospective calibration phase in the year 2000 and for prospective implementation in the year 2001. The spatial design consists of laying a 0.5-mile grid across New York City, running the Knox test on each cell centroid (n = 1,400) as if it were the center of a high-risk area, and assigning to the cell the resulting probability. The localized Knox test uses a 1.5-mile radius as its spatial domain and a 21-day window as its temporal domain. The rationale for using a 1.5-mile-radius domain to evaluate the probability for a 0.5-mile grid cell is that West Nile virus activity is a continuous phenomenon and, therefore, should be modeled as a continuous surface rather than as a collection of discrete adjoining regions. The creation of surfaces has traditionally been accomplished using a function that interpolates the value of a cell based on its neighborhood (32). The grids centroid spacing of 0.5 miles (0.8 km) (0.707 miles diagonally) was selected to be less than half the feature of interest (e.g., the 1.5-mile range of C. pipiens) to avoid information loss during the interpolation process (33). Similar grid size selection procedures are widely used in remote sensing (34). Treating West Nile virus activity as a surface also avoids the MAUP associated with the arbitrary partitioning of space.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Model implementation
The results of this study demonstrate that the DYCAST model was successful in identifying areas of high risk for West Nile virus at least 13 days prior to the onset of illness in five of the seven human cases in the year 2001. One of the missed cases appeared in an area that was indicated by the model 3 days after the persons onset of illness; however, this case occurred after dead bird surveillance was virtually halted on September 11, 2001, because of the terrorist attack on the World Trade Center buildings in New York City.
A high-risk area for West Nile virus was identified on July 2nd in northern Staten Island, 25 days prior to the first human case of West Nile virus with onset of illness on July 26th (figure 2). South of this location, another area of significant space-time interaction of dead birds appeared 17 days prior to the second human case with onset of illness on August 5th (figure 3). In northeastern Queens, an area of high risk for West Nile virus appeared on July 9th, 38 days prior to the third human case with onset of illness on August 16th (figure 4). Subsequently, significant space-time interaction of dead birds was observed in southwestern Brooklyn on August 20th, 13 days prior to the fourth human case with onset of illness on September 2nd (figure 5). This area of high risk for West Nile virus in Brooklyn was an expansion of an area first observed on August 4th, which persisted and expanded until September 20th. In northern Queens, significant space-time interaction of dead birds was seen from August 3rd through August 28th, 35 days prior to the fifth human case with onset of illness September 7th.
|
|
|
|
When bird surveillance did resume in early October, the model demonstrated significant space-time interaction on October 8th in lower Manhattan, just 2 days after the seventh and last human case of West Nile virus with onset of illness on October 6th (figure 6). This same area of high risk expanded and remained significant until October 31st, the last date for which we conducted the analysis.
|
Sensitivity analysis
A sensitivity analysis was performed on data for September 1, 2001, because this date reflected the peak of the seasons West Nile virus activity. The results of the analysis using the critical parameter combinations of 0.25 miles (0.4 km), 0.5 miles (0.8 km), and 0.75 miles (1.2 km) with 3 and 6 days are summarized in table . This table shows the number of cells that were identified as being high-risk areas on the basis of the dead bird interaction effects with the respective parameters, at the p = 0.05 and the p = 0.1 significance levels.
|
In summary, the DYCAST system was stable over the range of ecologically constrained parameters, and the most constrained parameters (0.25 miles (0.4 km) and 3 days) were the best for initiating remediation and control activities, because they minimized false negatives.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The innovations of the DYCAST system include the use of a statistical measure to assess areas of high risk for West Nile virus rather than the selection of an arbitrary "critical" density and the use of data regarding the ecology of the host and vectors in the parameter calibration phase that attunes the model to the real world phenomenon. In addition, the DYCAST system did not suffer from MAUP inconsistencies, because it was designed as a continuous area system that used a neighborhood interpolation technique in accordance with centroid spacing selection principles. Moreover, the identification of areas of high risk for West Nile virus in locations of low population density (i.e., Staten Island) demonstrates no visible evidence of reporting bias when a minimum bird threshold was calculated and implemented. Edge effects were reduced to cases where an edge spuriously caused the number of dead bird data points not to reach the appropriate 25-bird threshold. Finally, the dynamic nature of the system provided timely identification and incorporation of temporal effects.
Despite the success of the DYCAST system, certain issues require further research. Two points arising from the calibration results include the issue of thresholding and the validity of significance tests. Another important issue that should be addressed is reporting bias in areas of varied socioeconomic characteristics. Finally, the model could be optimized further by incorporating additional data, such as the location of mosquito breeding grounds, environmental conditions, high-risk populations, and feedback mechanisms for further calibration on positive bird and mosquito results and control activities.
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|