1 Harvard Medical School, Boston, MA.
2 Harvard Pilgrim Health Care, Boston, MA.
3 Harvard Vanguard Medical Associates, Boston, MA.
4 Eastern Massachusetts Prevention Epicenter, Centers for Disease Control and Prevention, Boston, MA.
5 Center for Education and Research in Therapeutics, HMO Research Network, Boston, MA.
6 Brigham and Womens Hospital, Boston, MA.
7 University of Sydney School of Public Health, Sydney, Australia.
Received for publication October 28, 2003; accepted for publication November 3, 2003.
We thank Professor Waller for his thoughtful and supportive remarks (1) on our article (2), especially for clarifying and elaborating on several points. We have further thoughts along the lines suggested by his comments.
Wallers comments about the timeliness of reporting are important. Delays in reporting would adversely affect the performance of our model. Before discussing this further, we note that the data in our example are unlikely to have suffered from such circumstances. The case reporting in the example is part of a real-time integrated patient management and billing system; data are often entered by the health-care provider as care is delivered. Such systems are becoming more common in patient care organizations (3) and should greatly lessen concerns about the timeliness of reporting in surveillance systems that rely on such data.
However, there are many flaws to which even data from electronic medical records are prone, more generally. Examples include patients who cannot be geocoded or are incorrectly assigned to a certain geographic area, physicians who routinely use incorrect codes, and systematic reporting patterns. In general, these flaws in the data are far from catastrophic, as long as they affect the historical and surveillance data in a consistent fashion. In most regards, the effect on the modeling will then be relatively benign. While these situations can add to the "noise" in the data, they do not systematically cause specific kinds of false signals or prevent detection of true signals. They do decrease the sensitivity and specificity to some degree, because of the added noise. However, if the flaws in the data vary over time, the modeling will not usually be able to reflect this; this will lead to dramatically poorer results. For example, when a reporting unit has periods of inaccurate coding interspersed with periods of more precise reporting, surveillance in areas covered by that reporting unit will be very poor. Frustratingly, surveillance can also be made more complicated even when systems improve over time: For example, a dramatic improvement in geocoding will result in unpredictable and possibly undesirable effects if not accommodated in modeling.
The point about varying geographic intensity of coverage is also well taken. Clearly, any system will have better test characteristics for events that occur where the density of coverage is greatest. In the example data, coverage is generally greatest in areas near the clinics that participate. We expect that detailed simulations will be the most effective way to assess the performance of surveillance systems, and these should include evaluation of the impact of varying spatial coverage.
Waller shows great insight regarding confidentiality and the value of this approach with respect to privacy issues. In a recent project (3), our ability to accept simple summaries of data compiled by zip code, day, and syndrome was valuable in mitigating legitimate concerns about these matters.
Waller discusses adding higher-level random effects to account for correlation in areas larger than the basic spatial unit. This practice is often necessary when correct standard errors are needed. We are unsure whether they would be valuable in the current type of application. Since standard errors are not required, these neighborhood-level effects would be useful mainly if the shrinkage properties of the random effects differed meaningfully in such models from models with only the basic-level random effects. On the other hand, there is little need to seek parsimony in the models; we do not recommend removing predictors on the basis of statistical tests.
We again thank Professor Waller for his insights.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() |
---|
Related articles in Am. J. Epidemiol.: