1 Department of Biostatistics, University of Michigan, Ann Arbor, MI.
2 Department of Urology, University of Michigan, Ann Arbor, MI.
3 Institute for Social Research, University of Michigan, Ann Arbor, MI.
Received for publication September 17, 2001; accepted for publication June 19, 2002.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
bias (epidemiology); Blacks; imputation; nonresponse; urology
Abbreviations: Abbreviations: AUABS, American Urological Association Bother Score; AUASS, American Urological Association Symptoms Score; MCAR, missing completely at random; PSA, prostate-specific antigen.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The Flint Mens Health Study is an ongoing population-based random survey of African-American men in Flint, Michigan. Major aims of the study are to define age-adjusted normal ranges for PSA levels in African-American men without prostate cancer and to define the prevalence of lower urinary tract symptoms. To our knowledge, it is the first population-based study to evaluate urinary symptoms among African-American men. Details on the study design and preliminary results have been published elsewhere (68).
Data collection was conducted in two stages: first an interview questionnaire and second, for eligible men, a follow-up clinical examination that included a biopsy in a selected subset of men. There was a high response rate at the interview stage (86 percent) but a lower response rate (52 percent) at the clinical examination stage. The low response rate in the clinical examination stage was viewed as a potential problem for interpretation of some of the studys results. For example, a specific quantity of interest is the prevalence of lower urinary tract symptoms, as measured by the American Urological Association Symptoms Score (AUASS). Results for this measure in the clinical examination stage showed a trend towards increasing AUASS with age but a surprising decrease in the highest age group (8). This finding may have been a consequence of nonresponse bias.
It is common to have missing data in epidemiologic studies. Gaps in the data can lead to biased assessment and misleading interpretation of the data collected. Statistical theory associated with missing data has been characterized elsewhere (9), and this is an active area of research. One generally applicable approach to analyzing data with missing values is multiple imputation (10, 11). Use of multiple imputation for assistance in the analysis of public health and epidemiologic data is becoming more common (1216). With the multiple imputation method, missing values are "filled in" in an appropriate way consistent with the observed data. The procedure is repeated multiple times to create multiple complete data sets. Each data set is then analyzed separately, and the results are combined to obtain statistical inferences about quantities of interest. The quantities of interest can be measures of association, such as regression coefficients or odds ratios, or summary statistics for a particular variable, such as the mean value or 95th percentile. The theory and implementation of multiple imputation and the conditions for its validity have been described in a paper by Rubin (11) and in many subsequent articles.
In the Flint Mens Health Study, values for clinical examination variables for the men who did not have a clinical examination can be regarded as missing data. Similarly, biopsy results for men who did not undergo biopsy are missing data. For each missing data point, a missing value can be conceptualized as the value that would have been obtained if that variable had been measured. If the missing values were known, the data set would be complete, and most analyses would be relatively straightforward to perform. Since this is a population-based sample, an inference could then be validly generalized to the population from which the sample was chosen. We undertook multiple imputation of missing data in the Flint Mens Health Study to correct for potential nonresponse bias in our interpretation of the normal range of PSA levels and self-administered AUASS.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Study variables
PSA level, self-administered AUASS, and biopsy result were the primary variables of interest in this analysis. However, the multiple imputation procedure must simultaneously consider other variables that might be associated with the three primary variables and variables that might be associated with the probability of participating at the clinical stage or the biopsy stage of the study. For example, questionnaire AUASS clearly will be strongly correlated with self-administered AUASS and therefore will be very useful for imputing the missing self-administered AUASS data. We performed an investigation of factors associated with nonparticipation in the clinical phase of the study (6, 7). We found that age, family history of prostate cancer, alcohol use, and AUASS were associated with participation in the clinical stage of the study.
Table 1 describes the variables used in the multiple imputation analysis, including the type of variable, its range, and the number of observed data points. The 48 observed biopsies (11 positive, 37 negative) in the study have been augmented by biopsy results from nine additional men who reported having a negative biopsy during the past year on the questionnaire. In our analyses, a person was regarded as having an indication for biopsy if the PSA level was greater than 4, the digital rectal examination was not normal, or the ultrasound results were viewed as suspicious.
|
Endpoints 14 above are generalizable to a population of African-American men aged 4079 years who have had no previous urologic surgery, have had no prior diagnosis of prostate cancer, and would have no evidence of prostate cancer from a biopsy. Endpoint 5 is generalizable to a population of African-American men who have had no prior urologic surgery and have no prior known diagnosis of prostate cancer.
Multiple imputation procedure
In multiple imputation, the missing values are drawn from an appropriate distribution that characterizes the conditional relation of the imputed variables to other variables. Of course, the imputed values for any subject are not real values and have no interpretation. They are used simply as statistical tools with which to effectively use other nonmissing variables from that subject to make an inference about the quantity of interest. The procedure of drawing missing values from the distribution is repeated M times. Because the missing values are drawn from a distribution, there will be a range of values imputed for each missing value, with this variation appropriately reflecting the uncertainty about that value. After imputation, each of the M completed data sets is analyzed separately, and the results are combined. The Bayesian theoretical underpinnings of the method (11) require a statistical model for the joint probability distribution of all of the variables. On the basis of this model, each missing value is drawn from an appropriate distribution. Let Q denote the quantity of interestfor example, the mean AUASS; let Xobs denote observed data and Xmis the missing data; and let denote the parameters in the model for X = (Xobs, Xmis). Inference about Q can be represented by the posterior distribution of Q given Xobs. Technical details on the theory dictating how the values of Xmis are imputed are given by Little and Rubin (9).
For each of the completed data sets, the inference for Q can be summarized by an estimate and a standard error; denote these m and SEm, m = 1, ..., M. Then the final estimate for Q is
,
and the final variance is a sum of between- and within-imputation variances given by
. (1)
The implementation of multiple imputation in this study is based on the procedure described by Raghunathan et al. (17) using software called IVEware (18). Multiple imputation requires that the joint distribution of all of the variables which have missing values be specified; this is difficult when the number of variables is large and the variables are of continuous, categorical, and mixed types. In this procedure, the joint distribution is approximated by a series of conditional distributions of each variable conditioned on all other variables. The conditional distribution of each continuous variable is given by a standard linear regression model. Logistic regression is used for binary variables. For count variables, Poisson regression is used. For categorical variables with more than two response categories, a polytomous regression model is used. A mixed variable is one which is either zero or positive, and if it is positive it has a continuous distribution. For imputing a mixed variable, first the zero/nonzero status is imputed using logistic regression and then, conditional on its being nonzero, normal linear regression is used to impute a continuous value.
The IVEware procedure iterates through all of the variables. For each variable to have data imputed, the regression model is fitted to the current values of observed and imputed data, with that variable designated as the response variable and the others included as the covariates. The model fitting gives estimates and the covariance matrix for the parameters . A value of
is drawn from a multivariate normal approximation to the posterior distribution of
. Then a new value of the missing data is drawn from the model specified by the regression equation, using the explanatory variables and the drawn value of
. This is repeated with each variable considered as the response variable, in sequence. The whole sequence is repeated pM times. With every pth set of sequences, the imputed values are saved, giving M completed data sets. The value of p needs to be large enough that the imputed values are essentially uncorrelated, and M is typically 5 or greater, with larger values being preferred if there is a substantial amount of missing data. For these analyses, we used p = 20 and M = 20.
For binary variables for which there is a substantial amount of missing data, specifically for the "biopsy result" and the "indicator for biopsy needed" variables, two refinements of the algorithm are used. An importance sampling scheme is used to draw from its posterior distribution (12), and the number of explanatory variables was limited to the best six.
Within each age category, the final variance of the multiple imputation estimate is a sum of the within- and between-imputation variance components. For the mean AUASS, the within-imputation standard error is SD/ , and for a percentage, the within-imputation standard error is given by
, where n is the sample size within each age group. The within-imputation standard error for the 95th percentile of PSA is calculated from the formula
, (2)
where P = 0.95 and f (.) denotes the probability density function, which was estimated from the data.
Survey weights
The design of the Flint Mens Health Study included oversampling of older individuals. The sampling weights for the four age categories 4049, 5059, 6069, and 7079 years were 4.45, 2.96, 2.34, and 1.00, respectively. The weights reflect age-specific adjustments for the population-based sampling probabilities and minor corrections for the small differences in age-specific rates of response to the interview questionnaire, which were slightly lower for men aged 6079 years. These weights are important because they must be taken into account in any overall estimates and measures of uncertainty. For example, the overall average AUASS is a weighted average of the age-category-specific mean AUASS values. The standard errors accounting for the sampling weights were computed using the Taylor series approach when estimating weighted means and proportions and the "jackknife" approach when estimating coefficients in the regression models (19) as implemented in the IVEware software. For the overall 95th percentile, a weighted mean and variance were calculated to approximate the density in equation 2.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In this paper, we implemented multiple imputation using IVEware software. There are a number of programs available for multiple imputation, some of which were reviewed by Horton and Lipsitz (20) and can be found on the World Wide Web site http://www.multiple-imputation.com. There is no single software product that can handle all designs and data types. IVEware was particularly convenient for our data because it can be applied when there is a large number of continuous and categorical variables, it can accommodate general patterns of missingness, and it has a procedure for incorporating the survey sampling weights into the final analysis.
The terminology of missing data (9) classifies the missingness pattern into three types: missing completely at random (MCAR), missing at random, and nonignorable missingness. The MCAR pattern occurs when the probability of missingness does not depend on the value of the data. In the context of the Flint Mens Health Study, this would occur if the men who did not participate at the clinical examination stage were a random sample of all eligible men from the questionnaire stage. When MCAR does occur, analysis of the observed data is valid, although it may not be efficient and techniques such as multiple imputation could improve the efficiency. From table 2, it is clear that MCAR is not valid for the Flint Mens Health Study.
The pattern "missing at random" occurs when the probability of missingness is dependent on the observed data but not on the unobserved data. Under these conditions, simple analysis of the observed data can give an invalid statistical inference. For example, the proportion biopsy-positive from the observed data is clearly not a valid estimate of the population proportion. However, likelihood-based or model-based methods will give a valid inference provided that the models are correctly specified and the variables that determine the probability of missingness are included in the model. Multiple imputation is an example of a model-based method.
Nonignorable missingness is where the probability of missingness depends on the value of the unobserved data. In this case, likelihood-based methods are not valid unless a model for the missingness mechanism is also specified. In general, there is no information in the observed data with which to check the appropriateness of the missingness model.
Previous analysis (6, 7) of data from the Flint Mens Health Study indicated that the probability of nonparticipation in the clinical examination stage was associated with the questionnaire variables age, AUASS, and number of alcoholic drinks per day, among other factors. There was a lower participation rate among older men and those who had urologic symptoms or were already seeing a doctor on a regular basis. A possible interpretation is that younger and healthier men participated at a higher rate because they viewed the clinical examination as beneficial, whereas older participants tended to be already under the care of a doctor and saw less benefit in the clinical examination. Because factors associated with participation in the clinical examination were obtained at the questionnaire stage and included in the model, it is plausible that the missingness mechanism is "missing at random," for which multiple imputation is valid.
As a sensitivity analysis, we have presented the results from 12 different imputation models. Numerically the results from the models differ, but with a few exceptions, the differences are slight, giving some reassurance as to the robustness of the conclusions. If one wants to select a best model, one should include in it variables that are associated with the quantities of main interest and variables that are associated with the probability of missingness. Standard model assessment and checking procedures can be used to choose a specific functional form for the model. However, it may not be possible to decide on the functional form of the best model from the observed data; in these situations, it is important to perform a sensitivity analysis to assess the robustness of the conclusions. The results for model 12 for the percentage biopsy-positive are different because they are based on an imputation model that excludes the "indication for biopsy" variable. Similarly, in model 9, the definition of "indication for biopsy" was changed. This variable is known to be strongly associated with the probability of having a missing biopsy result. This illustrates the importance of including in the imputation model variables that are known to be associated with the missingness process (21).
In the current study, multiple imputation was applied to improve efficiency for estimating appropriate age-specific reference ranges. The Flint Mens Health Study was established to determine the age-specific reference ranges for PSA among African-American men, but the sample has also been used to evaluate the association between aging and lower urinary tract symptoms. The rationale for applying community-based cohorts to the derivation of normative reference ranges is minimization of the selection bias inherent in the use of clinical samples. It is apparent from table 2 that nonresponse bias affected the interpretation of AUASS from the clinical stage. Following multiple imputation, there is a clear increasing trend in average AUASS with age. The various imputation models show small differences in the average AUASS, but they all show increasing trends with age. The differences in average AUASS are all within the uncertainty limits of one another, so this is not a cause of great concern. The fact that all models showed increasing AUASS with age suggests that this conclusion is quite robust.
The results for 95th percentile of PSA are quite consistent across imputation models. A similarly designed study of White men in Olmsted County, Minnesota (22), estimated the 95th percentile of PSA level to be 2.0, 2.9, 5.8, and 6.2 in persons aged 4049, 5059, 6069, and 7079 years, respectively. These numbers are quite similar to our results, and thus we do not find evidence for a difference in the normal range of PSA values between White and African-American men.
In summary, multiple imputation is an attractive and feasible approach for analyzing data sets with missing values and for assessing and potentially correcting for nonresponse bias. This analysis demonstrates the utility of multiple imputation in a large population-based study of prostate disease.
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|