1 Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT
2 Department of Medical Education, Griffin Hospital, Derby, CT
Correspondence to Dr. Lisa Calvocoressi, Department of Medical Education, Griffin Hospital, 130 Division Street, Derby, CT 06418 (e-mail: lisa.calvocoressi{at}yale.edu).
Received for publication February 9, 2005. Accepted for publication July 8, 2005.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
age factors; epidemiologic methods; mammography; mass screening; risk assessment
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Across studies, predictors of adherence to regular screening include demographic variables such as younger age (46
), White race (7
9
), higher income and more education (7
11
), and being married (7
, 11
, 12
). Additional predictors include having health insurance (8
, 9
, 13
), having a regular health-care provider (12
14
), receiving a provider's recommendation or reminder notice to obtain a mammogram (4
, 6
, 12
, 14
17
), having better self-reported health (14
, 18
), having a family history of breast cancer (8
, 13
, 19
, 20
), participating in regular physical exercise (21
, 22
), not smoking (12
, 16
), having knowledge of breast screening guidelines (4
, 6
, 20
, 23
), and having a past history of screening (24
26
). Psychosocial predictors of adherence include a belief that mammography is beneficial (4
, 18
, 26
), confidence in one's ability to obtain a mammogram (26
), and perceived control over the effects of breast cancer (4
). High perceived susceptibility to breast cancer has predicted adherence in most studies (6
, 13
, 15
) but has been found to adversely affect adherence in some investigations (27
30
). In addition, past mammography experiences marked by embarrassment, pain, or anxiety have been implicated as screening barriers (18
, 23
, 31
). Notwithstanding the many predictors, however, adherence to screening guidelines is suboptimal (32
, 33
). A comprehensive review between 1990 and 2001 found that only 46 percent of eligible women were screening according to established guidelines (34
). Further work is thus needed to understand how predictors of adherence may be utilized to develop effective strategies to promote this screening behavior.
Effective utilization of predictors to promote adherence may be compromised because knowledge of these predictors is confined to their individual and independent influences on adherence, derived from standard bivariate and multivariate analyses. Identifying a manageable number of pertinent predictors and investigating whether, and for whom, certain combinations of predictors are associated with adherence may further our understanding of factors motivating adherence and enhance our ability to develop effective intervention strategies. Recursive partitioning, a nonparametric technique used with increasing frequency in clinical research and epidemiologic studies (35), is well suited to these tasks. Also known as tree analysis, recursive partitioning can distinguish a subset of important variables from a larger pool of "candidate predictors" and can classify subjects into well-separated subgroups with respect to the outcome of interest across these predictor variables (36
38
). The results of a recursive partitioning analysis are presented in the form of an inverted "tree" in which the sample has been repeatedly split into binary subgroups based on examining all candidate variables and, for each split, selecting the predictor that best partitions the sample into relatively homogeneous subgroups based on the outcome. When tested on learning samples, tree analysis has produced models with better predictive accuracy than parametric methods (36
, 38
, 39
). For a detailed explication of the recursive partitioning method and its statistical underpinnings, we refer the reader interested to reports by Breiman et al. (36
) and Zhang and Singer (37
).
We examined data from the prospective cohort study, Race Differences in the Screening Mammography Process, which included a broad range of potential predictors of adherence to screening guidelines. By use of recursive partitioning, we sought 1) to identify a subset of variables that predicted adherence to screening guidelines and 2) to delineate subgroups across these predictors that differed in the proportions of adherent women. In keeping with the exploratory nature of recursive partitioning, we allowed the forms of relations among predictors and outcome to become manifest (40) and did not specify hypotheses concerning these relations in advance.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Participants completed a baseline telephone interview on average 1.5 months after the index screening (standard deviation, 0.85 month; range, 16 months), as well as a follow-up telephone interview on average 29.4 months thereafter (standard deviation, 1.42 months; range, 2741 months). The baseline questionnaire included sociodemographic, health history, medical care, behavioral, and psychosocial factors. The follow-up questionnaire included information on mammograms obtained subsequent to the index screening as well as other variables. Approvals of institutional review boards of this institution and of participating hospitals were obtained to conduct the study. Oral consent for participation in the study interviews was obtained following prospective respondents' review of a study information sheet and discussion with a trained study interviewer.
Of the 1,451 women who completed baseline interviews (73 percent participation), 1,249 participated in follow-up interviews (86 percent). Twenty women who completed a follow-up interview were excluded because they provided insufficient information to determine their adherence to guidelines during follow-up, or because they were diagnosed with cancer following study entry and did not subsequently adhere to a regular screening schedule. The total number of women thus available for this analysis was 1,229 (39.4 percent African Americans and 60.6 percent Whites).
Variables
We assessed the outcome, adherence to mammography screening guidelines, at follow-up based on American Cancer Society guidelines in effect in 1996, at the onset of this study's data collection period: annual screening of women aged 50 years or more; one screening of women aged 4049 years every 12 years (41). We thus considered women aged 5079 years adherent if they obtained at least two screenings within 26 months (2 years + 2 months) of the index examination. We considered women aged 4049 years adherent if they obtained at least one screening within 26 months (2 years + 2 months) of that examination.
For the tree analysis, we selected 22 candidate predictors to examine in relation to adherence to screening guidelines, based on factors associated with adherence in the published literature (426
, 31
). These included the following: 1) sociodemographics (race/ethnicity, age, marital status, education, annual family income); 2) health-care factors (full annual mammography insurance coverage, having the same (i.e., usual) health-care provider over the past year, receipt of a health-care provider's recommendation to obtain a mammogram subsequent to the index screening, receipt of a reminder notice to obtain a mammogram subsequent to that screening); 3) health status and behaviors (self-rated health, history of breast cancer in a first- or second-degree relative, participation in any form of aerobic exercise at least once a week, pack-years of smoking); 4) mammography-related factors (knowledge of mammography screening guidelines, past history of adherence to mammography screening guidelines prior to the index screening); and 5) psychosocial factors (perceived susceptibility to developing breast cancer in one's lifetime, perceived usefulness of mammograms for detecting breast abnormalities, embarrassment experienced during the index screening, pain experienced during index screening compared with expectations, anxiety experienced during index screening compared with expectations, confidence in one's ability to make arrangements to obtain a future mammogram, perceived control over cancer recovery). With the exception of receipt of a provider's recommendation and receipt of a reminder notice to obtain a mammogram subsequent to the index screening that were assessed at follow-up, information on the candidate predictors was obtained during the baseline interview. Specific coding of each predictor is shown in table 1. For the logistic regression analysis shown in that table, we combined categories that included few subjects (e.g., somewhat/a little/not at all useful/don't know). For the recursive partitioning analysis, classification and regression tree (CART) software automatically dichotomized variables with multiple categories.
|
To form the classification tree, CART repeatedly partitioned or split the study population into binary subgroups (i.e., nodes). To determine which variable to use for each split, CART examined all possible binary splits of the sample by each candidate predictor. CART then selected the predictor (and its particular dichotomization) that split the sample into the most homogeneous binary subgroups based on adherence status (yes/no). The Gini impurity criterion (38) that measures and ranks the extent to which each split departs from complete homogeneity (i.e., where all subjects in one branch of the split have the outcome under study and all subjects in the other branch do not have the outcome) was used for this purpose. For each split, CART selected the variable with the lowest impurity score. For variables with multiple categories, we elected to have CART examine every possible binary combination of those categories to determine the best split. This approach may provide evidence of a relation between predictor and outcome that is not linear (e.g., if the predictor is split into two categories: low/high vs. moderate).
CART fully partitioned or "grew" the tree until the default lower limit of 10 subjects in a node was reached. CART then derived a number of smaller "pruned" trees, based on a 10-fold cross-validation procedure (36, 38
) that identified from among trees of a given size those with the lowest misclassification of subjects on the outcome variable. CART gives the investigator the option of choosing the tree for study with the overall lowest cross-validated misclassification of subjects regardless of size, or the smallest tree with cross-validated misclassification within one standard error of that tree. To obtain a tree of manageable size, we chose the latter. At the bottom of the pruned tree are "terminal" nodes that represent relatively well-separated and homogeneous subgroups across the predictors included in the tree. CART provided the percentages of adherent and nonadherent women in each of these subgroups. We calculated 95 percent confidence intervals for the percentages of adherent women.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the recursive partitioning analysis (figure 1), the total sample (n = 1,229) comprised the "root" node at the top of the classification tree. CART selected age to split this sample into adherent and nonadherent subgroups and dichotomized the four age categories into the following: 1) a group (n = 786) that included the age categories 5059, 6069, and 7079 years (i.e., ages 5079 years) with 44.0 percent adherence and 2) a group (n = 443) that included women aged 4049 years with 66.8 percent adherence.
|
Shown to the right in figure 1 are the three subgroups CART identified for women aged 4049 years. Group 8 (n = 327; 74.3 percent adherence) included only the predictor, receipt of a health-care provider's recommendation to obtain a mammogram. Group 7 (n = 37; 67 percent adherence) included women who did not receive a recommendation and believed that they were moderately susceptible to breast cancer. Group 6 (n = 79; 35.4 percent adherence) included women without a recommendation who reported high or low susceptibility.
Tables 2 and 3 summarize the results of the classification tree for each age group. Among women aged 5079 years (table 2), those in group 5 were most adherent (57.1 percent). Moreover, the 95 percent confidence interval (52.1 percent, 62.1 percent) did not overlap with that of any other group. In that group, women reported a history of adherence to screening guidelines, had higher family incomes, considered mammography very useful, and believed that their susceptibility to breast cancer was low or moderate. Among women aged 4049 years (table 3), those who received a provider's recommendation to obtain a screening showed the highest adherence (74.3 percent, 95 percent confidence interval: 69.6 percent, 79.0 percent).
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Notwithstanding the above exceptions, most of the variables examined in this analysis were significant individual predictors of adherence to mammography screening guidelines, thus confirming previous work (423
, 26
). However, several investigators have called for research that goes beyond identifying individual predictors of cancer screening to acknowledging and uncovering the multiplicity of variables, and their interrelations, that impact this health behavior (42
). In addition, investigators have suggested that cancer screening might be more effectively promoted if interventions were targeted to homogeneous population subgroups with distinctive patterns of cancer screening (43
), and if individually tailored messages based on a limited number of salient variables could be developed (44
). By honing in on a subset of important predictors of adherence from a larger candidate pool and by partitioning the study population into subgroups across these predictors, tree analysis may have a role in meeting these research challenges.
In this analysis, CART identified six predictors of adherence, thus identifying a manageable subset of variables on which to intervene. These predictors included demographics and other characteristics that could be used to target population subgroups (i.e., age, income, and history of adherence) and psychosocial variables that might be used to develop tailored interventions (e.g., perceived usefulness of mammography). Because the study population included a broad age range (ages 4079 years), as well as a large proportion of African-American women in addition to White women, the results of this analysis may apply across these demographic characteristics. However, sampling was limited to urban, hospital-based mammography facilities in Connecticut and may not generalize to women receiving screenings in other settings.
In this sample, age was the demographic characteristic that split the entire sample (i.e., 4049 and 5079 years). Moreover, of the six predictors included in the tree, CART selected largely different ones for each age group. For example, although generally regarded as among the strongest predictors of mammography screening (16, 45
), a health-care provider's recommendation to obtain a mammogram predicted adherence in the classification tree only among women aged 4049 years. Among women aged 5079 years, history of adherence to screening guidelines instead predicted adherence at follow-up. This may suggest that older women were more likely than younger women to have made a habit of screening (46
) and were less dependent on a provider's external cue. Additionally, usefulness of mammography and annual income were selected as predictors of adherence only for women aged 5079 years. One variable selected by CART for both age categories was perceived susceptibility to breast cancer, but the dichotomization of this categorical variable differed by age group. Lower adherence was associated with high perceived susceptibility in women aged 5079 years and with high or low perceived susceptibility in women aged 4049 years, suggesting a curvilinear relation between these variables in younger women. By limiting their samples to women aged 50 years or not examining potential age-modifying effects (7
, 12
, 13
, 15
, 16
, 18
, 24
26
), prior studies may have overlooked these and other differences in predictors of influence for younger and older women.
This recursive partitioning analysis suggests age-specific intervention strategies. In women aged 4049 years, for example, adherence substantially differed among those who did not receive a provider's recommendation to obtain a mammogram, according to their perceptions of susceptibility to breast cancer. While this finding provides insights into the relation between these variables, for intervention purposes it is key that the most adherent women in this age group were those who received a provider's recommendation to obtain a mammogram, regardless of their perceptions of susceptibility. Therefore, the most feasible and potentially efficacious intervention to promote screening among women aged 4049 years may involve a single variable, that is, mobilizing providers to recommend mammograms.
In women aged 5079 years, on the other hand, the classification tree suggests several possible intervention strategies. First, among women in this age group who do not have a history of adherence to mammography screening guidelines, a relatively small percentage adhered to guidelines in the 2 years following the index screening. Through review of medical records or by history obtained during the primary care examination, women without a history of adherence could be identified and targeted for intervention. The tree also indicated that relatively few low-income women, regardless of screening history, adhered to screening guidelines during follow-up. As with women without a history of adherence, this finding suggests that low-income women as a group may require particular attention. Among older women with a history of adherence and higher incomes, the tree identified two psychosocial variables that impacted adherence and could be used to develop tailored messages: perceived usefulness of mammography and perceived susceptibility to breast cancer. However, whereas higher perceived usefulness was associated with better adherence, high perceived susceptibility was associated with lower adherence. This finding is consistent with a few prior studies, including our logistic regression analysis, one of a few to prospectively assess the relation between these variables (27). It suggests that interventions designed to increase perceptions of susceptibility may not effectively promote adherence and that caution should be exercised when intervening on this variable.
That the psychosocial variables identified by the tree were important predictors of adherence only among women with a history of adherence and higher incomes supports the view that some theories and models of behavior change that include these variables (e.g., the Health Belief Model (47)) were developed using, and may be most relevant to, middle-class women (48
, 49
). Among low-income women and those without a history of adherence, the tree did not select additional predictors that might have aided in refining interventions for specific segments of these target groups. Additional variables that may have effectively partitioned these groups were not included in this data set.
Furthermore, although this study population included a substantial proportion of African-American women and although race was a significant predictor of adherence in the unadjusted logit analysis, CART did not include this variable in the classification tree. Compared with other candidate predictors, this suggests that race had relatively less impact and that intervening on other features of the study population (e.g., age and income), regardless of race, may more effectively promote adherence to screening guidelines. This is in agreement with prior research, where White (in relation to African-American) race was as an independent predictor of adherence in some (7, 9
, 13
, 22
), but not all (15
, 24
), previous studies, whereas income has predicted adherence quite consistently (4
, 5
, 7
, 9
13
, 24
).
However, the investigator may still wish to examine whether predictors of adherence differ for particular subgroups (e.g., race). Because the sample was initially split on age and not race, predictors of adherence for each racial group could not be identified in this analysis. With the automated CART procedure, the variable selected to split the total sample will influence subsequent partitioning and determine the terminal subgroups in a given tree. It is possible that other potentially important individual predictors and relations among these predictors may have been overlooked in this process. To assist the investigator in identifying variables that differentiate the outcome nearly as well as the chosen split, CART provides information on "competing splits" for each partitioning of the data. Furthermore, with CART, the investigator can perform separate analyses by each subgroup of interest (e.g., by race). Alternatively, one can use recursive partitioning software, such as RTREE (37), that allows the investigator to force variables of interest into the model.
Despite some limitations, recursive partitioning can perform several functions with relative ease, compared with traditional statistical methods. For example, although recursive partitioning dichotomizes variables with multiple categories and may obscure more complex curvilinear effects, it can easily discern U-shaped relations between ordinal predictors (e.g., perceived susceptibility among younger women) and the outcome of interest. In addition, because the output of a recursive partitioning analysis provides a visual overview of the data, structures in the data that may be less apparent with traditional analyses may become manifest. For example, the striking age difference in predictors of importance to adherence could only have been observed with traditional analyses through the laborious testing of multiple two-way interactions between age and each candidate predictor. Moreover, the subgroups across predictors within each age group, so-called "local interactions" (50), could only have been identified through higher order interactions that may not be detected due to insufficient statistical power and may also be difficult to interpret. Thus, recursive partitioning is a potentially useful technique that may provide a more complete understanding of predictors of adherence to mammography screening guidelines and that may aid in developing more effective interventions.
![]() |
ACKNOWLEDGMENTS |
---|
The authors wish to thank the following hospitals in Connecticut that allowed access to their patients and medical records: Bridgeport Hospital, Lawrence and Memorial Hospital, St. Francis Hospital and Medical Center, Waterbury Hospital, and Yale-New Haven Hospital. They also wish to thank Lisa Schlenk, project coordinator, for her invaluable assistance.
Conflict of interest: none declared.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|