1 Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD.
2 Basic Research Program, Science Applications International Corporation, Frederick, MD.
3 Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL.
4 Department of Epidemiology, School of Public Health, University of California, Los Angeles, Los Angeles, CA.
5 Department of Molecular Microbiology and Immunology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD.
6 Department of Infectious Diseases and Microbiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA.
7 Laboratory of Genomic Diversity, National Cancer InstituteFrederick, Frederick, MD.
Received for publication January 22, 2003; accepted for publication July 31, 2003.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
acquired immunodeficiency syndrome; chemokines; cytokines; epidemiologic methods; HIV-1; HLA antigens; receptors, chemokine
Abbreviations: Abbreviations: AIDS, acquired immunodeficiency syndrome; CCR2, C-C chemokine receptor 2; CCR5, C-C chemokine receptor 5; CCR5P, C-C chemokine receptor 5 promoter; CI, confidence interval; HIV-1, human immunodeficiency virus type 1; HLA, human leukocyte antigen; IL10, interleukin 10; MACS, Multicenter AIDS Cohort Study; PF, prevented fraction; RH, relative hazard; SDF1, stromal-derived factor.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Initial studies examining the influence of AIDS restriction genes on the progression of AIDS focused on the isolated effect of polymorphisms for the primary coreceptors, chemokines, cytokines, and human leukocyte antigen (HLA). A mutation of the C-C chemokine receptor 5 (CCR5) allele (CCR5-32) was identified that essentially prevented infection among persons who were homozygous for the mutation (811) and delayed progression among those who were heterozygous for it (8, 11). Other effects found included: delayed progression among persons carrying a mutation in the C-C chemokine receptor 2 (CCR2) allele (CCR2-64I) (12); rapid progression among persons who were homozygous for the P1 haplotype of the CCR5 promoter (CCR5P) allele (CCR5P1/P1) (13, 14); delayed progression among persons with a mutation for stromal-derived factor (SDF1) (SDF1 3'A/3'A) (15); rapid progression among persons who had homozygous alleles at one, two, or three HLA class I loci (16); and rapid progression among persons with a polymorphism for interleukin 10 (IL10) (IL10 +/5'A or 5'A/5'A) (17).
However, the prognosis of HIV-1-infected persons is likely to involve interactions of several host genes, virus genes, and other nongenetic influences. In general, genetic studies have examined the effects of AIDS restriction genes separately, although a few studies (13, 15, 17, 18) have considered the interaction of two or three genes at a time. The goal of this study was to examine the overall influence of described AIDS restriction genes on progression to AIDS among participants in the Multicenter AIDS Cohort Study (MACS) who had HIV-1 seroconversion documented during follow-up. Our approach to this question was based on regression trees, which are directly suitable for incorporating interactions among many variables. To obtain a single overall measure of the influence of genetic factors, we estimated the prevented fraction of AIDS cases, defined as the proportion of potential AIDS cases prevented by AIDS restriction genes relative to a population without a protective genotype.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The primary outcome of interest was time from HIV-1 seroconversion to the development of an AIDS-defining illness, based on category C clinical conditions listed in the Centers for Disease Control and Preventions 1993 case definition (i.e., the immunologic criterion of a CD4-positive cell count less than 200 cells per µl was not included in our case definition) (20). Continuous surveillance data were available for each MACS participant with respect to the development of clinically defined AIDS, death, or loss to follow-up.
The exposures of interest were AIDS restriction genes known to influence the development of AIDS, based on the seminal genetic studies published prior to calendar year 2002. For each of the 525 seroconverters, we determined genetic status regarding CCR5 (wild-type +/+ and +/32, since
32/
32 protects against HIV-1 infection), CCR2 (+/+, +/64I, and 64I/64I), CCR5P (P1/P1, P1/~P1, and ~P1/~P1, where ~P1 represents P2, P3, or P4), SDF1 (+/+, +/3'A, and 3'A/3'A), IL10 (5'A/5'A, +/5'A, and +/+), and HLA homozygosity (number among the HLA-A, B, and C loci with identical (i.e., homozygous) alleles: 2 or 3, 1, or 0). Human genomic DNA was extracted from Epstein-Barr virus-immortalized B cell lines, and stored blood specimens and specific segments were amplified by polymerase chain reaction for the determination of genotypes as previously described (8, 12, 13, 1517). For HLA (16), a panel of primers specific for the HLA-A, B, and C loci was used to identify homozygous alleles. Even though, in principle, the HLA A, B, and C loci are separate loci, we used here the combination put forward by Carrington et al. (16), and hereafter we refer to six AIDS restriction genes: CCR5, CCR2, CCR5P, SDF1, IL10, and HLA.
The study protocols were reviewed and approved by the institutional review boards of the study sites. All participants provided written informed consent.
Statistical methods
The outcome for this analysis was time to AIDS from HIV-1 seroconversion. Since participants in the MACS are followed at 6-month intervals, exact dates of seroconversion were unknown. Furthermore, there were persons who missed semiannual visits between their last negative and first positive tests. To be in consonance with the demonstrated downward trend of HIV-1 incidence over time (21), we assigned a seroconversion date for these persons at one third of the time between the last HIV-1-seronegative study visit and the first HIV-1-seropositive study visit (22). To appropriately incorporate in the analysis those subjects with long times to the first positive visit, we used the time from the assigned seroconversion date to the first HIV-1-seropositive visit as the time of entry into the observed risk sets (i.e., staggered/late entries) (23). Censored observations (i.e., AIDS-free at the last time seen) resulted from losses to follow-up, deaths unrelated to HIV-1, and freedom from AIDS at the date of analysis, which was preset as December 31, 1995 (i.e., before the introduction of highly active antiretroviral therapy).
To quantify the protective effect of each of the AIDS restriction genes, we first computed univariate relative hazards using Cox proportional hazards models (24). Persons with the genotype associated with the highest susceptibility to AIDS on the basis of previous reports (8, 1117) were selected as the reference group. The multivariable analysis consisted of the construction of a regression tree in two stages to incorporate the known relations of the AIDS restriction genes.
The first stage of the regression tree incorporated only CCR2, CCR5P, and CCR5, which are structurally related and constitute the CCR2.CCR5P.CCR5 superlocus (13, 25). These were the first AIDS restriction genes to be described, and their joint effects have been replicated in the literature (13, 14). Therefore, the first split of the regression tree was defined by the CCR2.CCR5P.CCR5 superlocus, which has four haplotypes (i.e., [+.P1.+], [+.P1.32], [64I.P1.+], and [+.~P1.+]) corresponding to nine potential genotypes for HIV-1-infected persons (those with CCR5-
32/
32 are protected against HIV-1 infection). A full Cox regression model yielded the relative hazards for all genotypes relative to those associated with the most susceptibility to AIDS: [+.P1.+]/[+.P1.+]. To combine genotypes associated with similar risks of AIDS, we used a recursive amalgamation algorithm (26, 27) whereby two categories were joined if they yielded the lowest deviance (i.e., goodness-of-fit likelihood ratio statistic relative to the model with the two categories separated) among persons with deviances below 1.32, which corresponds to the 75th percentile of the chi-squared test with 1 df.
For each resulting node of the superlocus, the second stage of the regression tree analysis consisted of identifying subsequent branches of the tree defined by the three remaining AIDS restriction genes (i.e., SDF1, IL10, and HLA homozygosity) using standard binary recursive partitioning methodology (2630). Specifically, for the three genotypes (x1, x2, x3) of each AIDS restriction gene (e.g., for SDF1, x1 = 3'A/3'A, x2 = +/3'A, and x3 = +/+), we fitted two Cox regression models to identify the nature of the association as dominant (i.e., including as a covariate an indicator for x1 or x2) or recessive (i.e., including as a covariate an indicator for x1). We used the likelihood ratio statistic as the dissimilarity measure, and a node was split if the resulting nodes contained more than 10 persons and if the largest likelihood ratio statistic was above 2.07, which corresponds to the 85th percentile of a chi-squared test with 1 df.
Subsequent splits for newly defined nodes were determined in the same way, including the possible determination of an association as codominant (i.e., including as a covariate an indicator for x1 for nodes defined by x1 or x2, or including as a covariate an indicator for x2 for nodes defined by x2 or x3). Once the full tree was derived, we fitted a Cox regression model to the full data to obtain relative hazards using the described reference group. We identified final nodes by combining nonreference nodes with similar relative hazards using the amalgamation procedures described. Kaplan-Meier survival curves for the final nodes and relative hazards (RHi) were computed with the reference group coded by i = 0 (i.e., RH0 = 1).
A multivariate relative hazard (RHM) was computed as a weighted average (based on weights determined by the percentage of seroconverters (pi) at each final node) of the RHis for all nonreference final nodes (i.e., RHM = pi RHi/
pi with summations for i > 0). On the basis of both the RHM and the total percentage with protective genotypes (i.e.,
pi with summations for i > 0), we computed the multivariate prevented fraction (PFM) as
pi x (1 RHM) = (1 p0) x (1 RHM).
The PFM is interpreted as the proportion of all potential AIDS cases that were prevented as a result of the studied AIDS restriction genes. An added feature of our regression tree approach (in contrast to ordinary multiple regression) is that the joint effects of the AIDS restriction genes will never yield a prevented fraction above 1. In addition, if the AIDS restriction genes are indeed protective (i.e., RHM < 1), the prevented fraction will be above zero; otherwise, it can take negative values. Negative values of the prevented fraction will be indicative of no protection.
A confidence interval for the PFM was obtained using the delta method for the log of RHM as a function of ßi for i > 0. Specifically,
Var[log RHM] = di dj cij,
where di = piexp(ßi)/((1 p0)RHM) and cij is the covariance between ßi and ßj obtained from a Cox regression on the final nodes. A 95 percent confidence interval for PFM is given by
(1 p0) x (1 RHMexp(±1.96[Var(log RHM)]1/2)).
To assess the robustness of the delta method when applied to our data, we repeated the second stage of the analysis on 100 bootstrap samples (i.e., random sample with replacement from the 525 seroconverters), allowing the tree to vary with each bootstrap for all nonreference nodes of the superlocus. We compared the third and 98th of the 100 ordered PFMs with the 95 percent confidence interval obtained using the delta method in the original sample of the 525 seroconverters.
To determine the prevented fraction at different times since seroconversion and to allow for departures from the proportional hazards assumption over the full time span, we performed the final analysis in strata of years of follow-up defined by 0.04.0, 4.18.0, and 8.112.0 years, with the use of staggered entries (23) for the last two strata. The 95 percent confidence interval for each stratum was computed using the delta method as described above.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Previous studies have addressed the issue of the interaction of genetic polymorphisms. Martin et al. (13) reported a 32 percent decreased hazard among persons with protective genotypes for CCR5 or CCR2 in combination with SDF1 and a 52 percent elevated hazard among persons who were homozygous for the CCR5P1 haplotype, in comparison with the reference group of all other combinations of protective genotypes. Winkler et al. (15) observed prolonged survival among persons with protective genotypes for SDF1 and CCR5 or CCR2 in comparison with those with protective genotypes for SDF1 alone. Finally, Shin et al. (17) identified a 38 percent reduced hazard of AIDS among persons with protective genotypes for CCR5 or CCR2 in combination with IL10 as compared with those with protective genotypes for CCR5 or CCR2 only. In a meta-analysis, Ioannidis et al. (18) reported relative hazards of 0.74, 0.76, and 0.66 for persons with protective genotypes for CCR5, CCR2, and both, respectively, compared with persons with wild-type CCR5 and CCR2 alleles. The corresponding relative hazards in our data were 0.74, 0.75, and 0.59, confirming the conclusions from the meta-analysis.
Prior studies have also indicated that the protective genotypes affect different stages of HIV-1 infection. Martin et al. (13) reported an earlier effect (05 years) for CCR5P, and Shin et al. (17) reported a later effect (>5 years) for IL10. Winkler et al. (15) reported stronger associations with later endpoints (i.e., the 1987 definition of AIDS and death) for the homozygous SDF1 mutation, which is suggestive of later effects. The overall influence of genetic factors on the development of AIDS in our study was observed early in the course of infection. Of all potential AIDS cases that would have occurred within the first 4 years of infection (i.e., among rapid progressors), 51 percent were prevented by the genotype of the study participants. The overall prevented fraction beyond 4 years was substantially reduced and was not statistically significant (p > 0.05). Consistent with the previous findings for IL10 (17), the node with the strongest effect beyond 4 years was node 3, which was also the only node that included protective genotypes for IL10. The SDF1-3'A/3'A genotype was only identified in 3.5 percent of seroconverters, which explains why it did not influence the overall prevented fraction in later years.
In additional analyses, we allowed the composition of the regression tree to vary in the three time periods described above, but the conclusions were unchanged (data not shown). It is also possible that our follow-up of seroconverters was not long enough to capture AIDS cases, given the median follow-up time of 6.2 years; however, 25 percent of participants were followed for 8.6 years or longer, with a maximum of 11.5 years. A more likely explanation is the emergence of HIV-1 virus populations several years after infection that were capable of overcoming the resistance provided by the combined early effect of the AIDS restriction genes.
We used the delta method to compute the standard error of the prevented fraction. It is easily derived from Cox regression analysis on the final nodes of the regression tree and does not require intensive computational methods (e.g., bootstrap methods). Furthermore, the 95 percent confidence interval obtained from the 100 bootstrap samples (95 percent CI: 0.035, 0.476) was close to the 95 percent confidence interval of the observed tree prior to amalgamation (95 percent CI: 0.074, 0.467) obtained using the delta method, with only a slightly reduced lower bound, indicating that the variability introduced by allowing the tree to vary was not substantial. The reduced precision resulting from the use of only 100 bootstrap samples may also explain the slight discrepancy. Nevertheless, the interpretation of the prevented fraction and confidence interval obtained using the delta method is conditional on the final observed tree. It would be useful to validate the prevented fraction using a different but similarly HIV-1-infected population.
While we had complete genetic data on 96 percent of the participants, 23 persons had missing data on HLA class 1 loci, and we were concerned that this explained why HLA was not included in our final tree. Therefore, we completed the missing HLA data on these persons based on the observed HLA data in persons with the same data on the other five AIDS restriction genes. With the use of standard multiple imputation methods (32), the magnitude of the univariate relative hazards for HLA and the composition of the final regression tree were unchanged (data not shown). Thus, the null results for HLA are not likely to be explained by the missing data. We used here a composite of the HLA data, and it is possible that using specific alleles (e.g., B57 and B35) would have resulted in a refined tree.
An additional concern was the inclusion of seroconverters with longer lag times (i.e., 1 year) between the last HIV-1-negative and first HIV-1-positive visits (n = 85). The analysis appropriately accounted for the staggered/late entries of these persons, though we were also interested in the effect of confining the analysis to those with shorter lag times and thus more well-defined dates of seroconversion. The resulting prevented fraction was 0.339 (95 percent CI: 0.120, 0.505), suggesting that the analysis based on the full data set incorporated some random error, resulting in a prevented fraction closer to zero. Since the estimates were relatively close, we chose to present the results for the complete data set.
As the field of host genetics and AIDS evolves, evaluation of the interactions between identified AIDS restriction genes becomes increasingly complex. Our approach provides a more comprehensive analysis resulting in the estimation of an easily interpretable summary measure: the prevented fraction. We also introduced an easily implemented method of calculating a confidence interval for this measure. This measure is particularly relevant for this field, since it incorporates both the strength of the association and the prevalence of a particular combination of AIDS restriction genes. Note that the prevented fraction is only generalizable to populations with a similar prevalence of restriction genes. Nevertheless, using the relative hazards reported here, which are expected to be internally valid, one can estimate the prevented fraction for a population with a different prevalence of AIDS restriction genes.
In summary, we have presented a novel use of multivariable methods for examining the influence of genetic factors on the progression of HIV-1 infection to AIDS in a well-characterized cohort of HIV-1 seroconverters. As additional AIDS restriction genes are identified, the prevented fraction can be expected to increase. Despite the considerable proportion of cases averted as a result of AIDS restriction genes, the majority of potential cases (70 percent) were not affected. This highlights the need to continue searching for additional genetic modifiers of the survival of HIV-1-infected persons.
![]() |
ACKNOWLEDGMENTS |
---|
The authors acknowledge the assistance of Richard Skolasky in data management and preliminary analysis.
The Multicenter AIDS Cohort Study (http://www.statepi.jhsph.edu/macs/macs.html) includes the following research centers: Baltimore, MarylandBloomberg School of Public Health, Johns Hopkins University: Joseph B. Margolick (Principal Investigator), Haroutune Armenian, Barbara Crain, Adrian Dobs, Homayoon Farzadegan, Nancy Kass, Shenghan Lai, Justin McArthur, and Steffanie Strathdee; Chicago, IllinoisHoward Brown Health Center, Feinberg School of Medicine, Northwestern University, and Cook County Bureau of Health Services: John P. Phair (Principal Investigator), Joan S. Chmiel (Co-Principal Investigator), Sheila Badri, Bruce Cohen, Craig Conover, Maurice OGorman, Frank Pallela, Daina Variakojis, and Steven M. Wolinsky; Los Angeles, CaliforniaSchools of Public Health and Medicine, University of California, Los Angeles: Roger Detels and Beth Jamieson (Principal Investigators), Barbara R. Visscher (Co-Principal Investigator), Anthony Butch, John Fahey, Otoniel Martínez-Maza, Eric N. Miller, John Oishi, Paul Satz, Elyse Singer, Harry Vinters, Otto Yang, and Stephen Young; Pittsburgh, PennsylvaniaGraduate School of Public Health, University of Pittsburgh: Charles R. Rinaldo (Principal Investigator), Lawrence Kingsley (Co-Principal Investigator), James T. Becker, Phalguni Gupta, John Mellors, Sharon Riddler, and Anthony Silvestre; Baltimore, MarylandData Coordinating Center, Johns Hopkins Bloomberg School of Public Health: Alvaro Muñoz (Principal Investigator), Lisa P. Jacobson (Co-Principal Investigator), Stephen R. Cole, Haitao Chu, Janet Schollenberger, Eric Seaberg, Michael Silverberg, and Sol Su; Bethesda, MarylandNational Institute of Allergy and Infectious Diseases: Carolyn Williams; National Cancer Institute: Sandra Melnick, Stephen J. OBrien, and Michael W. Smith.
The content of this publication does not necessarily reflect the views or policies of the US Department of Health and Human Services, nor does the mention of trade names, commercial products, or organizations imply endorsement by the US government.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|