1 Division of Clinical Epidemiology, Department of Medicine, University of Texas Health Science Center, San Antonio, Texas
2 Department of Cellular and Structural Biology, University of Texas Health Science Center, San Antonio, Texas
3 Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio, Texas
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Type 2 diabetes, a common multifactorial metabolic disease, is caused by both environmental and genetic factors. The incidence of type 2 diabetes continues to rise and increasingly affects individuals of all ages across all ethnic groups (1, 2). Individuals from certain ethnic groups including Mexican Americans have an increased propensity toward developing type 2 diabetes. Furthermore, not only is the risk of diabetes two- to threefold higher in Mexican Americans than in non-Hispanic whites, but diabetes is more severe and more deadly in Mexican Americans than in non-Hispanic whites (3, 4).
Genetic studies consistently indicate that diabetes is familial in nature. However, despite >20 genome-wide linkage analyses of type 2 diabetes, the genes influencing susceptibility to the common forms of type 2 diabetes remain largely unknown (521). Moreover, the increased risk and severity of type 2 diabetes in Mexican Americans may indicate an increased genetic susceptibility (22, 23). Previously, in the San Antonio Family Diabetes Study, an extended pedigree study composed of Mexican-American families ascertained through a single proband with diabetes, Duggirala et al. (24) reported evidence of linkage to diabetes on 10q and suggestive evidence of linkage to diabetes on 3p and 9p. This earlier study was based on a single clinical examination of 440 individuals from 27 extended pedigrees with genotypic information from 379 microsatellite markers across the genome (24). Subsequently, the study has been expanded to include two additional examination cycles during which previous participants were invited to return, existing pedigrees were expanded, and new pedigrees were added. In addition, the Center for Inherited Disease Research (CIDR) has recently completed genotyping of 382 highly polymorphic markers distributed throughout the autosomes at 10-cM intervals on over 96% of the 906 individuals who have participated in at least one clinical examination. Therefore, our objective was to perform a genome-wide linkage analysis of type 2 diabetes in the updated San Antonio Family Diabetes/Gallbladder Study (SAFDGS).
![]() |
RESEARCH DESIGN AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Of the original 579 participants, 500 individuals returned for the first follow-up examination of the SAFDGS population that occurred between 1996 and 1999. In addition, 67 family members who were unable to participate in the baseline SAFDGS were recruited and participated in the first recall of the SAFDGS. Finally, between 1998 and 2001, a second follow-up examination occurred (known as the San Antonio Family Gallbladder Study), during which 265 new participants were recruited including both additional family members of the 31 original families as well as 8 additional families. In summary, there are 911 SAFDGS participants from 39 families who have participated in at least one of three clinic examinations. Of these 911 participants, 5 people have been excluded either because of chromosomal abnormalities or because of irresolvable pedigree uncertainties; hence, there are a total of 906 SAFDGS participants with phenotypic information available. The institutional review board of the University of Texas Health Science Center at San Antonio approved the study, and all subjects gave informed consent.
SAFDGS baseline and follow-up examinations.
Each examination consisted of a standardized medical examination that included interviews, anthropometry, a fasting venipuncture, and an oral glucose tolerance test. Trained interviewers obtained information on medical history and medication use. Examinations occurred in the morning, after participants had fasted the 12 h before their examination. Measurement of BMI, waist circumference, fasting plasma glucose, and plasma glucose 2 h after a standardized oral glucose load have been previously described in detail (26).
Two definitions of diabetes were used. The first, referred to throughout this study as diabetes (fasting), required a fasting plasma glucose 7.0 mmol/l (126 mg/dl) (27), whereas the second, referred to as diabetes (World Health Organization [WHO]), required either a fasting plasma glucose
7.0 mmol/l (126 mg/dl) or a 2-h glucose after an oral glucose tolerance test
11.1 mmol/l (200 mg/dl) (28). For each definition, participants who did not meet these criteria but who self-reported physician-diagnosed diabetes and who reported current therapy with either oral antidiabetic agents or insulin were also considered to have diabetes in this investigation. For each definition, information from a participants most recent examination was used to define his or her diabetes status with the caveat that individuals diagnosed with diabetes who when later examined no longer had diabetes were considered missing. Finally, age of diabetes diagnosis was modeled as a proxy for age of diabetes onset: for previously diagnosed diabetic participants, self-reported age of diagnosis was used as the age of onset; for diabetic participants initially diagnosed at a SAFDGS examination, the participants reported age at that examination was used as the age of onset.
Genotyping and calculation of multipoint identities by descent.
A complete genomic scan using 382 highly polymorphic markers distributed throughout the autosomes at 10-cM intervals has been completed by the CIDR on 872 of 906 participants. In 39 families, genotype data were available across five generations in 5 families, four generations in 19 families, three generations in 11 families, and two generations in 4 families. Genotype data were cleaned for both Mendelian and spurious double-recombinant errors using SimWalk2 (29). SimWalk2 provides a probability of error for each individual for each genotype based on the genotypes of other family members, the allele frequencies, and the marker map. For Mendelian error cleaning, genotypes were blanked at descending error probability thresholds, in increments of 0.1, until no more inconsistencies existed, resulting in the elimination of 1,247 of 364,934 genotypes, a blanking rate of 0.34%. For double recombinant cleaning, all genotypes with an error probability of
0.25 were blanked, resulting in the elimination of an additional 436 genotypes, a blanking rate of 0.12%. MultiMap/CRI-MAP (30, 31) was used to construct sex-averaged marker maps using the cleaned genotype data. Allele frequencies were estimated by maximum likelihood methods implemented in SOLAR (sequential oligogenic linkage analysis routines) (32), and matrices of multipoint identity-by-descent sharing probabilities were estimated using Markov chain Monte Carlo methods implemented by LOKI (33).
In addition, in instances in which a statistically significant linkage signal was identified, additional microsatellite markers were typed in the 705 individuals for whom we have large quantities of DNA because of the lymphoblastoid cell lines available for these individuals. The locations of the highly polymorphic microsatellite markers used were determined from the UCSC Human Genome Browser July 2003 assembly. PCR primer sequences for each marker were obtained from the genome database. Forward primers were labeled with fluorescent dyes. Multiplex PCR was performed using 240 ng pf highmolecular weight genomic DNA isolated from lymphoblastoid cell lines. The resulting PCR products were run on an ABI 3100 Avant Genetic Analyzer, and alleles were called using the GeneMapper analysis software.
Statistical analyses.
Using a variance decomposition approach implemented in SOLAR (32), we performed genetic analysis on the discrete trait diabetes, using a liability threshold model (6). This approach assumes that an individual belongs to a specific disease class if an underlying genetically determined risk or liability exceeds a certain threshold on a normally distributed liability curve. The liability is assumed to have an underlying multivariate normal distribution. The correlation in liability between individuals i and j is given by:
![]() |
where ij is the correlation in liability to disease between individuals i and j;
mij is the proportion of alleles that individuals i and j share identical by decent at a marker, m, linked to a quantitative trait locus; h2m is the heritability attributed to the quantitative trait locus near the marker locus;
ij is the kinship coefficient for individuals i and j; h2 is the heritability attributed to residual additive polygenic effects; Iij is an identity matrix (which equals 1 if i equals j and 0 if i does not equal j); and e2 is equal to 1 (h2m + h2). Maximum likelihood techniques were used to estimate variance components and covariate effects simultaneously.
In addition to performing genetic analysis on the discrete trait diabetes, we used SAS to model age of diabetes diagnosis as a proxy for age of diabetes onset with a Cox proportional hazards model (34). In the Cox proportional hazards models, self-reported age of diagnosis was used as the time of the event for previously diagnosed diabetic participants; for diabetic participants initially diagnosed at a SAFDGS examination, the participants reported age at that examination was used as the time of the event; and finally, nondiabetic participants were censored at their most recent SAFDGS examination age. In addition, participants were left censored at their age 1 day before they entered the study or, in the case of an individual who entered with diabetes, 1 day before their age of reported diabetes onset. Moreover, the left censoring age (with a quadratic term) was adjusted for in the Cox proportional hazards model. Finally, using SOLAR, we performed standard multipoint variance components linkage analysis on the Martingale residual from the Cox proportional hazards model, a quantitative trait (34). In summary, Martingale residuals obtained from Cox proportional hazards models were used to model age of diabetes onset while accounting for age of entry into the study. Additional covariates of interest were incorporated into the SOLAR model.
Available covariates considered in the analyses included age (with a quadratic term) at a participants most recent examination, sex, decade of birth (to account for secular trends in the prevalence of diabetes), and waist circumference at a participants most recent examination. In each case, a P value of 0.05 was used as a nominal value for retention in the model. Finally, because it is well known that variance components-based linkage analyses can have elevated false-positive rates in analysis of traits that are not multivariate-normally distributed (35, 36), for each multipoint linkage analysis completed on the discrete trait diabetes as well as the Martingale residuals for age of diabetes diagnoses, we have used a simulation-based approach implemented in SOLAR to adjust the nominal logarithm of odds (LOD) score (35). The LOD score distribution under the null hypothesis of no linkage was determined based on 10,000 replicates assuming a fully informative marker locus, and these empirical LOD scores were regressed on the theoretical LOD score distribution to obtain a correction constant, which was subsequently used to adjust the observed LOD score. Specifically, a large number of fully informative markers were simulated (n = 10,000) that were not linked to the trait of interest. For this marker, identity-by-descent information was calculated, and linkage analysis was conducted. The correction constant was obtained by comparing the distribution of the observed simulated LOD scores with the theoretical distribution and was used to adjust the observed LOD scores to empirical LOD scores. Throughout the study, empirical LOD scores are presented, because these are more reliable than conventional LOD scores. Chromosomal regions with an empirical LOD score >3 were considered to provide significant evidence of linkage; regions with an empirical LOD score >2 were considered to provide suggestive evidence for linkage, and regions with an empirical LOD score of 1.18 (P < 0.01) were considered "potentially interesting" as previously described by Vionnet et al. (37).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
In addition to multipoint linkage analysis, two-point linkage analysis with each marker was completed for diabetes (fasting and WHO) adjusted only for age and age2 as well as the (fasting and WHO) Martingale residual traits (Table 3). Furthermore, to follow up the linkage signal on 3p, we typed 10 additional microsatellite markers in the 705 individuals for whom we have ample DNA through lymphoblastoid cell lines. The markers encompassed a region of 3p from 49.2 to 81.9 Mb. Two-point empirical LODs for the 10 newly typed microsatellite markers on chromosome 3 in the region of interest are found in Table 3. Moreover, multipoint analysis including these additional markers resulted in a linkage profile that was similar to the original but slightly narrowed. The peak linkage signal for the (fasting) Martingale residual trait was at 112 cM (empirical LOD of 3.55) using the new map.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
For the diabetes (fasting) age of onset model, the 1.5-LOD support interval (95% CI) around our peak spanned
20 cM from marker D3S4542 to marker D3S4529 before typing the 10 additional microsatellite markers. However, with the 10 additional microsatellite markers, the 1.5-LOD support interval was narrowed and spans an
10-cM region from marker D3S1296 to marker D3S3633 that harbors
40 identified (known and predicted) genes.
The San Antonio Family Diabetes Study has previously reported evidence for linkage to type 2 diabetes and nominal evidence for linkage to fasting-specific insulin in the 3p region implicated in this expanded SAFDGS. Specifically, Duggirala et al. (24) in 1999 reported a multipoint LOD of 2.56 for diabetes near marker GATA128C02 (Marshfield map, 112 cM) in 440 San Antonio Family Diabetes study participants who were examined at baseline and in 2001 reported a multipoint LOD 1.00 for fasting-specific insulin in 310 nondiabetic San Antonio Family Diabetes study participants at the same locus (38). Moreover, the San Antonio Family Heart Study comprised of a similar study population reports a LOD for serum insulin in 391 nondiabetic individuals of 3.07 on chromosome 3p in the region flanked by D3S1600 (86.0 cM, Marshfield map) and D3S1285 (91.0 cM, Marshfield map) (39). Similar to the SAFDGS, the San Antonio Family Heart study is an extended pedigree study of low-income Mexican Americans in San Antonio, Texas. Differences between the two studies include ascertainment without regard to health status of the proband in the San Antonio Family Heart Study as well as a younger mean age at baseline in the San Antonio Family Heart Study.
Moreover, previously in the San Antonio Family Diabetes Study, Duggirala et al. (24) not only reported suggestive evidence of linkage to diabetes on 3p but reported significant evidence of linkage on 10q as well as suggestive evidence of linkage on 9p. Contrary to the earlier findings concerning 10q that have been replicated in a number of other study populations examining diabetes and related phenotypes (16, 37, 4042), in the expanded SAFDGS, we do not find evidence of linkage to diabetes on 10q; however, we do replicate the earlier findings on both 3p and 9p. Explanations for discrepancies between previous and current findings on 10q include 1) changes in how diabetes is defined (i.e., the new fasting glucose threshold, 126 vs. 140 mg/dl), 2) changes in the study population (i.e., new pedigrees have been added and old pedigrees have been expanded) that could introduce further heterogeneity, 3) collection of phenotypic information at three versus a single time point and the subsequent aging of the study population, 4) the new set of microsatellite markers, and 5) secular trends in the prevalence of diabetes (1). Therefore, to help identify which explanation was most plausible, we completed a genome-wide linkage analysis on the 579 individuals at the San Antonio Family Diabetes study baseline examination using the new definitions of diabetes and the new genotypic information from CIDR. Results from this analysis replicate the previous 10q findings (i.e., Martingale residual [fasting], LOD = 3.00; Martingale residual [WHO], LOD = 2.57). Hence, neither the new definition of the phenotype nor the new marker set accounts for the loss of the 10q signal. Its disappearance, therefore, must relate to the expanded population, aging of our population, or secular trends in the incidence of diabetes. We attempted to control for the latter by controlling for birth decade. Different genes are likely to influence diabetes susceptibility at different ages; hence, the aging of our population over the three visits may have influenced the observed linkage signals. Moreover, because the gene pool is relatively stable, the rapid increase in the prevalence of diabetes in recent years reflects environmental changes that may call into play additional diabetes susceptibility genes through gene-environment interaction.
Other studies from outside San Antonio, Texas, have also reported evidence for linkage to type 2 diabetes or type 2 diabetesrelated traits on chromosome 3p in the region implicated in this study. Our 3p signal has previously been identified with type 2 diabetes in a Chinese population (maximum reported LOD of 2.27) (21), a family study focused on identifying genes associated with maturity-onset diabetes of the young (maximum reported heterogeneity LOD of 1.81) (8), and in an ordered subset analysis of a Japanese population in which families were rank ordered based on average maximal BMI of sib-pairs and maximal BMI was adjusted for (maximum reported LOD of 2.42) (17). Additionally, our 3p region has been identified as having slight evidence for linkage to type 2 diabetes in several studies (43, 44), to fasting glucose in the Framingham Offspring study (16), and to acute insulin response in the Pima Indians (45). Finally, four additional studies have reported evidence of linkage to type 2 diabetes near our region, 50 cM toward the p-terminal end of chromosome 3 (12, 13, 20, 46).
Ideally, in an extended pedigree study of a discrete trait, complete information on who acquired the trait and their age of onset would be available. Therefore, when available, we used serial data to define diabetes status. This enabled us to exclude individuals likely misclassified who were identified as having diabetes at one examination but who at a subsequent examination no longer met the criteria for having diabetes (1% of the population for either definition). In addition, because we used information across three examinations that occurred over 10 years, we were able to identify a greater number of individuals as having diabetes (i.e., those who were initially nondiabetic but who developed diabetes at a later examination) as well as to determine more accurately their age of diabetes onset. Finally, by combining information across the three examinations, we maximized the number of individuals included in the analyses. One limitation of combining information collected over a 10-year period is that it tends to magnify the effect of secular trends. To help overcome this limitation, we adjusted for birth decade to help account for secular trends in the incidence of diabetes.
In comparison with modeling the discrete trait diabetes, which includes information only on whether or not an individual has diabetes (0.1), the Martingale residual can be viewed as a continuous measure that incorporates information based on a participants age at entry into the study, whether they develop diabetes during the study, age of diabetes diagnosis if they develop diabetes, and age at the end of the study if they do not develop diabetes. For example, in our study, a participant that enters the study at age 21 and develops diabetes (fasting) at age 25 has a Martingale residual of 0.96, whereas a participant who enters the study at age 64 and is 72 when the study ends and still does not have diabetes has a Martingale residual of 0.71. Therefore, use of the Martingale residual allows us to model age of diabetes diagnosis because it assigns a value to individuals based on their age of diabetes diagnosis if they have diabetes or based on their obtained age without developing diabetes if they dont have diabetes. In summary, for each definition of diabetes (fasting and WHO), the LOD for the Martingale residual is quite a bit higher than the LOD for the trait; however, this is not unexpected because the Martingale residual is more precise, incorporating additional information.
We examined two definitions for diabetes because basing the definition on fasting only avoids the problems associated with 2-h glucose levels in the definition that increases the number of individuals with missing information on diabetes (due to missing 2-h values) and that has lower repeatability. However, it is reassuring that evidence for linkage to diabetes (fasting) is similar to evidence for linkage to diabetes (WHO). The slightly higher LODs for the Martingale residual for diabetes (fasting) relative to the Martingale residual for diabetes (WHO) (3.76 vs. 3.29 for the Martingale residual) may be explained by the lower percentage of people with missing data as well as the higher repeatability of the diabetes (fasting) definition. Moreover, although adjusting for birth decade to help account for secular trends in the incidence of diabetes only slightly attenuated the signals, additionally adjusting for waist circumference further reduced the LODs 24.5 and 29.0% for the diabetes age of onset model for fasting and WHO definitions, respectively. Interestingly, there is no evidence for linkage to waist circumference in the SAFDGS in this region on chromosome 3 (K.J.H., unpublished data).
A number of genes in the region of linkage on chromosome 3p encode proteins that appear to have biological relevance to the pathophysiology of diabetes. Their function points to a role in either insulin secretion or insulin action. Succinyl-CoA synthetase (SUCLG2, MIM 603922) is expressed in the mitochondria and in the cytosol of pancreatic islet cells as well as numerous other tissues (47). Alterations in SUCLG2 could affect key functional steps in mitochondrial metabolism leading to insulin secretion (4750). Forkhead box P1 (MIM 605515) is a novel member of a subfamily of winged helix/forkhead/HNF3 transcription factors (51). Mutations in two related members, FOXP3 and Foxo1 (mouse homolog of FOXO1), have been shown to be involved in the pathogenesis of diabetes (52, 53). Glycogen-branching enzyme (MIM 607839) plays a role in glycogen synthesis and storage, a critical factor in energy homeostasis (54). Glycogen-branching enzyme is an attractive candidate because there is substantial evidence to suggest that impaired glycogen synthesis (nonoxidative glucose disposal) could be a primary metabolic defect in pre-diabetic individuals (55).
In summary, our genome-wide multipoint linkage scan for type 2 diabetes in a Mexican-American population found evidence for linkage on chromosome 3p. Moreover, this finding replicates prior linkage findings to type 2 diabetes and fasting-specific insulin in the San Antonio Family Diabetes Study, as well as replicates three independent studies that have reported evidence for linkage to type 2 diabetes in this region and a single independent study that has reported evidence for linkage to serum insulin in this region.
![]() |
ACKNOWLEDGMENTS |
---|
Genotyping services were provided by the CIDR. Yasmin Ench, Korri Weldon, and Jeanette Hamlington provided excellent technical assistance. We thank the participants of the SAFDGS families for their support and cooperation.
![]() |
FOOTNOTES |
---|
Address correspondence and reprint requests to Kelly J. Hunt, PhD, Division of Clinical Epidemiology, Department of Medicine, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Dr., San Antonio, TX 78229-3900. E-mail: huntk{at}uthscsa.edu
Received for publication January 14, 2005 and accepted in revised form June 3, 2005
CIDR, Center for Inherited Disease Research; LOD, logarithm of odds; SAFDGS, San Antonio Family Diabetes/Gallbladder Study; WHO, World Health Organization
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Diabetes | Diabetes Care | Clinical Diabetes | Diabetes Spectrum | DOC News |