* Department of Evolutionary Biology, Uppsala University, Uppsala, Sweden
Molecular and Computational Biology Program, University of Southern California
Correspondence: E-mail: hans.ellegren{at}ebc.uu.se.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: microsatellite male-bias germ line mutation polymorphism polymerase chain reaction (PCR)
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The basic requirement for studies of the latter kind is the possibility of analyzing large numbers of germline transmissions of a particular marker in a given parent. Microsatellite genotyping of sperm samples can meet this requirement as it offers a virtually unlimited supply of gametes. Holtkemper et al. (2001) used a small-pool polymerase chain reaction (PCR) approach for microsatellite mutation detection in sperm, where DNA corresponding to up to 40 cells was genotyped at the same time. This approach is efficient in terms of throughput, but it may be disadvantageous for other reasons. For instance, the stutter bands typically generated in vitro during microsatellite PCR amplification may make the identification of mutant alleles in pools difficult, in particular when it comes to deletion mutations.
In this study we use single-molecule genotyping (Li et al. 1988; Arnheim, Li, and Cui 1990; Zhang et al. 1994) of DNA prepared from sperm cells to examine the rate and pattern of microsatellite mutation in individual men. Specifically, we study mutation events in the highly mutable tetranucleotide repeat marker D21S1245 (Talbot et al. 1995) in eight males of different age. These men were chosen to represent a "young" (1823 years) or an "old" (4656 years) group of males, to allow a test of the influence of paternal age on the germline mutation rate.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A nested PCR approach was applied with two internal primers in the second PCR. The first PCR contained 10 mM Tris-HCl (pH = 8.3), 50 mM KCl, 1.75 mM MgCl2, 0.2 mM of each dNTP, 2.0 pmol each of primers D21S1245C (5'-GCTGAATTCAGTTTGCTGG-3') and D21S1245D (5'-TGAAAAACAGAGAAGGAGGG-3'), and 0.5 U Taq polymerase (Applied Biosystems or Promega), in a total volume of 20 µl. Amplification was initiated within 5 min at 94°C, followed by 30 s at 95°C, 30 s at 50°C, and 30 s at 72°C for 30 cycles. The final extension step was elongated to 5 min. For the second PCR, 3.0 µl from the first amplification was used as template. Conditions were as above with the exception that 5.0 pmol each of primers D21S1245A and D21S1245B (Talbot et al. 1995) was used, and that 35 cycles with an annealing temperatures of 60°C were run. One of the nested primers was fluorescence-labeled to allow subsequent detection.
The PCR products were analyzed on either a Beckmann-Coulter 8-capillary instrument or an ABI377 sequencing instrument. Each fragment was run together with a high density internal size standard. To reduce the risk of contamination, the first PCR was set up in a hood with a positive flow of sterile filtered air. The hood was decontaminated with UV-light before each use, and filter tips were consistently used throughout the pre-PCR steps. The second PCR was set up in a different room.
DNA Sequencing
The PCR products of individual microsatellite alleles were cloned in pGEM-T vector (Promega) and heat shock transformed into E. coli JM109 cells using manufacturer's protocol (Promega). To ensure that clones contained the appropriate fragment, and not a stutter artifact band from PCR amplification, insert size was screened by clone PCR amplification using the original primers and comparison of insert length with that obtained in amplification from genomic DNA. Only clones containing inserts with the same length as the control were used as templates for sequencing. Clones to be sequenced were amplified using a ThempliPhi-kit (Amersham), and cycle sequencing reactions were performed with BigDye chemistry and analyzed on an ABI377 sequencing instrument. Sequences are available in GenBank under Accession numbers AY193847AY193862.
Scoring and Interpretation
Amplifications were performed in 96-well microtiter plates with one positive control (genomic DNA from the male in question) and one negative control. If contamination was indicated, the results from all amplifications from a particular microtiter plate were discarded. The positive control served as a reference for unmutated alleles, and deviations from these reference lengths were regarded as potential mutations. The second PCR and the subsequent fragment analysis were always repeated for potential mutations.
The stutter (artefact) bands typically seen in microsatellite DNA amplification and the fact that two molecules could be present as templates in a PCR reaction required a criterion for a deviant fragment (a fragment differing in size from the two alleles of a heterozygote individual) to be interpreted as a mutation event. For instance, the simultaneous amplification of an unmutated allele and a mutant one repeat unit shorter than the particular unmutated allele could potentially be difficult to distinguish from one unmutated allele showing extensive stuttering. We therefore used as the criterion for mutation interpretation that if a deviant fragment also displayed a fragment of the size of the unmutated allele, the peak height of the fragment at this position would at most be 33% of the peak height of the potential mutation. Although representing an arbitrary level, we consider this criterion to be conservative for our purposes as it seems highly unlikely that a truly unmutated allele would also show an artifact band of much higher peak height at another position.
The origin of mutant alleles was considered to be the allele that was closest in size to new length variants (cf. Beck, Double, and Cockburn 2003). Potential mutant alleles differing considerably in size from unmutated alleles may represent PCR artefacts, and we therefore excluded five deviant fragments which differed in size by 10 or more repeat units from progenitor alleles.
The exact volume of template DNA to be used in each PCR reaction (corresponding to one genome equivalent per reaction) was empirically determined by making dilution series. One genome equivalent is reached when 63% of the PCR reactions show amplification. However, the dilution approach implies that not all reactions will contain just one template molecule, so occasionally there will be two or more templates per aliquot, and in other cases there will be none. The likelihood for any given reaction to carry zero, one, two, or more DNA molecules is constant and can be described by the Poisson distribution with mean one. Obviously, the number of cases with two or more template molecules present in the same reaction has to be taken into account when estimating the total number of genomes analyzed (i.e., to be able to derive mutation rate estimates). Among the reactions that show only one visible allele (Nv), a certain number in fact represent the amplification product of two alleles of the same length (Ns). Assuming that there is no bias in the appearance of the two alleles, Ns can be estimated by the number of reactions that contain two different alleles (Nd) (i.e, Ns = Nd). Subtracting Ns (estimated by Nd) from Nv leaves the number of reactions that contain only one allele. To this number must be added the number of alleles found in reactions containing two different alleles (2 x Nd) (Nd multiplied by 2, because there are two alleles in each of these reactions) and the number of alleles in reactions containing two alleles of the same length (2 x Ns) (estimated by 2 x Nd). The total number of alleles analyzed (Ntot) can then be approximated by:
|
We used equation (1) to estimate the total number of template genomes present in each microtiter plate separately (a master mix including template DNA was made for one plate at the time). This approach requires all individuals be heterozygotes, which was the case in our study. It should be noted that we ignore the fact that some reactions will actually contain more than two template molecules. However, the proportion of such cases should be less than 7% when using a template concentration corresponding to one genome equivalent per reaction, so it should only have a marginal effect on the total estimate.
For all eight men, approximately equal numbers of the two alleles were amplified (data not shown), thus giving no indication of segregation distortion or selective amplification of certain alleles (implying Nd = Ns).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
Allelic Lineages and their Mutation Rate
To gain further insight into the mutation process at D21S1245, we sequenced all 16 alleles carried by the eight men included in the study. Although only 12 different allele sizes were scored, sequence data revealed all 16 alleles to represent unique variants. The repeat structure proved to be extremely complex, with a main (GAAA)n tetranucleotide repeat followed by a number of A- and G-containing repeat derivatives, generally iterated only a few times (fig. 3). In some alleles, however, a second (GAAA)n sequence toward the 3' end of the complex showed almost as many repeats as the main block. Importantly, the overall repeat structure differed significantly between alleles, and it was clear that three major allelic lineages could be defined. The first lineage ("lineage 1"), represented by six different alleles (11 through 16), was characterized by a diagnostic 22-bp insertion. Alleles of the second ("lineage 2"; 21 through 26) and the third ("lineage 3"; 31 through 33) lineages displayed a specific repeat structure in the very 3' end of the complex. Lineage 3 differed from lineage 2 by a 22-bp deletion (different from the above-mentioned 22-bp insertion). For one allele, designated 4, lineage affiliation was unclear. This allele may evidence a mutation event based on recombination, which could have formed a new allele originating from lineage 2 in the 5' end and lineage 1 in the 3' end through crossing-over. Alternatively, it may represent an evolutionary transition state between lineages 1 and 2.
|
Not only did mutation rates differ between allelic lineages, they also showed contrasting ratios of contraction/expansion mutations. Although contractions were about two and four times as common as expansions in lineages 2 (33/14) and 3 (47/11), expansions (40) were about as common as contractions (39) in lineage 1. The cause of this variation is unknown, and indeed intriguing; the observation represents a previously unknown form of mutation heterogeneity within a microsatellite locus.
From the sequence of individual alleles available, we finally addressed whether the length of longest uninterrupted repeat stretch was correlated with mutation rate, and no such relationship was found (r2 = 0.012, P = 0.96).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Mutation Rate Variation Between Age Groups and Allelic Lineages
The rate of human mutation, and its determinants, is of considerable general interest as it has a bearing on health risks (Crow 2000) and human evolution (Eyre-Walker and Keightley 1999). It is often assumed that a significant number of all point mutations in the germ line arise because of replication errors in connection with cell division. It follows that the number of germ line cell divisions should be an important factor in governing the mutation rate. The germ line mutation rate of humans is male-biased (Hurst and Ellegren 1998; Makova and Li 2002), and this is compatible with the contrasting number of cell divisions in spermatogenesis and oogenesis. Along the same line of thinking, it has been postulated that the male mutation rate should be positively correlated with paternal age as men continue to produce sperm cells throughout adulthood (Crow 2000). However, empirically testing this hypothesis is difficult. An indirect test of the male age effect has been made by studying the age of fathers in cases of a spontaneous dominant mutation leading to hereditary disease (e.g., achondroplasia, Apert's syndrome, neurofibromatosis). This study has supported the idea of an increase in the mutation rate with age (but see Tiemann-Boege et al. 2002), although the precise form of relationship seems to vary among diseases (Risch et al. 1987; Vogel and Motulsky 1997; Crow 2000). However, mutations at one site or at a limited number of hot spot sites, which is the case for several diseases, may not be representative for the overall pattern of mutation in the human genome. For instance, achondroplasia is caused exclusively by recurrent mutations at a hypermethylated cytosine site in the paternal germ line only (Wilkin et al. 1998).
Although length mutations caused by replication slippage at microsatellite loci represent a different class of mutation from point mutations (nucleotide substitutions), it is reasonable to assume that the number of germ line cell divisions should affect the mutation rate of the former class as well (mitotic replication events likely representing the predominant source of mutation). This assumption is supported by the observation that the human microsatellite mutation rate shows a male-bias similar to the rate for point mutations (Ellegren 2000a; Xu, Peng, and Fang 2000). We therefore expected that the group of older males would display a higher microsatellite mutation rate in sperm than the group of younger males. Contrary to that expectation, however, young men had a higher mean mutation rate at D21S1245 than old men. Does this challenge the long-held view of a positive relationship between paternal age and mutation rate?
To be able to compare the mutation rates of different age groups, it is of course important that factors other than age not strongly influence the mutation rate or, alternatively, that they can be controlled for. For this study, sperm donors within the different age categories had been carefully selected to give similarly sized sets of alleles in the two groups, as it is known that the microsatellite mutation rate often varies between alleles according to a repeat length effect (Ellegren 2000b). It was thus not expected that any mutation rate variation related to repeat length would mask a possible male age effect. (Incidentally, during the course of the study we were unable to demonstrate a strict relationship between allele length and mutation rate at D21S1245). However, it is possible that rate variation between allelic lineages would. One important finding of this study was the identification of three phylogenetically well-defined allelic lineages and the variation in mutation rate seen between alleles of these lineages. Biased representation of alleles from the three allelic lineages may explain, at least in part, the observation of a higher mutation rate of young men than of older men. For instance, the group of young men had two alleles from lineage 3, which had the highest mutation rate, and two alleles from lineage 2, which had the lowest rate. For the group of old males there was only one allele from lineage 3 but four alleles from lineage 2. An unbiased assessment of the effect of age on the mutation rate at D21S1245 should ideally be based on the same alleles studied in men of different ages, chiefly because the mutation rate also varied considerably within allelic groups. Notably, the alleles with the lowest and the highest estimated mutation rate in the whole sample were both from lineage 1. It is thus possible that cis-acting elements have a significant effect on the mutation rate of individual alleles.
In summary, we find it premature to conclude from our data that there is no increase in the germ line mutation rate at D21S1245 with age. Nevertheless, there is no indication that there is a strong age effect. Maybe the rates of point mutation and microsatellite mutation respond differentially to an increase in male age.
Additional Heterogeneity in the Mutation Process Between Allelic Groups
There are a several examples of microsatellite loci where the degree of polymorphism has been found to vary between allelic groups (Jin et al. 1996). Allelic groups may be defined by the presence or absence of interruptions within perfect repeat arrays, and the general trend is that interruptions lower the degree of polymorphism. Variation in polymorphism levels is likely to translate into underlying mutation rate variation, although the evidence for this possibility is only indirect. D21S1245 is a particularly complex microsatellite locus, but the observed variation in mutation rate between allelic lineages seems not to be related to the overall degree of complexity at this locus, at least not in a simple way (cf. fig. 3). Moreover, a high mutation rate cannot be explained by the presence of large repeat numbers in the second (GAAA)n repeat toward the 3' end. In any case, our data provide the first direct support for significant mutation rate variation between allelic lineages within a microsatellite locus. Perhaps allelic variation in adjacent sequences affects the relative instability of D21S1245, similar to the situation at some minisatellite loci (Monckton et al. 1994). Talbot et al. (1995) found no effect on D21S1245 mutability of markers at 2 cM distance; however, the possibility that there are cis-acting alleles at much closer distance cannot be excluded.
Mechanisms of Mutation
Although length mutations compatible with replication slippage can broadly explain length polymorphism at D21S1245, a closer inspection of the repeat structures of individual alleles (fig. 3) suggests that other mutations also contribute to the polymorphism seen at this locus. In our set of alleles there are nine sites of interruptions within arrays of (GAAA)n (in two cases arrays of [GAGAAA]n) which are represented by perfect repeats in other alleles. Interestingly, all nine cases constitute A > G or G > A substitutions. Assuming the direction of mutation has always been from a perfect repeat to an imperfect, six cases of A > G and three cases of G > A substitution can be identified. This non-random pattern of mutation represents an extreme transition:transversion bias (i.e., there is not a single C or T among nine mutation events) and would either suggest a strong strand bias for point mutation to A or G influenced by sequence context, or a mechanism of mutation that is different from point mutation and is affected by the G- and A-rich nature of this repeat locus. The latter could involve gene conversion or other recombination-like events, and perhaps replication slippage as well (Palsbøll, Berube, and Jørgensen 1999).
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Arnheim, N., H. H. Li, and X. F. Cui. 1990. PCR analysis of DNA sequences in single cells: single sperm gene mapping and genetic disease diagnosis. Genomics 8:415-419.[ISI][Medline]
Beck, N. R., M. C. Double, and A. Cockburn. 2003. Microsatellite evolution at two hypervariable loci revealed by extensive avian pedigrees. Mol. Biol. Evol. 20:54-61.
Brinkmann, B., M. Klintschar, F. Neuhuber, J. Huhne, and B. Rolf. 1998. Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat. Am. J. Hum. Genet. 62:1408-1415.[CrossRef][ISI][Medline]
Crow, J. F. 2000. The origins, patterns and implications of human spontaneous mutation. Nat. Rev. Genet. 1:40-47.[CrossRef][ISI][Medline]
Ellegren, H. 2000a. Heterogeneous mutation processes in human microsatellite DNA sequences. Nat. Genet. 24:400-402.[CrossRef][ISI][Medline]
Ellegren, H. 2000b. Microsatellite mutations in the germline: implications for evolutionary inference. Trends Genet. 16:551-558.[CrossRef][ISI][Medline]
Eyre-Walker, A., and P. D. Keightley. 1999. High genomic deleterious mutation rates in hominids. Nature 397:344-347.[CrossRef][ISI][Medline]
Holtkemper, U., B. Rolf, C. Hohoff, P. Forster, and B. Brinkmann. 2001. Mutation rates at two human Y-chromosomal microsatellite loci using small pool PCR techniques. Hum. Mol. Genet. 10:629-633.
Huang, Q. Y., F. H. Xu, H. Shen, H. Y. Deng, Y. J. Liu, Y. Z. Liu, J. L. Li, R. R. Recker, and H. W. Deng. 2002. Mutation patterns at dinucleotide microsatellite loci in humans. Am. J. Hum. Genet. 70:625-634.[CrossRef][ISI][Medline]
Hurst, L. D., and H. Ellegren. 1998. Sex biases in the mutation rate. Trends Genet. 14:446-452.[CrossRef][ISI][Medline]
Jin, L., C. Macaubas, J. Hallmayer, A. Kimura, and E. Mignot. 1996. Mutation rate varies among alleles at a microsatellite locus: phylogenetic evidence. Proc. Natl. Acad. Sci. USA 93:15285-15288.
Li, H. H., U. B. Gyllensten, X. F. Cui, S. R. Kaiki, H. A. Erlich, and N. Arnheim. 1988. Amplification and analysis of DNA sequences in single human sperm and diploid cells. Nature 335:414-417.[CrossRef][ISI][Medline]
Makova, K. D., and W. H. Li. 2002. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416:624-626.[CrossRef][ISI][Medline]
Monckton, D. G., R. Neumann, T. Guram, N. Fretwell, K. Tamaki, A. MacLeod, and A. J. Jeffreys. 1994. Minisatellite mutation rate variation associated with a flanking DNA sequence polymorphism. Nat. Genet. 8:162-170.[ISI][Medline]
Palsbøll, P. J., M. Berube, and H. Jørgensen. 1999. Multiple levels of single-strand slippage at cetacean tri- and tetranucleotide repeat microsatellite loci. Genetics 151:285-296.
Primmer, C., N. Saino, A. Møller, and H. Ellegren. 1998. Unraveling the processes of microsatellite evolution through analysis of germline mutations in barn swallows (Hirundo rustica). Mol. Biol. Evol. 15:1047-1054.
Risch, N., E. W. Reich, M. M. Wishnick, and J. G. McCarthy. 1987. Spontaneous mutation and parental age in humans. Am. J. Hum. Genet. 41:218-248.[ISI][Medline]
Schlötterer, C. 2000. Evolutionary dynamics of microsatellite DNA. Chromosoma 109:365-371.[ISI][Medline]
Schlötterer, C., R. Ritter, B. Harr, and G. Brem. 1998. High mutation rate of a long microsatellite allele in Drosophila melanogaster provides evidence for allele-specific mutation rates. Mol. Biol. Evol. 15:1269-1274.
Talbot, C. C., Jr., D. Avramopoulos, S. Gerken, A. Chakravarti, J. A. Armour, N. Matsunami, R. White, and S. E. Antonarakis. 1995. The tetranucleotide repeat polymorphism D21S1245 demonstrates hypermutability in germline and somatic cells. Hum. Mol. Genet. 4:1193-1199.[Abstract]
Tiemann-Boege, I., W. Navidi, R. Grewal, D. Cohn, B. Eskenazi, A. J. Wyrobek, and N. Arnheim. 2002. The observed human sperm mutation frequency cannot explain the achondroplasia paternal age effect. Proc. Natl. Acad. Sci. USA 99:14952-14957.
Weber, J. L., and C. Wong. 1993. Mutation of human short tandem repeats. Hum. Mol. Genet. 2:1123-1128.[Abstract]
Wilkin, D. J., J. K. Szabo, and R. Cameron, et al. (12 co-authors). 1998. Mutations in fibroblast growth-factor receptor 3 in sporadic cases of achondroplasia occur exclusively on the paternally derived chromosome. Am. J. Hum. Genet. 63:711-716.[CrossRef][ISI][Medline]
Vogel, F., and A. G. Motulsky. 1997. Human genetics: problems and approaches. Springer-Verlag, Berlin.
Xu, X., M. Peng, and Z. Fang. 2000. The direction of microsatellite mutations is dependent upon allele length. Nat. Genet. 24:396-399.[CrossRef][ISI][Medline]
Zhang L., E. P. Leeflang, J. Yu, and N. Arnheim. 1994. Studying human mutations by sperm typing: instability of CAG trinucleotide repeats in the human androgen receptor gene. Nat. Genet. 7:531-535.[ISI][Medline]
|