Laboratory of Genomic Diversity, National Cancer InstituteFrederick, Frederick, Maryland
Microsatellites, repetitive simple sequences of 16 nt in length, are abundant in eukaryotic genomes (Weber 1990a
). Because of extensive variability in the number of repeat units for any one locus among members of a population, microsatellite loci exhibit high polymorphism. Microsatellite variation has become a useful class of genetic markers in population assessment for numerous species for questions of genetic identification, parentage, kinship, and population variability assessment (Jarne and Lagoda 1996
; Goldstein and Pollock 1997
).
The high level of polymorphism at microsatellite loci is believed to result from slipped-strand mispairing during DNA replication (Levinson and Gutman 1987
; Weber 1990b
; Weber and Wong 1993
; Krugylak et al. 1998
), most commonly causing the gain or loss of one or more repeat units. This mutation mechanism would be expected to generate allelic homoplasy, i.e., comigrating allele size fragments which are not identical by descent or in DNA sequence among different individuals.
Because mutational slippage does not conform to the infinite-alleles model (Kimura and Crow 1964), a stepwise mutation model (SMM) (Ohta and Kimura 1973
; Chakraborty and Nei 1982
) has been invoked to explain allele frequency distributions and population variability at microsatellite loci (Valdez, Slatkin, and Freimer 1993
; Kimmel et al. 1996
). A distinguishing feature of the two models is that homoplasy is not considered under the infinite-alleles model, while it is an expectation of the stepwise mutation model (Viard et al. 1998
). The SMM (but not the infinite-alleles model) makes the assumption that differences in allelic length are due to variation in the number of repeat units and not to insertions and deletions in the flanking sequences.
Allelic size homoplasy, a condition under which comigrating alleles represent different sequence motifs, has been demonstrated through sequence analysis of electromorphs of compound or imperfect repeats of primates (Blanquer-Maumont and Crouau-Roy 1995
), several fish species (Angers and Bernatchez 1997
), invertebrates (Viard et al. 1998
), and the fungus Candida albicans (Metzgar et al. 1998
). Insertion and deletion events in genomic regions flanking microsatellites confer allelic size homoplasy in several marine turtle species (FitzSimmons et al. 1995
) and humans (Homo sapiens) (Grimaldi and Crouau-Roy 1997
). The occurrence of allelic size homoplasy is informative, since ignorance of its extent in a population may confound forensic, phylogenetic, or population diversity assessment. A few studies of microsatellite size homoplasy in natural populations have been reported (Estoup et al. 1995
; Angers and Bernatchez 1997
; Ortí, Pearse, and Avise 1997
; Viard et al. 1998
), but none of these studies assess mammalian populations.
A sampling of 277 free-ranging puma (Puma concolor) individuals from throughout the geographic distribution of this species (from the Canadian Yukon to the Patagonia region of Chile and Argentina) was analyzed for genetic diversity (Culver et al. 2000)
. In that phylogeographic study, relationships of 31 recognized puma subspecies were evaluated based on mitochondrial DNA (mtDNA) haplotypes and allele distributions of 10 feline microsatellite loci isolated from a genomic library of the domestic cat Felis catus (Menotti-Raymond and O'Brien 1995
). Here, nucleotide sequences from 64 pumas homozygous for various length alleles at each of the 10 dinucleotide loci were determined. Microsatellite allele sequences were compared with each other and with the homologous domestic cat microsatellite locus in order to assess locus sequence structure between Felidae species and to estimate the incidence of allelic size homoplasy in pumas. Figure 1A and B
presents sequence alignments for 76 different homozygous puma alleles. (Ten pumas were homozygous at more than one locus.) These results have bearing on the application of microsatellite locus genotypes in diversity and phylogeographic assessment of wide-ranging species of mammals.
|
|
Mutational differences in the structures of repeat units between cat and puma microsatellite homologs resulted in interrupted repeats of the long repeat arrays (fig. 1B
). Such interruptions (particularly as observed in FCA008 and FCA262) can reduce the length of the repeat, which precedes the "decay" of the repeat region, a process suggested to lead to "death" of the microsatellite (Kruglyak et al. 1998
; Taylor, Sanny, and Breden 1999
). The sum of these differences between homologous puma and domestic cat microsatellite locus structures would raise questions around their efficacy in phylogenetic comparisons among species or genus-level comparisons of distantly related mammalian genera such as Puma versus Felis, estimated at 10 Myr divergence (Johnson and O'Brien 1997
). In addition, compound repeat structures among four loci in the cat and four loci in the puma also contribute substantially to size homoplasy. In sum, approximately 80% of comigrating alleles between these two species reflect size homoplasy of sequentially distinctive alleles.
Within pumas, no differences in 874 bp of flanking DNA were observed across 10 loci, including 76 individuals homozygous for microsatellite alleles (fig. 1A and B
). Thus, microsatellite estimates of population diversity and diversification were derived exclusively from the differences in repeat structural motifs. Similar invariance in microsatellite flanking regions have been observed in bees (Angers and Bernatchez 1997
).
Three of the four interrupted repeat loci in pumas (FCA262, FCA096, and FCA008; fig. 1B ) exhibited allele size homoplasy whereby identical size alleles displayed distinctive repeat structure compositions in different parts of the pumas' geographical range. Locus FCA262, a simple interrupted repeat in the puma, demonstrated size homoplasy among puma electromorphs containing 19 dinucleotide repeats (fig. 1B ). Electromorphs at this locus had two distinctly different repeat structures: one electromorph with a pattern of (CA)6TA(CA)8, and a second electromorph of (CA)15 in the variable regions of the repeat. A North American subspecies (MIS [Mississippi]) contained the (CA)15 form, a South American subspecies (ACR [Brazil]) contained the (CA)6TA(CA)8 form, and a more centrally located subspecies (STA [Texas/Mexico]) contained both forms. Within the STA region, the individual from farther north (Texas) contained the northern form of the electromorph, and a puma from farther south (Mexico) contained the southern form.
Locus FCA096 consisted of a compound repeat that contained both uninterrupted and interrupted repeat structures within the puma (fig. 1B ). This locus exhibited size homoplasy for the allele containing 37 dinucleotide repeats, and both electromorphs occurred in the STA subspecies. A compound uninterrupted allele at this locus (size 33) which does not exhibit size homoplasy appears to be regional, occurring only in two adjoining central South American subspecies (ACR and CAB).
The FCA008 locus, a simple uninterrupted repeat in the domestic cat, exhibited a complex compound interrupted repeat in the puma (fig. 1B ). Three pairs of compound repeat alleles exhibited size homoplasy, those containing 20, 36, and 39 dinucleotide repeats. Different repeat regions of FCA008 (labeled A, C, and E in fig. 1B ) occurred in alleles found throughout all geographical areas. However, alleles from south of Honduras contained dimers only in repeated regions B and D, resulting in a cline of repeat complexity at this locus. The other compound imperfect repeat locus (FCA166) contained a combination of dinucleotide and tetranucleotide repeats but exhibited no geographical component. Simple uninterrupted repeat loci (SU, fig. 1A ) showed no apparent size homoplasy, although back mutation of repeat motif number represents a type of homoplasy not detectable by sequence analysis.
Of 1,168 possible pairwise allele comparisons (the sum of allele size pairwise combinations for each of 10 loci; fig. 1A and B ), homoplasy was observed in 32 pairwise comparisons made between individuals (2.7%). Since alleles were sequenced using technology diagnostic for sequence polymorphism, each sequence was considered to represent two identical alleles. Thus, 97.3% of allele size comigrations within the puma species would represent authentic sequence identities.
The incidence of allele size homoplasy for a puma specieswide comparison is estimated as the product of included homoplastic allele sequence frequencies. Thus, for a microsatellite size allele of frequency p which includes two homoplastic sequences of equal frequency in our sequenced sample (as is the case for FCA262, FCA096, and FCA008; Fig. 1B), the estimated homoplasy frequency equals (p/2)2. The observed allele frequencies across 277 sampled pumas (Culver et al. 2000)
for size alleles which display homoplasy are as follows: FCA262 electromorph size 136 = 0.63; FCA096 electromorph size 171 = 0.16; FCA008 electromorph sizes 86, 106, and 124 = 0.04, 0.63, and 0.03, respectively. The estimated homoplasy incidences per locus are 9.92% for FCA262, 0.64% for FCA096, and 9.98% for FCA008. For the 10 sampled microsatellite loci, the average size homoplasy incidence is (9.92% + 0.64% + 9.98%)/10 = 2.05%. In a subset of the four compound loci (FCA008, FCA096, FCA166, and FCA262) where it was possible to observe size homoplasy, 32 of 562 pairwise comparisons (5.7%) exhibited size homoplasy. Thus, for comparisons within the puma, a species which exhibits considerable genomic diversity, the incidence of allele size homoplasy was rather low (2.1%5.7%).
Alternatively, Taylor, Sanny, and Breden (1999)
suggested that simple uninterrupted microsatellite homoplasy could be estimated by assuming that compound microsatellites have the same mutation rate and pattern as simple uninterrupted microsatellites. From our data (fig. 1B
), dividing the number of allele sequences by the number of allele sizes (18/13 = 1.4), we estimate that for every microsatellite allele size, there are approximately 1.4 sequences, a somewhat smaller estimate than the 1.66 alleles estimated by Taylor, Sanny, and Breden (1999)
, but larger than what was actually observed.
In our sample, we observed five pairs of comigrating homoplastic alleles. Two pairs originated from North America, one pair originated from North America/South America, and two pairs originated from Central America/South America (fig. 1B
). One pair (FCA096-171 nt) of alleles originated from the same subspecies, and four pairs (FCA262-136 nt, FCA008-106 nt, FCA008-124 nt, and FCA008-86 nt) were from widely different geographical areas. The overall incidence of allele size identity across all pumas was approximately 29%, but the estimates were even higher among geographically adjacent populations (61%). Although comigrating alleles were more common within adjoining or closely adjoining subspecies, the comigrating alleles from distant geographic regions were more likely to exhibit size homoplasy, perhaps because spacial isolation over time increases the incidence of compound interrupted repeat allele formation. The pattern of compound repeat distribution over space was clinal in at least two cases (FCA008 and FCA262), providing additional information relevant to population assessment in phylogeographic analysis (see also Angers and Bernatchez 1997
; Taylor, Sanny, and Breden 1999
).
Models used to explain the mutational process at microsatellites rest on the assumption that differences between alleles are due entirely to changes in the number of repeat units (Tautz 1989
; Weber 1990a
). A number of genetic distance measures describing evolution at microsatellite loci also rely on the same assumption (Goldstein et al. 1995
; Slatkin 1995
). This data set examines this assumption on inter- and intraspecies levels. Among examined compound repeats, 74% of inferred mutational events were reflected by allele length differences within puma species. In contrast, the vast majority of mutational events between puma and cat alleles were not reflected in differences in allele length. The exact magnitude of interspecies changes could not be determined with any level of confidence due to extensive molecular signature changes within some of the repeat structures between the two species. These observations affirm the utility and power of microsatellite analyses in population and phylogenetic analyses within adequately sampled species (Culver et al. 2000)
. In contrast, the high incidence of size homoplasy between species identifies a potential bias around the efficacy of microsatellites in comparisons of distantly related species. Illustrated here and elsewhere, allele length comigrations of homologous loci from evolutionarily divergent species exhibit size homoplasy so frequently as to invalidate available phylogenetic models (Estoup et al. 1995
; Garza, Slatkin, and Freimer 1995
; Angers and Bernatchez 1997
; Primmer and Ellegren 1998
; Viard et al. 1998
; Colson and Goldstein 1999
).
|
We thank Victor David, Stanley Cevario, and Janice Martenson for expert technical assistance in this project. We also appreciate Gavin Huttley, Michael Smith, and J. Claiborne Stephens for beneficial discussions on these data. The content of this paper does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
Footnotes
Charles F. Aquadro, Reviewing Editor
1 Keywords: size homoplasy
puma microsatellites
microsatellite sequences
2 Address for correspondence and reprints: Stephen J. O'Brien, Laboratory of Genomic Diversity, National Cancer Institute, Building 560, Room 21-105, Frederick, Maryland 21702. obrien{at}mail.ncifcrf.gov
literature cited
Angers, B., L. Bernatchez. 1997. Complex evolution of a Salmonid microsatellite locus and its consequences in inferring allelic divergence from size information. Mol. Biol. Evol. 14:230238[Abstract]
Blanquer-Maumont, A., B. Crouau-Roy. 1995. Polymorphism, monomorphism, and sequences in conserved microsatellites in primate species. J. Mol. Evol. 41:492497[ISI][Medline]
Chakraborty, R., M. Nei. 1982. Genetic differentiation of quantitative characters between populations of species: I. Mutation and random genetic drift. Genet. Res. Camb. 39:303314
Colson, I., D. B. Goldstein. 1999. Evidence for complex mutations at microsatellite loci in Drosophila. Genetics. 2:617629
Culver, M., W. E. Johnson, J. Pecon-Slattery, S. J. O'Brien. 2000. Genomic ancestry of the American pumas (Puma concolor). J. Hered. 91:186197
Estoup, A., C. Taillez, J.-M. Cornuet, M. Solignac. 1995. Size homoplasy and mutational processes of interrupted microsatellites in two bee species, Apis mellifera and Bombus terrestris (Apidae). Mol. Biol. Evol. 12:10741084[Abstract]
FitzSimmons, N. N., C. Moritz, S. S. Moore. 1995. Conservation and dynamics of microsatellite loci over 300 million years of marine turtle evolution. Mol. Biol. Evol. 12:432440[Abstract]
Garza, J. C., M. Slatkin, N. B. Freimer. 1995. Microsatellite allele frequencies in humans and chimpanzees, with implications for constraints on allele size. Mol. Biol. Evol. 12:594603[Abstract]
Goldstein, D. B., A. R. Linares, L. L. Cavalli-Sforza, M. W. Feldman. 1995. An evaluation of genetic distances for use with microsatellite loci. Genetics. 139:463471
Goldstein, D. B., D. D. Pollock. 1997. Launching microsatellites: a review of mutation processes and methods of phylogenetic inference. J. Hered. 88:335342[ISI][Medline]
Grimaldi, M. C., B. Crouau-Roy. 1997. Microsatellite allelic homoplasy due to variable flanking sequences. J. Mol. Evol. 44:336340[ISI][Medline]
Jarne, P., P. J. L. Lagoda. 1996. Microsatellites, from molecules to populations and back. TREE. 11:424429
Johnson, W. E., S. J. O'Brien. 1997. Phylogenetic reconstruction of the Felidae using 16S rRNA and NADH-5 mitochondrial genes. J. Mol. Evol. 44:S98S116
Kimmel, M., R. Chakraborty, D. N. Stivers, R. Deka. 1996. Dynamics of repeat polymorphisms under a forward-backward mutation model: within- and between-population variability at microsatellite loci. Genetics. 143:549555
Kimura, M., J. F. Crow. 1964. The number of alleles that can be maintained in a finite population. Genetics. 49:725738
Kruglyak, S., R. T. Durrett, M. D. Schug, C. F. Aquadro. 1998. Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations. Proc. Natl. Acad. Sci. USA. 95:1077410778
Levinson, G., G. Gutman. 1987. Slipped strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4:203221[Abstract]
Menotti-Raymond, M., V. A. David, L. A. Lyons, A. A. Schäffer, J. F. Tomlin, M. K. Hutton, S. J. O'Brien. 1999. A genetic linkage map of microsatellites in the domestic cat(Felis catus). Genomics. 57:923[ISI][Medline]
Menotti-Raymond, M. A., S. J. O'Brien. 1995. Evolutionary conservation of ten microsatellite loci in four species of Felidae. J. Hered. 86:319322[ISI][Medline]
Metzgar, D., D. Field, R. Haubrich, C. Wills. 1998. Sequence analysis of a compound coding-region microsatellite in Candida albicans resolves homoplasies and provides a high-resolution tool for genotyping. FEMS Immunol. Med. Microbiol. 20:103109[ISI][Medline]
Ohta, T., M. Kimura. 1973. A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet. Res. 22:201204[ISI][Medline]
OrtÍ, G., D. E. Pearse, J. C. Avise. 1997. Phylogenetic assessment of length variation at a microsatellite locus. Proc. Natl. Acad. Sci. USA. 94:1074510749
Primmer, C. R., H. Ellegren. 1998. Patterns of molecular evolution in avian microsatellites. Mol. Biol. Evol. 15:9971008[Abstract]
Slatkin, M.. 1995. A measure of population subdivision based on microsatellite allele frequencies. Genetics. 139:457462
Tautz, D.. 1989. Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res. 17:64636471[Abstract]
Taylor, J. S., J. S. P. Sanny, F. Breden. 1999. Microsatellite allele size homoplasy in the guppy (Poecilia reticulata). J. Mol. Evol. 48:245247[ISI][Medline]
Valdez, A. M., M. Slatkin, N. B. Freimer. 1993. Allele frequencies at microsatellite loci: the stepwise mutation model revisited. Genetics. 133:737749
Viard, F., P. Franck, M.-P. Dubois, A. Estoup, P. Jarne. 1998. Variation of microsatellite size homoplasy across electromorphs, loci, and populations in three invertebrate species. J. Mol. Evol. 47:4251[ISI][Medline]
Weber, J. L.. 1990a.. Human DNA polymorphisms based on length variations in simple-sequence tandem repeatsPp. 159181 in K. E. Davies and S. M. Tilghman, eds. Genome analysis, Vol. 1. Genetic and physical mapping. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y
.1990b.. Informativeness of human (dC-dA)n polymorphisms. Genomics. 7:524530
Weber, J. L., C. Wong. 1993. Mutation of human short tandem repeats. Hum. Mol. Gen. 2:11231128[Abstract]