Parallel Patterns of Sequence Variation Within and Between Populations at Three Loci of Arabidopsis thaliana

Helmi Kuittinen*,{dagger}, David Salguero* and Montserrat Aguadé*

*Departament de Genètica, Universitat de Barcelona, Spain;
{dagger}Department of Biology, University of Oulu, Finland

Evolutionary factors influencing naturally occurring diversity in the model species Arabidopsis thaliana are not fully understood. Nucleotide variation at several nuclear loci has been studied in samples consisting of worldwide accessions, usually one individual per population. Some loci seem to be highly dimorphic (e.g., Innan et al. 1996Citation ; Aguadé 2001Citation ), whereas for other loci there is no evidence of diverged haplotype groups (Kuittinen and Aguadé 2000Citation ). On the contrary, variants at AFLP polymorphic sites do not occur commonly at intermediate frequencies (Miyashita, Kawabe, and Innan 1999Citation ). It is not clear yet if the detected dimorphism could be due to past population structure, some form of balancing selection, other selective forces, or random drift in a selfing species. In inbreeders, background selection (Charlesworth, Morgan, and Charlesworth 1993Citation ), adaptive sweeps (Hedrick 1980Citation ) and local adaptation (Allard, Jain, and Workman 1968Citation ), as well as drift and founder effects (Schoen and Brown 1991Citation ) are thought to play a major role in shaping variation.

The population structure and dynamics of A. thaliana have not yet been fully characterized either. As a weedy inbreeder, A. thaliana may reproduce as a metapopulation with recurrent extinction and recolonization of sites (Bergelson et al. 1998Citation ; Innan and Stephan 2001). Studies using isozymes or microsatellite markers (Abbot and Gomez 1989Citation ; Todokoro, Terauchi, and Kawano 1995Citation ; Kuittinen, Mattila, and Savolainen 1997Citation ) and RFLPs (Bergelson et al. 1998Citation ) have shown that A. thaliana populations are sometimes polymorphic but often consist of only one multilocus genotype. Weak or no correlation of geographical and genotypic distance (e.g., Innan et al. 1996Citation ; Scharbel, Haubold, and Mitchell-Olds 2000Citation ) suggests recent spread of A. thaliana all over the world, but this could also be due to metapopulation dynamics (Wakeley and Aliacar 2001Citation ). Both the worldwide expansion of the species and purifying selection have been claimed to explain the excess of singleton variants detected in some genes (e.g., Kawabe et al. 1997Citation ; Purugganan and Suddith 1998Citation ; Innan and Stephan 2000Citation ).

Studies of sequence polymorphism at several loci in natural populations should help to discern the roles of different evolutionary factors. In A. thaliana, like in most inbreeding species, some outcrossing occurs (Abbot and Gomez 1989Citation ). Indeed, worldwide surveys of nucleotide variation in particular gene regions have revealed historical recombination in this species (as summarized in Kuittinen and Aguadé 2000Citation ). Thus, even in this inbreeding species, selection would affect only specific genes, or regions of the genome, whereas demographic events, such as bottlenecks, should affect all loci in a similar way.

This is the first study of nucleotide variation in natural populations of A. thaliana based on sequence comparison. We have studied variation in three genes (CHI, FAH1, and F3H) previously analyzed in a worldwide sample. In both FAH1 and F3H, two distinct haplotypes were present at intermediate frequencies, resulting in positive Tajima's D values (Aguadé 2001Citation ); no dimorphism was, however, detected in CHI (Kuittinen and Aguadé 2000Citation ). We partition variation within and between populations and compare it with variation found in worldwide samples and with patterns found in other inbreeding and outcrossing species. Second, we compare two loci that show dimorphism with a locus without dimorphism.

We studied samples from four randomly chosen natural populations (Granollers, Lulep Istjak, Wilp and Eden), from one population that was known to be polymorphic based on microsatellite variation (Ruds Vedby) and from one population that was monomorphic for 12 isozyme loci (Tvärminne) (Kuittinen, Mattila, and Savolainen 1997Citation ). For each population, 6–10 individuals were studied (fig. 1 ). For CHI, FAH1, and F3H, 1.9-, 1.4-, and 1.2-kb long fragments were sequenced, respectively, and the sequences were analyzed as described previously (Kuittinen and Aguadé 2000Citation ; Aguadé 2001Citation ). We calculated total ({pi}T), within ({pi}S), and between population nucleotide diversities ({pi}T{pi}S) (Charlesworth 1998Citation , formulas 1a and b). We used both silent and nonsynonymous sites for calculations. Newly determined sequences will appear in the EMBL sequence database under accession numbers AJ492461–AJ492503, AJ492832–AJ492869, and AJ493128–AJ493163.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1.—Haplotypes at the CHI, FAH1 and F3H loci. A, One-locus haplotypes and their frequencies in Granollers (GN), Lulep Istjak (LU), Wilp (WP), Eden (ED), Tvärminne (TV), and Ruds Vedby (RV). For the origins of the populations see Kuittinen and Aguadé (2000)Citation . Nucleotides are numbered as in Kuittinen and Aguadé (2000)Citation and Aguadé (2001)Citation . A dot indicates the same nucleotide as in the first sequence. In FAH1, a double bar above polymorphic sites 574–580 indicates a complex mutational event. Hap., haplotype; e, exon; d, deletion. B, Haplotypes across the three loci; n, number of sequences

 
The division of haplotypes into two major classes previously found in FAH1 and F3H in a worldwide sample could also be seen in this study (fig. 1 ). Similar to the worldwide sample, CHI showed no dimorphism, and it exhibited a high number of both polymorphic sites and haplotypes. CHI was polymorphic in three of the studied populations, whereas F3H and FAH1 were polymorphic in two and one populations, respectively. Overall diversity was higher in F3H ({pi}T = 0.0079, standard deviation (SD) = 0.00065) than in FAH1 ({pi}T = 0.0045, SD = 0.00070) and CHI ({pi}T = 0.0036, SD = 0.00015). These values are close to those previously found in worldwide samples (fig. 2 ).



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 2.—Levels of nucleotide variation. A, Nucleotide diversity in the worldwide sample (Ww), and within different populations at the CHI (black columns), FAH1 (gray columns), and F3H (white columns) regions. B, Total ({pi}T), within population ({pi}S), and between populations nucleotide diversity ({pi}T{pi}S) at the three loci in all populations and in the set of randomly selected populations (GN, LU, WP, and ED). Abbreviations of the population names are as in figure 1

 
Three of the four randomly chosen populations were monomorphic at all loci studied (see fig. 1 ). One of the randomly selected populations (Granollers) and one population that was known to be polymorphic based on microsatellites (Ruds Vedby) were polymorphic for two and three genes, respectively. Furthermore, one population that was known to be monomorphic for isozymes (Tvärminne) (Kuittinen, Mattila, and Savolainen 1997Citation ) harbored two CHI haplotypes (fig. 1 ), but it was monomorphic at the other two genes. In the polymorphic loci, the number of different haplotypes in Granollers and Ruds Vedby varied between three and four. In some cases, the two major haplotypes of FAH1 and F3H were present in a single population (fig. 1 ). All three loci were scored for 28 individuals. Some one-locus haplotypes but none of the three-locus haplotypes were shared between populations (fig. 1 ).

In the four randomly chosen populations, the average within-population diversity ({pi}S) was 0.00078, 0, and 0.00152 at CHI, FAH1, and F3H, respectively. The diversity between populations ({pi}T{pi}S) was accordingly 79%, 100% and 81% of the total diversity (fig. 2 ). Variation between populations was thus the largest component of total diversity in the randomly chosen populations. But diversity within polymorphic populations ({pi}S) for CHI, FAH1, and F3H was on average of the same order (0.00234, 0.00579, and 0.00584, respectively) than in the worldwide collection. Similar results were obtained when only variation at silent sites (noncoding sites and synonymous sites in coding regions) was considered (results not shown).

Most genetic diversity was detected between populations, as previously found in A. thaliana (Kuittinen, Mattila, and Savolainen 1997Citation ; Bergelson et al. 1998Citation ). There were also large differences between populations in diversity, some of the populations being monomorphic and some polymorphic. These results are in concordance with a large survey of isozyme variation, where a larger fraction of the diversity was between populations and the level of diversity varied more between populations in inbreeders than in outbreeders (Schoen and Brown 1991Citation ). A similar pattern was found when nucleotide variation at the ADH locus was studied in a number of outcrossing and inbreeding Leawenworthia species (Liu, Zhang, and Charlesworth 1998Citation ). Thus, the overall pattern observed in A. thaliana would be typical of inbreeding species.

Several factors can contribute to the patterns of variability in inbreeding species. Inbreeding reduces effective population size (Pollak 1987Citation ), and background selection (Charlesworth, Morgan, and Charlesworth 1993Citation ) and bottlenecks in extinction-recolonization cycles (Schoen and Brown 1991Citation ; Pannell and Charlesworth 1999Citation ) are expected to be more prominent in inbreeders than outbreeders. They should result in low diversity over the entire genomes. Selective sweeps may contribute to the reduction in diversity within populations (Hedrick 1980Citation ), and local selection can result in both low diversity within populations and high divergence between populations (Charlesworth, Nordborg, and Charlesworth 1997Citation ). Selective sweeps and local selection are expected to affect only a part of all genes. Indeed, selection at a very small scale may occur in A. thaliana (Stratton and Bennington 1996Citation ).

The dimorphic pattern of variation observed at FAH1 and F3H was suggestive of balancing selection (Aguadé 2001Citation ). If balancing selection maintains polymorphism within demes, there should be a peak in the between-allele component of diversity close to the selected site, and most of the variation should exist within demes (Charlesworth, Nordborg, and Charlesworth 1997Citation ). Thus, if the divergent haplotypes at FAH1 and F3H were maintained within populations by some form of balancing selection, one would expect both haplotypes at intermediate frequencies within populations and thus higher levels of polymorphism within populations in these genes than at CHI. Consequently, the level of divergence among populations ({pi}T{pi}S) should be higher for CHI than for FAH1 and F3H. But nucleotide diversity was divided between and within populations similarly in all three genes. Thus, it would seem that the bottlenecks (and founder effects) associated with the origin of local populations of A. thaliana, and not selection, had shaped variation within populations in these genes. Indeed, considering that local populations of this species are young, the data presented here would not support that balancing selection is maintaining dimorphism in FAH1 and F3H within demes; however, they would not preclude other forms of selection acting on those genes at a larger geographical scale. The high between population diversity and low within population diversity may thus be due to metapopulation dynamics of A. thaliana (see Pannell and Charlesworth 2000Citation ). The species-wide diversity of A. thaliana is high despite the expectations of typical metapopulation models predicting decrease in diversity (Pannell and Charlesworth 1999Citation ) and may reflect the high actual number of A. thaliana individuals or populations. The parameters needed to describe the metapopulation dynamics in A. thaliana, such as extinction and recolonization rates of colonies and migration rates and patterns are not known and would need further, more detailed studies.

In CHI, in which there is no evidence for dimorphism, the average number of differences between nonidentical haplotypes in the set of three polymorphic populations (7.4) is of the same order than between nonidentical haplotypes within any single population (6 in Tvärminne, 9.3 in Granollers, and 4.8 in Ruds Vedby; average 6.7). Thus, in polymorphic populations, haplotypes present at any given population are not more closely related among themselves than with those present in other populations, if multiplication of the haplotypes is excluded. Genealogical trees, based on both the current data and the data of Eurasian ecotypes (Kuittinen and Aguadé 2000Citation ; Aguadé 2001Citation ), reconstructed for each locus separately or for the three loci jointly indicated no geographic structure of the data (results not shown). There was no relationship between geographic and genetic distance. This observation is consistent with data from several nuclear loci (e.g., Innan et al. 1996Citation ; Bergelson et al. 1998Citation ). For the three genes studied here, polymorphic populations were highly variable, which was due to the high haplotype diversity and nonrelatedness of the haplotypes within population. This result confirms the importance of using single-seed accessions, as opposed to bulk-seed accessions, in studies that require homogenous genetic background, e.g., when correlations between naturally occurring phenotypic variation and molecular variation at candidate genes are analyzed. On the other hand, it suggests that association studies can be conducted also within populations.

Acknowledgements

We thank Outi Savolainen, Peter van Tiederen, and Jon Ågren for sharing A. thaliana seeds and Serveis Cientifico-Técnics from Universitat de Barcelona for sequencing facilities. This study was supported by postdoctoral fellowships from the Environmental and Natural Resources Research Council of Finland (grant 41815) and the European Science Foundation to H.K., and by grants PB97-0918 from Dirección Genetal de Investigación Científica y Técnica, Spain, and 1999SGR-25 from Comissió Interdepartamental de Recerca i Tecnologia, Generalitat de Catalunya to M.A.

Footnotes

Wolfgang Stephan, Reviewing Editor

Keywords: CHI FAH1 F3H nucleotide diversity balancing selection natural variation Back

Address for correspondence and reprints: Helmi Kuittinen, Department of Biology, University of Oulu, P.O. Box 3000, Oulu 90014, Finland. helmi.kuittinen{at}oulu.fi Back

References

    Abbot R. J., M. F. Gomez, 1989 Population genetic structure and outcrossing rate of Arabidopsis thaliana (L.) Heynh Heredity 62:411-418[ISI]

    Aguadé M., 2001 Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana Mol. Biol. Evol 18:1-9[Abstract/Free Full Text]

    Allard R. W., S. K. Jain, P. L. Workman, 1968 The genetics of inbreeding populations Adv. Genet 14:55-131[ISI]

    Bergelson J., E. Stahl, S. Dudek, M. Kreitman, 1998 Genetic variation within and among populations of Arabidopsis thaliana Genetics 148:1311-1323[Abstract/Free Full Text]

    Charlesworth B., 1998 Measures of divergence between populations and the effects of forces that reduce variability Mol. Biol. Evol 15:538-543[Abstract]

    Charlesworth B., M. T. Morgan, D. Charlesworth, 1993 The effect of deleterious mutations on neutral molecular variation Genetics 134:1289-1303[Abstract/Free Full Text]

    Charlesworth B., M. Nordborg, D. Charlesworth, 1997 The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations Genet. Res. Camb 70:155-174[ISI][Medline]

    Hedrick P., 1980 Hitch-hiking: a comparison of linkage and partial selfing Genetics 94:791-808[Abstract/Free Full Text]

    Innan H., W. Stephan, 2000 The coalescent in an exponentially growing metapopulation and its application to Arabidopsis thaliana Genetics 155:2015-2019[Free Full Text]

    Innan H., F. Tajima, R. Terauchi, N. T. Miyashita, 1996 Intragenic recombination in the Adh locus of the wild plant Arabidopsis thaliana Genetics 143:1761-1770[Abstract/Free Full Text]

    Kawabe A. H., Innan R., Terauchi R., N. T. Miyashita, 1997 Nucleotide polymorphism in the acidic chitinase locus (ChiA) region of the wild plant Arabidopsis thaliana Mol. Biol. Evol 14:1303-1315[Abstract]

    Kuittinen H., M. Aguadé, 2000 Nucleotide variation at the chalcone isomerase locus in Arabidopsis thaliana Genetics 155:863-872[Abstract/Free Full Text]

    Kuittinen H., A. Mattila, O. Savolainen, 1997 Genetic variation at marker loci and in quantitative traits in natural populations of Arabidopsis thaliana Heredity 79:144-152[ISI][Medline]

    Liu F., L. Zhang, D. Charlesworth, 1998 Genetic diversity in Leavenworthia populations with different inbreeding levels Proc. R. Soc. Lond. B 265:293-301[ISI][Medline]

    Miyashita M. T., A. Kawabe, H. Innan, 1999 DNA variation in the wild plant Arabidopsis thaliana revealed by amplified fragment length polymorphism analysis Genetics 152:1723-1731[Abstract/Free Full Text]

    Pannell J. R., B. Charlesworth, 1999 Neutral genetic diversity in a metapopulation with recurrent local extinction and recolonization Evolution 53:664-676[ISI]

    ———. 2000 Effects of metapopulation processes on measures of genetic diversity Phil. Trans. R. Soc. Lond. B 355:1851-1864[ISI][Medline]

    Pollak E., 1987 On the theory of partially inbreeding finite populations Genetics 117:353-360[Abstract/Free Full Text]

    Purugganan M. D., J. I. Suddith, 1998 Molecular population genetics of the Arabidopsis cauliflower regulatory gene: non-neutral evolution and naturally occurring variation in floral homeotic function Proc. Natl. Acad. Sci. USA 95:8130-8134[Abstract/Free Full Text]

    Scharbel T. F., B. Haubold, T. Mitchell-Olds, 2000 Genetic isolation by distance in Arabidopsis thaliana: biogeography and postglacial colonization of Europe Mol. Ecol 9:2109-2118[ISI][Medline]

    Schoen D. J., A. H. D. Brown, 1991 Intraspecific variation in population gene diversity and effective population size correlates with the mating system in plants Proc. Natl. Acad. Sci. USA 88:4494-4497[Abstract]

    Stratton D. A., C. C. Bennington, 1996 Measuring spatial variation in natural selection using randomly-sown seeds of Arabidopsis thaliana J. Evol. Biol 9:215-228[ISI]

    Todokoro S., R. Terauchi, S. Kawano, 1995 Microsatellite polymophisms in natural populations of Arabidopsis thaliana in Japan Jpn. J. Genet 70:543-554

    Wakeley J., N. Aliacar, 2001 Gene genealogies in a metapopulation Genetics 159:893-905[Abstract/Free Full Text]

Accepted for publication July 20, 2002.