Trichome Distribution in Arabidopsis thaliana and its Close Relative Arabidopsis lyrata: Molecular Analysis of the Candidate Gene GLABROUS1

Marie-Theres Hauser, Bettina Harr and Christian Schlötterer

Zentrum für angewandte Genetik, Universität für Bodenkultur, Wien, Vienna, Austria;
Institut für Tierzucht und Genetik, Veterinärmedizinische Universität Wien, Vienna, Austria


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
GLABROUS1 (GL1) belongs to the large family of MYB transcription factors and is known to play a central role in trichome initiation. We studied trichome distribution and the molecular variation of GL1 in 28 A. thaliana accessions. Trichome density on rosette leaves was highly variable among those accessions. On the molecular level, we detected substantial sequence variation in a 3-kb fragment which included the complete coding region of the GL1 locus ({pi} = 0.01). Phylogenetic analysis of GL1 indicates the presence of two diverged clades among 28 accessions. Using ANOVA, we show that the phenotypic variation in trichome density cannot be explained by the sequence divergence between the two phylogenetic lineages. Sequence analysis of wild-type Arabidopsis thaliana and Arabidopsis lyrata accessions indicates that all amino acid substitutions are located outside of the conserved helix-turn-helix DNA-binding domains R2 and R3. Using plants of A. thaliana and A. lyrata with either naturally occurring or ethyl methane sulfonate–induced glabrous phenotypes, we demonstrate that the last 14 C-terminal amino acids of the GL1 gene have no major impact on the initiation of trichomes.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
Many phenotypic traits of interest to evolutionary biologists, breeders, and human geneticists are quantitative in nature. The observed variation for these traits can be partitioned into an environmental and a genetic component (Lynch and Walsh 1998Citation ). While the identification of the genes underlying quantitative traits is of great interest, analyses are complicated by the fact that the number of contributing genes is often not known. Furthermore, alleles at these loci may have different effects, depending on the effects of variants segregating at other loci (i.e., epistasis).

Recently, it has been suggested that for the identification of genes that contribute to variation in complex traits, association-based methods could be more powerful than linkage studies (Risch and Merikangas 1996Citation ). The principle underlying association tests is that the joint distribution of genotypes and phenotypes in population samples are studied (Long and Langley 1999Citation ). At its most extreme, an association study is based on every polymorphic site in the genome, including the polymorphism on which the quantitative trait is based. Until further technological advances are made, a realistic association study relies on the detection of polymorphic DNA markers that are in linkage disequilibrium with the trait-causing site. Alternatively, association studies could take advantage of the recent progress made in genomics and developmental genetics and use the available set of candidate genes (Long et al. 1998Citation ; Long and Langley 1999Citation ).

Arabidopsis trichomes are single epidermal cells on leaves, stems, petioles, and sepals. Among Arabidopsis thaliana accessions, the density of trichomes has been shown to vary in a quantitative manner. Genetic analyses have identified several candidate loci which have been demonstrated to affect trichome number in A. thaliana. Two genes, GLABROUS1 (GL1) and TRANSPARENT TESTA GLABRA (TTG), are essential for trichome initiation, and null mutants produce completely glabrous plants (Koornneef, Dellaert, and van der Veen 1982Citation ; Oppenheimer et al. 1991Citation ). Mutations in GL3 cause a reduced number of trichomes and also affect the branching pattern of the trichomes (Koornneef, Dellaert, and van der Veen 1982Citation ; Payne, Zhang, and Lloyd 2000Citation ). CAPRICE (CPC) and TRIPTYCHON (TRY) have also been shown to influence trichome number (Hülskamp, Miséra, and Jürgens 1994Citation ; Wada et al. 1997Citation ). Furthermore, in a QTL study, one locus, called REDUCED TRICHOME NUMBER (RTN), could be associated with a smaller number of trichomes in the accession Landsberg erecta when compared with Columbia plants (Larkin et al. 1996Citation ).

Most of these candidate genes have already been cloned. This provides the opportunity to test naturally occurring molecular variation at the candidate genes for an association with the trichome phenotype (e.g., trichome density). Thus, the analysis of natural populations may complement classical mutational analyses to understand gene function and provide insight into the molecular changes required for adaptation.

Here, we studied the GL1 gene, which belongs to the group of R2R3-MYB transcription factors. This class of proteins is characterized by two repeat motifs, which are most similar to the second and third repeats of the three animal MYB DNA-binding domains. We analyzed the molecular variation of 3-kb genomic sequence containing the GL1 locus. Phenotypic analysis of trichome number and distribution in the same set of A. thaliana accessions was used to search for an association with the molecular variation at the GL1 locus.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
Plant Material
Arabidopsis thaliana seeds were obtained from Ken Feldmann (Ws), Philip Benfey (Col), Gerd Jürgens (Nd-0), and the Nottingham Stock Centre (all others; table 1 ). M. Hülskamp provided two unpublished gl1-65 and gl1-323 alleles which were isolated from ethyl methane sulfonate (EMS) mutagenized Landsberg erecta (Ler) seeds. Arabidopsis lyrata seeds (Karhumäki, Russia, and Plech, Germany) were kindly provided by Outi Savoulainen and Marcus Koch. The A. lyrata plant from Russia was phenotypically glabrous. Seeds were germinated after at least 2 days of stratification at 4°C on sterile nutrient agar plates (Benfey et al. 1993Citation ). Fourteen days after germination, the seedlings were transferred to soil and cultivated in the same growth chamber under 16 h light at 22°C.


View this table:
[in this window]
[in a new window]
 
Table 1 Phenotypic Analysis of Trichome Distribution in Arabidopsis thaliana Accessions

 
Phenotypic Analysis
Trichome numbers were counted on late rosette leaves, cauline leaves, and stems. Trichome numbers on the adaxial surface of the rosette were normalized for the leaf area to account for differences in leaf sizes between accessions; trichome densities are given per mm2. Trichome numbers were determined for one complete leaf for each plant. For each accession, one to seven plants were analyzed.

DNA Sequencing
Genomic DNA was isolated from leaves by a modified CTAB method as described in Hauser et al. (1998)Citation . The GL1 sequence was determined by direct sequencing of PCR products. GL1 was amplified in two pieces using the PCR primers Gl1AF (5'-TTA TAG CCA TGA TTA CAC AAA G 3-') and Gl1R (5'-CCA TGA TCC GAA GAG ACT AT-3'), as well as the primers Gl1F (5'-ATA TTG AGT ACT GCC TTT AG 3-') and Gl1HR (5'-ATG TAT GTT TAC ATT TCG AGT GC 3-'). A BLAST search was performed with all PCR primers to verify that only the GL1 locus was amplified. All PCR primers had homology to published GL1 sequences only. PCR conditions were as follows: a 3-min denaturing step at 94°C was followed by 35 cycles each consisting of 55°C for 15 s, 72°C for 1.5 min, and 94°C for 15 s. The two sequences overlapped by 519 bp. The orthologous GL1 sequence from A. lyrata was amplified using the PCR primers Gl1AF and Gl1R and the primers Gl1BF (5'-CCA CAA GCT CCT CGG CAA TAG-3') and Gl1IR (5'-CTA CGC GGA AGA TAT CAA CAC AAC-3'). PCR products were purified using the QIAquick PCR purification kit (Qiagen, Hilden, Germany). Sequencing primers were located approximately 550 bp apart. Each accession was sequenced in both directions on an ABI automated sequencer using the BigDye terminator sequencing chemistry. The DNA sequences are available from GenBank (accession numbers AF263690AF263721).

Data Analysis
Sequence contigs were generated with the AutoAssember software (Perkin Elmer), and all mutations were individually verified using the original electropherograms. Sequences were aligned using CLUSTAL W (Thompson, Higgins, and Gibson 1994Citation ) and manually adjusted. Because of the large indels between A. thaliana and A. lyrata sequences, homologous regions were identified with a dot matrix. A table of polymorphic sites was generated using the SITES software (Hey and Wakeley 1997Citation ). Tests of neutrality, recombination, and linkage disequilibrium were performed using DNA SP 3.14 (Rozas and Rozas 1999Citation ). Indel polymorphisms were excluded for tests of neutrality, recombination analysis, and estimates of nucleotide diversity ({pi}). Complex mutations consisting of an insertion and deletion were treated as single events. Similarly, microsatellite polymorphisms were treated as a single indel at each microsatellite stretch. For all tests requiring an outgroup, the hairy A. lyrata individual was used.

Phylogenetic reconstruction was carried out with the PUZZLE 4.0 software (Strimmer and von Haeseler 1996Citation ) using the Tamura-Nei model of sequence evolution (Tamura and Nei 1993Citation ) and eight categories of rate heterogeneity. Treeview (Page 1996Citation ) was employed for the graphical representation of the tree. DNA slider software was used (10,000 replicates) to test for heterogeneity in the distribution of base substitutions (McDonald 1996, 1998Citation ). In particular, we used the three most powerful test statistics: the runs test (runs), the mean sliding G statistic (average G), and the Kolmogorov-Smirnov statistic (K-S) (McDonald 1998Citation ).

ANOVA analysis was carried out with the SPSS software package. Trichome densities were ln-transformed to achieve normality. To evaluate the statistical significance of differences in trichome densities among A. thaliana accessions, we used trichome density measurements of one to seven plants per accession for a one-way ANOVA. The measurements of trichome densities were averaged for each accession to determine the influence of molecular variation on trichome density. Since A. thaliana has a low rate of effective recombination, sites in the 3-kb fragment of the GL1 locus are not independent of each other. To account for the correlation of molecular variation, we used a cladistic grouping of the A. thaliana accessions (see table 2 and fig. 3 ). Three different grouping levels were considered in addition to the ANOVA based on individual accessions. It should be noted that our cladistic analysis differs from the one proposed by Templeton, Boerwinkle, and Sing (1987)Citation in the way we defined cladistic groups.


View this table:
[in this window]
[in a new window]
 
Table 2 ANOVA on Trichome Density Based on Various Cladistic Groupings

 


View larger version (17K):
[in this window]
[in a new window]
 
Fig. 3.—Maximum-likelihood tree of all 26 Arabidopsis thaliana ecotypes using the full GL1 sequence. Significance for the observed branchings is given in percentage of supporting quartets (Strimmer and von Haeseler 1996Citation ). Glabrous individuals are marked with an asterisk. The vertical bars indicate the cladistic grouping used for the ANOVA analysis to test for an association between genotype and trichome density. Light gray corresponds to level 1, dark gray to level 2, and black to level 3 (see table 2 )

 

    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
Naturally Occurring Nucleotide Variation at the GL1 Locus
Approximately 3 kb of the GL1 locus were sequenced in 28 A. thaliana accessions. The sequences included all three exons, together with approximately 0.2 kb of the flanking 5' region and 1.4 kb of the 3' region. Our molecular analysis revealed high sequence variability, particularly in the 3' half of the GL1 locus. Twenty-six accessions were selected without a priori knowledge of the trichome phenotype. Two additional accessions (Br-0, Mir-0) known to be glabrous were also included in the analysis. To obtain an unbiased estimate of naturally occurring variation at the GL1 locus, we delimited the description of molecular variation to the 26 wild-type accessions. A total of 28 indels and 107 polymorphic sites were detected, of which 81 sites were phylogenetically informative and 26 sites were singletons (see table 3 ). Indels at positions 101–106, 118–125, 324–325, and 2352–2353 are caused by length variation of dinucleotide microsatellites. The only indel in the coding region was detected in the third exon and resulted in the loss of a single serine (Ms-0, Kas-1) at amino acid 148. The estimate of the specieswide nucleotide polymorphism ({pi}) was 0.0109, which is higher than the estimate for most A. thaliana genes (Purugganan and Suddith 1999Citation ). Of 229 codons, 12 were polymorphic, with 6 replacement and 6 synonymous sites. All polymorphisms detected in the coding region were confined to the third exon (positions 1422–1850). The accession Can had two replacements, G130V and D132E. Three replacements, R149C, S222G, and I224V, were found in the accessions Di-0, Ler, Col, Te-0, Bla-1, Ba-1, Sha, and Cond. Furthermore, the accessions Di-0 and Ler had an additional replacement (P176R).


View this table:
[in this window]
[in a new window]
 
Table 3 Nucleotide Polymorphism in 26 Arabidopsis thaliana Accessions

 
Distribution of Polymorphic Sites at the GL1 Locus
To investigate the distribution of the polymorphic sites at the GL1 locus, we performed a sliding-window analysis. Figure 1 shows that almost the entire polymorphism is observed in the 3' half of the sequence. Such a distribution of polymorphisms could be caused either by altered selective constraints in the different parts of the sequence or by balancing selection. To discriminate between these two hypotheses, we performed a cross-species comparison using the GL1 ortholog of A. lyrata. A sliding-window analysis of the genetic distances between A. thaliana and A. lyrata showed no evidence for an elevated divergence in the 3' region (fig. 2 ). On the contrary, the highest peak of sequence divergence was observed in the second intron around position 1300 in the alignment (fig. 2 ). To test for statistical significance of the discrepancy between the within- and between-species comparisons, we used several test statistics implemented in the DNA Slider software. Under a range (0–50) of recombination parameters (R = 4Nr; McDonald 1996Citation ), we detected with all three tests a significant heterogeneity in the distribution of polymorphic sites at the GL1 locus (table 4 ), confirming the results of the sliding-window analysis. Hence, the distribution of polymorphic sites at the GL1 locus in A. thaliana deviates significantly from the neutral expectations for a single panmictic population.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 1.—Sliding-window plot of average pairwise nucleotide differences ({pi}). The black line indicates the analysis with all 26 Arabidopsis thaliana ecotypes, and the two gray lines show the result for a separate analysis of the two clades (see fig. 2 ). Window size was 200 bp, and windows were moved in 10-bp steps. The organization of the GL1 locus is given below the plot. Boxes indicate the three exons, and lines depict noncoding regions

 


View larger version (17K):
[in this window]
[in a new window]
 
Fig. 2.—Sliding-window analysis of divergence between Arabidopsis lyrata and Arabidopsis thaliana (black line). For comparison, the gray line indicates the nucleotide diversity among all 26 A. thaliana accessions. Window size was 200 bp, and windows were moved in 10-bp steps. The organization of the GL1 locus is given below the plot. Boxes indicate the three exons, and lines depict noncoding regions

 

View this table:
[in this window]
[in a new window]
 
Table 4 P Values for Different Tests of Heterogeneity in the Ratio of Polymorphism to Divergence

 
Phylogenetic analysis grouped the GL1 sequences into two different clades (fig. 3 ). One clade (clade B) with eight accessions is separated with high statistical support from the remaining 20 sequences (clade A). Within both clades, the genetic distances are short compared with the between-clades distances (fig. 3 ). The two A. thaliana glabrous accessions were assigned to clade A. Similar to previous studies (Hanfstingl et al. 1994Citation ; Innan et al. 1996Citation ; Kawabe et al. 1997Citation ; Kawabe and Miyashita 1999Citation ; Aguade 2001Citation ), we did not detect an association between geographic origin of the accessions and their grouping to the two clades at the GL1 locus. This apparently random distribution of sequence variants indicates either a high migration rate or that these two clades were already present before the habitat expansion of A. thaliana.

To test whether the presence of two diverged clades could explain the observed deviation from neutrality, we repeated the sliding-window analysis for the two GL1 clades independently. The pairwise nucleotide differences among the members of a clade were found to be very similar over the entire sequence (fig. 1 ). Similarly, the test statistics implemented in DNA Slider were not significant for either of the two clades (table 4 ). These results would be consistent with the hypothesis of balancing selection acting on the 3' region of the GL1 locus. To further test to what extent the observed sequence variability at the GL1 locus is influenced by natural selection, we used several neutrality tests. While Tajima's D for all 26 A. thaliana GL1 sequences was positive, as expected under balancing selection (Tajima 1989Citation ), it was very small and not significant (D = 0.39, P > 0.1). Similarly, Fu and Li's (1993)Citation test statistics were also not significant (D = 0.41, P > 0.1; F = 0.43, P > 0.1). A McDonald-Kreitman test (McDonald and Kreitman 1991Citation ) based on the A. lyrata accession with trichomes and 26 wild-type A. thaliana accessions did not reveal a significant deviation from neutral expectations (G = 0.03, P = 0.87). In summary, the evidence for a deviation from neutral evolution of the GL1 locus is not very strong and is based only on an observed heterogeneity in the ratio of fixed to polymorphic sites.

Linkage Disequilibrium at the GL1 Locus
Significant linkage disequilibrium was detected at the GL1 locus, with most significant pairwise comparisons in the 3' region (positions 1601–3998; fig. 4 ). Out of 86 base substitutions and 21 indels in the 3' region, 44 substitutions and 17 indels were fixed between the assigned clades. These 61 polymorphisms were distributed over the entire 3' region. Any recombination event between the clades would result in shared mutations between the two clades. Because the large number of fixed differences between the two clades are distributed over the entire 3' region, very little or no recombination has occurred between them for a long time. The distribution of polymorphism in the 5' region shows the opposite trend. While in the 3' region about 51% of the base substitutions were fixed between the clades, none of the 21 polymorphic sites in the 5' region are fixed between the two clades. Only a single indel polymorphism (positions 327–330) was consistent with the assignment to the two clades for 23 of 26 accessions, resulting in a significant pairwise linkage disequilibrium with the 44 substitutions fixed between clades A and B (D = 0.723, P < 0.001; fig. 4 ).



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 4.—Pairwise linkage disequilibrium. Only significant pairwise comparisons are shown. Gray boxes indicate P values smaller than 0.001, and black boxes indicate significance after Bonferroni correction for multiple testing. With the exception of the indel polymorphism at positions 327–330, no indels were included in the analysis

 
Trichome Density and Sequence Variation at the GL1 Locus
We used four different approaches to correlate molecular and phenotypic variation. First, we tested whether the remarkable difference between the GL1 sequences was reflected in the trichome phenotype. Second, we sequenced the GL1 locus in additional accessions, which showed a strong reduction in trichome numbers. Third, we also analyzed the phenotype and genotype of two EMS-induced GL1 mutations (kindly donated by M. Hülskamp). Finally, we studied the GL1 locus in a closely related species, A. lyrata, comparing a glabrous and a wild-type plant.

Analysis of 26 Worldwide Wild-Type Accessions
We counted the numbers of trichomes on different parts of the plant in most of the accessions sequenced. Substantial variation was detected in the distribution and density of trichomes between these accessions (table 1 ). An ANOVA indicated that much of the observed phenotypic variation could be attributed to differences among accessions (P < 0.001). The highest trichome density was observed in the accession Can, with an average of 3.84 trichomes/mm2, which is more than twice as high as the trichome density observed in Cond, the accession with the second-highest trichome density. The lowest trichome density was observed in Gr-1, with 0.3 trichomes/mm2.

While it would be highly desirable to test all polymorphic sites at the GL1 locus for an association between phenotype and molecular variation, the low effective recombination rate of A. thaliana significantly complicates the interpretation of such an analysis, as the polymorphic sites are highly correlated. Thus, we used a cladistic approach to group the A. thaliana accessions in our study according to their shared history. Three different cladistic levels were analyzed by ANOVA to test for a correlation with trichome density, but none of the analyses were statistically significant (table 2 ). Interestingly, the extreme sequence divergence in the 3' region of the GL1 locus, dividing the GL1 sequences into two clades, could not be associated with trichome density. In addition, no pattern emerged for the other plant organs (table 1 ).

Analysis of Glabrous Plants
We used four natural accessions which were described as glabrous in the Nottingham stock center seed catalog. In two of these accessions (Wil-2 and Est), PCR amplification of GL1 failed completely despite the use of several primer combinations. Hence, we conclude that the glabrous phenotype may be caused by a large deletion encompassing the entire GL1 locus. The remaining two natural GL1 mutants (Br-0, Mir-0) contained a single-base-pair deletion in the second exon at the beginning of the R3 motif (table 5 ). The resulting frameshift caused a premature stop after amino acid 93. Phenotypically, these accessions were identical to gl1-1, Wil-2, and Est, which have lost the GL1 gene. Hence, no trichomes could be detected on those plants.


View this table:
[in this window]
[in a new window]
 
Table 5 GL1 Mutations Causing a Glabrous Phenotype

 
We also analyzed two EMS-induced GL1 mutants. gl1-65 had a severe reduction in the number of trichomes, and only a small number of trichomes could be detected on the late rosette leaf margins, the cauline leaves, and the stem. The glabrous phenotype was caused by a G100R replacement. The second EMS-induced mutant, gl1-323, had a more severe phenotype with fewer trichomes on the cauline leaves and no trichomes on the stem. Sequence analysis indicated that a stop codon was introduced at amino acid 182, which resulted in a truncation of the predicted protein by 46 amino acids. A summary of the mutations leading to a glabrous phenotype is given in table 5 .

Analysis of A. lyrata
For comparison, we also sequenced GL1 in two A. lyrata individuals, one showing a glabrous phenotype. Interestingly, both individuals carried a 7-bp insertion in the third exon causing a frameshift at amino acid 215 and a stop codon 11 codons downstream. Because this mutation occurred independent of the trichome phenotype, it probably has no implications for the functionality of GL1. The glabrous plant carried an additional insertion of 4 bp at the beginning of the third exon in the R3 domain. This mutation leads to a frameshift of 10 codons and a truncation of 81 amino acids of the predicted GL1 gene product. We regard this mutation as the possible cause of the glabrous phenotype. An important conclusion from this naturally occurring glabrous phenotype is that, as in A. thaliana, GL1 is also required for trichome initiation in A. lyrata.

Cross-species comparisons indicated seven amino acid replacements between A. thaliana and A. lyrata. No amino acid replacement occurred in the conserved myb R2 (amino acids 14–66, nucleotide positions 206–491 in our alignment) and R3 (amino acids 67–117, nucleotide positions 492–1507) domains. We detected most amino acid replacements (86%) in the 110 terminal amino acids, suggesting only a limited requirement for the C-terminus.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
The GL1 Polymorphism Is Not Compatible with Balancing Selection
The distribution of pairwise differences between the 26 analyzed GL1 alleles is very similar to that of previously described balancing polymorphisms (fig. 1 ). Because this phenomenon (balancing selection) is known to result in longer coalescence times than those resulting under neutrality (Kaplan, Hudson, and Langley 1989Citation ; Stahl et al. 1999Citation ), genomic regions linked to the selected site are expected to accumulate more mutations. Thus, if the two allelic classes observed at the Gl1 locus were generated by balancing selection, they should differ more than neutrally evolving sequences. Considering that the size of the sequenced stretch, which shows elevated levels of sequence polymorphism, is determined by recombination, no significant linkage disequilibrium should be detected between polymorphisms within and outside the region with elevated sequence polymorphism (assuming balancing selection). However, our data indicate that an indel polymorphism at positions 327–330 does not follow this prediction and displays significant linkage disequilibrium with polymorphisms in the 3' region, which shows an elevated level of sequence polymorphism. Combining the evidence for linkage disequilibrium and the frequent occurrence of diverged sequence clades at several A. thaliana genes (Hanfstingl et al. 1994Citation ; Kawabe et al. 1997Citation ; Purugganan and Suddith 1998, 1999Citation ; Kawabe and Miyashita 1999Citation ; Stahl et al. 1999Citation ; Aguade 2001Citation ), we conclude that the distribution of polymorphic sites at the GL1 locus is most likely not caused by balancing selection.

Whether or not a simple historical explanation, such as a secondary contact of diverged A. thaliana populations (Innan et al. 1996Citation ), fits the pattern of nucleotide polymorphism better requires more data but should be borne in mind as an alternative explanation.

Functional Analysis of Molecular Variation at GL1
One of the major goals of this study was to link phenotypic and genotypic variation to learn more about the function of the GL1 locus. Our analysis revealed substantial phenotypic and molecular variation. Most of the molecular variation could be attributed to the presence of two diverged sequence clades. Despite this extensive molecular divergence between the two clades, our ANOVA did not detect a significant effect of the molecularly defined clades on the measured trichome densities. Further ANOVA analyses also relying on a cladistic grouping of GL1 sequences did not show a statistically significant effect.

In principle, several different explanations for the lack of a statistically significant result could be invoked. The first explanation would be that an association between molecular variation and trichome density exists, but the statistical power to detect the association was too low. This hypothesis is supported by the large within-accession variance in trichome density observed in this study as well as in a previous report (Larkin et al. 1996Citation ). A second explanation assumes an influence of additional genes. For example, the co-evolution of GL1 and genes interacting with GL1 could result in a nonsignificant ANOVA result. Assuming that each sequence variant is associated with a corresponding, co-evolved allele of one or several interacting genes, we would not expect to see an effect of the two sequence clades on trichome distribution. Recently, a similar scenario was tested for two genes (APETALA3 and PISTILLATA) involved in flower development which are known to interact with each other. The authors of the study detected significant linkage disequilibrium within genes (largely due to two allelic classes) but no significant intergenic association between polymorphic sites (Purugganan and Suddith 1999Citation ). Hence, their results are not consistent with a co-evolution of the two classes of allelic variants. One further possibility is that the observed sequence variation at GL1 has an effect on trichome density, but a large effect of variation at another locus obscures the association of the phenotype with GL1 variation. Finally, GL1 may be an important gene for initiation of trichome formation, but it has no influence on trichome densities. We used a one-way ANOVA for three different cladistic groupings to test whether the partitioning of molecular variation at GL1 could explain the variation in trichome number among accessions (table 1 ). While we did not observe a significant ANOVA, this could be simply due to the lack of statistical power. Nevertheless, a closer inspection of the three different cladistic levels indicates that the proportion of variance that could be explained by the grouping of accessions into different clades increases with the number of groups. This observation further corroborates the conclusion that shared polymorphic sites, as indicated by the cladistic grouping, fail to explain the observed variance in trichome density among accessions. Further evidence against the involvement of GL1 in trichome density comes from accessions with the identical sequence at GL1. Although these accessions (Es-0, As-0, Mt-0, RLD, and Tsu-0) lack sequence variation at GL1, they differ significantly in trichome density (ANOVA, P = 0.04). This observation is also consistent with a recent QTL analysis which identified one gene, not mapping to the GL1 locus, with a significant effect on trichome density (Larkin et al. 1996Citation ).

The combined analysis of molecular variation in glabrous plants and the closely related species A. lyrata provides some insight into the functionally important regions of the GL1 gene. One important observation was that those mutations which resulted in a predicted truncation of the GL1 protein differed in the extent to which various plant organs were affected. The most extreme mutation, which did not affect trichome density, was a C-terminal insertion of 7 bp in A. lyrata. The resulting frameshift caused a replacement of 10 amino acids and a deletion of six terminal amino acids. A stop codon at amino acid position 202 in A. thaliana resulted in a weak phenotype in which mainly late rosette leaves had a reduced trichome formation (Esch, Oppenheimer, and Marks 1994Citation ). The stop codon at position 181 resulted in a lack of trichomes on the rosette leaves and the stem. Cauline leaves of this mutant had a reduced number of trichomes. A frameshift mutation located in the highly conserved R3 domain (exon 3) which resulted in a stop codon at position 104, however, was completely glabrous.

Hence, the emerging picture is that the effect on trichome density becomes weaker the closer the truncation is located to the C-terminal end of the protein. Furthermore, late rosette leaves are generally more affected than the stem and cauline leaves. In particular, the latter organ showed some robustness toward truncations of the GL1 protein.

Interestingly, a single amino acid replacement (G100R) observed in the EMS-induced mutant gl1-65 in the R3 domain (nucleotide position 1454), which is conserved among many members of the myb proteins, was sufficient to produce a phenotype comparable to a deletion of 48 amino acids in the C-terminal region of the protein.


    Conclusions
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
Our study confirmed previous results that many genes in A. thaliana group into two diverged sequence clades. In principle, this provides an excellent framework to test for correlation between genotype and phenotype. Either it is possible to show that the sequence divergence is associated with a different phenotype or, alternatively, a large fraction of molecular variation can be excluded as a determinant for phenotypic variation. Finally, two divergent clades will greatly facilitate the identification of epistatic interactions. Hence, despite the problem of adequate sampling (Savolainen et al. 2000)Citation , the analysis of naturally occurring variation in A. thaliana will be an important approach for further investigation of genotype-phenotype interactions.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 
We are grateful to the Arabidopsis stock center for providing seeds and to M. Hülskamp for discussion, communicating unpublished results, and providing seed material. O. Savoulainen and M. Koch provided A. lyrata seed material. K. Schmid, M. Hülskamp, and D. Charlesworth provided helpful comments on the manuscript. We are grateful to M. Nordborg for his continued interest in the manuscript and many helpful comments at various stages of the project. This work was supported by grants from the Fonds zur Förderung der wissenschaftlichen Forschung (FWF) to M.-T.H. and C.S.


    Footnotes
 
Antony Dean, Reviewing Editor

1 Keywords: trichome Arabidopsis thaliana Arabidopsis lyrata balancing selection GLABROUS1. Back

2 Address for correspondence and reprints: Christian Schlötterer, Institut für Tierzucht und Genetik, Veterinärmedizinische Universität Wien, Josef-Baumann Gasse 1, A-1210 Vienna, Austria. christian.schloetterer{at}vu-wien.ac.at . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 References
 

    Aguade M., 2001 Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana Mol. Biol. Evol 18:1-9[Abstract/Free Full Text]

    Benfey P. N., P. J. Linstead, K. Roberts, J. W. Schiefelbein, M. T. Hauser, R. A. Aeschbacher, 1993 Root development in Arabidopsis: four mutants with dramatically altered root morphogenesis Development 119:57-70[Abstract/Free Full Text]

    Esch J. J., D. G. Oppenheimer, M. D. Marks, 1994 Characterization of a weak allele of the GL1 gene of Arabidopsis thaliana Plant Mol. Biol 24:203-207[ISI][Medline]

    Fu Y.-X., W.-H. Li, 1993 Statistical tests of neutrality of mutations Genetics 133:693-709[Abstract/Free Full Text]

    Hanfstingl U., A. Berry, E. A. Kellogg, J. T. Costa, W. Rüdiger, F. M. Ausubel, 1994 Haplotypic divergence coupled with lack of diversity at the Arabidopsis thaliana Alcohol Dehydrogenase locus: roles for both balancing and directional selection? Genetics 138:811-828[Abstract/Free Full Text]

    Hauser M. T., F. Adhami, M. Dorner, E. Fuchs, J. Glössl, 1998 Generation of co-dominant PCR-based markers by duplex analysis on high resolution gels Plant J 16:117-125[ISI][Medline]

    Hey J., J. Wakeley, 1997 A coalescent estimator of the population recombination rate Genetics 145:833-846[Abstract/Free Full Text]

    Hülskamp M., S. Miséra, G. Jürgens, 1994 Genetic dissection of trichome cell development in Arabidopsis Cell 76:555-566[ISI][Medline]

    Innan H., F. Tajima, R. Terauchi, N. T. Miyashita, 1996 Intragenic recombination in the Adh locus of the wild plant Arabidopsis thaliana Genetics 143:1761-1770[Abstract/Free Full Text]

    Kaplan N. L., R. R. Hudson, C. H. Langley, 1989 The "hitchhiking effect" revisited Genetics 123:887-899[Abstract/Free Full Text]

    Kawabe A., H. Innan, R. Terauchi, N. T. Miyashita, 1997 Nucleotide polymorphism in the acidic chitinase locus (ChiA) region of the wild plant Arabidopsis thaliana Mol. Biol. Evol 14:1303-1315[Abstract]

    Kawabe A., N. T. Miyashita, 1999 DNA variation in the basic chitinase locus (ChiB) region of the wild plant Arabidopsis thaliana Genetics 153:1445-1453[Abstract/Free Full Text]

    Koornneef M., L. W. M. Dellaert, J. H. van der Veen, 1982 EMS- and radiation-induced mutation frequencies at individual loci in Arabidopsis thaliana (L.) Heynh. Mutat. Res 93:109-123

    Larkin J. C., N. Young, M. Prigge, M. D. Marks, 1996 The control of trichome spacing and number in Arabidopsis Development 122:997-1005[Abstract/Free Full Text]

    Long A. D., C. H. Langley, 1999 The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits Genome Res 9:720-731[Abstract/Free Full Text]

    Long A. D., R. F. Lyman, C. H. Langley, T. F. Mackay, 1998 Two sites in the Delta gene region contribute to naturally occurring variation in bristle number in Drosophila melanogaster Genetics 149:999-1017[Abstract/Free Full Text]

    Lynch M., B. Walsh, 1998 Genetics and analysis of quantitative traits Sinauer, Sunderland, Mass

    Marks M. D., K. A. Feldman, 1989 Trichome development in Arabidopsis thaliana. I. T-DNA tagging of the GLABROUS1 gene Plant Cell 1:1043-1050[Abstract/Free Full Text]

    McDonald J. H., 1996 Detecting non-neutral heterogeneity across a region of DNA sequence in the ratio of polymorphism to divergence Mol. Biol. Evol 13:253-260[Abstract]

    ———. 1998 Improved tests for heterogeneity across a region of DNA sequence in the ratio of polymorphism to divergence Mol. Biol. Evol 15:377-384[Abstract]

    McDonald J. H., M. Kreitman, 1991 Adaptive protein evolution at the Adh locus in Drosophila Nature 351:652-654[ISI][Medline]

    Nei M., 1987 Molecular evolutionary genetics. Columbia University Press, New York

    Oppenheimer D. G., P. L. Herman, S. Sivakumaran, J. Esch, M. D. Marks, 1991 A myb gene required of leaf trichome differentiation in Arabidopsis is expressed in stipules Cell 67:483-493[ISI][Medline]

    Page R. D. M., 1996 TREEVIEW: an application to display phylogenetic trees on personal computers Comput. Appl. Biosci 12:357-358[Medline]

    Payne C. T., F. Zhang, A. M. Lloyd, 2000 GL3 encodes a bHLH protein that regulates trichome development in Arabidopsis through interaction with GL1 and TTG1 Genetics 156:1349-1362[Abstract/Free Full Text]

    Purugganan M. D., J. I. Suddith, 1998 Molecular population genetics of the Arabidopsis CAULIFLOWER regulatory gene: nonneutral evolution and naturally occurring variation in floral homeotic function Proc. Natl. Acad. Sci. USA 95:8130-8134[Abstract/Free Full Text]

    ———. 1999 Molecular population genetics of floral homeotic loci: departures from the equilibrium-neutral model at the APETALA3 and PISTILLATA genes of Arabidopsis thaliana Genetics 151:839-848[Abstract/Free Full Text]

    Risch N., K. Merikangas, 1996 The future of genetic studies of complex human diseases Science 273:1516-1517[ISI][Medline]

    Rozas J., R. Rozas, 1999 DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis Bioinformatics 15:174-175[Abstract/Free Full Text]

    Savolainen O., C. H. Langley, B. P. Lazzaro, H. Fréville, 2000 Contrasting patterns of nucleotide polymorphism at the Alcohol Dehydrogenase locus in the outcrossing Arabidopsis lyrata and the selfing Arabidopsis thaliana Mol. Biol. Evol 17:645-655[Abstract/Free Full Text]

    Stahl E. A., G. Dwyer, R. Mauricio, M. Kreitman, J. Bergelson, 1999 Dynamics of disease resistance polymorphism at the Rpm1 locus of Arabidopsis Nature 400:667-671[ISI][Medline]

    Strimmer K., A. von Haeseler, 1996 Quartet puzzling: a quartet maximum-likelihood method for reconstructing tree topologies Mol. Biol. Evol 13:964-969[Free Full Text]

    Tajima F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism Genetics 123:585-595[Abstract/Free Full Text]

    Tamura K., M. Nei, 1993 Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees Mol. Biol. Evol 10:512-526[Abstract]

    Templeton A. R., E. Boerwinkle, C. F. Sing, 1987 A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila Genetics 117:343-351[Abstract/Free Full Text]

    Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680[Abstract]

    Wada T., T. Tachibana, Y. Shimura, K. Okada, 1997 Epidermal cell differentiation in Arabidopsis determined by a Myb homolog, CPC Science 277:1113-1116[Abstract/Free Full Text]

    Watterson G. A., 1975 On the number of segregating sites in genetical models without recombination Theor. Pop. Biol 7:256-276[ISI][Medline]

Accepted for publication May 23, 2001.