*Centre for Animal Conservation Genetics, Faculty of Resource Science and Management,
and
Graduate Research College, Southern Cross University, Lismore, New South Wales, Australia
Abstract
In this study, the evolutionary history of the variable second exon of RT1.Ba and its adjoining intron b are compared across a number of species and subspecies of the Australian Rattus. Three lineages are identified in the second intron across a range of Rattus species. Two of these lineages, separated by the insertion of a probable rodent short interspersed nucleotide element and by point mutations outside the indel region, are both found in each of the major clades of the endemic Australian Rattus. This pattern of ancestral polymorphism is reflected in the adjoining exon 2 sequences, although phylogenetic constraints confirm that the clustering is not identical to that of the associated intron sequences. In addition, the coding sequences show evidence of the retention of ancestral polymorphism, with identical exon sequences found in two divergent species, and some indication of gene conversion detected for the exon sequences.
Introduction
The major histocompatibility complex (Mhc) is a multigene family identified across a range of vertebrate species. Many Mhc loci show high levels of polymorphism, which are thought to derive in part from the retention of ancestral polymorphisms (Klein 1987
), leaving some Mhc alleles with similarity to those in otherwise distant species. Gene conversion or intragenic segmental transfer has also been postulated as a means of increasing the level of polymorphism of Mhc alleles (Hedrick et al. 1991
). In addition, an observed increase in the number of nonsynonymous over synonymous changes in regions associated with peptide binding suggests that balancing selection is involved in the maintenance of polymorphism of Mhc alleles (Hughes and Nei 1989
).
These processes of selection, the retention of ancestral polymorphisms, and gene conversion have uncoupled the phylogenetic history of the coding regions of Mhc loci from that of the remaining genome and from that of the species in which they are found. The strength of the evolutionary forces shaping the phylogeny of the Mhc coding regions makes the Mhc a good model for examining the interplay between the evolution of an exon and that of its adjoining intron.
Empirical studies present examples from both extremes of linkage for Mhc loci. Cosegregations of exon and intron segments have been identified in several class II Mhc loci (e.g., Ammer et al. 1992
), and such tight linkage has led to the suggestion that intron polymorphisms, which often include microsatellite or retroposon variation, can be used as markers for haplotype differences (Ellegren, Davies, and Andersson 1993
). In contrast, poor correlation between the evolutionary history of an exon and intron of mouse H-2 Ab was found (Lu et al. 1996
) and attributed to intraexonic recombination. Between these extremes, an intron repeat in DRB of artiodactyls was found to evolve with the ß-sheetencoding region of an exon but not with the
-helixencoding region (Schwaiger et al. 1993
).
In this study, the evolutionary history of the second intron of RT1.Ba was examined across a number of species and subspecies of Australian Rattus and compared with that of its adjoining exon to determine the degree to which their evolutionary histories have become uncoupled.
Materials and Methods
Tissues were obtained from six species and eight subspecies of Australian Rattus, namely, R. colletti, R. fuscipes assimilis, R. fuscipes coracius, R. fuscipes fuscipes, R. fuscipes greyii, R. leucopus cooktownensis, R. leucopus leucopus, R. sp., R. tunneyi culmorum, R. tunneyi tunneyi, and R. villosissimus. These samples provide representatives for six of the eight species of endemic Australian rats. Genomic DNA (gDNA) was isolated using a standard proteinase K digestion and phenol/chloroform extraction procedure (Bothwell, Yancopoulos, and Alt 1990
). Exon 2 was amplified by the polymerase chain reaction (PCR), using primers and procedures described previously (Seddon and Baverstock 1998
). Intron b was amplified under similar conditions, but with the primers RT1.Ba 612C (5'-ATGAAGAGGTCAAATTCAACTCCAGCTA-3') and RT1.Ba 1263NC (5'-GCTGACCCAGCAGCACAGGAGACTT-3').
Because this study compared the evolutionary history of the second exon and its adjoining intron, it was essential that the chromosomal pair of the exon and intron sequences could be deduced without doubt. To simplify this procedure, individuals were chosen on the basis of homozygosity at the intron using temperature gradient gel electrophoresis (TGGE). Consequently, exon sequences determined for these individuals, whether present in homozygous or heterozygous form, could be assigned unequivocally to their coevolving introns.
Both strands of the PCR products were sequenced manually using the ThermoSequenase Cycle Sequencing kit (Amersham Life Science) or were sequenced using the Applied Biosystems 373A autosequencer. Homozygous individuals were sequenced directly. For heterozygous samples, the two homoduplex bands on TGGE were cut separately from the gel and reamplified by PCR (Meyer et al. 1991
) prior to sequencing.
The sequences were aligned by eye with assistance, for the intron sequences, from CLUSTAL W (Thompson, Higgins, and Gibson 1994
). Tajima-Nei distances were calculated to account for a nucleotide base frequency bias (A, 33.4%; C, 19.6%; G, 22.3%; T, 24.7%) in the exon sequences, and Jukes-Cantor distances were calculated for the intron sequences. To calculate synonymous and nonsynonymous distances, peptide-binding regions (PBRs) of the exons were identified by comparison with the postulated PBR sites for other species (Brown et al. 1988
). Phylogenetic analyses were performed using the neighbor-joining distance algorithm in MEGA (Saitou and Nei 1987
) and using a maximum-likelihood approach in DNAML of the PHYLIP package (Felsenstein 1993
).
The strength of association between the intron and exon sequences was tested by applying topological constraints during phylogeny reconstruction. The intron sequences were used to construct a maximum-likelihood tree under the constraints of the exon phylogeny and of two species phylogenies compatible with that of Baverstock, Adams, and Watts (1986)
. One species phylogeny is the most parsimonious tree when constrained to the phylogeny of Baverstock, Adams, and Watts (1986)
and the second places fuscipes and leucopus as sister taxa and the sordidus group and tunneyi as sister taxa. The degree of congruence between the unconstrained trees and the constrained trees was assessed in a maximum-likelihood analysis by the change in log likelihood (
lnL) of the tree, with the level of significance assigned by the method of Kishino and Hasegawa (1989)
as implemented in PHYLIP (Felsenstein 1993
).
Results and Discussion
Exon
Twenty-one nucleotide sequences of the second exon of RT1.Ba were recovered from 18 individuals (accession numbers AF145094AF145102), and of these, 12 sequences have previously been presented (Seddon and Baverstock 1998
; Seddon 1998
). Of the 249 sites, 56 (22.5%) are variable and 32 are informative under parsimony. As anticipated for variable Mhc loci, there is a high level of nucleotide divergence, with Tajima-Nei distances reaching 11.7%.
Identical exon 2 nucleotide sequences were found in two divergent species, R. t. culmorum (R132A) and R. colletti, two species whose distributions are widely separated and distinct. Although the reporting of shared motifs is relatively common, the sharing of a complete exon sequence between two species has been reported infrequently, for example, in Mhc-E for two closely related chimpanzee species (Suarez et al. 1997
). Convergence and reticulate evolution cannot be dismissed as explanations; however, this is the first example of a shared Mhc exon sequence in two otherwise divergent species and is strong evidence of the retention of ancestral polymorphisms.
Such an evolutionary pattern is further supported by the presence of a codon deletion at positions 222224 (corresponding to residue 80 of the first domain) in two sequences of R. t. culmorum (this study), in 17 exon 2 alleles of R. f. greyii (Seddon 1998
), in R. colletti (Seddon and Baverstock 1998
), and in R. norvegicus (Holmdahl et al. 1993
). This codon deletion indicates an ancestry shared since the divergence of these species.
The influence of balancing selection on these exons is suggested by an increase in the level of nonsynonymous changes at sites postulated to be involved in peptide binding. The 12 PBR codons showed more nonsynonymous than synonymous changes (mean pn = 0.1723, mean ps = 0.1233), but for the 70 codons outside the PBR, there were fewer nonsynonymous than synonymous changes (mean pn = 0.0380, mean ps = 0.0652).
The phylogenetic relationships of the exon 2 sequences (fig. 1A ) clearly differ from the expected species phylogeny. This discrepancy is likely due to the retention of ancestral polymorphisms and the presence of balancing selection acting on the exon 2 sequences of RT1.Ba. However, the small number of nucleotides used to construct the tree may result in an inaccurate phylogeny, with few branches which have strong bootstrap support.
|
|
The evolutionary patterns of the exon have substantially influenced the evolution of its adjoining intron. The nucleotide divergence of the intron sequences approximate the high levels found for the Mhc exon sequences. The presence of balancing selection acting on the PBR of the second exon, together with hitchhiking, can be postulated to have increased the divergence in the adjoining intron sequences beyond neutral expectations (Hudson, Kreitman, and Aguade 1987
; Hudson 1990
). However, such an explanation requires that the recombination rate between these regions has been fairly small. Even a recombination rate of 10-5 per generation has been shown to be sufficient to substantially decrease the influence of balancing selection on the linked neutral sites (Takahata and Satta 1998
).
The most interesting feature of the intron sequences is the pattern of large insertion/deletion events which give three lineages. The sequence of R. norvegicus (X07551) is designated lineage 1. In comparison with this sequence, all Australian rat sequences (lineages 2 and 3) show a deletion of 171 nt (positions 487657). A large indel of approximately 107 nt separates lineages 2 and 3 (positions 334440), with the proposed insertion defining lineage 2 in some species and subspecies of Australian rats. This insertion has a long run of thymidine bases, resulting in slippage of the Taq sequencing enzyme and leading to difficulty in verifying the exact number of thymidine bases. The insertion, defining lineage 2, shows a 93% similarity to a rodent short interspersed nucleotide element (SINE) (Pascale et al. 1993
), identified using a FASTA search of GenBank sequences. The subdivision of the intron lineages on the basis of a probable rodent SINE is reminiscent of a similar lineage split reported in an intron of H-2 Ab in an extensive array of Mus species (Lu et al. 1996
).
In addition to the insertion/deletion events, there are changes outside these regions which further define the lineages. A neighbor-joining tree calculated using Jukes-Cantor distances with complete omission of sites containing gaps (thus effectively removing insertion and deletion events) clearly shows the division of the 19 sequences into three lineages (fig. 1B ).
Importantly, both lineage 2 and lineage 3 do not occur in a species-specific manner. Both lineages are represented across a range of species and subspecies. For example, two individuals of R. f. greyii were sequenced, one showing the insertion characteristic of lineage 2 and the other not. This pattern is repeated for paired individuals of each of R. l. cooktownensis, R. t. culmorum, and R. villosissimus. The retention of ancestral polymorphisms and the presence of balancing selection, which led to a discrepancy between the exon 2 phylogeny and that of the expected species phylogeny, has clearly influenced this intron phylogeny.
There is remarkable similarity in the grouping of the exon sequences by intron-defined lineages, although the phylogenies of the intron and exon sequences have, to some extent, become uncoupled. The individuals with lineage 2 introns are joined in a clade with 94% bootstrap support in the intron neighbor-joining tree (fig. 1B
) and are again united in a phylogenetic tree constructed from the exon sequences (fig. 1A
). However, the linkage is not complete. In the exon tree, this clade also contains R. f. greyii (KA3) and one exon of the heterozygote R. l. cooktownensis (D71B), supported by a moderate (68%) bootstrap value. Further support for the absence of complete linkage between the exon and intron sequences is shown by two individuals of R. l. cooktownensis (D72 and D71B) that share an identical exon sequence, yet one has a lineage 2 intron and the other has a lineage 3 intron. The application of topological constraints during the re-creation of phylogenetic relationships allows an assessment of the strength of hitchhiking between the intron and exon sequences. Such an analysis shows that an intron tree constrained to either the exon phylogeny (lnL = -589.45) or one of two species phylogenies (
lnL = -527.37, -527.69) is significantly different from the unconstrained tree (lnL = -2,080.40).
The patterns of evolution acting on the exon, such as balancing selection and the retention of ancestral polymorphisms, have clearly influenced the evolutionary history of the intron. Therefore, in this example of RT1.Ba in species and subspecies of endemic Australian Rattus, a low level of recombination can be postulated for this locus, allowing the footprints of the evolutionary forces acting on the coding region to be detected in the adjacent noncoding region.
Furthermore, the presence of two clear lineages in the intron of the Australian Rattus species and subspecies gives some indication of the speciation history of this group of endemic rats. The large size of the insertion into the intron suggests that it represents a single evolutionary event. Therefore, the presence of both lineages in each major clade of the Australian rats (fig. 3
) indicates that the insertion predates the rapid divergence of Australian Rattus that followed their movement into Australia during the Pleistocene (Taylor, Calaby, and Smith 1983
). The origin of the insertion cannot be determined without further study of the Asian species of Rattus. In addition, because the continuation of ancestral polymorphisms through speciation events requires a sufficiently large population to prevent the loss of variation through genetic drift or bottlenecks (Vincek et al. 1997
), we can infer that each speciation event within the divergence of the Australian Rattus lineage occurred with a relatively large founding population.
|
Acknowledgements
We thank the South Australian Museum for providing tissue from their collection, and Martin Elphinstone, Fiona Harriss, and the South Australian National Parks and Wildlife Service for the collection of samples for this study. Financial support was provided by the Australian Research Council and the School of Resource Science and Management, Southern Cross University.
Footnotes
Eleptherios Zouros, Reviewing Editor
1 Present address: School of Biological Sciences, University of East Anglia, Norwich, England.
2 Keywords: MHC
RT1.Ba,
Australian Rattus.
3 Address for correspondence and reprints: J. M. Seddon, School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, United Kingdom. E-mail: j.seddon{at}uea.ac.uk
literature cited
Ammer, H., F. Schwaiger, C. Kammerbauer, M. Gomolka, A. Arriens, S. Lazary, and J. T. Epplen. 1992. Exonic polymorphism versus intronic simple repeat hypervariability in MHC-DRB genes. Immunogenetics 32:332340.
Baverstock, P. R., M. Adams, and C. H. S. Watts. 1986. Biochemical differentiation among karyotypic forms of Australian Rattus. Genetica 71:1122.
Bothwell, A., G. D. Yancopoulos, and F. W. Alt. 1990. Methods for cloning and analysis of eukaryotic genes. Jones and Bartlett, Boston.
Brown, J. H., T. Jardetzky, M. A. Saper, B. Samraoui, P. J. Bjorkman, and D. C. Wiley. 1988. A hypothetical model of the foreign antigen binding site of class II histocompatibility molecules. Nature 332:845850.
Edwards, S. V., and P. W. Hedrick. 1998.Evolution and ecology of MHC molecules: from genomics to sexual selection. TREE 13:305311.
Ellegren, H., C. J. Davies, and L. Andersson. 1993. Strong association between polymorphisms in an intronic microsatellite and in the coding sequence of the BoLA-DRB3 gene: implications for microsatellite stability and PCR-based DRB3 typing. Anim. Genet. 24:269275.[ISI][Medline]
Felsenstein, J. 1993. PHYLIP (phylogeny inference package). Distributed by the author, Department of Genetics, University of Washington, Seattle.
Hedrick, P. W., W. Klitz, W. P. Robinson, M. K. Kuhner, and G. Thomson. 1991.Population genetics of HLA. Pp. 248271 in R. K. Selander, A. G. Clark, and T. S. Whittam, eds. Evolution at the molecular level. Sinauer, Sunderland, Mass.
Holmdahl, R., M. Karlsson, K. Gustafsson, and H. Hedrich. 1993. Structural polymorphism of six rat RT1.Ba genes. Immunogenetics 38:381.
Hudson, R. 1990. Gene genealogies and the coalescent process. Oxf. Surv. Evol Biol. 7:144.
Hudson, R. R., M. Kreitman, and M. Aguade. 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116:153159.
Hughes, A. L., and M. Nei. 1989. Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. Proc. Natl. Acad. Sci. USA 86:958962.
Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data and the branching order of the Hominoidea. J. Mol. Evol. 29:170179.[ISI][Medline]
Klein, J. 1987. Origin of the major histocompatibility complex polymorphism: the trans-species hypothesis. Hum. Immunol. 19:155162.[ISI][Medline]
Lu, C.-C., Y. Ye, J. X. She, F. Bonhomme, and E. K. Wakeland. 1996. Evolutionary origins of retroposon lineages of Mhc class II Ab alleles. Immunogenetics 43:115124.
Meyer, C. G., E. Tannich, J. Harders, K. Henco, and R. D. Horstmann. 1991. Direct sequencing of variable HLA gene segments after in vitro amplification and allele separation by temperature-gradient gel electrophoresis. J. Immunol. Methods 142:251256.
Pascale, E., C. Liu, E. Valle, K. Usdin, and A. V. Furano. 1993. The evolution of long interspersed repeated DNA (L1, LINE 1) as revealed by the analysis of an ancient rodent LI DNA family. J. Mol. Evol. 36:920.[ISI][Medline]
Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406425.[Abstract]
Schwaiger, F., E. Weyers, C. Epplen, G. Ruff, A. M. Crawford, and J. T. Epplen. 1993. The paradox of MHC-DRB exon/intron evolution: -helix and ß-sheet encoding regions diverge while hypervariable intronic simple repeats co-evolve with b-sheet codons. J. Mol. Evol. 37:260272.[ISI][Medline]
Seddon, J. M. 1998. Genetic variation on islands: Mhc polymorphism in the Australian bush rat. Ph.D. thesis, Southern Cross University, Lismore, New South Wales, Australia.
Seddon, J. M., and P. R. Baverstock. 1998. Eight rat RT1.Ba sequences. Immunogenetics 48:161162.
Suarez, B., P. Morales, M. J. Castro, V. Fernandez-Soria, M. J. Recio, M. Perez-Blas, M. Alvarez, N. Diaz-Campos, and A. Arnaiz-Villena. 1997. Mhc-E polymorphism in Pongidae primates: the same allele is found in two different species. Tissue Antigens 50:695698.
Takahata, N., and Y. Satta. 1998. Footprints of intragenic recombination at HLA loci. Immunogenetics 47:430441.
Taylor, J. M., J. H. Calaby, and S. C. Smith. 1983. Native Rattus, land bridges, and the Australian region. J. Mammal. 64:463475.[ISI]
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:46734680.[Abstract]
Vincek, V., C. O'hUigin, Y. Satta, N. Takahata, P. T. Boag, P. R. Grant, B. R. Grant, and J. Klein. 1997. How large was the founding population of Darwin's finches? Proc. R. Soc. Lond. B Biol. Sci. 264:111118.