*Department of Biology, University of Rochester;
and
Department of Ecology and Evolutionary Biology, University of Arizona
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Several experimental systems have revealed that the incidences of certain replication errors differ between the complementary DNA strands and that error rates are usually, but not always, higher on the lagging strand (Trinh and Sinden 1991
; Veaute and Fuchs 1993
; Roberts et al. 1994
; Izuta, Roberts, and Kunkel 1995
; Rosche, Trinh, and Sinden 1995
; Thomas et al. 1996
; Fijalkowska et al. 1998
; Miret, Pessoa-Brandao, and Lahue 1998
; Rosche, Ripley, and Sinden 1998
). For bacteria, evidence for strand-specific mutational biases comes from the large-scale base composition patterns uncovered in complete genomic sequences. While equal frequencies of G and C, as well as A and T, are expected under nostrand bias conditions, most bacterial chromosomes exhibit deviations from these intrastrand parities, which often switch direction at the origin and terminus of replication (Lobry 1996
; Mrazek and Karlin 1998
). However, such deviations are also affected by strand-biased gene distribution coupled with (1) selection on amino acid content and/or synonymous codon usage and (2) strand-specific mutational biases introduced by transcription through transcription-coupled repair and the preferential deamination of the coding strand (Francino and Ochman 1997, 1999
; Mrazek and Karlin 1998
).
Unlike bacterial replication, DNA replication of eukaryotic chromosomes initiates at multiple origins, with an average spacing of 50100 kb (Brewer and Fangman 1993
). Eukaryotic chromosomes do not show large-scale deviations from intrastrand parity in base composition (Karlin, Campbell, and Mrazek 1998
), although small local deviations are observable in the human ß-globin region (Smithies et al. 1981
; Bulmer 1991
). This small-scale patterning might result from different mutational biases on the leading and lagging strands, but only if replicon size in germ line cells were restricted to a few kilobases (Bulmer 1991
). Previous analyses of substitutional patterns in this region among primate species have yielded conflicting results regarding the existence of differences between the strands (Wu and Maeda 1987
, but see Bulmer 1991
). These were due, in part, to the lack of information about the actual positions of replication origins, which precluded the assignment of mutations to the leading or lagging strand. However, present knowledge of the location of a replication origin within the ß-globin region (Kitsberg et al. 1993
; Aladjem et al. 1998
) enables the analysis of sequences for which the direction of replication can be established, thus facilitating the interpretation of such results.
To resolve whether mutational asymmetries exist between the leading and lagging strands of DNA, we compared the patterns of substitutions among primate species at noncoding sequences adjacent to the only origin of replication identified within the 200-kb ß-globin region. The close proximity of the analyzed sequences to this replication origin minimizes the probability that these sequences are replicated by forks coming from alternative origins, even in face of dense origin recruitment in the germ line and in early embryogenesis, when the spacing between initiation sites is reduced to <10 kb (DePamphilis 1996
). During these periods of rapid cell division, replication may initiate at many sites due to permissive conditions such as high concentrations of initiation proteins, but strong origins, like the one identified in the ß-globin region, are still most likely to be used (DePamphilis 1996
). Furthermore, several sequence motifs typical of eukaryotic replication origins are conserved at the same locations in the orthologous sequences from other primates, indicating that replication is very likely to initiate at the same site in all species considered.
![]() |
Materials And Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
In order to detect reversals in the substitutional patterns associated with the shift from lagging strand to leading strand, we compared the ratios of complementary substitutions (e.g., CA vs. G
T) at either side (5' and 3') of the origin of replication. For every pair of complementary substitutions, we calculated the following ratio (
):
where fij and fkl are frequencies of complementary substitutions. If no strand bias existed, complementary substitutions would occur at equal frequencies within each region, and would be equal to 1. But under strand bias, complementary substitution frequencies would deviate from equality within each region, and the deviations would switch direction at the replication origin, yielding values of
not equal to 1. To evaluate the statistical significance of
values, we calculated ln
, whose distribution is approximately normal and whose standard error can be obtained (Sokal and Rohlf 1995
). Within each region, fij and fkl were also compared by means of
2 tests. To evaluate the effects of neighboring nucleotides on the substitutional pattern, we obtained trinucleotide frequencies at each side of the origin with the aid of the GCG package (Devereux, Haberli, and Smithies 1984
).
![]() |
Results And Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The (TG)n repeat is located at the center of the 5' regionnucleotides 678731 in the 5' alignmentin Old World monkeys (OWMs) and hominoids, with n varying from 10 in the chimpanzee to 27 in the gorilla. The orthologous segment could not be recovered from New World monkeys (NWMs). The sequence and length variability of this segment suggests that it evolves differently from the rest of the region due to such processes as slipped-strand mispairing.
The NWMs harbor an insertion in this region of 634 bpnucleotides 7741407 in the 3' alignmentconsisting of a complete Alu element immediately preceded by the 3' untranslated region (UTR) of L1. The joint L1-Alu insertion is flanked by perfect direct repeats (AGGACT). This arrangement suggests read-through transcription of an active L1 element into the adjacent Alu, coupled with retrotransposition by the L1-encoded enzymes into the 3' region and extensive truncation of the 5' end during cDNA synthesis, sparing only the 3' UTR of L1 (H. Malik, personal communication). Therefore, this example can be added to the list of 3' flanking DNA transductions mediated by L1 that have been detected in vivo (Miki et al. 1992
; Holmes et al. 1994
; McNaughton et al. 1997
; Rozmahel et al. 1997
) and represents the first case in which an Alu element has been transposed by the L1 machinery. The L1-Alu insertion occurred sometime between the split of NWMs from OWMs and hominoids 35 MYA and the split of Saimiri from the Callitrichinae (Callithrix and Saguinus) 17.5 MYA (Gingerich 1984
; Fleagle 1988
; Schneider et al. 1993
). Because of the possibility of recombinational events between repetitive elements, we omitted the L1-Alu insertion from the estimation of the substitutional pattern in the 3' region.
Patterns of Substitution Around the ß-Globin Origin of Replication
The minimal numbers of substitutions detected were 125 in the region the 5' and 220 in the region 3' of the ß-globin origin. Table 1
presents the numbers for every type of nucleotide substitution in either region, and figure 3
compares the frequencies (fij) for each pair of complementary substitutions. The hypothesis of unequal distribution of mutations between the leading and lagging strands predicts that complementary substitution frequencies will differ within each region and that deviations from equality will manifest opposite directions 5' and 3' of the replication origin. However, figure 3
shows that in most cases, complementary substitution frequencies are of similar magnitudes and/or that deviations from equality do not switch direction at the origin. Only two pairs of complementary transversions, TA versus A
T and C
A versus G
T, behave in a manner similar to the strand-bias predictions.
|
|
Neighboring-Nucleotide Effects on the Substitutional Pattern
Because the pattern of substitutions around the ß-globin origin of replication does not behave as expected from a strand-biased mutation input model, we investigated whether neighboring-nucleotide effects could explain the few individual cases in which deviations from symmetry do exist. The 5' and 3' neighboring nucleotides are known to influence the probability of specific substitutions (Hess, Blake, and Blake 1994
), and, since nucleotides are likely to be exposed to different neighbors in each strand, substitutional asymmetries may arise. Based on the trinucleotide frequencies in the regions analyzed, we computed the average likelihood of each nucleotide substitution according to the relative rates reported by Hess, Blake, and Blake (1994)
for every substitution in the environment of each of the 16 neighbor pairs. As shown in figure 4
, the influence of neighboring nucleotides does not account for much of the variation in substitution frequencies or for the limited component of asymmetry around the origin. Indeed, the average likelihoods of substitution, given the neighboring environment, are similar for complementary transversion pairs within and between the 5' and 3' regions and do not covary with the observed fij (Spearman's coefficient rs = 0.374, P = 0.148).
|
|
Origin Specificity in Multicellular Eukaryotes
Although site-specific origins of replication in multicellular eukaryotes have been difficult to localize, about 20 sites of replication initiation have now been identified. In particular, the ß-globin origin of replication considered in the present study has been both biochemically (Kitsberg at al. 1993
) and genetically (Aladjem et al. 1998
) defined and shown to operate in both erythroid and nonerythroid cells. Moreover, although replication initiates at apparently random sites in the rapidly dividing early embryos of Drosophila and Xenopus (Shinomiya and Ina 1991
; Hyrien, Maric, and Mechali 1995
), this is not known to be the case in mammals, whose early embryogenesis differs from that of flies and frogs in several respects. First, mammal embryos divide at a much slower pace. For humans, the cleavage rate of embryos during their first week is only of one per day (England 1994
). Furthermore, whereas no zygotic transcription takes place in Drosophila and Xenopus, transcription has been shown to begin immediately after zygotic formation in mouse embryos. In addition, consistent with the idea that active transcription and higher-order chromatin structure are involved in establishing origin specificity (DePamphilis 1996
; Françon, Maiorano, and Mechali 1999
), replication in mouse embryos requires specific origin and enhancer sequences (Wirak et al. 1985
; Martinez-Salas, Cupo, and DePamphilis 1988
).
The ß-globin origin can initiate replication in different cell types independently of the transcriptional status of the ß-globin gene cluster (Kitsberg at al. 1993
; Aladjem et al. 1995
), as well as upon experimental introduction at ectopic locations in the genomes of simian cell lines (Aladjem et al. 1998
). Therefore, recruitment of the ß-globin origin does not seem to require a specific chromatin organization, suggesting that this origin may function under varied chromosomal environments. Hence, we assume in our analysis that the ß-globin origin is being preferentially used during the evolutionarily relevant cell divisions taking place in the germ line and in early embryos. Nevertheless, the possibility remains that the lack of strand biases around the ß-globin origin is due to the utilization of alternative origins during such cell divisions. In particular, the ß-globin replication initiation region may contain, in addition to the replication origin considered in this analysis, one or more secondary sites at which replication initiates with low efficiency (Aladjem et al. 1998
). However, the frequency of firing of these putative origins would probably be too low to affect sequence evolution, unless relative origin efficiencies change in the germ line and/or in early embryos. Moreover, none of the potential origins lies upstream of the 5' region analyzed in this study, and only sites with barely detectable activities reside downstream of the 3' region, so the leading/lagging strand assignments here considered are not likely to vary. Therefore, the similarity of the patterns of substitution around the ß-globin origin is likely to reflect the lack of strand biases in the input of mutations in the chromosomes of primates.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: sequence evolution
substitutional asymmetry
strand bias
replication origin
primate globin genes
2 Address for correspondence and reprints: Howard Ochman, Department of Ecology and Evolutionary Biology, 233 Life Sciences South, University of Arizona, Tucson, Arizona 85721. E-mail: hochman{at}u.arizona.edu
![]() |
literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aladjem, M. I., M. Groudine, L. L. Brody, E. S. Dieken, R. E. Keith Fournier, G. M. Wahl, and E. M. Epner. 1995. Participation of the human ß-globin locus control region in initiation of DNA replication. Science 270:815819.
Aladjem, M. I., L. W. Rodewald, J. L. Kolman, and G. M. Wahl. 1998. Genetic dissection of a mammalian replicator in the human ß-globin locus. Science 281:10051009.
Brewer, B. J., and W. L. Fangman. 1993. Initiation at closely spaced replication origins in a yeast chromosome. Science 262:17281731.
Bulmer, M. 1991. Strand symmetry of mutation rates in the ß-globin region. J. Mol. Evol. 33:305310.[ISI][Medline]
Burgers, P. M. 1998. Eukaryotic DNA polymerases in DNA replication and DNA repair. Chromosoma 107:218227.
DePamphilis, M. L. 1996. Origins of DNA replication. Pp. 4586 in M. L. DePamphilis, ed. DNA replication in eukaryotic cells. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
Devereux, J., P. Haberli, and O. Smithies. 1984. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12:387395.[Abstract]
England, M. A. 1994. The human. Pp. 207220 in J. Bard, ed. Embryos. Color atlas of development. Wolfe, London.
Fijalkowska, I. J., P. Jonczyk, M. M. Tkaczyk, M. Bialoskorska, and R. M. Schaaper. 1998. Unequal fidelity of leading strand and lagging strand DNA replication on the Escherichia coli chromosome. Proc. Natl. Acad. Sci. USA 95:1002010025.
Fleagle, J. G. 1988. Primate adaptation and evolution. Academic Press, San Diego.
Francino, M. P., and H. Ochman. 1997. Strand asymmetries in DNA evolution. Trends Genet. 13:240245.[ISI][Medline]
. 1999. A comparative genomics approach to DNA asymmetry. Ann. N.Y. Acad. Sci. 870:428431.
Françon, P., D. Maiorano, and M. Mechali. 1999. Initiation of DNA replication in eukaryotes: questioning the origin. FEBS Lett. 452:8791.[ISI][Medline]
Gingerich, P. D. 1984. Primate evolution: evidence from the fossil record, comparative morphology, and molecular biology. Yearb. Phys. Anthropol. 27:5772.[ISI]
Gojobori, T., W.-H. Li, and D. Graur. 1982. Patterns of nucleotide substitution in pseudogenes and functional genes. J. Mol. Evol. 18:360369.[ISI][Medline]
Hess, S. T., J. D. Blake, and R. D. Blake. 1994. Wide variations in neighbor-dependent substitution rates. J. Mol. Biol. 236:10221033.[ISI][Medline]
Holmes, S. E., B. A. Dombroski, C. M. Krebs, C. D. Boehm, and H. H. Kazazian Jr. 1994. A new retrotransposable human L1 element from locus LRE2 on chromosome 1q produces a chimaeric insertion. Nat. Genet. 7:143148.[ISI][Medline]
Hyrien, O., C. Maric, and M. Mechali. 1995. Transition in specification of embryonic metazoan DNA replication origins. Science 270:994997.
Imanishi, T., and T. Gojobori. 1992. Patterns of nucleotide substitutions inferred from the phylogenies of the class I major histocompatibility complex genes. J. Mol. Evol. 35:196204.[ISI][Medline]
Izuta, S., J. D. Roberts, and T. A. Kunkel. 1995. Replication error rates for G.dGTP, T.dGTP, and A.dGTP mispairs and evidence for differential proofreading by leading and lagging strand DNA replication complexes in human cells. J. Biol. Chem. 270:25952600.
Karlin, S., A. M. Campbell, and J. Mrazek. 1998. Comparative DNA analysis across diverse genomes. Annu. Rev. Genet. 32:185225.[ISI][Medline]
Kitsberg, D., S. Selig, I. Keshet, and H. Cedar. 1993. Replication structure of the human ß-globin gene domain. Nature 366:588590.
Kornberg, A., and T. Baker. 1992. DNA replication. Freeman, New York.
Kunkel, T. A. 1992. Biological asymmetries and the fidelity of eukaryotic DNA replication. Bioessays 14:303308.
Lobry, J. R. 1996. Asymmetric substitution patterns in the two DNA strands of bacteria. Mol. Biol. Evol. 13:660665.[Abstract]
McNaughton, J. C., G. Hughes, W. A. Jones, P. A. Stockwell, H. J. Klamut, and G. B. Petersen. 1997. The evolution of an intron: analysis of a long, deletion-prone intron in the human dystrophin gene. Genomics 40:294304.
Maddison, W. P., and D. R. Maddison. 1992. MacClade, Version 3.0. Analysis of phylogeny and character evolution. Sinauer, Sunderland, Mass.
Martinez-Salas, E., D. Y. Cupo, and M. L. DePamphilis. 1988. The need for enhancers is acquired upon formation of a diploid nucleus during early mouse development. Genes Devel. 2:11151126.[Abstract]
Miki, Y., I. Nishisho, A. Horii, Y. Miyoshi, J. Utsonomiya, K. W. Kinzler, B. Vogelstein, and Y. Nakamura. 1992. Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Res. 52:643645.[Abstract]
Miret, J. J., L. Pessoa-Brandao, and R. S. Lahue. 1998. Orientation-dependent and sequence-specific expansions of CTG/CAG trinucleotide repeats in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 95:1243812443.
Mrazek, J., and S. Karlin. 1998. Strand compositional asymmetry in bacterial and large viral genomes. Proc. Natl. Acad. Sci. USA 95:37203725.
Perrin-Pecontal, P., M. Gouy, V.-M. Nigon, and G. Trabuchet. 1992. Evolution of the primate ß-globin gene region: nucleotide sequence of the -ß-globin intergenic region of gorilla and phylogenetic relationships between African apes and man. J. Mol. Evol. 34:1730.[ISI][Medline]
Roberts, J. D., S. Izuta, D. C. Thomas, and T. A. Kunkel. 1994. Mispair-, site-, and strand-specific error rates during simian virus 40 origin-dependent replication in vitro with excess deoxythymidine triphosphate. J. Biol. Chem. 269:17111717.
Rosche, W. A., L. S. Ripley, and R. R. Sinden. 1998. Primer-template misalignments during leading strand DNA synthesis account for the most frequent spontaneous mutations in a quasipalindromic region in Escherichia coli. J. Mol. Biol. 284:633646.
Rosche, W. A., T. Q. Trinh, and R. R. Sinden. 1995. Differential DNA secondary structure-mediated deletion mutation in the leading and lagging strands. J. Bacteriol. 177:43854391.[Abstract]
Rozmahel, R., H. H. Heng, A. M. Duncan, X. M. Shi, J. M. Rommens, and L. C. Tsui. 1997. Amplification of CFTR exon 9 sequences to multiple locations in the human genome. Genomics 45:554561.
Schneider, H., M. P. C. Schneider, I. Sampaio, M. L. Harada, M. Stanhope, J. Czelusniak, and M. Goodman. 1993. Molecular phylogeny of the New World monkeys (Platyrrhini, primates). Mol. Phylogenet. Evol. 2:225242.[ISI][Medline]
Shinomiya, T., and S. Ina. 1991. Analysis of chromosomal replicons in early embryos of Drosophila melanogaster by two-dimensional gel electrophoresis. Nucleic Acids Res. 19:39353941.[Abstract]
Smithies, O., W. R. Engels, J. R. Devereux, J. L. Slightom, and S.-H. Shen. 1981. Base substitutions, length differences, and DNA strand asymmetries in the human G and G
fetal globin gene region. Cell 26:345353.
Sokal, R. R., and F. J. Rohlf. 1995. Biometry. Freeman, New York.
Sueoka, N. 1995. Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J. Mol. Evol. 40:318325.[ISI][Medline]
Thomas, D. C., D. L. Svoboda, J. M. Vos, and T. A. Kunkel. 1996. Strand specificity of mutagenic bypass replication of DNA containing psoralen monoadducts in a human cell extract. Mol. Cell. Biol. 16: 25372544.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL-X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:48764882.
Trinh, T. Q., and R. R. Sinden. 1991. Preferential DNA secondary structure mutagenesis in the lagging strand of replication in E. coli. Nature 352:544547.
Veaute, X., and R. P. P. Fuchs. 1993. Greater susceptibility to mutations in lagging strand of DNA replication in Escherichia coli than in leading strand. Science 261:598600.
Wirak, D. O., L. E. Chalifour, P. M. Wassarman, W. J. Muller, J. A. Hassell, and M. L. DePamphilis. 1985. Sequence-dependent DNA replication in reimplantation mouse embryos. Mol. Cell. Biol. 160:7384.
Wu, C.-I., and N. Maeda. 1987. Inequality in mutation rates of the two strands of DNA. Nature 327:169170.