Intervening Sequences in Paralogous Genes: A Comparative Genomic Approach to Study the Evolution of X Chromosome Introns

Barbara Cardazzo*, Luca Bargelloni{dagger}, Luisa Toffolatti* and Tomaso Patarnello*,{dagger},

* Dipartimento di Biologia
{dagger} Facoltà di Medicina Veterinaria, Campus di Agripolis, Università di Padova, Padova, Italy

Correspondence: E-mail: tomaso.patarnello{at}unipd.it.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
The enlargement of the genome size and the decrease in genome compactness with increase in the number and size of introns is a general pattern during the evolution of eukaryotes. Among the possible mechanisms for modifying intron size, it has been suggested that the insertion of transposable elements might have an important role in driving intron evolution. The analysis of large portions of the human genome demonstrated that a relatively recent (50 to 100 MYA) accumulation of transposable elements appears to be biased, favoring a preferential insertion of LINE1 transposons into sex chromosomes rather than into autosomes. In the present work, the effect of chromosomal location on the increase in size of introns was evaluated with a comparative analysis performed on pairs of human paralogous genes, one located on the X chromosome and the second on an autosome. A phylogenetic analysis was also performed on the X-encoded proteins and their paralogs to confirm orthology-paralogy and to approximately estimate the time of gene duplication. Statistical analysis of total intron length for each pair of paralogous genes provided no evidence for a larger size of introns in the gene copies located on the X chromosome. On the opposite, introns of autosomal genes were found to be significantly longer than introns of their X-linked paralogs. Likewise, LINE1 elements were not significantly more frequent in X-chromosome introns, whereas the frequency of SINE elements showed a marginally significant bias toward autosomal introns.

Key Words: intron size • X chromosome • genome evolution • repeat elements


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
The complete sequence of the human genome (Lander et al. 2001) has made available for the first time the sequence of large introns, raising new questions on the significance of the extreme increase in intron size occurred during evolution of eukaryotes (Vinogradov 1999). Whole-genome analyses in several species of protists, plants, fungi, and animals indicate that the increase in genome size is paralleled by a general decrease in genome compactness and that increase in the number and size of introns is a general pattern during evolution of eukaryotic genomes (Patthy 1999). With regard to vertebrate taxa, however, the evolution of intron size remains largely unexplored. The issue of the evolutionary dynamics of intron length is not merely speculative and appears especially relevant in light of the increasing evidence for an active function of intronic sequences in the modulation of gene expression, alternative splicing, and chromatin structure (Hardison 2000; Duret 2001).

Among the possible mechanisms for modifying intron size, insertion of transposable elements (TEs) might have an important role in driving intron evolution. Incidentally, the activity of TEs has been proposed as a major driving force toward size change not only of intron sequence but also of the entire genome, to explain the so-called C-value paradox (for a detailed discussion on alternative hypotheses on the C-value enigma, see Petrov 2001). The human genome, for instance, contains a large number of repeated sequences that account for nearly 50% of the total genome (Lander et al. 2001). Analysis of age distribution in the human genome indicates that interspersed sequences have not accumulated at a constant rate through evolutionary time and that accumulation has slowed down in the past 50 million years (Myr). TEs are divided into four types, three of which transpose through a RNA intermediate (long interspersed elements [LINEs], short interspersed elements [SINEs], and LTR retrotransposons) and account for the vast majority of interspersed repeats in the human genome. One transposes directly as DNA (DNA transposons) and represents only 3% of the genome (Kidwell and Lisch 2001; Lander et al. 2001). Similar to other noncoding sequences, intronic regions in the human genome are indeed rich in TEs, with tens to hundreds of interspersed repeats per Mb (Smit 1999), depending on the class of TEs. Furthermore, repeat density in human introns (number of repeats per intron length unit) appears to be positively correlated with intron size (Vinogradov 2002), suggesting either that integration of transposons occurs more frequently into longer introns or that some introns became longer than others because of the insertion of repeated elements. In an earlier study focused on the evolution of intron 7 in the Duchenne Muscular Dystrophy (DMD) gene, it was reported that recognizable inserted sequences accounted for more than 40% of the 110-kb intron sequence (McNaughton et al. 1997). The human DMD gene is the longest human gene described so far. Spanning over 2.5 Mb, it is located on the X chromosome and comprises 79 exons and 78 introns. Dating LINE (L1) and SINE (Alu) expansions in intron 7 indicated that this intron has approximately doubled its size within the past 130 Myr, mainly in consequence of repeat element insertion (McNaughton et al. 1997). A recent study on the genomic structure of the human DMD gene and its paralog utrophin (UTRN), which originated from a duplication event that occurred in the vertebrate ancestor (Roberts and Bobrow 1998), confirmed that gradual accumulation of repeated elements, especially in the past 130 Myr, might be regarded as a convincing hypothesis to explain intron expansion at least in these two genes (Pozzoli et al. 2002). Several waves of retrotransposon expansion have characterized the mammalian genome within the past 150 Myr (McNaughton et al. 1997). The expansion of DNA transposons and older LINE elements predated the eutherian radiation, whereas SINE and younger LINE families invaded mammalian genomes in the past 80 to 40 Myr. The analysis of several Mb of the human genome demonstrated that the more recent accumulation of TEs is biased, favoring a preferential insertion of transposons into sex chromosomes rather than into autosomes (Bailey et al. 2000; Erlandsson, Wilson, and Paabo 2000). As a matter of fact, long elements (LINE and LTR) were found at much higher frequency in genomic sequences of the X chromosome than in non-X chromosomes. This bias was particularly evident in the case of LINE1, the frequency of which is double in the X chromosome (26.50% compared with 13.43%) (Bailey et al. 2000).

In light of the above evidence, the present work was aimed at evaluating the effect of chromosomal location on the insertion of TEs in intronic regions. We initially considered the case of the X-chromosome DMD gene, the introns of which are much longer as compared with the corresponding introns of its autosomal homolog UTRN. If increase in intron size were associated with the accumulation of repeat elements, then we would expect that the difference in intron length between X-linked and autosome genes is essentially due to an increased frequency of mobile elements. To this end, pairs of human paralogous genes, one located on the X chromosome and the other located on an autosome, were first selected based on sequence similarity scores. Then the orthologs of these sequences were identified in the genomes of mouse (Mus musculus), pufferfish (Fugu rubripes), and fruit fly (Drosophila melanogaster). Only genes on the human X chromosome having their ortholog on the murine X chromosome were then selected, to have genes that presumably were on the ancestral mammalian X chromosome. Phylogenetic analyses of all selected genes were then performed to establish relationships of homology and to infer time of gene duplication. For these genes, length and structure of introns were compared to test whether the preferential insertion of TEs might have been responsible for a bias in intron size.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
A first selection of genes was carried out using the "Proteome analysis @EBI," a database that provides a list of protein sequences, divided according to the chromosomal location of the corresponding genes, based on the predicted proteomes of fully sequenced organisms (http://www.ebi.ac.uk/proteome/home.html [Apweiler et al. 2001]). A second step was to identify, among the proteins comprised in the X chromosome Gene Set of the Proteome analysis @EBI database, those that were also present in a second database (Proteome Database, http://www.incyte.com/sequence/proteome/index.shtml [Hodges et al. 2002]), which contains a compilation of all available information for each of the listed protein, and, in particular, annotated results from extensive similarity searches against all available proteomes. The combined use of the above databases provided a set of human protein-coding genes located on the X chromosome, as well as their putative closest homologs (based on protein sequence similarity), respectively, from the human, murine, and fruit fly proteomes. A cut-off value of four paralogs (one on the X chromosome) in a single vertebrate species was enforced based on the hypothesis that two rounds of whole-genome duplications occurred in an early vertebrate ancestor. In addition, all vertebrate putative paralogs should have the same putative homolog in the fruit fly genome, based on similarity scores. When two or more fruit fly putative homologs showed close values of sequence similarity (less than 5% of difference), all fruit fly genes were selected and included in the phylogenetic analysis, and a single fruit fly homolog was selected. The chromosomal position of murine homologs was determined using the human-mouse homology map (http://www.ncbi.nlm.nih.gov/Homology/). Only genes on the human X chromosome having their ortholog on the murine X chromosome were then selected, to have genes that presumably were on the ancestral mammalian X chromosome. Genomic sequence data for the selected human and mouse genes were found at the Genome Browser site that contains a working draft both for the human genome and the mouse genome, (http://genome.ucsc.edu/, human: April 2001 version, mouse: February 2002 draft version). Information retrieved from Human and Mouse Genome Browser was compared with curated sequences and descriptive information about single genetic loci from LocusLink at the NCBI (http://www.ncbi.nlm.nih.gov/LocusLink/ [Pruitt and Maglott 2001]).

Putative homologs from the pufferfish Fugu rubripes were then obtained by searching the fugu proteome predicted from the complete sequence of the fugu genome that is available at (http://www.jgi.doe.gov/fugu/, version 1.0). Database searches were performed by BlastP program at http://bahama.jgi-psf.org/fugu/bin/blast.fugu.cgi using default parameters.

Each group of homologous protein sequences was aligned using the program ClustalX (Thompson et al. 1997). Based on the multiple alignment obtained from ClustalX, pairwise aminoacidic distances were calculated applying a Poisson correction for multiple hits. Positions with gaps were excluded either in pairwise comparisons or in multiple comparisons. For each homology group, a Neighbor-Joining (NJ) approach was used to reconstruct phylogenetic relationships between sequences based on the obtained distance matrix. Node robustness was assessed using 1,000 bootstrap replicates. All phylogenetic analyses were performed by using the program MEGA version 2.1 (Kumar et al. 2001).

Based on the phylogenetic information, homology groups were further selected to obtain a final set of genes, which included only paralogs whose duplication predated the divergence between the human and mouse lineages. For each gene in this data set, number and length of introns were calculated based on genomic sequence data retrieved from human and mouse genome browsers. Moreover, a search and classification of repeated elements in intervening sequences was performed using the program RepeatMasker, (http://repeatmasker.genome.washington. edu/cgi-bin/RepeatMasker), under default settings.

Size of intervening sequences (total intron length) appeared not to be normally distributed when applying a Kolmogorov-Smirnoff test. For this reason log-transformed values for total intron length (which resulted to follow a normal distribution) were used. A paired sample test was applied to compare intron size between pairs of paralogous genes. The frequency of mobile elements (SINE and LINE) in the intervening sequences respectively of X-linked and autosomal genes was compared after arcsin transformation of percentage values for each class of elements. As above, a paired sample test was performed after testing for normal distribution of transformed frequency values. To assess the correlation of the difference in size, respectively, between human X-linked and autosomal parologs compared with the difference in size between the corresponding pair of genes in the mouse genome, a sign-correlation test was applied.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
According to the Proteome analysis @EBI database, a total of 585 human proteins were encoded by genes located on the X chromosome. For 435 proteins, a specific entry was present in the Proteome database, which provides detailed results from extensive comparisons among proteomes of model organisms. On the basis of sequence information, protein-coding genes were selected under the following criteria: (1) for each gene located on the human X chromosome, the corresponding protein should have at least one paralog located on any human autosome; (2) all human paralogs should have the same putative homolog in D. melanogaster (as judged based on similarity values); and (3) in dubious cases (two or more fruit fly genes showing close similarity values), all Drosophila putative homologous genes were included in the phylogenetic analysis.

A further criterion was the chromosomal location of the murine orthologs to the human selected genes. In fact, we were particularly interested in investigating the expansion of LINE1 elements occurred in the past 100 Myr in the X chromosome, since it appears that the probability of a LINE1 insertion is double on the X chromosome as compared with any autosome (Bailey et al. 2000). For this reason, it was important to select genes that were already on the X chromosome before the eutherian radiation. Thus, among all human genes located on the X chromosome, we choose only those genes that, on the base of the human-mouse Homology Map (http://www.ncbi.nlm.nih.gov/Homology/), have their putative murine ortholog on the mouse X chromosome,

At the end of this selection, 47 pairs of human paralogous genes were found to match all the criteria described above. For each selected pair, all available sequence information was retrieved from the Human Genome Browser (see Materials and Methods), with the exception of five gene pairs for which the complete genomic sequence was not available. Other five pairs were excluded because one or both genes did not contain introns. For each of the remaining 37 pairs of genes, a similarity search on the fugu proteome database was carried out to identify putative protein homologs of the teleost species F. rubripes (a complete list of the accession numbers for each fugu sequence is available in the table 1 of Supplementary Material online). Putative protein homologs from the pufferfish genome were included to perform a phylogenetic analysis for each group of paralogy. The rationale for such a phylogenetic analysis was twofold. The first goal was to confirm relationships of orthology-paralogy for each group of genes, because homology among protein sequences is usually best defined when, in addition to calculating pairwise similarity scores, phylogenetic relationships are reconstructed in the form of a phylogenetic tree. Second, the construction of a gene tree was carried out to determine whether the gene duplication event that gave origin to the putative human paralogs in each paralogy group occurred before the separation of the ray-finned fish (Actinopterigii) and the lobe-finned fish (Sarcopterigii) (450 MYA) or whether the duplication event took place only in the tetrapod lineage. The results of phylogenetic analysis might be summarized as follows. For five groups of genes, the phylogenetic position of the fruit fly gene(s) in the tree was not compatible with the criterion of a duplication having occurred after the separation between arthropod and vertebrate ancestors. For the remaining 32 groups of sequences, different tree topologies were obtained. In 14 groups of paralogy, one pufferfish ortholog was present for each of the two human paralogs. Four groups of genes showed more than two fugu proteins, whereas the remaining 14 groups displayed either only one fugu ortholog or a single fugu ortholog with a second fugu protein branching out earlier than any other sequence homolog. Despite these differences, all observed tree topologies could be reconciled with the tree topology depicted in figure 1; therefore, the origin of all selected human paralogs (32) could be traced back to gene duplication in the common ancestor of teleosts and tetrapods. No recently duplicated genes (after the tetrapod-teleost split but before human and mouse divergence) were observed in our study. The hypothesis of whole-genome duplications in an early vertebrate ancestor as a major driving force of vertebrate genome evolution is currently the subject of animated discussions (Meyer and Schartl 1999). Although limited in terms of selection criteria and number of genes, the results of the present paper appear in agreement with a recent study based on large-scale genome analyses (McLysaght, Hokamp, and Wolfe 2002), which supported the contention that many of the gene families in vertebrates were formed or expanded by a large-scale DNA duplication in an early vertebrate. In a recent paper, however, Friedman and Hughes (2003) have questioned the existence of a peak of gene duplication before the amniote-amphibian separation and suggested that most duplications occur as tandem duplications within a single chromosome and that in the case of recent events, duplicated copies are more often found still on the same chromosome. Therefore, the bias toward early duplicated genes observed in the our study might be also explained as a consequence of our selection of paralogous genes on distinct chromosomes, thereby excluding the majority of recently duplicated genes, since most of them are expected to be found on a single chromosome (Friedman and Hughes 2003).


View this table:
[in this window]
[in a new window]
 
Table 1 The 32 Selected Chromosome X Genes (X Genes) and Their Autosomal Paralogs (A Genes).

 


View larger version (7K):
[in this window]
[in a new window]
 
FIG. 1. Tree topology of the analyzed genes as expected under the hypothesis of a gene duplication occurred before the divergence of teleosts and tetrapods. human = Homo sapiens; mouse = Mus musculus; fugu = Fugu rubripes; fruitfly = Drosophila melanogaster.

 
The final gene set (32 paralogous pairs) was characterized in terms of total genomic size, coding sequence size, and exon-intron structure (table 1). To increase comparability, only introns delimited by coding exons were considered. The exon-intron structure was similar for each human paralogous pair, suggesting that most introns predated the duplication event that produced the paralogous genes, and that intron positions were already determined early in the evolution of vertebrates. Indeed, a comparison of exon-intron structure between human-mouse orthologs revealed minor differences (data not shown; Mural et al. 2002), and similar evidence was obtained when comparing mammalian and fish orthologous genes (Venkatesh, Ning, and Brenner 1999).

Among the 32 selected human paralog pairs, two genes showed extremely large intervening sequences (more than 1 Mb), the DMD gene, located on the X chromosome, and the GPC6, located on chromosome 13. For this reason, all the statistical analyses were performed either including or excluding these two genes and their corresponding paralogous copies (respectively UTRN and GPC4). Irrespective to the inclusion or exclusion of these large genes, statistical analysis of total intron length for each pair of paralogous genes demonstrated that autosomal gene copies contain significantly larger introns (P = 0.023). In particular, 18 comparisons out of 30 (or 19 out of 32 if including DMD/UTRN and GPC4/GPC6) revealed that the intronic regions of autosomal genes are larger than the corresponding intervening sequences of their X-chromosome paralogs. This result is in contrast to the initial hypothesis of larger introns in X-chromosome genes compared with their autosomal paralogs. Different lines of evidence concurred in the formation of the above hypothesis. First, the analysis of the difference in size between the DMD and the UTRN genes (McNaughton et al. 1997; Pozzoli et al. 2002). Secondly, the possible inverse correlation between intron length and recombination rate, in consideration of the lower recombination frequency in sexual chromosomes (Comeron and Kreitman 2000; Duret 2001). Third, the relevant impact of TEs on introns (Duret 2001), coupled with the much higher rate of insertion of longer mobile elements into the X chromosome compared with autosomal genomic regions (Bailey et al. 2000; Erlandsson, Wilson, and Paabo 2000), especially with regard to LINE1 retrotransposons.

On the basis of the results obtained in the present study, the extreme difference in length between the DMD and the UTRN genes is not a general trend for pairs of sex chromosome-autosome paralogs. Likewise, intervening sequences in sex-linked genes show no evidence of the supposed effect of lower recombination in sexual chromosomes. Indeed, the correlation between intron length and recombination rate was established for introns less than 100 bp in length, whereas for larger introns, there was no significant correlation (Carvalho and Clark 1999; Duret 2001), and almost all introns examined in this paper were considerably larger than 100 bp.

Finally, the massive insertion of LINE1 elements that occurred preferentially in the X chromosome had apparently no major influence on differences in intron size of paralogous genes. On the contrary, statistically significant evidence was found for longer intervening sequences in autosomal genes. To further evaluate this result, intronic regions of all selected genes were analyzed for the presence of mobile elements. Percent values of total repeat sequences, as well as values of the four principal classes of transposons, are reported in table 2 for each gene. Statistical analysis of the frequency of the different classes showed a marginally significant enrichment in SINE elements (P = 0.05) in the introns of autosomal genes compared with their X-linked paralogs. This evidence is in agreement with the results of a previous study in which both intergenic and intragenic noncoding regions were considered (Bailey et al. 2000). In the same study, Bailey and colleagues demonstrated a highly significant enrichment in LINE1 and LTR elements in X chromosome noncoding sequences, with LINE1s having a frequency nearly double that of autosomal sequences. Statistical analysis of composition in LINE1 in the intronic sequences of our data set showed a slightly higher frequency of this class of mobile elements in the X-linked genes, but the difference is not statistically significant (P > 0.66; when excluding DMD/UTRN and GPC4/GPC6, P > 0.73). A likely explanation for the observed discrepancy is that intergenic regions could be more frequently the target of insertion of the long LINE1 and LTR elements (3,000 to 5,000 bp) than intron sequences as a consequence of more relaxed functional constraints in size and composition of intergenic regions (Smit 1999). These functional constraints might be less important for the insertion of the short SINE (200 to 300 bp) sequences. On the other hand, a lower presence in SINE elements in the introns of X-linked genes with comparison to their autosomal paralogs was found, in parallel with the observed bias toward longer autosomal introns (see above). The difference in SINE content is only marginally significant, however, and further evidence is required to support the hypothesis of a role of SINEs in shaping intron size.


View this table:
[in this window]
[in a new window]
 
Table 2 Mobile Elements in the 32 Pairs of Human Paralogs (Chromosome X Genes and Autosomal Genes).

 
An additional reason for the discrepancy with the results of Bailey and colleagues (2000) could be due to the heterogeneous distribution of the LINE1 elements on the X chromosome, in association with different cytogenetic bands (Bailey et al. 2000). In that case, the authors concluded that the LINE1 overabundance in some X chromosome regions and their distribution pattern of these elements was related to a functional role in X inactivation. Although this hypothesis has been recently called into question (Chureau et al. 2002), a similar heterogeneity was observed in the present work for the genes located in q23 (facl4 and trpc5) and p11.21 (alas2 and pfkfb1) cytogenetic bands. The intervening sequences of these four genes revealed percent values of LINE1 elements two times to over 40 times higher than their autosomal paralogs. Quite remarkably, in this case, all four X-linked copies had longer intervening sequences than their paralogous copies. Therefore, it might be possible that a recent expansion of LINE1 elements did play a role in determining intron length but limited to specific genomic regions.

A closer examination of single comparisons between paralogous genes also revealed that in several cases, introns were 10 times larger in one paralog compared with the other (table 1). These large differences were often correlated with one gene showing extremely short introns with a low proportion or even absence of repeated elements in the shortest intervening sequences (table 2). This might reflect different constraints in intron size and compositions between paralogous genes. It could be possible, however, that once a relative difference in size between introns of duplicated genes is established by chance, if TEs transpose more frequently into large introns, such insertional bias might increase the initial size difference. As mentioned in the introduction, some introns might be longer either because longer introns accommodate more frequently new TEs or because functional constraints (e.g., size or composition) are more relaxed, allowing a larger number of TEs to be inserted. Some evidence in favor of the latter hypothesis comes from the analysis of intron size in the corresponding mouse paralogous pairs (28 gene pairs [table 3]). In 25 out of 28 comparisons, the sign in difference between sex-linked and autosomal copies was the same as the homologous human pair (sign-correlation test P < 0.0001). After inheriting introns of similar size from their common ancestor, the human and mouse genome evolved separately since the divergence of the two species 100 to 80 MYA. Moreover, a comparative study on the whole mouse chromosome 16 (Mural et al. 2002) as well as the analysis of introns of human-mouse orthologs examined in the present study (data not shown) indicate that the majority of TEs present in the intergenic and intronic regions of human and mouse genomes are different. Despite 100 Myr of independent evolution and although most of the TE insertions occurred after the eutherian radiation, we observed a strong correlation in intron size between orthologous genes. This evidence suggests the presence of parallel functional constraints on intron evolution that have independently maintained the intron length at a similar size in different genomes.


View this table:
[in this window]
[in a new window]
 
Table 3 Comparison of Human and Mouse Genes.

 
In conclusion, although it cannot be excluded that major expansions of TEs might have affected intron size, our comparative analysis of homologous genes indicates that the effect of these expansions was modulated by other factors, such as regional chromosomal location and functional constraints.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
We thank Andrea Pilastro for his help with statistical analyses. This work was partially supported by a grant from the European Union (FINGER, QLG2-CT-1999–00920) to T.P.


    Footnotes
 
Axel Meyer, Associate Editor Back


    Literature Cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 

    Apweiler, R., M. Biswas, and W. Fleischmann, et al. (11 co-authors). 2001. Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes. Nucleic Acids Res. 29:44-48.[Abstract/Free Full Text]

    Bailey, J. A., L. Carrel, A. Chakravarti, and E. E. Eichler. 2000. Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: the Lyon repeat hypothesis. Proc. Natl. Acad. Sci. USA 97:6634-9.[Abstract/Free Full Text]

    Carvalho, A. B., and A. G. Clark. 1999. Intron size and natural selection. Nature 401:344.[CrossRef][ISI][Medline]

    Chureau, C., M. Prissette, A. Bourdet, V. Barbe, L. Cattolico, L. Jones, A. Eggen, P. Avner, and L. Duret. 2002. Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine. Genome Res. 12:894-908.[Abstract/Free Full Text]

    Comeron, J. M., and M. Kreitman. 2000. The correlation between intron length and recombination in Drosophila: dynamic equilibrium between mutational and selective forces. Genetics 156:1175-1190.[Abstract/Free Full Text]

    Duret, L. 2001. Why do genes have introns? Recombination might add a new piece to the puzzle. Trends Genet. 17:172-175.[CrossRef][ISI][Medline]

    Erlandsson, R., J. F. Wilson, and S. Paabo. 2000. Sex chromosomal transposable element accumulation and male-driven substitutional evolution in humans. Mol. Biol. Evol. 17:804-812.[Abstract/Free Full Text]

    Friedman, R., and A. L. Hughes. 2003. The temporal distribution of gene duplication events in a set of highly conserved human gene families. Mol. Biol. Evol. 20:154-161.[Abstract/Free Full Text]

    Hardison, R. C. 2000. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 16:369-372.[CrossRef][ISI][Medline]

    Hodges, P. E., P. M. Carrico, J. D. Hogan, K. E. OÆNeill, J. J. Owen, M. Mangan, B. P. Davis, J. E. Brooks, and J. I. Garrels. 2002. Annotating the human proteome: the Human Proteome Survey Database (HumanPSDTM) and an in-depth target database for G protein-coupled receptors (GPCR-PDTM) from Incyte Genomics. Nucleic Acids Res. 30:137-141.[Abstract/Free Full Text]

    Kidwell, M. G., and D. M. Lisch. 2001. Transposable elements, parasitic DNA and genome evolution. Evolution 55:1-24.[ISI][Medline]

    Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.[Abstract/Free Full Text]

    Lander, E. S., L. M. Linton, and B. Birren, et al. (271 co-authors). 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921.[CrossRef][ISI][Medline]

    McLysaght, A., K. Hokamp, and K. H. Wolfe. 2002. Extensive genomic duplication during early chordate evolution. Nat. Genet. 31:200-204.[CrossRef][ISI][Medline]

    McNaughton, J. C., G. Hughes, W. A. Jones, P. A. Stockwell, H. J. Klamut, and G. B. Petersen. 1997. The evolution of an intron: analysis of a long, deletion-prone intron in the human dystrophin gene. Genomics 40:294-304.[CrossRef][ISI][Medline]

    Meyer, A., and M. Schartl. 1999. Gene and genome duplication in vertebrates: the one-to-four (-to-eight in fish) rule and the evolution of novel gene functions. Curr. Opin. Cell Biol. 11:699-704.[CrossRef][ISI][Medline]

    Mural, R. J., M. D. Adams, and E. W. Myers, et al. (180 co-authors). 2002. A comparison of whole genome shotgun-derived mouse chromosome 16 and the human genome. Science 296:1661-1671.[Abstract/Free Full Text]

    Patthy, L. 1999. Genome evolution and the evolution of exon-shuffling—a review. Gene 238:103-114.[CrossRef][ISI][Medline]

    Petrov, D. A. 2001. Evolution of genome size: new approaches to an old problem. Trends Genet. 17:23-28.[CrossRef][ISI][Medline]

    Pozzoli, U., M. Sironi, R. Cagliani, G. P. Comi, A. Bardoni, and N. Bresolin. 2002. Comparative analysis of the human dystrophin and utrophin gene structures. Genetics 160:793-798.[Abstract/Free Full Text]

    Pruitt, K. D., and D. R. Maglott. 2001. RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29:137-140.[Abstract/Free Full Text]

    Roberts, R. G., and M. Bobrow. 1998. Dystrophins in vertebrates and invertebrates. Hum. Mol. Genet. 7:589-595.[Abstract/Free Full Text]

    Smit, A. F. A. 1999. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9:657-663.[CrossRef][ISI][Medline]

    Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876-4882.[Abstract/Free Full Text]

    Venkatesh, B., Y. Ning, and S. Brenner. 1999. Late changes in spliceosomal introns define clades in vertebrate evolution. Proc. Natl. Acad. Sci. USA 96:10267-10271.[Abstract/Free Full Text]

    Vinogradov, A. E. 1999. Intron-genome size relationship on a large evolutionary scale. J. Mol. Evol. 49:376-384.[ISI][Medline]

    2002. Growth and decline of introns. Trends Genet. 18:232-236.[CrossRef][ISI][Medline]

Accepted for publication July 7, 2003.





This Article
Abstract
FREE Full Text (PDF)
Supplementary Material
All Versions of this Article:
20/12/2034    most recent
msg213v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (1)
Request Permissions
Google Scholar
Articles by Cardazzo, B.
Articles by Patarnello, T.
PubMed
PubMed Citation
Articles by Cardazzo, B.
Articles by Patarnello, T.