* Department of Biochemistry and Molecular Biology, and the Kidney Institute, University of Kansas Medical Center, Kansas City, Kansas
Department of Chemistry/Physics, Northwest Missouri State University, Maryville, Missouri
Correspondence: E-mail: jcalvet{at}kumc.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: polycystic kidney disease intron G-triplet splicing
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Mechanisms for intron removal by pre-mRNA splicing require the precise identification of intron-exon borders (Reed 2000). Sequence elements within introns that are involved in the splicing mechanism are conserved donor (5') and acceptor (3') splice sites and an internal branch point sequence containing a conserved adenosine residue. Splice-site recognition and splicing efficiency are also regulated by intron and exon splicing enhancer sequences (McCullough and Berget 2000; Cartegni, Chew, and Krainer 2002). In general, spliceosome formation is thought to first involve "exon definition," whereby upstream acceptor and downstream donor splice sites flanking an exon are identified by the splicing machinery before branch-site recognition and subsequent splicing of the upstream intron (Berget 1995). This mechanism is thought to apply to the majority of splicing events, which typically involve small exons flanked by much larger introns. In contrast, small introns may be recognized by a process of "intron definition" (Talerico and Berget 1994), and their splicing may be facilitated by the presence of multiple G-triplet sequences within these introns (McCullough and Berget 1997). The removal of the last intron of some pre-mRNAs may involve a mechanism that defines the last exon by coupling splicing with polyadenylation (Berget 1995; Cooke and Alwine 2002).
Autosomal dominant polycystic kidney disease (ADPKD) is one of the most common, potentially lethal inherited diseases worldwide, with a frequency of 1 in 200 to 1,000 in the population (Calvet and Grantham 2001; Peters and Breuning 2001; Qian, Harris, and Torres 2001; Igarashi and Somlo 2002). The disease is characterized by the growth of innumerable, large, fluid-filled cysts arising from kidney tubules, which can ultimately lead to renal failure in midlife. In some patients, the disease is also characterized by extrarenal manifestations, such as liver and pancreatic cysts, and cerebral and aortic aneurysms. Mutations in the PKD1 gene account for 85% to 90% of ADPKD cases. The human PKD1 gene, which encodes a very large multimembrane-spanning receptor termed polycystin-1, is greater than 50 kb in length and has 46 exons (The European PKD Consortium 1994; Burn et al. 1995; Harris et al. 1995; Hughes et al. 1995; The International PKD Consortium 1995). PKD1 sequences have been characterized from human, mouse, rat, dog, and pufferfish (Lohning, Nowicka, and Frischauf 1997; Sandford et al. 1997; Xu et al. 2001; Dackowski et al. 2002). PKD1 orthologs have also been identified in sea urchin and nematode (Moy et al. 1996; Barr and Sternberg 1999). We previously reported a very high sequence conservation in the amino acid sequence flanking the last intron in the PKD1 gene, intron 45 (Parnell et al. 1998). We subsequently examined the genomic sequence and found striking nucleotide sequence conservation of this intron across the four known mammalian PKD1 genes.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Sequence Analysis
PKD1 and -globin-2 sequences were obtained from GenBank as follows: rat PKD1 cDNA accession number AF277452; dog PKD1 genomic accession number AF483210; human PKD1 genomic accession number L39891; mouse
-globin-2 genomic accession number AY016021; and human
-globin-2 genomic accession number J00153. Multiple sequence alignments were carried out using ClustalW version 1.8. For the
-globin-2/intron-2 alignment, the human and mouse introns were anchored with flanking exon sequences. Pairwise identities for the four PKD1 exons (exon 46 coding) and for intron 45 were determined using Blast version 2 (NCBI), and all six pairwise combinations were averaged. Pairwise identities for introns 43 and 44 were calculated directly from the ClustalW alignments (fig. 1). Translations were carried out using 6 Frame Translation (Baylor College of Medicine Search Launcher; http://searchlauncher.bcm.tmc.edu/seq-util/seq-util.html). RNA secondary structure predictions were carried out using RNAfold (Vienna RNA Package; http://www.tbi.univie.ac.at/cgi-bin/RNAfold.cgi).
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To further analyze the sequence conservation over the 3' terminal region of the PKD1 gene, we compared the last four exons (exons 43 to 46) and the last three introns (introns 43 to 45) of the PKD1 gene by carrying out a ClustalW alignment for these sequences from the four mammalian species (see Supplementary Material online). Figure 2 summarizes the degree of sequence identity for the four exons and three introns. As shown, all three introns are small (<100 bp). The percent identities are the averages of all six pairwise sequence comparisons. As expected, the coding regions are at least 80% identical, whereas introns 43 and 44 are only 57% and 54% identical, respectively. In contrast, the most conserved region in this greater than 1,400-bp sequence is intron 45, with an average identity of 94% across all four species.
|
Analysis of the intron 45 pre-mRNA sequences from the four mammalian species using RNAfold shows that the intron is capable of forming a very stable secondary structure (fig. 3). The length of the intron is within the approximately 110-nucleotide, naked RNA "window" that would allow intramolecular base pairing to occur (Eperon et al. 1988), making it very likely that such a structure would form in the nascent RNA transcript immediately after being synthesized, before the binding of hnRNP and spliceosomal proteins. It is notable that the one variable region near the middle of the intron (see fig. 1, top) occurs in a loop and thus does not alter the base-paired structure. In addition, the two single base-pair transitions in the human intron, which lie upstream (AG) and downstream (C
T) of the variable region, are within base-paired stems, preserving the secondary structure. Furthermore, the likely branch-point sequence (Zhuang, Goldstein, and Weiner 1989) lies in another loop, which would make it accessible for interaction with the splicing factor SF1 (Liu et al. 2001) and with U2 snRNA (Calvet, Meyer, and Pederson 1982; Zhuang, Goldstein, and Weiner 1989; Pascolo and Seraphin 1997). Thus, intron 45 may adopt a specific secondary structure conformation that is required to facilitate its splicing.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The very high sequence conservation of intron 45 suggests that it has been selected for, and therefore may have a unique function. There are many possibilities, and several are worthy of comment. One function may be that the intron (and perhaps the flanking conserved sequence) has a role in regulating transcription. Although no known promoter lies close to intron 45, it is well established that enhancers can function over quite long distances. We tested intron 45 and its flanking sequences for enhancer activity by transient transfection in HEK293T cells using the PKD1 promoter cloned upstream of a luciferase reporter (Rodova et al. 2002) but were unable to demonstrate an effect on transcriptional activity (data not shown). It is also possible that the unique tail-to-tail organization of the PKD1 and TSC2 genes, which are directly abutted at their 3' ends (The European PKD Consortium 1994; Olsson et al. 1996), requires that transcription from the TSC2 gene be efficiently terminated so as to not extensively overlap with and therefore interfere with PKD1 transcription. If so, the intron 45 sequence, which lies only approximately 1,500 bp downstream of the 3' end of the TSC2 gene, may function as a termination signal. Arguing against this possibility is that the pufferfish PKD1 and TSC2 genes are also in a tail-to-tail organization (Sandford et al. 1996) and that RNA polymerase II terminators usually have a run of T residues (Kerppola and Kane 1991), which is not found in or around intron 45. Alternatively, the PKD1 and TSC2 genes may share a locus control region, or there may be a transcriptional insulator between the promoters of these two genes, which may include intron 45.
Intron 45 may contain within it, or may be part of, a transcription unit that produces a small RNA. Inspection of the sequence shows that there is an RNA polymerase III box A motif (Geiduschek and Tocchini-Valentini 1988) near the 5' end of the intron (fig. 1, top). It is also possible that a small RNA is processed from the intron after its excision by splicing, as has been shown for a number of modification guide snoRNAs, which are encoded within introns (Seraphin 1993; Tycowski, Shu, and Steitz 1993). In general, however, snoRNAs are encoded in genes having some functional connection with the process of translation, making this possibility less likely. We have also considered that the intron has a protein-coding function, such that if the intron is retained by alternative splicing, it would encode a short insertion in the polycystin-1 C-tail. We feel that this can be ruled out because, whereas the human intron 45 sequence has the required length and open reading frame to encode an in-frame 30amino acid insertion, those of the other species do not.
Intron 45 appears to conform to the class of small vertebrate introns that are spliced by a mechanism utilizing intron definition (McCullough and Berget 1997, 2000). These introns often contain multiple G-triplets, which serve as intron enhancer sequences. It has been shown that the presence of G-triplets can overcome "weak" acceptor splice sites. Intron 45 has a very C-rich acceptor site, with no contiguous T residues, and therefore would be considered to have a "weak" acceptor splice site by these criteria. The G-triplet density (number of G residues in G-triplets divided by intron length) of intron 2 of -globin-2 (fig. 1, bottom) is 18% to 20% and the G-triplet density of intron 45 is 20% to 21%. In contrast, the comparable pufferfish intron has a G-triplet density of only 7%. Furthermore, the pufferfish intron has a T-rich polypyrimidine track with multiple runs of contiguous T residues, suggesting that it is spliced by a different mechanism. The presence of G-triplets in intron 45 does not explain the very high sequence conservation; however, as the "prototypic" G-tripletcontaining intron,
-globin-2 intron 2, shows very little sequence conservation (fig. 1, bottom).
Another possibility is that the conserved intron 45 sequence is required for the regulation of its own splicing. It has been shown that hnRNA contains regions of intramolecular base pairing in vivo (Calvet and Pederson 1977, 1978, 1979a, 1979b) and that secondary structure can play a role in the regulation of splicing (Balvay, Libri, and Fiszman 1993; Charpentier and Rosbash 1996). The predicted secondary structure of the intron 45 transcript (fig. 3) places the variable region of the intron within a loop. Furthermore, the two nonloop nucleotide substitutions preserve the base-paired stems, suggesting that the intron adopts a highly specific stem-loop conformation in which the stems are functionally important. The predicted structure also places the putative branch-point sequence within another loop, where it should be accessible for interaction with splicing factors such as SF1 and U2 snRNA, which both require single-stranded RNA for binding. The conservation of intron 45 suggests that the sequence has a function in addition to (or other than) intron definition, since other small G-triplet introns do not show sequence conservation. Possibly, the tertiary structure of intron 45 may be important for defining it as the last intron, by perhaps creating a unique platform for factors that couple its splicing with polyadenylation (Cooke and Alwine 2002).
Other examples of conserved introns or intron regions exist (Bourbon and Amalric 1990; Aruscavage and Bass 2000; Sun et al. 2000; Yatsuki et al. 2000); however, in no case is the reason for this conservation understood. Further insight into whether PKD1 intron 45 has a special function could be obtained by testing the effect of germline mutation of the intron in transgenic mice, and by searching for otherwise unexplained intron 45 mutations in ADPKD patients.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aruscavage, P. J., and B. L. Bass. 2000. A phylogenetic analysis reveals an unusual sequence conservation within introns involved in RNA editing. RNA 6:257-269.
Balvay, L., D. Libri, and M. Y. Fiszman. 1993. Pre-mRNA secondary structure and the regulation of splicing. Bioessays 15:165-169.[ISI][Medline]
Barr, M. M., and P. W. Sternberg. 1999. A polycystic kidney-disease gene homologue required for male mating behaviour in C. elegans. Nature 401:386-389.[CrossRef][ISI][Medline]
Berget, S. M. 1995. Exon recognition in vertebrate splicing. J. Biol. Chem. 270:2411-2414.
Bourbon, H. M., and F. Amalric. 1990. Nucleolin gene organization in rodents: highly conserved sequences within three of the 13 introns. Gene 88:187-196.[CrossRef][ISI][Medline]
Burn, T. C., T. D. Connors, and W. R. Dackowski, et al. (16 co-authors). 1995. Analysis of the genomic sequence for the autosomal dominant polycystic kidney disease (PKD1) gene predicts the presence of a leucine-rich repeat. Hum. Mol. Genet. 4:575-582.[Abstract]
Calvet, J. P., and J. J. Grantham. 2001. The genetics and physiology of polycystic kidney disease. Semin. Nephrol. 21:107-123.[CrossRef][ISI][Medline]
Calvet, J. P., L. M. Meyer, and T. Pederson. 1982. Small nuclear RNA U2 is base-paired to heterogeneous nuclear RNA. Science 217:456-458.[ISI][Medline]
Calvet, J. P., and T. Pederson. 1977. Secondary structure of heterogeneous nuclear RNA: two classes of double-stranded RNA in native ribonucleoprotein. Proc. Natl. Acad. Sci. USA 74:3705-3709.[Abstract]
1978. Nucleoprotein organization of inverted repeat DNA transcripts in heterogeneous nuclear RNA-ribonucleoprotein particles from HeLa cells. J. Mol. Biol. 122:361-378.[ISI][Medline]
1979a. Heterogeneous nuclear RNA double-stranded regions probed in living HeLa cells by crosslinking with the psoralen derivative aminomethyltrioxsalen. Proc. Natl. Acad. Sci. USA 76:755-759.[Abstract]
1979b. Photochemical cross-linking of secondary structure in HeLa cell heterogeneous nuclear RNA in situ. Nucleic Acids Res. 6:1993-2001.[Abstract]
Cartegni, L., S. L. Chew, and A. R. Krainer. 2002. Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat. Rev. Genet. 3:285-298.[CrossRef][ISI][Medline]
Charpentier, B., and M. Rosbash. 1996. Intramolecular structure in yeast introns aids the early steps of in vitro spliceosome assembly. RNA 2:509-522.[Abstract]
Chua, K., and R. Reed. 2001. An upstream AG determines whether a downstream AG is selected during catalytic step II of splicing. Mol. Cell. Biol. 21:1509-1514.
Cooke, C., and J. C. Alwine. 2002. Characterization of specific protein-RNA complexes associated with the coupling of polyadenylation and last-intron removal. Mol Cell Biol 22:4579-4586.
Dackowski, W. R., H. F. Luderer, P. Manavalan, N. O. Bukanov, R. J. Russo, B. L. Roberts, K. W. Klinger, and O. Ibraghimov-Beskrovnaya. 2002. Canine PKD1 is a single-copy gene: genomic organization and comparative analysis. Genomics 80:105-112.[CrossRef][ISI][Medline]
Eperon, L. P., I. R. Graham, A. D. Griffiths, and I. C. Eperon. 1988. Effects of RNA secondary structure on alternative splicing of pre-mRNA: Is folding limited to a region behind the transcribing RNA polymerase? Cell 54:393-401.[ISI][Medline]
The European Polycystic Kidney Disease Consortium. 1994. The polycystic kidney disease 1 gene encodes a 14 kb transcript and lies within a duplicated region on chromosome 16. Cell 77:881-894.[ISI][Medline]
Fedorov, A., X. Cao, S. Saxonov, S. J. de Souza, S. W. Roy, and W. Gilbert. 2001. Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns. Proc. Natl. Acad. Sci. USA 98:13177-13182.
Geiduschek, E. P., and G. P. Tocchini-Valentini. 1988. Transcription by RNA polymerase III. Annu. Rev. Biochem. 57:873-914.[CrossRef][ISI][Medline]
Harris, P. C., C. J. Ward, B. Peral, and J. Hughes. 1995. Polycystic kidney disease 1: identification and analysis of the primary defect. J. Am. Soc. Nephrol. 6:1125-1133.[Abstract]
Hawkins, J. D. 1988. A survey on intron and exon lengths. Nucleic Acids Res. 16:9893-9908.[Abstract]
Hughes, J., C. J. Ward, B. Peral, R. Aspinwall, K. Clark, J. L. San Millan, V. Gamble, and P. C. Harris. 1995. The polycystic kidney disease 1 (PKD1) gene encodes a novel protein with multiple cell recognition domains. Nat. Genet. 10:151-160.[ISI][Medline]
Igarashi, P., and S. Somlo. 2002. Genetics and pathogenesis of polycystic kidney disease. J. Am. Soc. Nephrol. 13:2384-2398.
The International Polycystic Kidney Disease Consortium. 1995. Polycystic kidney disease: the complete structure of the PKD1 gene and its protein. Cell 81:289-298.[ISI][Medline]
Kerppola, T. K., and C. M. Kane. 1991. RNA polymerase: regulation of transcript elongation and termination. FASEB J. 5:2833-2842.
Kolkman, J. A., and W. P. Stemmer. 2001. Directed evolution of proteins by exon shuffling. Nat. Biotechnol. 19:423-428.[CrossRef][ISI][Medline]
Liu, Z., I. Luyten, M. J. Bottomley, A. C. Messias, S. Houngninou-Molango, R. Sprangers, K. Zanier, A. Kramer, and M. Sattler. 2001. Structural basis for recognition of the intron branch site RNA by splicing factor 1. Science 294:1098-1102.
Logsdon, J. M., Jr., A. Stoltzfus, and W. F. Doolittle. 1998. Molecular evolution: recent cases of spliceosomal intron gain? Curr. Biol. 8:R560-563.[ISI][Medline]
Lohning, C., U. Nowicka, and A. M. Frischauf. 1997. The mouse homolog of PKD1: sequence analysis and alternative splicing. Mamm. Genome 8:307-311.[CrossRef][ISI][Medline]
Maniatis, T., and B. Tasic. 2002. Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature 418:236-243.[CrossRef][ISI][Medline]
McCullough, A. J., and S. M. Berget. 1997. G triplets located throughout a class of small vertebrate introns enforce intron borders and regulate splice site selection. Mol. Cell. Biol. 17:4562-4571.[Abstract]
2000. An intronic splicing enhancer binds U1 snRNPs to enhance splicing and select 5' splice sites. Mol. Cell. Biol. 20:9225-9235.
Moy, G. W., L. M. Mendoza, J. R. Schulz, W. J. Swanson, C. G. Glabe, and V. D. Vacquier. 1996. The sea urchin sperm receptor for egg jelly is a modular protein with extensive homology to the human polycystic kidney disease protein, PKD1. J. Cell Biol. 133:809-817.[Abstract]
Olsson, P. G., C. Lohning, S. Horsley, L. Kearney, P. C. Harris, and A. Frischauf. 1996. The mouse homologue of the polycystic kidney disease gene (Pkd1) is a single-copy gene. Genomics 34:233-235.[CrossRef][ISI][Medline]
Parnell, S. C., B. S. Magenheimer, R. L. Maser, C. A. Rankin, A. Smine, T. Okamoto, and J. P. Calvet. 1998. The polycystic kidney disease-1 protein, polycystin-1, binds and activates heterotrimeric G-proteins in vitro. Biochem. Biophys. Res. Commun. 251:625-631.[CrossRef][ISI][Medline]
Pascolo, E., and B. Seraphin. 1997. The branchpoint residue is recognized during commitment complex formation before being bulged out of the U2 snRNA-pre-mRNA duplex. Mol. Cell. Biol. 17:3469-3476.[Abstract]
Peters, D. J., and M. H. Breuning. 2001. Autosomal dominant polycystic kidney disease: modification of disease progression. Lancet 358:1439-1444.[CrossRef][ISI][Medline]
Qian, Q., P. C. Harris, and V. E. Torres. 2001. Treatment prospects for autosomal-dominant polycystic kidney disease. Kidney Int. 59:2005-2022.[CrossRef][ISI][Medline]
Reed, R. 2000. Mechanisms of fidelity in pre-mRNA splicing. Curr. Opin. Cell Biol. 12:340-345.[CrossRef][ISI][Medline]
Rodova, M., M. R. Islam, R. L. Maser, and J. P. Calvet. 2002. The polycystic kidney disease-1 promoter is a target of the beta-catenin/T-cell factor pathway. J. Biol. Chem. 277:29577-29583.[Abstract]
Roy, S. W., B. P. Lewis, A. Fedorov, and W. Gilbert. 2001. Footprints of primordial introns on the eukaryotic genome. Trends Genet. 17:496-501.[CrossRef][ISI][Medline]
Sandford, R., B. Sgotto, and S. Aparicio, et al. (12 co-authors). 1997. Comparative analysis of the polycystic kidney disease 1 (PKD1) gene reveals an integral membrane glycoprotein with multiple evolutionary conserved domains. Hum. Mol. Genet. 6:1483-1489.
Sandford, R., B. Sgotto, T. Burn, and S. Brenner. 1996. The tuberin (TSC2), autosomal dominant polycystic kidney disease (PKD1), and somatostatin type V receptor (SSTR5) genes form a synteny group in the Fugu genome. Genomics 38:84-86.[CrossRef][ISI][Medline]
Seraphin, B. 1993. How many intronic snRNAs? Trends Biochem. Sci. 18:330-331.[CrossRef][ISI][Medline]
Sun, L., Y. Li, A. K. McCullough, T. G. Wood, R. S. Lloyd, B. Adams, J. R. Gurnon, and J. L. Van Etten. 2000. Intron conservation in a UV-specific DNA repair gene encoded by chlorella viruses. J. Mol. Evol. 50:82-92.[ISI][Medline]
Talerico, M., and S. M. Berget. 1994. Intron definition in splicing of small Drosophila introns. Mol. Cell. Biol. 14:3434-3445.[Abstract]
Tycowski, K. T., M. D. Shu, and J. A. Steitz. 1993. A small nucleolar RNA is processed from an intron of the human gene encoding ribosomal protein S3. Genes Dev. 7:1176-1190.[Abstract]
1996. A mammalian gene with introns instead of exons generating stable RNA products. Nature 379:464-466.[CrossRef][ISI][Medline]
van Haasteren, G., S. Li, S. Ryser, and W. Schlegel. 2000. Essential contribution of intron sequences to Ca(2+)-dependent activation of c-fos transcription in pituitary cells. Neuroendocrinology 72:368-378.[CrossRef][ISI][Medline]
Xu, H., J. Shen, C. L. Walker, and E. Kleymenova. 2001. Tissue-specific expression and splicing of the rat polycystic kidney disease 1 gene. DNA Seq. 12:361-366.[ISI][Medline]
Yatsuki, H., H. Watanabe, and M. Hattori, et al. (14 co-authors). 2000. Sequence-based structural features between Kvlqt1 and Tapa1 on mouse chromosome 7F4/F5 corresponding to the Beckwith-Wiedemann syndrome region on human 11p15.5: long-stretches of unusually well conserved intronic sequences of Kvlqt1 between mouse and human. DNA Res. 7:195-206.[ISI][Medline]
Zhuang, Y. A., A. M. Goldstein, and A. M. Weiner. 1989. UACUAAC is the preferred branch site for mammalian mRNA splicing. Proc. Natl. Acad. Sci. USA 86:2752-2756.[Abstract]
|