1 Department of Genetics, North Carolina State University, Raleigh, North Carolina 27695-7614
2 Department of Biology, University of Minnesota Duluth, Duluth, Minnesota 55812
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
endogenous retrovirus; retroviral insertion; hibernation; promoter; genome
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
PTL hydrolyzes triacylglycerols in a sequential manner producing 2-monoacylglycerols and free fatty acids (reviewed in Refs. 5 and 17). The PTL gene is part of a larger gene family that includes the genes for hepatic lipase and lipoprotein lipase (reviewed in Ref. 16). The seasonally expressed PTL mRNA found in WAT from thirteen-lined ground squirrels (3) was 500 bases longer than the PTL mRNA expressed in heart (2). Sequence analysis of cDNA clones attributed this difference in length to divergent 5'-untranslated regions (5'-UTRs; Ref. 3). The 5'-UTRs of the WAT PTL cDNAs contained distinct tracts of retroviral-like elements that were not found in the 5'-UTR of the heart PTL cDNA sequence. Bauer et al. (3) proposed two potential explanations for these differences at the genomic level, both of which required a retroviral insertion event. The first scenario involved the use of different transcriptional start sites and/or alternative splicing of the PTL transcript from a single gene. The second scenario suggested that the initial PTL gene was duplicated via nonhomologous recombination. Insertion of a retrovirus into the promoter of one of these two duplicate genes provided "novel" regulatory elements that allowed expression of this gene in WAT. Expression of this latter gene resulted in the chimeric mRNA that was found in WAT (3).
In this study, we seek to provide insight regarding the sequence organization and number of PTL genes that are present in the thirteen-lined ground squirrel genome. First, we examine PTL mRNA levels in several hibernating ground squirrel tissues using RT-PCR, a more sensitive method than Northern blot analysis, to determine how broadly this gene is expressed. Next, we isolate PTL cDNAs from a pancreas cDNA library and address whether the abundant PTL message found in this tissue resembles the form of PTL mRNA expressed in the heart or WAT. The structure and organization of the ground squirrel PTL gene(s) is then addressed using Southern blot analysis and sequencing of recombinant lambda clones from a thirteen-lined ground squirrel genomic library. Further analysis of the retroviral sequence, in combination with the Southern blot data, has led us to the conclusion that the sequences present in both WAT and heart PTL cDNAs are products of a single gene and that the retroviral insertion is a relatively recent event.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
RNA isolation.
Total RNA was isolated from ground squirrel tissues as previously described in Andrews et al. (2) using a modification of the method developed by Chomczynski and Sacchi (8). Before extraction, however, WAT homogenates from the abdominal WAT pad were centrifuged at 3,000 g for 10 min, and the lipid layer was removed. Extraction then proceeded on the WAT homogenate as described in Andrews et al. (2). RNA integrity was checked by separating total RNAs on 1.2% agarose gels containing 3% formaldehyde followed by staining with ethidium bromide.
RT-PCR.
First-strand cDNA was generated from 2 µg total RNA using the SuperScript first-strand synthesis system for RT-PCR (Invitrogen-Life Technologies) with oligo(dT) as the primer. After synthesis, however, treatment with RNase H was not performed as recommended by the protocol. Products of the first-strand cDNA were diluted twofold with water before proceeding to PCR.
PCR was executed using 1 µl cDNA template from the diluted first-strand reaction and 2 U Taq DNA polymerase (Invitrogen-Life Technologies) in the presence of 1.5 mM MgCl2. Reactions using PTL and ß-actin primers were done separately for 28 and 20 cycles, respectively, in HotStart storage and reaction tubes (Molecular BioProducts). After the hot start, denaturation was performed at 94°C for 30 s, annealing at 55°C for 45 s, and extension at 72°C for 1 min. The primer pair used to measure relative PTL mRNA levels was 5' CAGATGTCAACACCCGCTTC 3' and 5' GTGGCCAATGACATGGAC 3'. This pair fell within the coding region of the message and spanned at least one intron based on alignment with the nucleotide sequence for the human PTL gene (accession no. AH003527; Ref. 29). The primer pair used to measure ß-actin mRNA levels was 5' GACAGGATGCAGAAGGAG 3' and 5' ACATCTGCTGGAAGGTGG 3'. ß-actin primers served as a control for RNA integrity and amplification in the PCR reaction.
Two negative controls were also performed. For each RNA sample, an RT reaction was performed that replaced the SuperScript II RT enzyme with water. A portion of this reaction, as described above, was then used as template for PCR. In addition, a PCR reaction was carried out that substituted water for template. All control reactions were negative for DNA or other contamination (data not shown). PCR results were viewed on 5% acrylamide, 1x TBE gels.
Genomic Southern blot analysis.
Frozen thirteen-lined ground squirrel liver was ground into fragments with a mortar and pestle and immersed in digestion buffer (100 mM NaCl, 10 mM Tris·HCl, pH 8.0, 25 mM EDTA, pH 8.0, 0.5% SDS, and 0.1 mg/ml proteinase K) for incubation overnight at 50°C with shaking. Genomic DNA was then isolated from this mixture using a cesium chloride (CsCl) gradient as described in Curtis and Haselkorn (9). Sixty micrograms of this genomic DNA was digested separately with each of five restriction enzymes: BamHI, EcoRI, HindIII, PstI, and XbaI. Triplicate Southern blots containing 20 µg of each digest were made according to standard methods (27) using Hybond-XL nylon membrane (Amersham). Probe hybridization and washes were performed according to the manufacturers suggestions for this membrane.
For detection of PTL gene fragments, the probes were generated via PCR using three different primer pairs and labeled by random priming. Primer pair I (5' CCAATGATAGAGGATGGC 3' and 5' GTTGGGAAGTTGTGTCGG 3') was unique to the 5'-UTR of the WAT PTL cDNA clone 22A4, bases 319603 (Fig. 3; accession no. AF177403; Ref. 3). Primer pair II (5' ATTGCTATAGAGAGAGCC 3' and 5' ATGGCAGATCCGTCAGGC 3') was unique to the 5'-UTR of the heart PTL cDNA clone 29H4, bases 1365 (Fig. 3; accession no. AF027293; Ref. 2). Primer pair III (5' CAGATGTCAACACCCGCTTC 3' and 5' CTTATCCCCAGTGTTCAG 3') was unique to the coding region present in both PTL cDNAs, bases 5121413 of heart PTL and bases 10881989 of WAT PTL clone 22A4 (Fig. 3). The PCR reactions were carried out for 35 cycles using the same conditions that were described earlier. Before radiolabeling, products of the three reactions were gel purified to confirm the expected size of each fragment. 32P-labeling was accomplished using the Rediprime II system (Amersham).
|
Screening the genomic library for the PTL gene.
Approximately 325,000 plaques from the unamplified ground squirrel genomic library were screened for the presence of PTL. Recombinant phages were plated on NZY Top Agar with XL1 Blue MRA E. coli cells (Stratagene) resuspended in 10 mM MgSO4. Lifts of the plates were performed using Magna nylon (Osmonics). Filters were treated with 0.5 M NaOH plus 1.5 M NaCl, followed by 0.5 M Tris·HCl, pH 8.0, plus 1.5 M NaCl, and were then rinsed in 0.2 M Tris·HCl, pH 7.5, plus 2x SSC. UV cross-linking was performed at 120,000 µJ/cm2 for 30 s. To test for the presence of PTL sequence, the filters were probed with a 32P-end-labeled oligonucleotide. To remove bias from the library screen that could direct the discovery of PTL genomic clones that encoded mRNAs expressed solely in the heart or WAT, the oligonucleotide probe contained the complement of a portion of the PTL open-reading frame (ORF) (5' AGCAGCAGTGCCAGCGACCAGACCAGCAGCATCATG 3') that included the start codon (underlined) at its 3' end.
A secondary screen was performed on several potentially positive plaques. From this secondary screen, five single positive plaques were cored and placed in 1 ml SM buffer with chloroform. DNA was isolated from each of these phage stocks, and Southern blots were used to confirm that each of the recombinant phage contained parts of the PTL gene (data not shown). Large-scale liquid lysates were made from each of the five phage stocks using standard methods (27). DNA was isolated from the large-scale lysates using either a Qiagen Lambda Kit, following manufacturers instructions, or a CsCl step gradient (27). The DNA isolated from two of the stocks was partially sequenced. These two DNA preparations were selected because they contained sequence found in both the heart and WAT 5'-UTRs, as well as sequence found at the start of the PTL coding region.
DNA sequencing.
Positive cDNA and lambda library clones were sequenced using ABI Prism 377 automated cycle sequencers (PE Applied Biosystems). Multiple sequences for a particular cDNA clone were aligned and analyzed using MacVector and AssemblyLIGN software (Oxford Molecular Group). Multiple sequences for both lambda library clones were aligned and analyzed using SeqMan II, version 4.05 (DNASTAR). The consensus sequences that were generated using both programs were edited manually to resolve discrepancies. Finished sequences were compared with known sequences entered into the National Center for Biotechnology Information (NCBI) database using the BLAST tool (4).
Long terminal repeat analysis.
Two programs, ModelInspector Release 4.7.4 (Genomatix Software, GEMS Launcher 3.1, accessed via http://www.genomatix.de/; Ref. 10) and MatInspector Release 5.2 (Genomatix Software; accessed via http://www.genomatix.de/; Ref. 23), were utilized to identify the putative long terminal repeats (LTRs) and the consensus elements within them. For each consensus element, the core sequence represents the four most highly conserved, contiguous bases in the defining matrix used by the MatInspector program.
Comparative genomic analysis.
Human and rat PTL sequences were obtained from the UCSC Bioinformatics Site Genome Browser (http://genome.ucsc.edu/). For the human sequence, the human genome browser assembly date was April 2003. GenBank accession numbers for the physical map contig and the clone fragment ID were NT_030059 and AL731653.1, respectively. For the rat sequence, the rat genome browser assembly date was January 2003. The clone fragment ID number was RNOR01013788, and the bactig was kaxw_ghoa. The mouse PTL sequence was obtained from the Ensembl web site (http://www.ensembl.org/) using mouse genome browser version 12.3.1 (March 3, 2003). The Ensembl gene ID was ENSMUSG00000042344.
All nucleotide and amino acid alignments were performed with Clustal W (http://clustalw.genome.ad.jp).
Data deposition.
The new sequences reported in this paper have been deposited in the GenBank database (accession nos. AF395870 and AY071823).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Thirteen-lined ground squirrel PTL cDNA analysis.
Earlier experiments using Northern blot analysis demonstrated that the size of the PTL mRNA in WAT was 2.3 kb, whereas the size of the PTL mRNA in heart, pancreas, and testes was 1.8 kb (3). Sequencing of two full-length PTL cDNAs isolated from a WAT cDNA library, 7G5 and 22A4 (accession nos. AF177402 and AF177403, respectively), revealed that the 5'-UTR of the message found in WAT differed from the 5'-UTR of the message found in heart (3). Starting just three bases upstream of the start codon, these two regions from heart and WAT shared no similarity. When the 5'-UTR sequences of the two WAT cDNA clones were compared with known sequences in the GenBank database using the BLAST tool (http://www.ncbi.nlm.nih.gov/blast/), it was found that this region contained segments of retroviral-like elements (3). Ultimately, this scenario led us to examine exactly how many PTL genes were present in the thirteen-lined ground squirrel genome. Before this question was addressed, the sequence of the PTL message expressed in its traditional location, the pancreas, was determined to learn whether it more closely resembled the mRNA in heart or those found in WAT.
Two full-length PTL cDNA clones were isolated from a thirteen-lined ground squirrel pancreas cDNA library that was created using poly(A)+ mRNA from hibernating and active ground squirrel pancreases. The consensus sequence obtained for these pancreatic clones (accession no. AF395870) showed only minor differences when compared with the heart PTL cDNA sequence (accession no. AF027293; Ref. 2). These differences included two base changes, one found in the 5'-UTR and the other found in the 3'-UTR, which denote probable allelic or individual variations in the PTL message. The pancreatic PTL cDNAs were also missing the first eight bases found in the 5'-UTR of the heart cDNA clone. This difference, however, was likely due to the reverse transcription process used to generate the cDNAs and does not represent a true difference in length of the two messages. Thus, based on the size of the two messages seen on Northern blots (2, 3) and based on the sequences of the full-length cDNAs, it was concluded that the PTL mRNAs found in the pancreas and the heart were identical.
Thirteen-lined ground squirrel PTL gene: Genomic library analysis.
The dissimilarity in PTL mRNAs from WAT (2.3 kb; 5'-UTR retroviral sequence) and that from heart and pancreas (1.8 kb; no detectable retroviral sequence) prompted us to investigate the possibility of multiple PTL genes. To address this question, our first approach was the construction and screening of a thirteen-lined ground squirrel genomic library. Approximately 325,000 plaques from the unamplified library were screened for the presence of PTL using an oligonucleotide probe whose sequence was complementary to the 5'-most bases of the PTL coding region. A secondary screen was then performed to isolate clones that contained sequence from the three regions of interest: 5'-UTR of WAT, 5'-UTR of heart/pancreas, and the 5' portion of the PTL coding sequence. Two clones were isolated, labeled P2 and P3, that contained these three regions. The main difference between these two clones was that recombinant clone P3 contained the entire region studied, whereas the 5' end of clone P2 ended in exon 2 (Fig. 2A). These genomic library clones were sequenced on both strands using primers generated for sequencing the heart and WAT PTL cDNAs. Additional primers, to fill gaps in the sequence, were created based on genomic clone sequencing runs. PTL gene sequencing results (accession no. AY071823) are summarized in Fig. 2A.
|
All sequences present in the cDNAs were found in their original orientation in the genomic sequence with one exception. The first 52 bases in the 7G5 WAT cDNA were found in a reverse and complementary fashion in the genomic sequence. This type of inversion, present in the cDNA, was most probably the result of an error that occurred during first-strand cDNA synthesis of 7G5. Also, although not a change in orientation, the first portion of the heart PTL cDNA clone was found at two places in the genomic sequence. This cDNA sequence comprised all of exon 1 and was repeated again at the start of exon 5. This repeated sequence is part of a larger direct repeat of retroviral origin as indicated by the bold horizontal arrows in Fig. 2A. The significance of this larger direct repeat will be discussed shortly.
Thirteen-lined ground squirrel PTL gene: Splice-junction analysis.
In the thirteen-lined ground squirrel genome, the presence of a single PTL gene requires that alternative splicing would occur to produce the two unique PTL cDNAs found in WAT. In addition, if alternative promoters are not used for regulation of this gene, then alternative splicing would also be required to produce the PTL cDNA form found in heart and pancreas. To examine the potential for alternative splicing of this ground squirrel gene, splice junction sequences for the first six exons were analyzed (Fig. 2B). The consensus sequence for the 5' intron splice sites was -2NGGUPuPuGN+6 [notation is based on that used in Goldstrohm et al. (11), where the arrow marks the exon-intron junction, and Pu denotes purine]. This sequence closely resembled the consensus sequence for mammalian 5' intron splice sites, -2AG
GUPuAGU+6 (the underlined positions are the most highly conserved residues; reviewed in Ref. 11). The 3' intron splice sites for this gene, however, were less well conserved. The consensus sequencederived from these sites was -4(U/A)C(C/A)(C/G)
AU+2 and differed somewhat from the mammalian consensus -4NPyAG
PuN+2 (the notation is as previously described, Py denotes pyrimidine; reviewed in Ref. 11).
The lessened degree of conservation of the 3' splice sites could enable alternative splicing of a primary PTL mRNA transcript and would address the issue of three unique PTL cDNAs. Exons 4 and 6, which contained sequence shared by both WAT clones, had 3' splice sites that were identical to the mammalian consensus. Exon 2 and the first part of exon 3, on the other hand, were unique to specific WAT clones and had 3' splice sites that deviated considerably from the consensus. Less efficient processing of these latter splice sites could explain the variation in cDNA products observed in WAT. Similarly, the 3' splice site for exon 5 failed to conform to the mammalian consensus. If transcription in heart and pancreas started at exon 1 instead of exon 5, then this lessened degree of conservation could enable splicing of the primary transcript to produce the heart/pancreatic PTL cDNA.
Thirteen-lined ground squirrel PTL gene: Genomic Southern blot analysis.
A second approach was taken to investigate the possibility of multiple PTL genes present in the thirteen-lined ground squirrel genome. This approach involved the analysis of Southern blots containing ground squirrel genomic DNA. Three identical thirteen-lined ground squirrel genomic Southern blots were probed with 32P-labeled sequences complementary to the 5'-UTR of WAT PTL mRNA (probe I), the 5'-UTR of heart/pancreatic PTL mRNA (probe II), and the PTL coding region shared by all three PTL messages respectively (probe III; Fig. 3A). The results of this experiment are shown in Fig. 3B.
In blot I of Fig. 3B, multiple DNA fragments hybridized to the probe complementary to the 5'-UTR of the WAT PTL mRNA. This number of fragments was more than would be expected based on the restriction site analysis of the ground squirrel PTL gene (Fig. 2A). Sequence complementary to probe I fell entirely within exon 3 of the genomic sequence. Regardless of which enzyme was used to cut the DNA, only a single band would be expected to hybridize with this probe. The presence of multiple bands in each lane on the Southern blot I (Fig. 3B) indicates that sequence complementary to this probe was present in the thirteen-lined ground squirrel genome at places other than simply upstream of the PTL gene. Because this probe sequence is retroviral in nature, the presence of multiple bands suggests that the retrovirus from which this probe sequence was derived had inserted itself at multiple sites in the ground squirrel genome. The ground squirrel PTL gene represents only one of these sites of insertion.
In blot II of Fig. 3B, a similar result was seen. Multiple DNA fragments also hybridized to the probe that was complementary to the 5'-UTR of heart/pancreatic PTL mRNA. A BlastN analysis of this probe sequence uncovered no similarity to any potentially repetitive DNA sequence. Thus each lane on the blot would be expected to contain only two bands based on the restriction site analysis of the ground squirrel PTL genomic sequence (Fig. 2A). Sequence complementary to probe II was present in exon 1 and again in exon 5. Furthermore, the first 113 base pairs (bp) of this sequence was part of a direct repeat (Fig. 2A). The nature of this direct repeat was later determined to be of retroviral origin (Fig. 5). Given this new context, the result obtained on Southern blot II (Fig. 3B) could be explained by retroviral insertion at multiple sites in the ground squirrel genome as described earlier for blot I results.
|
BamHI and EcoRI sites, on the other hand, were present in the probe III cDNA sequence (Fig. 3A). As a result, at least two bands would be predicted for each of these lanes (blot III, Fig. 3B). Two bands were seen in the EcoRI lane, whereas only one band was seen in the BamHI lane. Because BamHI is not affected by mammalian CpG methylation, this mechanism could not be used to explain the presence of a single band in this lane. Revisiting the earlier alignment of the probe III cDNA sequence with the human PTL gene, however, does present one possible explanation. This cut site lies just five bases from an intron-exon splice junction. If this splice junction was not completely conserved between the two species, then the BamHI restriction site seen in the cDNA sequence could have been abolished by a splice site in the ground squirrel genomic DNA sequence. The faint bands migrating at the top of the BamHI and PstI lanes were not included in this analysis as they likely represent uncut genomic DNA.
In summary, Southern blot analysis indicates that while the 5'-UTR sequences from heart/pancreatic and WAT PTL cDNAs are found at multiple sites throughout the thirteen-lined ground squirrel genome, the PTL coding region sequence is much more limited in scope. In addition, although the existence of multiple PTL genes cannot be excluded on the basis of Southern blots alone, the limited banding patterns seen on the PTL coding region blot suggests the presence of a single PTL gene.
Thirteen-lined ground squirrel PTL gene: Retroviral sequence analysis.
Previous sequence analysis of WAT PTL cDNAs revealed portions of retroviral sequence in their 5'-UTRs (3). These retroviral elements were present in a conserved linear order in the cDNAs (3) and in the PTL gene exons (Fig. 2A). This conservation of order suggested that the elements present in the cDNAs derived from a retrovirus that had integrated into the thirteen-lined ground squirrel genome upstream of the coding region in the PTL gene. To test the validity of this hypothesis, a BlastX analysis was performed on the ground squirrel genomic sequence. As part of this analysis, direct comparisons were made between the translated ground squirrel sequence and the Gag, Pol, and Env polyproteins encoded by four different full-length -retroviruses: porcine endogenous retrovirus (P-ERV; accession no. AF038600; Ref. 1), gibbon ape leukemia virus (GALV; accession no. U60065; Ref. 21), Mus dunni endogenous virus (MDEV; accession no. AF053745; Ref. 38), and Friend murine leukemia virus (FMLV; accession no. M93134; Ref. 24). To obtain a more accurate alignment of the translated ground squirrel sequence with each of these four
-retroviruses, the low-complexity option for the BlastX analysis was deselected.
Overall, the length of the ground squirrel retroviral sequence (8,569 nt) was consistent with the length of complete retroviral genomes (Table 1). Additionally, alignment could be shown between the translated retroviral region of the ground squirrel sequence and, on average, 98% of the amino acids encoded by each of the three retroviral genes found in the four
-retroviruses. Percentage amino acid identities for these alignments ranged from 39 to 64%, with the highest degree of similarity seen in the Pol polyprotein region and the lowest seen in part of the Env polyprotein.
|
|
The boundaries of each half of the direct repeat, which will now be referred to as the putative 5'- and 3'-LTRs respectively, were determined based on the locations of the tRNA primer-binding site (PBS) and the polypurine tract (PPT) (Fig. 5). The PBS lies immediately downstream of the 5'-LTR and acts as the priming site for minus-strand DNA synthesis of the retrovirus (reviewed in Ref. 32). This PBS is complementary to 18 bases at the 3' end of a host-encoded tRNA. Although the tRNA for proline, and to a lesser extent glutamine, acts as the typical tRNA primer for mammalian C-type retroviruses (reviewed in Ref. 32), the PBS (bases 487504) found in this ground squirrel genomic sequence shared 100% complementarity to the 3' terminus of a human glycine tRNA (accession no. K00208; Ref. 12). The PPT, on the other hand, lies immediately upstream of the 3'-LTR and provides the priming site for plus-strand DNA synthesis of the retrovirus (reviewed in Ref. 32). In this ground squirrel sequence, a near perfect PPT (17/18 nucleotides) was present from bases 82448261. These sequences generally range from 7 to 18 bases in length (22). Last, bordering the integrated provirus was a 4-bp direct repeat (ATTC). Formation of this direct repeat is a consequence of the viral DNA insertion event (as reviewed in Ref. 6).
Within the putative LTRs, boundaries for the U3, R, and U5 regions were determined by the definition of these regions. As reviewed in Vogt (35), the transcription start site establishes the boundary between the U3 and the R regions, and the polyadenylation site marks the boundary between the R and the U5 regions. In the ground squirrel genomic sequence, the transcription start sites were located at the beginning of exons 1 and 5, 36 bases downstream of their respective TATA boxes (Fig. 5). Conversely, the polyadenylation signal spanned bases 409414 and bases 85828587 in the 5'- and the 3'-LTRs, respectively. Relative to these locations, the R-U5 boundary was marked 21 bases downstream of this consensus element (Fig. 5). Two observations provided the foundation for this determination. The first, supplied by Chen and Barker (7), was that most R regions ended with the dinucleotide CA. An alignment with the R regions from other mammalian type C retroviruses was used to select the appropriate CA (7). The second, as reviewed in Petropoulos (22), was that the polyadenylation tract was commonly found 1520 bases downstream of the polyadenylation signal. The proposed boundary met both of these guidelines.
Thirteen-lined ground squirrel PTL gene: Comparative genomic analysis.
To examine further the possibility that a retrovirus inserted into the promoter region of a single functional PTL gene, a comparison was made between the ground squirrel PTL gene, with the retroviral sequence removed, and the rat, mouse, and human PTL genes. As seen in Fig. 6, the 3' end of exon 5 in the ground squirrel gene aligned with exon 1 in the rat, mouse, and human genes and extended 220 bases upstream of their +1 sites. Within these 220 bases was a perfectly conserved TATA box located
24 bases from the transcriptional start site for rat, mouse, and human PTLs. Other areas of high sequence identity were also found within this region and could represent transcription factor binding sites important for the regulation of this gene in an intact promoter. With the retroviral sequence removed for this comparison, the upstream boundary for exon 5 in Fig. 6 is the retroviral insertion site (ATTC). This insertion site fell within a region of low sequence identity among the four species. Rat PTL, however, did share three of the four bases in the insertion sequence with the ground squirrel gene. A look at the sequence downstream of exon 5 showed that sequence identity was high near splice site junctions, through portions of the intron, and throughout the next exon.
|
The program RepeatMasker2 identified the retroviral sequence present in the ground squirrel PTL gene as an ERV_class I (data not shown). This classification was consistent with our earlier analysis which placed it in the mammalian type C, or -retrovirus, category (37). No ERV_class I elements were found in the 5 kb upstream of the rat, mouse, and human PTL coding regions.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The presence of retroviral sequence proximal to the PTL gene provides a potential mechanism for directing seasonal expression of PTL in thirteen-lined ground squirrels. Novel expression of a pancreatic enzyme mediated by retroviral insertion has been observed previously (33). Parotid-specific expression of the human salivary amylase gene (AMY1C) is driven by a retroviral-like sequence present in its proximal promoter (33). Similarly, in mouse, androgen-responsiveness of the sex-limited promoter (Slp) gene was conferred by retroviral insertion upstream of its promoter (31). More recently, documentation of endogenous retroviral elements providing alternative promoters or enhancers for neighboring genes has been provided for the pleiotrophin gene (28), the endothelin B receptor and the apolipoprotein C-I genes (19), and the Mid1 gene in humans (15). It is likely that more such examples of retroviral-mediated gene regulation will be uncovered for the following reasons: 1) 38.5% and 46% of the mouse and human genomes, respectively, are recognized as having been derived from the insertion of transposable elements, and 2) nearly 10% of mouse and 9% of human insertions are classified as LTR elements (36). For the human genome, if all classes of transposable elements are considered as a whole, then this number could total more than 1,000 genes (14).
Alignment of the ground squirrel PTL gene, after removal of its retroviral sequence, with the promoter regions of the rat, mouse, and human PTL genes illustrated that integration of the provirus occurred just over 200 bases upstream of the original transcription start site. Insertion of the retrovirus occurred in a region of low sequence identity, but disrupted normal promoter function as evidenced by inclusion of what appears to be the original TATA box in the 5'-UTR of the PTL cDNA sequence isolated from heart and pancreas. Although disruptive to normal promoter function, maintenance of the retroviral sequence at this location within the ground squirrel genome suggests that the inserted sequence was not deleterious to overall PTL gene function. Transposable element-derived sequences that are deleterious to gene function are likely to be removed from the genome by selection (14). Jordan et al. (14) presented this hypothesis based on their observation that the percentage of transposable element-derived sequences in human promoters increased as one moved farther upstream from the transcription start site. We propose that insertion of the retrovirus into the promoter region of this ground squirrel gene enabled novel expression of PTL mRNA in a broad range of tissues during hibernation (Fig. 1A). The product of this chimeric mRNA conferred a selective advantage to the organism in the form of low-temperature lipolysis during hibernation (2, 30) and, as a result, has enabled the retroviral sequence to be maintained in the ground squirrel lineage.
Analysis of the retroviral sequence contained in the thirteen-lined ground squirrel PTL gene suggests that the insertion event occurred in relatively recent history. The four bases (ATTC) that represented the original target site, and that were duplicated upon integration of the provirus, are perfectly conserved in the ground squirrel genomic sequence. The gag, pol, and env genes encoded ORFs that are also largely intact (Fig. 4). Along these lines, the longest, uninterrupted ORF was present in the env gene and represented over 500 amino acids. In addition, the LTRs were less than 1% divergent. Working under the assumption that the LTRs are identical at the time of insertion, several researchers have used the percent LTR divergence to date the time of insertion (13, 18, 25). To provide an estimate for the time of our ground squirrel retroviral insertion, we use the synonymous nucleotide substitution rate [0.013 substitutions per site per million years (Myr) or 1.3%/Myr] for the nuclear gene lecithin:cholesterol acyltransferase for the Marmota/Sciurus dichotomy (26). This rate returns an insertion date of 300,000 years ago (0.8%/1.3%, divided by two for divergence from a common sequence) placing it well within the Spermophilus lineage (20).
![]() |
ACKNOWLEDGMENTS |
---|
This work was supported by US Army Research Office Grant DAAD19-01-1-0014 and Augmentation Awards for Science and Engineering Training DAAG55-97-1-0175 and by North Carolina Biotechnology Center Grant 9805-ARG-0038.
![]() |
FOOTNOTES |
---|
Address for reprint requests and other correspondence: M. T. Andrews, Dept. of Biochemistry and Molecular Biology, Univ. of Minnesota School of Medicine, 1035 University Drive, Duluth, MN 55812 (E-mail: mandrews{at}d.umn.edu).
10.1152/physiolgenomics.00167.2002.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|