©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Rat Phospholipid-hydroperoxide Glutathione Peroxidase
cDNA CLONING AND IDENTIFICATION OF MULTIPLE TRANSCRIPTION AND TRANSLATION START SITES (*)

(Received for publication, July 28, 1995; and in revised form, August 29, 1995)

Thimmalapura R. Pushpa-Rekha Andrea L. Burdsall Lisa M. Oleksa Guy M. Chisolm Donna M. Driscoll (§)

From the Department of Cell Biology, Cleveland Clinic Foundation, Cleveland, Ohio 44195

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Phospholipid-hydroperoxide glutathione peroxidase (PhGPx) is a selenoenzyme that reduces hydroperoxides of phospholipid, cholesterol, and cholesteryl ester. Previous studies suggested that both the mitochondrial and nonmitochondrial forms of PhGPx are 170 amino acids long. In this study, we isolated a full-length cDNA clone encoding rat testis PhGPx. Based on sequence analysis, the cDNA encodes a protein of 197 amino acids, with translation initiating at AUG. The additional 27 amino acids at the N terminus contain the features of a mitochondrial targeting sequence. In vitro translation of the full-length PhGPx mRNA initiated predominantly at AUG. However, translation initiated at AUG when AUG was deleted. An RNase protection assay was used to map the 5`-ends of PhGPx mRNAs in rat tissues. We identified two major windows of transcription initiation that are tissue-specific. Rat testis predominantly expresses larger transcripts that encode the 197-amino acid protein containing the potential mitochondrial targeting signal. The predominant smaller transcripts in somatic tissues lack AUG and encode a 170-amino acid protein, which may represent the nonmitochondrial forms of PhGPx. Our results suggest that the use of alternative transcription and translation start sites determines the subcellular localization of PhGPx in different tissues.


INTRODUCTION

PhGPx (^1)is a unique selenoenzyme that reduces phospholipid, cholesterol, and cholesteryl ester hydroperoxides at the expense of glutathione(1, 2, 3) . PhGPx can also reduce 7beta-hydroperoxycholesterol, which our laboratory has shown to be the principal cytotoxin in oxidized lipoproteins(4) . These lipid hydroperoxides are resistant to the action of the classical glutathione peroxidase, which reduces hydrogen peroxide and free fatty acid hydroperoxides(5) . It has been proposed that while glutathione peroxidase is important in removing cytosolic hydroperoxides, PhGPx catalyzes the reductive inactivation of lipid hydroperoxides in membranes and lipoproteins and thus protects cellular membranes against oxidative damage(5) .

PhGPx has been purified to homogeneity from pig heart, liver, and brain (6, 7) , rat testis cytosol and mitochondria(8) , and human liver(9) . Biochemical analyses of the purified protein have shown that PhGPx is a 170-amino acid protein of 20 kDa(1, 10) . cDNA clones that encode PhGPx have been isolated from pig heart and blastocyst and from human testis(10, 11, 12) . Although structurally and functionally distinct, PhGPx and glutathione peroxidase share 40% homology at the amino acid level. Several of the active-site residues of glutathione peroxidase are conserved in PhGPx, which suggests that the two enzymes have similar mechanisms of action(10, 11, 13) . Like other selenoproteins, the selenocysteine (Sec) in PhGPx is encoded by a UGA codon, which normally functions as a translation stop codon.

PhGPx is widely expressed, and its enzymatic activity has been detected in all rat tissues examined (14) as well as in several human tumor cell lines(15) . The highest levels of activity are present in rat testis, where PhGPx expression is regulated by gonadotropins in maturing spermatogenic cells(16) . PhGPx has been detected in cytoplasm, mitochondria, and plasma and nuclear membranes, but the structural basis for this subcellular localization has not been determined. Enzymological and immunochemical data suggest that PhGPx is not an integral membrane protein(1, 8) . The cytosolic and membrane-associated forms of the enzyme from rat testis appear to be identical based on their molecular masses, cross-reactivity with antisera, and protein fragmentation pattern(8) . However, the molecular mass of pig PhGPx, predicted from the nucleotide sequence, differed from that determined by laser desorption spectroscopy(10) . It has been suggested that this difference in the molecular mass may be due to a post-translational modification that is necessary for the association of the protein with membranes(10) . Recently, it was demonstrated in rat testis mitochondria that PhGPx is localized in the intermembrane space, possibly at the contact sites of the two membranes(17) . Since PhGPx is a nuclear encoded gene product, this suggests that the protein may be synthesized as a larger precursor containing a mitochondrial targeting signal, which may be cleaved after import into mitochondria.

In this study, we isolated a full-length cDNA clone encoding rat PhGPx and identified two windows of transcription start sites that are tissue-specific. Our results suggest that the predominant full-length transcripts in testis direct the synthesis of a 197-amino acid protein containing a potential mitochondrial targeting signal. Somatic tissues primarily express shorter transcripts that encode a 170-amino acid protein, which may represent the cytosolic and membrane-associated forms of the protein.


EXPERIMENTAL PROCEDURES

Plasmids

Plasmid RP-1 contains the full-length rat PhGPx cDNA (nucleotides 1-871) in the EcoRI/EcoRV site of the vector pcDNAI/Amp (Invitrogen). Plasmid RP-2 is a 5`-deletion mutant of rat PhGPx (nucleotides 131-871). Plasmids CRP-1 and CRP-2 are mutant plasmids in which Sec was changed to Cys in RP-1 and RP-2, respectively.

Oligonucleotides

Oligonucleotides were purchased from Genosys Biotechnologies, Inc. (The Woodlands, TX). The nucleotide positions refer to our nomenclature for the rat PhGPx cDNA (see Fig. 1). The degenerate sequences for oligonucleotides LO-1 and LO-2 were designed based on the published pig PhGPx protein sequence(13) . The underlined regions indicate differences from the wild-type rat PhGPx sequence. Oligonucleotides LO-1, LO-2, PHG-1, JS-3, and JS-4 were used for cDNA synthesis and PCR amplification. Oligonucleotides PHG-1 and PHG-2 were used for primer extension and mutagenesis, respectively. LO-1, ATGCATGAATTCTCAGCCAAA/TGAC/TAT (nucleotides 180-206); LO-2, GAA/GAAA/GGAC/TCTGCCGTGCTACCTCTAGCATGC (complementary to nucleotides 628-654); JS-3, GACTCGAGTCGACATCGA(T) (oligo(dT) 3`-primer); JS-4, GACTCGAGTCGACATCG (3`-anchor primer); PHG-1, ACCAACGTGGCCTCGCAATGAGGC (complementary to nucleotides 259-282); and PHG-2: GCCTCGCAATGTGGCAAAACC (nucleotides 268-288).


Figure 1: Nucleotide and deduced amino acid sequences of the full-length rat testis PhGPx cDNA. The two potential translation start sites, AUG and AUG, are underlined. TGA, which codes for selenocysteine, is indicated by an asterisk. The sequences in the 3`-UTR that may be necessary for Sec incorporation (ATGA, AAA, and UGG) are in lower-case letters. The potential polyadenylation signal AATAAA at nucleotides 852-857 is double-underlined.



Strategy for Cloning

Total RNA (5 µg) from rat testis was used for cDNA synthesis using oligonucleotide JS-3 as the 3`-primer (18) . PCR amplification of cDNA was performed using oligonucleotides LO-1 and JS-4 as described(19) . Amplified products were purified by phenol extraction and Sephadex G-50 spin column chromatography. The products were blunt-ended with Klenow DNA polymerase, digested with EcoRI, and subcloned into the EcoRI/EcoRV site of pcDNAI/Amp to generate a partial cDNA clone(18) . To isolate the 5`-end of the cDNA, the 5`-rapid amplification of cDNA ends (RACE) kit from CLONTECH was used. cDNA was synthesized using LO-2 as the primer, and the ampliFinder anchor was ligated to the 5`-end of the cDNA. PCR amplification was performed using the PHG-1 and ampliFinder anchor primers according to the manufacturer's instructions (CLONTECH). PCR products were digested with EcoRI and subcloned into the EcoRI site of the partial clone to obtain the full-length cDNA clone, RP-1.

Mutagenesis

To obtain the deletion mutant RP-2, the HindIII/XbaI fragment of RP-1 was digested with BglI to remove nucleotides 1-134 and was subcloned into pcDNAI/Amp. Site-directed mutagenesis of RP-1 and RP-2 was performed with the PHG-2 oligonucleotide using the Kunkel method exactly as described in (18) . Clones containing mutations were identified directly by double-stranded DNA sequencing of miniprep DNA using Sequenase (U. S. Biochemical Corp.).

In Vitro Translation

Synthetic RNAs were transcribed from linear plasmid DNA using T7 RNA polymerase in the presence or absence of a 5`-methyl cap (Boehringer Mannheim). The RNAs (300 ng) were translated in a rabbit reticulocyte lysate translation system (Life Technologies, Inc.) in the presence of [S]methionine using the conditions for capped and uncapped RNAs as described by the manufacturer. Translation products were separated by SDS-polyacrylamide gel electrophoresis (14% gel) and analyzed by fluorography and autoradiography as described (18) . Canine pancreatic microsomal membranes (Promega) were used for the cotranslational processing experiments as described by the manufacturer.

Northern Blotting

Total RNA was isolated from rat tissues using guanidine isothiocyanate extraction and ultracentrifugation through a cesium chloride gradient(18) . Total RNAs (20 µg) were fractionated through 2.2 M formaldehyde, 1.5% agarose gels and transferred to nylon-based membranes. The PhGPx cDNA insert was isolated from plasmid RP-1 by purification in low melting agarose and was P-labeled using the random priming method(18) . Filters were hybridized overnight with the P-labeled probe and washed at high stringency (18) .

RNase Protection Assay

A fragment corresponding to nucleotides 1-192 of the rat PhGPx cDNA was isolated from plasmid RP-1 by digestion with EcoRI. The fragment was subcloned in the pGEM-3Zf(+) vector (Promega) and used for preparing an antisense RNA probe(18) . Plasmid DNA was linearized with HindIII and transcribed in vitro using T7 RNA polymerase (Life Technologies, Inc.) and [alpha-P]UTP (50 µCi, 800 Ci/mmol). The RNA was purified by digestion with RNase-free DNase I, phenol/chloroform extraction, and Sephadex G-50 spin column chromatography(18) . RNase protection assays were performed using the RPA II kit from Ambion Inc. as described by the manufacturer. Briefly, total RNA was incubated with the P-labeled RNA (300 pg/3.4 fmol/10^5 cpm) overnight at 45 °C in hybridization buffer. RNAs were digested with a 1:100 dilution of a RNase A/RNase T1 mixture for 30 min at 37 °C and precipitated. The products were separated by electrophoresis on an 8 M urea, 5% acrylamide sequencing gel and analyzed by autoradiography. The size of the protected fragments was determined by comparison with DNA sequence ladders and P-labeled X174 replicative form DNA/HaeIII fragments.

Computer Analysis

DNA sequence analysis and protein secondary structure predictions were performed using the computer program GeneWorks. RNA secondary structure predictions were generated by MulFold(20) .


RESULTS

cDNA Cloning and Sequence Analysis of Rat PhGPx

To obtain a full-length cDNA clone that encodes rat PhGPx, cDNA was synthesized from total rat testis RNA using an oligo(dT) anchor primer. The cDNAs were amplified by PCR using a 3`-anchor primer and a degenerate oligonucleotide based on the published protein sequence of pig PhGPx(13) . The PCR products were cloned into the plasmid pcDNAI/Amp, and a partial cDNA clone of 590 nucleotides was isolated. To determine the size of the full-length transcript, we performed primer extension analysis to map the 5`-end of PhGPx mRNA in rat testis. The results indicated that the isolated cDNA clone lacked 280 nucleotides of 5`-sequence (data not shown). The missing 5`-region was obtained using RACE technology to yield a full-length cDNA clone of 871 nucleotides. Three independent cDNA clones were analyzed by DNA sequencing to eliminate the possibility of artifacts due to misincorporation of nucleotides during PCR. The cDNA sequence and the deduced amino acid sequence of rat PhGPx are shown in Fig. 1.

Analysis of the nucleotide sequence revealed an open reading frame of 197 amino acids, with translation initiating at the AUG codon at nucleotides 61-63 (AUG). As shown in Fig. 2, PhGPx is highly conserved across species, and rat PhGPx shares 93 and 91% identity at the amino acid level with pig and human PhGPx, respectively. Tyr, which was shown to be phosphorylated in pig PhGPx in vivo(13) , is conserved in the rat sequence. The amino acid sequence of rat PhGPx also shares 40% homology with rat glutathione peroxidase. Several of the active-site residues of glutathione peroxidase are conserved at homologous positions in rat PhGPx, including Sec, Gly, Gln, and the triplet Trp-Asn-Phe. As in other selenoproteins, Sec in rat PhGPx is encoded by an in-frame UGA codon (TGA in the cDNA sequence; shown by the asterisk in Fig. 1).


Figure 2: Homology of rat, pig, and human PhGPx. The deduced amino acid sequences of rat (this work), pig(28) , and human (14) PhGPx were aligned using the computer program GeneWorks. The residue numbers refer to the rat testis PhGPx sequence, with translation initiating at AUG (Fig. 1), and the conserved residues are boxed.



The 3`-untranslated region (UTR) of rat PhGPx is 217 nucleotides long, and the sequence shares 80% homology with the 3`-UTRs of human and pig PhGPx, which is a high degree of conservation for a noncoding region. In the pig blastocyst PhGPx cDNA, two polyadenylation signals were found in the 3`-UTR, indicating that the distal polyadenylation signal is utilized in this tissue(11) . However, our three independent rat testis cDNA clones terminated after the proximal polyadenylation signal at nucleotides 852-857 (Fig. 1). cDNAs isolated from pig heart (13) and human testis (12) also contained only the proximal polyadenylation signal.

The 3`-UTRs of other eukaryotic selenoprotein mRNAs are necessary for the cotranslational insertion of Sec at the UGA codon, which is normally a translation stop codon(21, 22) . The decoding of UGA as Sec requires a stable stem-loop structure as well as specific nucleotide sequences in the 3`-UTR(21, 22) . The 3`-UTR of rat PhGPx is predicted to form a stable stem-loop structure with a high negative free energy (DeltaG = -62.4 kcal) based on computer analysis with the program MulFold (data not shown). The rat (this study) and pig (10) PhGPx 3`-UTRs also contain the motifs required for Sec incorporation in other mammalian selenoproteins, including AUGA, AAA, and UGA/G (shown in lower-case letters in Fig. 1). In contrast to other selenoproteins that contain the sequence AAA in the terminal loop of the stem-loop structure(21, 22) , this sequence is located in a single-stranded bulge region in rat (this study) and pig (10) PhGPx.

PhGPx mRNA Contains Two Potential Translation Start Sites

The cDNAs of rat, human, and pig PhGPx encode a 23-kDa protein of 197 amino acids if translation initiates at the first AUG codon in the mRNA ( Fig. 1and Fig. 2). However, biochemical analyses of the purified protein, including molecular mass, amino acid composition, and N-terminal sequencing, have shown that PhGPx is a 19.6-kDa protein of 170 amino acids(6, 10, 13) . Based on these observations, it has been proposed that the second AUG codon in the mRNA (AUG in rat PhGPx) is the putative translation start site(11) . Kozak (23) has proposed a ribosome scanning model in which translation initiates at the first AUG codon in the mRNA that is present in an optimal context. A consensus initiation sequence has been identified as GCCGCCA/GCCAUGG. Both AUG and AUG are located in a Kozak ribosome initiation site, with 9/13 and 10/13 nucleotides of the consensus sequence, respectively (Fig. 1).

To identify the translation start site of rat PhGPx, the full-length cDNA (construct RP-1) was transcribed in vitro, and the synthetic RNA was translated in a rabbit reticulocyte lysate system. When the translation products were analyzed by SDS-polyacrylamide gel electrophoresis, only a truncated protein was detected due to premature termination of translation at UGA, which encodes Sec (Fig. 3, first lane). This is consistent with previous studies that showed that selenocysteine incorporation is inefficient in reticulocyte lysate (21) . To avoid the premature termination, we used site-directed mutagenesis to convert UGA to UGU, which encodes cysteine (construct CRP-1). We also constructed a deletion mutant of RP-1 by deleting 134 nucleotides from the 5`-end (construct RP-2) as well as the cysteine mutant of this deletion construct (construct CRP-2). These two deletion mutants lacked AUG, but contained AUG. As shown in Fig. 3, in vitro translation of the CRP-1 RNA produced predominantly a protein of 24 kDa (second lane), whereas a 21-kDa protein was obtained when the CRP-2 deletion mutant was translated (fourth lane). We also observed a minor protein of 21 kDa in the CRP-1 translation assays, which may represent a low level of initiation at AUG in the full-length transcript. The 3-kDa size difference between the CRP-1 and CRP-2 translation products is equivalent to the predicted molecular mass of amino acids 1-27. These results suggest that translation initiates predominantly at AUG in the full-length PhGPx mRNA and that AUG can function as an efficient translation start site in the absence of AUG. Previous studies in other systems have shown that the choice of translation start sites can be influenced by the secondary structure of the 5`-UTR or by capping of the mRNA(23, 24) . The sequence upstream of AUG is highly GC-rich (70%) and has the potential to form a stable secondary structure (DeltaG = -50.6 kcal) based on computer analysis with the program MulFold. However, we found that initiation of translation at AUGin vitro was not affected by heating of the RNA (65 °C, 10 min) prior to translation. There was also no difference in the pattern of translation initiation when capped or uncapped RNAs were translated for 30 or 60 min (data not shown).


Figure 3: In vitro translation of PhGPx mRNAs. Synthetic RNAs were transcribed from plasmids containing the full-length PhGPx cDNA (RP-1) and its cysteine mutant (CRP-1) or the deletion construct lacking nucleotides 1-134 (RP-2) and its cysteine mutant (CRP-2). RNAs were translated in a rabbit reticulocyte lysate system in the presence of [S]methionine. Translation products were analyzed by SDS-polyacrylamide gel electrophoresis, followed by fluorography and autoradiography. The large and small arrows indicate the protein products due to translation initiating at AUG and AUG, respectively. The positions of the molecular mass markers (in kilodaltons) are indicated.



Function of the N-terminal 27-Amino Acid Sequence

The 27 amino acids at the N terminus of PhGPx are highly conserved across species (Fig. 2), which suggests a functional role for this sequence. Because PhGPx has been shown to be associated with both plasma and nuclear membranes(16) , the N-terminal sequence may represent a signal peptide that is cotranslationally cleaved. To test this hypothesis, we performed in vitro translation experiments in the presence of canine pancreatic microsomal membranes. The microsomal membranes efficiently cleaved the signal peptide of a control protein, beta-lactamase. However, the full-length 24-kDa PhGPx translation product was not cleaved to a shorter form when translation was performed in the presence of microsomal membranes (data not shown), indicating that the N-terminal 27-amino acid sequence of PhGPx may not represent a signal peptide.

PhGPx has also been localized to the intermembrane space in mitochondria isolated from rat testis(17) . Most nuclear coded mitochondrial proteins are synthesized as larger precursors containing N-terminal presequences that target the protein for mitochondrial import. In the case of proteins localized to the intermembrane space, a bipartite targeting signal is required(25) . After cleavage by a matrix protease to remove the N-terminal basic domain, the remaining hydrophobic sequence directs the protein to the intermembrane space, where cleavage by a membrane-associated peptidase yields the mature protein. As shown in Fig. 4, the N-terminal sequence of PhGPx has the features of such a mitochondrial targeting signal. The 27-amino acid length conforms to the typical size of leader sequences, which are 10-70 amino acids long. The PhGPx sequence is leucine-rich (7/27 amino acids), contains three hydroxyl amino acids, and lacks acidic residues. Three basic amino acids in the N-terminal portion of the sequence are followed by a hydrophobic region. The sequence is also predicted to form an amphiphilic alpha-helical structure(26) , with the positively charged and hydrophobic amino acids on opposite faces of the helix (data not shown). In addition, PhGPx also contains the sequence Arg^5-Leu^6-Ser^7-Arg^8-Leu^9-Leu, which is identical to the sequence preceding the proposed cleavage site in rat mitochondrial aldehyde dehydrogenase(27) .


Figure 4: Analysis of the N-terminal 27 amino acids of PhGPx. The deduced amino acid sequences of the N-terminal 30 amino acids of rat (this work), pig(28) , and human (14) PhGPx were aligned, and the identical residues between the species are boxed. The basic and hydrophobic residues in the sequence are represented in the line diagram. The hexapeptide sequence in the mitochondrial aldehyde dehydrogenase (DH) (38) that is homologous to residues 5-10 in the rat PhGPx sequence is shown. The arrow indicates the proposed site of cleavage of the leader sequence in mitochondrial aldehyde dehydrogenase.



Multiple Transcription Start Sites

Our results suggest that translation of the full-length PhGPx mRNA initiates at AUG to produce a precursor protein containing a potential mitochondrial targeting sequence. The fact that PhGPx has also been detected in the cytosol and in nuclear and plasma membranes suggests that some forms of the protein may be synthesized without the N-terminal 27 amino acids. This could be achieved in vivo by the generation of different mRNA species that encode the cytosolic and mitochondrial forms of the protein. To test this hypothesis, we performed Northern blot analysis of total RNA isolated from various rat tissues. As shown in Fig. 5, PhGPx mRNA was detected as a single species of similar size in all tissues examined, including intestine, liver, spleen, kidney, lung, heart, cerebellum, cerebral cortex, and testis. Since small differences in mRNA size may not be detected by this method, we also developed an RNase protection assay to map the 5`-ends of the PhGPx transcripts. A P-labeled antisense RNA probe, which was complementary to nucleotides 1-192 of the full-length rat PhGPx cDNA, was hybridized to total RNA from rat tissues. After digestion with RNases A and T1, the protected fragments were separated on a sequencing gel.


Figure 5: Expression of PhGPx mRNA in the rat tissues. Total RNAs (20 µg) from rat intestine, liver, spleen, kidney, lung, heart, cerebellum, cerebral cortex, and testis were analyzed by Northern blotting. Filters were hybridized with the rat PhGPx cDNA and washed under high stringency as described under ``Experimental Procedures.''



As shown in Fig. 6A, multiple protected bands were detected in all rat tissues, suggesting that transcription of the PhGPx gene initiates at multiple sites in vivo. Based on the sizes of the protected bands, we identified two major windows of transcription initiation, which differed between testis and somatic tissues (Fig. 6B). In testis, the major window of transcription initiation lies between nucleotides 1 and 27, which is upstream of AUG (Fig. 6A). These transcripts would encode a 197-amino acid protein that contains the putative mitochondrial targeting signal. A second, minor window of transcription initiation between nucleotides 87 and 102 was also detected in this tissue. Translation of these shorter mRNAs would initiate at AUG to produce a 170-amino acid protein. In addition, two to three protected bands that mapped to nucleotides 134-138 were consistently observed in testis RNA (Fig. 6A, asterisk). It is not known whether these transcripts are translated in vivo since they contain a very short 5`-UTR of only three to seven nucleotides. The pattern was reversed in somatic tissues. In kidney, spleen, lung, cerebral cortex, intestine, liver, and heart, transcription initiated predominantly at a second window between nucleotides 71 and 91 (Fig. 6A). A minor band corresponding to initiation at the first window of transcription was also detected in intestine (Fig. 6A, arrow) and in kidney and cerebral cortex after long exposure of the autoradiogram (data not shown).


Figure 6: Identification of the PhGPx transcription start sites in rat testis and somatic tissues. An RNase protection assay was used to map the 5`-ends of PhGPx mRNAs in rat tissues. Synthetic RNAs of CRP-1 and CRP-2 (200 pg) were used as controls. Total RNAs from testis (2.5 µg), lung (8 µg), and other somatic tissues (10 µg) were hybridized to a P-labeled RNA that was complementary to nucleotides 1-192 of the rat testis PhGPx cDNA. After RNase digestion, the products were analyzed on a sequencing gel as described under ``Experimental Procedures.'' The size of the protected bands was determined by comparison with DNA sequence ladders or P-labeled X174 replicative form DNA/HaeIII fragments. A, the two transcription initiation windows in testis are indicated in the left panel by brackets, and the asterisk indicates transcripts that initiate between nucleotides 134 and 138. In the right panel, the arrow and the bracket indicate the minor full-length protected band and the second window of transcription initiation, respectively, in somatic tissues. B, the two windows of transcription start sites in testis and somatic tissues are shown.



Several experiments were performed to confirm that the different protected bands corresponded to true transcription start sites. Multiple RNA samples isolated from several animals and processed separately gave identical results, which suggests that the shorter transcripts were not generated by RNase degradation. RNase protection assays were performed over a wide range of temperatures (45-55 °C) and RNase digestion conditions to eliminate the possibility of nonspecific hybridization to other mRNAs. In addition, the CRP-1 and CRP-2 synthetic RNAs generated discrete protected fragments of the appropriate sizes (Fig. 6A), which suggests that the shorter transcripts in the somatic tissues were not generated by nonspecific cleavage by RNases A and T1. The presence of multiple transcription start sites was also confirmed by an independent method using RACE technology to map the 5`-ends of the mRNAs (data not shown).


DISCUSSION

We have demonstrated that multiple transcription start sites generate two populations of PhGPx mRNAs that have different translation start sites. Mapping of the transcription start sites localized two major windows of transcription initiation, one upstream of AUG, which was predominantly used in testis, and another located between AUG and AUG, which was used primarily in somatic tissues. The presence of multiple transcription start sites is characteristic of housekeeping genes that lack a TATA box. PhGPx has been shown to be a single copy gene in humans(28) . Sequence analysis of a pig genomic clone revealed that the PhGPx promoter lacks a TATA box, is GC-rich, and contains a potential Sp1-binding site(10) . These features are similar to those of the promoter of the rat aspartate aminotransferase gene, which also contains multiple transcription start sites(29) . The genomic clone for rat PhGPx has not been isolated, but the high conservation of the pig, human, and rat cDNAs suggests that the structure of the gene may be conserved across species. Thus, it is likely that the rat PhGPx promoter also lacks a TATA box.

The predominant form of PhGPx mRNA in rat testis was 100 nucleotides longer at the 5`-terminus than the mRNAs from somatic tissues. Consistent with our results is the finding that the 5`-sequence of the pig parathyroid gland cDNA (10) starts from a region that corresponds to the transcription window downstream of AUG, while human testis (12) and pig blastocyst (11) cDNAs start upstream of AUG. Although these 5`-sequences were derived from the longest cDNA clones that were obtained, it is not known whether they represent the major form of the mRNA in these tissues. In an analogous system, the rat farnesyl-pyrophosphate synthetase gene promoter contains testis-specific transcription start sites that are located 25-100 nucleotides upstream of the somatic start sites(30) . The somatic start sites are clustered into two groups that are preceded by TATA boxes. In contrast, the testis-specific start sites were spread over a region of 90 nucleotides with no obvious initiation sequence. Thus, the somatic and testis sites were apparently controlled by overlapping promoters with different properties. Testis-specific transcription start sites have also been detected in genes coding angiotensin-converting enzyme (31) , proenkephalin(32) , cytochrome c(33) , and alpha-tubulin (34) . As in the case of PhGPx, these proteins are expressed in other organs.

Based on sequence analysis, the rat testis PhGPx cDNA contains two potential translation start sites, AUG and AUG, which were also present in the PhGPx cDNA clones isolated from human testis and pig blastocyst(11, 12) . Esworthy et al.(12) recently proposed that translation of the human testis PhGPx mRNA may initiate at the upstream AUG codon, although no functional evidence was provided. We have shown that both AUG and AUG in rat PhGPx occur in a favorable context for translation initiation and that both function as efficient translation start sites in vitro when they are the first AUG codon in the mRNA. This conforms with the ribosome scanning model for translation initiation proposed by Kozak(23) . In addition to the predominant 24-kDa protein, translation of the full-length PhGPx mRNA in vitro also produced a minor protein of 21 kDa, which appeared to initiate at AUG based on its molecular mass. A low level of initiation at the second AUG codon in the PhGPx mRNA may be due to leaky scanning of the ribosome or to internal entry of the 40 S ribosomal subunit, as has been proposed for other mRNAs(35) . Studies using artificial bicistronic mRNAs showed that translation initiation at the 5`-proximal or distal AUG codon was dependent on the cell type and the concentration of eukaryotic initiation factor 4F (36) . Alternatively, the 21-kDa protein may be generated by minor degradation of the full-length synthetic RNA from the 5`-end during translation. Although our in vitro translation experiments demonstrated that AUG was predominantly used as the translation initiation codon in the full-length transcript, the possibility of preferential initiation of translation at AUGin vivo cannot be ruled out.

The N-terminal 27-amino acid sequence of PhGPx is highly conserved across species and contains the known features of a mitochondrial targeting sequence. This is consistent with the fact that PhGPx has been localized to the intermembrane space in rat testis mitochondria (17) . Like PhGPx, other proteins targeted to this region contain a bipartite targeting signal with basic residues at the N terminus, followed by an uninterrupted hydrophobic stretch of 20 amino acids (25) . The eukaryotic mitochondrial intermembrane space proteins, such as cytochromes b(2) and c(1) of the bc(1) complex, are synthesized as cytosolic precursors containing such bipartite presequences(25) . The positively charged N-terminal sequence is cleaved by a mitochondrial processing peptidase after import of the protein into the matrix. The remaining hydrophobic stretch directs the export of the protein back across the inner membrane, where it is cleaved to the mature protein by a membrane-associated peptidase. Our results suggest that PhGPx may be synthesized in rat testis as a 197-amino acid precursor protein that is cleaved to the mature form during import into the mitochondrial intermembrane space. PhGPx contains the sequence RLSRLL, which is identical to the sequence in rat aldehyde dehydrogenase that is cleaved by the matrix processing peptidase(27) . This sequence is also similar to a pentapeptide sequence (SRLLK) in the leader sequence of another mitochondrial matrix protein, yeast KAD2(37) .

Taken together, our results suggest that a single PhGPx gene encodes both cytosolic and mitochondrial forms of the protein through differential transcription and translation start sites. This type of mechanism has been reported for several nuclear genes that encode cytosolic and mitochondrial proteins in yeast, Neurospora crassa, and mammals(38, 39, 40, 41, 42, 43, 44, 45) . Our model is consistent with current knowledge on the subcellular distribution of PhGPx in rat tissues. Studies by Ursini and co-workers (16) have shown that the majority of PhGPx activity is localized in the mitochondria in rat testis, whereas the predominant form of PhGPx in rat liver is cytosolic. However, the subcellular distribution of PhGPx in other tissues has not been analyzed in detail. Although PhGPx was detected in both the soluble and membrane fractions from various rat tissues, the membrane fractions included nuclear and plasma membranes as well as mitochondria(14) . Our results suggest that PhGPx will primarily be nonmitochondrial in somatic tissues since the predominant transcripts in these tissues lack AUG.

Although the physiological significance of high level expression of PhGPx in testis mitochondria is not understood, this enzyme may play an important role in protecting mitochondrial DNA against oxidative damage. This hypothesis is supported by the fact that mitochondrial DNA is more susceptible to oxidative damage than nuclear DNA due to a lack of histones protecting the mitochondrial DNA, a lack of DNA repair enzymes in mitochondria, and the proximity of mitochondrial DNA to oxidants generated during oxidative phosphorylation(46) .


FOOTNOTES

*
This work was supported by Grant HL29582 from the National Institutes of Health (to G. M. C.), a postdoctoral fellowship from the American Heart Association, Northeast Ohio Affiliate (to T. R. P.-R.), and an Established Investigator award from the American Heart Association (to D. M. D.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U37427[GenBank].

§
Established Investigator of the American Heart Association. To whom correspondence should be addressed: Dept. of Cell Biology, Cleveland Clinic Foundation, 9500 Euclid Ave., NC-10, Cleveland, OH 44195. Tel.: 216-445-9758; Fax: 216-444-9404.

(^1)
The abbreviations used are: PhGPx, phospholipid-hydroperoxide glutathione peroxidase; Sec, selenocysteine; PCR, polymerase chain reaction; RACE, 5`-rapid amplification of cDNA ends; UTR, untranslated region.


ACKNOWLEDGEMENTS

We thank Dr. Anuradha Mehta and Dr. Scott Colles for advice and helpful discussions and Jim Lang for photography.


REFERENCES

  1. Ursini, F., Maiorino, M., and Gregolin, C. (1985) Biochim. Biophys. Acta 839, 62-70 [Medline] [Order article via Infotrieve]
  2. Thomas, J. P., Geiger, P. G., Maiorino, M., Ursini, F., and Girotti, A. W. (1990) Biochim. Biophys. Acta 1045, 252-260 [Medline] [Order article via Infotrieve]
  3. Thomas, J. P., Maiorino, M., Ursini, F., and Girotti, A. W. (1990) J. Biol. Chem. 265, 454-461 [Abstract/Free Full Text]
  4. Chisolm, G. M., Ma, G., Irwin, K. C., Martin, L. L., Gunderson, K. G., Linberg, L. F., Morel, D. W., and DiCorleto, P. E. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 11452-11456 [Abstract/Free Full Text]
  5. Flohe, L. (1989) Coenzyme and Cofactors , Wiley-Interscience, New York
  6. Maiorino, M., Gregolin, C., and Ursini, F. (1990) Methods Enzymol. 186, 448-457 [Medline] [Order article via Infotrieve]
  7. Ursini, F., Maiorino, M., Valente, M., Ferri, L., and Gregolin, C. (1982) Biochim. Biophys. Acta 710, 197-211 [Medline] [Order article via Infotrieve]
  8. Roveri, A., Maiorino, M., Nissi, C., and Ursini, F. (1994) Biochim. Biophys. Acta 1208, 211-221 [Medline] [Order article via Infotrieve]
  9. Chambers, J. S., Lambert, N., and Williamson, G. (1995) Int. J. Biochem. 26, 1279-1286
  10. Brigelius-Flohe, R., Aumann, K., Blocker, H., Gross, G., Kiess, M., Kloppel, K., Maiorino, M., Roveri, A., Schuckelt, R., Ursini, F., Wingender, E., and Flohe, L. (1994) J. Biol. Chem. 269, 7342-7348 [Abstract/Free Full Text]
  11. Sunde, R. A., Dyer, J. A., Moran, T. V., Evenson, J. K., and Sugimoto, M. (1993) Biochim. Biophys. Acta 193, 905-911
  12. Esworthy, R. S., Doan, K., Doroshow, J. H., and Chu, F. F. (1994) Gene (Amst.) 144, 317-318 [CrossRef][Medline] [Order article via Infotrieve]
  13. Schuckelt, R., Brigelius-Flohe, R., Maiorino, M., Roveri, A., Reumkens, J., Strabburger, W., Ursini, F., Wolf, B., and Flohe, L. (1991) Free Radical Res. Commun. 14, 343-361 [Medline] [Order article via Infotrieve]
  14. Roveri, A., Maiorino, M., and Ursini, F. (1994) Methods Enzymol. 233, 202-212 [Medline] [Order article via Infotrieve]
  15. Maiorino, M., Chu, F. F., Ursini, F., Davis, K. J. A., Doroshow, J. H., and Esworthy, R. S. (1991) J. Biol. Chem. 266, 7728-7732 [Abstract/Free Full Text]
  16. Roveri, A., Casasco, A., Maiorino, M., Dalan, D., Calligaro, A., and Ursini, F. (1992) J. Biol. Chem. 267, 6142-6146 [Abstract/Free Full Text]
  17. Godeas, C., Sandri, G., and Panfili, E. (1994) Biochim. Biophys. Acta 1191, 147-150 [Medline] [Order article via Infotrieve]
  18. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  19. Driscoll, D. M., Lakhe-Reddy, S., Oleksa, L. M., and Martinez, D. (1993) Mol. Cell. Biol. 13, 7288-7294 [Abstract]
  20. Jaeger, J. A., Turner, D. H., and Zuker, M. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 7706-7710 [Abstract]
  21. Berry, M. J., Banu, L., Chen, Y., Mandel, S. J, Kieffer, J. D., Harney, J. W., and Larsen, P. (1991) Nature 353, 273-276 [CrossRef][Medline] [Order article via Infotrieve]
  22. Shen, Q., Chu, F., and Newburger, P. (1993) J. Biol. Chem. 268, 11463-11468 [Abstract/Free Full Text]
  23. Kozak, M. (1989) J. Cell Biol. 108, 229-241 [Abstract]
  24. Kozak, M. (1990) Annu. Rev. Cell Biol. 8, 197-225 [CrossRef]
  25. Hartl, F.-U., Ostermann, J., Guiard, B., and Neupert, W. (1987) Cell 51, 1027-1037 [Medline] [Order article via Infotrieve]
  26. von Heijne, G. (1986) EMBO J. 5, 1335-1342 [Abstract]
  27. Farris, J., Guan, K.-L., and Weiner, H. (1988) Biochem. Biophys. Res. Commun. 150, 1083-1087 [Medline] [Order article via Infotrieve]
  28. Chu, F. F. (1994) Cytogenet. Cell Genet. 66, 96-98 [Medline] [Order article via Infotrieve]
  29. Toussaint, C., Bousquet-Lemercier, B., Garlatti, M., Hanoune, J., and Barouki, R. (1994) J. Biol. Chem. 269, 13318-13324 [Abstract/Free Full Text]
  30. Teruya, J., Kutsunai, S., Spear, D. H., Edwards, P., and Clarke, C. (1990) Mol. Cell. Biol. 10, 2315-2326 [Medline] [Order article via Infotrieve]
  31. Howard, T., Shai, S., Langford, K., Martin, B., and Bernstein, K. (1990) Mol. Cell. Biol. 10, 4294-4302 [Medline] [Order article via Infotrieve]
  32. Kilpatrick, D. L., Zinn, S. A., Fitzgerald, M., Higuchi, H., Sabol, S., and Meyerhardt, J. (1990) Mol. Cell. Biol. 10, 3717-3726 [Medline] [Order article via Infotrieve]
  33. Hake, L. E., and Hecht, N. B. (1993) J. Biol. Chem. 268, 4788-4797 [Abstract/Free Full Text]
  34. Villasante, A., Wang, D., Dobner, P., Dolph, P., Lewis, S., and Cowan, N. (1986) Mol. Cell. Biol. 6, 2409-2419 [Medline] [Order article via Infotrieve]
  35. Kozak, M. (1991) J. Cell Biol. 115, 887-903 [Abstract]
  36. Tahara, S. M., Dietlin, T. A., Dever, T. E., Merrick, W. C., and Worrilow, L. M. (1991) J. Biol. Chem. 266, 3594-3601 [Abstract/Free Full Text]
  37. Cooper, A. J., and Friedberg, E. C. (1992) Gene (Amst.) 114, 145-148 [Medline] [Order article via Infotrieve]
  38. Natsoulis, G., Hilger, F., and Fink, G. R. (1986) Cell 46, 235-243 [Medline] [Order article via Infotrieve]
  39. Chatton, B., Walter, P., Ebel, J.-P., Lacroute, F., and Fasiolo, F. (1988) J. Biol. Chem. 263, 52-57 [Abstract/Free Full Text]
  40. Kubelik, A. R., Turcq, B., and Lambowitz, A. M. (1991) Mol. Cell. Biol. 11, 4022-4035 [Medline] [Order article via Infotrieve]
  41. Ellis, S. R., Hopper, A. K., and Martin, N. C. (1989) Mol. Cell. Biol. 9, 1611-1620 [Medline] [Order article via Infotrieve]
  42. Dihanich, M. E., Najarian, D., Clark, R., Gillman, E., Martin, N. C., and Hopper, A. K. (1987) Mol. Cell. Biol. 7, 177-184 [Medline] [Order article via Infotrieve]
  43. Tropschug, M., Nicholson, D. W., Hartl, F.-U., Kohler, H., Pfanner, N., Wachter, E., and Neupert, W. (1988) J. Biol. Chem. 263, 14433-14440 [Abstract/Free Full Text]
  44. Beltzer, J. P., Morris, S. R., and Kohlhaw, G. B. (1988) J. Biol. Chem. 263, 368-374 [Abstract/Free Full Text]
  45. Oda, T., Funai, T., and Ichiyama, A. (1990) J. Biol. Chem. 265, 7513-7519 [Abstract/Free Full Text]
  46. Shigenaga, M. K., Hagen, T. M., and Ames, B. N. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 10771-10778 [Abstract/Free Full Text]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.