(Received for publication, July 28, 1995; and in revised form, August 29, 1995)
From the
Phospholipid-hydroperoxide glutathione peroxidase (PhGPx) is a
selenoenzyme that reduces hydroperoxides of phospholipid, cholesterol,
and cholesteryl ester. Previous studies suggested that both the
mitochondrial and nonmitochondrial forms of PhGPx are 170 amino
acids long. In this study, we isolated a full-length cDNA clone
encoding rat testis PhGPx. Based on sequence analysis, the cDNA encodes
a protein of 197 amino acids, with translation initiating at
AUG
. The additional 27 amino acids at the N terminus
contain the features of a mitochondrial targeting sequence. In
vitro translation of the full-length PhGPx mRNA initiated
predominantly at AUG
. However, translation initiated at
AUG
when AUG
was deleted. An RNase
protection assay was used to map the 5`-ends of PhGPx mRNAs in rat
tissues. We identified two major windows of transcription initiation
that are tissue-specific. Rat testis predominantly expresses larger
transcripts that encode the 197-amino acid protein containing the
potential mitochondrial targeting signal. The predominant smaller
transcripts in somatic tissues lack AUG
and encode a
170-amino acid protein, which may represent the nonmitochondrial forms
of PhGPx. Our results suggest that the use of alternative transcription
and translation start sites determines the subcellular localization of
PhGPx in different tissues.
PhGPx ()is a unique selenoenzyme that reduces
phospholipid, cholesterol, and cholesteryl ester hydroperoxides at the
expense of glutathione(1, 2, 3) . PhGPx can
also reduce 7
-hydroperoxycholesterol, which our laboratory has
shown to be the principal cytotoxin in oxidized
lipoproteins(4) . These lipid hydroperoxides are resistant to
the action of the classical glutathione peroxidase, which reduces
hydrogen peroxide and free fatty acid hydroperoxides(5) . It
has been proposed that while glutathione peroxidase is important in
removing cytosolic hydroperoxides, PhGPx catalyzes the reductive
inactivation of lipid hydroperoxides in membranes and lipoproteins and
thus protects cellular membranes against oxidative damage(5) .
PhGPx has been purified to homogeneity from pig heart, liver, and
brain (6, 7) , rat testis cytosol and
mitochondria(8) , and human liver(9) . Biochemical
analyses of the purified protein have shown that PhGPx is a 170-amino
acid protein of 20 kDa(1, 10) . cDNA clones that
encode PhGPx have been isolated from pig heart and blastocyst and from
human testis(10, 11, 12) . Although
structurally and functionally distinct, PhGPx and glutathione
peroxidase share 40% homology at the amino acid level. Several of the
active-site residues of glutathione peroxidase are conserved in PhGPx,
which suggests that the two enzymes have similar mechanisms of
action(10, 11, 13) . Like other
selenoproteins, the selenocysteine (Sec) in PhGPx is encoded by a UGA
codon, which normally functions as a translation stop codon.
PhGPx is widely expressed, and its enzymatic activity has been detected in all rat tissues examined (14) as well as in several human tumor cell lines(15) . The highest levels of activity are present in rat testis, where PhGPx expression is regulated by gonadotropins in maturing spermatogenic cells(16) . PhGPx has been detected in cytoplasm, mitochondria, and plasma and nuclear membranes, but the structural basis for this subcellular localization has not been determined. Enzymological and immunochemical data suggest that PhGPx is not an integral membrane protein(1, 8) . The cytosolic and membrane-associated forms of the enzyme from rat testis appear to be identical based on their molecular masses, cross-reactivity with antisera, and protein fragmentation pattern(8) . However, the molecular mass of pig PhGPx, predicted from the nucleotide sequence, differed from that determined by laser desorption spectroscopy(10) . It has been suggested that this difference in the molecular mass may be due to a post-translational modification that is necessary for the association of the protein with membranes(10) . Recently, it was demonstrated in rat testis mitochondria that PhGPx is localized in the intermembrane space, possibly at the contact sites of the two membranes(17) . Since PhGPx is a nuclear encoded gene product, this suggests that the protein may be synthesized as a larger precursor containing a mitochondrial targeting signal, which may be cleaved after import into mitochondria.
In this study, we isolated a full-length cDNA clone encoding rat PhGPx and identified two windows of transcription start sites that are tissue-specific. Our results suggest that the predominant full-length transcripts in testis direct the synthesis of a 197-amino acid protein containing a potential mitochondrial targeting signal. Somatic tissues primarily express shorter transcripts that encode a 170-amino acid protein, which may represent the cytosolic and membrane-associated forms of the protein.
Figure 1:
Nucleotide and deduced amino acid
sequences of the full-length rat testis PhGPx cDNA. The two potential
translation start sites, AUG and AUG
, are underlined. T
G
A
,
which codes for selenocysteine, is indicated by an asterisk.
The sequences in the 3`-UTR that may be necessary for Sec incorporation
(ATGA, AAA, and UGG) are in lower-case letters. The potential
polyadenylation signal AATAAA at nucleotides 852-857 is double-underlined.
Analysis of the nucleotide sequence revealed an open
reading frame of 197 amino acids, with translation initiating at the
AUG codon at nucleotides 61-63 (AUG). As shown in Fig. 2, PhGPx is highly conserved across species, and rat PhGPx
shares 93 and 91% identity at the amino acid level with pig and human
PhGPx, respectively. Tyr
, which was shown to be
phosphorylated in pig PhGPx in vivo(13) , is conserved
in the rat sequence. The amino acid sequence of rat PhGPx also shares
40% homology with rat glutathione peroxidase. Several of the
active-site residues of glutathione peroxidase are conserved at
homologous positions in rat PhGPx, including Sec
,
Gly
, Gln
, and the triplet
Trp
-Asn
-Phe
. As in other
selenoproteins, Sec
in rat PhGPx is encoded by an in-frame
UGA codon (T
G
A
in the cDNA
sequence; shown by the asterisk in Fig. 1).
Figure 2:
Homology of rat, pig, and human PhGPx. The
deduced amino acid sequences of rat (this work), pig(28) , and
human (14) PhGPx were aligned using the computer program
GeneWorks. The residue numbers refer to the rat testis PhGPx sequence,
with translation initiating at AUG (Fig. 1), and
the conserved residues are boxed.
The 3`-untranslated region (UTR) of rat PhGPx is 217 nucleotides long, and the sequence shares 80% homology with the 3`-UTRs of human and pig PhGPx, which is a high degree of conservation for a noncoding region. In the pig blastocyst PhGPx cDNA, two polyadenylation signals were found in the 3`-UTR, indicating that the distal polyadenylation signal is utilized in this tissue(11) . However, our three independent rat testis cDNA clones terminated after the proximal polyadenylation signal at nucleotides 852-857 (Fig. 1). cDNAs isolated from pig heart (13) and human testis (12) also contained only the proximal polyadenylation signal.
The 3`-UTRs of
other eukaryotic selenoprotein mRNAs are necessary for the
cotranslational insertion of Sec at the UGA codon, which is normally a
translation stop codon(21, 22) . The decoding of UGA
as Sec requires a stable stem-loop structure as well as specific
nucleotide sequences in the 3`-UTR(21, 22) . The
3`-UTR of rat PhGPx is predicted to form a stable stem-loop structure
with a high negative free energy (G = -62.4
kcal) based on computer analysis with the program MulFold (data not
shown). The rat (this study) and pig (10) PhGPx 3`-UTRs also
contain the motifs required for Sec incorporation in other mammalian
selenoproteins, including AUGA, AAA, and UGA/G (shown in lower-case
letters in Fig. 1). In contrast to other selenoproteins
that contain the sequence AAA in the terminal loop of the stem-loop
structure(21, 22) , this sequence is located in a
single-stranded bulge region in rat (this study) and pig (10) PhGPx.
To identify the translation start site of rat
PhGPx, the full-length cDNA (construct RP-1) was transcribed in
vitro, and the synthetic RNA was translated in a rabbit
reticulocyte lysate system. When the translation products were analyzed
by SDS-polyacrylamide gel electrophoresis, only a truncated protein was
detected due to premature termination of translation at
UGA, which encodes Sec
(Fig. 3, first lane). This is consistent with previous studies that
showed that selenocysteine incorporation is inefficient in reticulocyte
lysate (21) . To avoid the premature termination, we used
site-directed mutagenesis to convert UGA
to UGU, which
encodes cysteine (construct CRP-1). We also constructed a deletion
mutant of RP-1 by deleting 134 nucleotides from the 5`-end (construct
RP-2) as well as the cysteine mutant of this deletion construct
(construct CRP-2). These two deletion mutants lacked AUG
,
but contained AUG
. As shown in Fig. 3, in
vitro translation of the CRP-1 RNA produced predominantly a
protein of 24 kDa (second lane), whereas a 21-kDa protein was
obtained when the CRP-2 deletion mutant was translated (fourth
lane). We also observed a minor protein of 21 kDa in the CRP-1
translation assays, which may represent a low level of initiation at
AUG
in the full-length transcript. The 3-kDa size
difference between the CRP-1 and CRP-2 translation products is
equivalent to the predicted molecular mass of amino acids 1-27.
These results suggest that translation initiates predominantly at
AUG
in the full-length PhGPx mRNA and that AUG
can function as an efficient translation start site in the
absence of AUG
. Previous studies in other systems have
shown that the choice of translation start sites can be influenced by
the secondary structure of the 5`-UTR or by capping of the
mRNA(23, 24) . The sequence upstream of AUG
is highly GC-rich (
70%) and has the potential to form a
stable secondary structure (
G = -50.6 kcal)
based on computer analysis with the program MulFold. However, we found
that initiation of translation at AUG
in vitro was not affected by heating of the RNA (65 °C, 10 min) prior
to translation. There was also no difference in the pattern of
translation initiation when capped or uncapped RNAs were translated for
30 or 60 min (data not shown).
Figure 3:
In vitro translation of PhGPx mRNAs.
Synthetic RNAs were transcribed from plasmids containing the
full-length PhGPx cDNA (RP-1) and its cysteine mutant (CRP-1) or the
deletion construct lacking nucleotides 1-134 (RP-2) and its
cysteine mutant (CRP-2). RNAs were translated in a rabbit reticulocyte
lysate system in the presence of [S]methionine.
Translation products were analyzed by SDS-polyacrylamide gel
electrophoresis, followed by fluorography and autoradiography. The large and small arrows indicate the protein products
due to translation initiating at AUG
and
AUG
, respectively. The positions of the molecular mass
markers (in kilodaltons) are indicated.
PhGPx has also been localized to the intermembrane space in
mitochondria isolated from rat testis(17) . Most nuclear coded
mitochondrial proteins are synthesized as larger precursors containing
N-terminal presequences that target the protein for mitochondrial
import. In the case of proteins localized to the intermembrane space, a
bipartite targeting signal is required(25) . After cleavage by
a matrix protease to remove the N-terminal basic domain, the remaining
hydrophobic sequence directs the protein to the intermembrane space,
where cleavage by a membrane-associated peptidase yields the mature
protein. As shown in Fig. 4, the N-terminal sequence of PhGPx
has the features of such a mitochondrial targeting signal. The 27-amino
acid length conforms to the typical size of leader sequences, which are
10-70 amino acids long. The PhGPx sequence is leucine-rich (7/27
amino acids), contains three hydroxyl amino acids, and lacks acidic
residues. Three basic amino acids in the N-terminal portion of the
sequence are followed by a hydrophobic region. The sequence is also
predicted to form an amphiphilic -helical structure(26) ,
with the positively charged and hydrophobic amino acids on opposite
faces of the helix (data not shown). In addition, PhGPx also contains
the sequence
Arg
-Leu
-Ser
-Arg
-Leu
-Leu
,
which is identical to the sequence preceding the proposed cleavage site
in rat mitochondrial aldehyde dehydrogenase(27) .
Figure 4: Analysis of the N-terminal 27 amino acids of PhGPx. The deduced amino acid sequences of the N-terminal 30 amino acids of rat (this work), pig(28) , and human (14) PhGPx were aligned, and the identical residues between the species are boxed. The basic and hydrophobic residues in the sequence are represented in the line diagram. The hexapeptide sequence in the mitochondrial aldehyde dehydrogenase (DH) (38) that is homologous to residues 5-10 in the rat PhGPx sequence is shown. The arrow indicates the proposed site of cleavage of the leader sequence in mitochondrial aldehyde dehydrogenase.
Figure 5: Expression of PhGPx mRNA in the rat tissues. Total RNAs (20 µg) from rat intestine, liver, spleen, kidney, lung, heart, cerebellum, cerebral cortex, and testis were analyzed by Northern blotting. Filters were hybridized with the rat PhGPx cDNA and washed under high stringency as described under ``Experimental Procedures.''
As shown in Fig. 6A, multiple protected bands were detected in all
rat tissues, suggesting that transcription of the PhGPx gene initiates
at multiple sites in vivo. Based on the sizes of the protected
bands, we identified two major windows of transcription initiation,
which differed between testis and somatic tissues (Fig. 6B). In testis, the major window of transcription
initiation lies between nucleotides 1 and 27, which is upstream of
AUG (Fig. 6A). These transcripts would
encode a 197-amino acid protein that contains the putative
mitochondrial targeting signal. A second, minor window of transcription
initiation between nucleotides 87 and 102 was also detected in this
tissue. Translation of these shorter mRNAs would initiate at
AUG
to produce a 170-amino acid protein. In addition, two
to three protected bands that mapped to nucleotides 134-138 were
consistently observed in testis RNA (Fig. 6A, asterisk). It is not known whether these transcripts are
translated in vivo since they contain a very short 5`-UTR of
only three to seven nucleotides. The pattern was reversed in somatic
tissues. In kidney, spleen, lung, cerebral cortex, intestine, liver,
and heart, transcription initiated predominantly at a second window
between nucleotides 71 and 91 (Fig. 6A). A minor band
corresponding to initiation at the first window of transcription was
also detected in intestine (Fig. 6A, arrow)
and in kidney and cerebral cortex after long exposure of the
autoradiogram (data not shown).
Figure 6:
Identification of the PhGPx transcription
start sites in rat testis and somatic tissues. An RNase protection
assay was used to map the 5`-ends of PhGPx mRNAs in rat tissues.
Synthetic RNAs of CRP-1 and CRP-2 (200 pg) were used as controls. Total
RNAs from testis (2.5 µg), lung (8 µg), and other somatic
tissues (10 µg) were hybridized to a P-labeled RNA
that was complementary to nucleotides 1-192 of the rat testis
PhGPx cDNA. After RNase digestion, the products were analyzed on a
sequencing gel as described under ``Experimental
Procedures.'' The size of the protected bands was determined by
comparison with DNA sequence ladders or
P-labeled
X174 replicative form DNA/HaeIII fragments. A,
the two transcription initiation windows in testis are indicated in the left panel by brackets, and the asterisk indicates transcripts that initiate between nucleotides 134 and
138. In the right panel, the arrow and the bracket indicate the minor full-length protected band and the
second window of transcription initiation, respectively, in somatic
tissues. B, the two windows of transcription start sites in
testis and somatic tissues are shown.
Several experiments were performed to confirm that the different protected bands corresponded to true transcription start sites. Multiple RNA samples isolated from several animals and processed separately gave identical results, which suggests that the shorter transcripts were not generated by RNase degradation. RNase protection assays were performed over a wide range of temperatures (45-55 °C) and RNase digestion conditions to eliminate the possibility of nonspecific hybridization to other mRNAs. In addition, the CRP-1 and CRP-2 synthetic RNAs generated discrete protected fragments of the appropriate sizes (Fig. 6A), which suggests that the shorter transcripts in the somatic tissues were not generated by nonspecific cleavage by RNases A and T1. The presence of multiple transcription start sites was also confirmed by an independent method using RACE technology to map the 5`-ends of the mRNAs (data not shown).
We have demonstrated that multiple transcription start sites
generate two populations of PhGPx mRNAs that have different translation
start sites. Mapping of the transcription start sites localized two
major windows of transcription initiation, one upstream of
AUG, which was predominantly used in testis, and another
located between AUG
and AUG
, which was used
primarily in somatic tissues. The presence of multiple transcription
start sites is characteristic of housekeeping genes that lack a TATA
box. PhGPx has been shown to be a single copy gene in
humans(28) . Sequence analysis of a pig genomic clone revealed
that the PhGPx promoter lacks a TATA box, is GC-rich, and contains a
potential Sp1-binding site(10) . These features are similar to
those of the promoter of the rat aspartate aminotransferase gene, which
also contains multiple transcription start sites(29) . The
genomic clone for rat PhGPx has not been isolated, but the high
conservation of the pig, human, and rat cDNAs suggests that the
structure of the gene may be conserved across species. Thus, it is
likely that the rat PhGPx promoter also lacks a TATA box.
The
predominant form of PhGPx mRNA in rat testis was 100 nucleotides
longer at the 5`-terminus than the mRNAs from somatic tissues.
Consistent with our results is the finding that the 5`-sequence of the
pig parathyroid gland cDNA (10) starts from a region that
corresponds to the transcription window downstream of AUG
,
while human testis (12) and pig blastocyst (11) cDNAs
start upstream of AUG
. Although these 5`-sequences were
derived from the longest cDNA clones that were obtained, it is not
known whether they represent the major form of the mRNA in these
tissues. In an analogous system, the rat farnesyl-pyrophosphate
synthetase gene promoter contains testis-specific transcription start
sites that are located 25-100 nucleotides upstream of the somatic
start sites(30) . The somatic start sites are clustered into
two groups that are preceded by TATA boxes. In contrast, the
testis-specific start sites were spread over a region of 90 nucleotides
with no obvious initiation sequence. Thus, the somatic and testis sites
were apparently controlled by overlapping promoters with different
properties. Testis-specific transcription start sites have also been
detected in genes coding angiotensin-converting enzyme (31) ,
proenkephalin(32) , cytochrome c(33) , and
-tubulin (34) . As in the case of PhGPx, these proteins
are expressed in other organs.
Based on sequence analysis, the rat
testis PhGPx cDNA contains two potential translation start sites,
AUG and AUG
, which were also present in the
PhGPx cDNA clones isolated from human testis and pig
blastocyst(11, 12) . Esworthy et al.(12) recently proposed that translation of the human
testis PhGPx mRNA may initiate at the upstream AUG codon, although no
functional evidence was provided. We have shown that both AUG
and AUG
in rat PhGPx occur in a favorable context
for translation initiation and that both function as efficient
translation start sites in vitro when they are the first AUG
codon in the mRNA. This conforms with the ribosome scanning model for
translation initiation proposed by Kozak(23) . In addition to
the predominant 24-kDa protein, translation of the full-length PhGPx
mRNA in vitro also produced a minor protein of 21 kDa, which
appeared to initiate at AUG
based on its molecular mass.
A low level of initiation at the second AUG codon in the PhGPx mRNA may
be due to leaky scanning of the ribosome or to internal entry of the 40
S ribosomal subunit, as has been proposed for other mRNAs(35) .
Studies using artificial bicistronic mRNAs showed that translation
initiation at the 5`-proximal or distal AUG codon was dependent on the
cell type and the concentration of eukaryotic initiation factor 4F (36) . Alternatively, the 21-kDa protein may be generated by
minor degradation of the full-length synthetic RNA from the 5`-end
during translation. Although our in vitro translation
experiments demonstrated that AUG
was predominantly used
as the translation initiation codon in the full-length transcript, the
possibility of preferential initiation of translation at AUG
in vivo cannot be ruled out.
The N-terminal 27-amino
acid sequence of PhGPx is highly conserved across species and contains
the known features of a mitochondrial targeting sequence. This is
consistent with the fact that PhGPx has been localized to the
intermembrane space in rat testis mitochondria (17) . Like
PhGPx, other proteins targeted to this region contain a bipartite
targeting signal with basic residues at the N terminus, followed by an
uninterrupted hydrophobic stretch of 20 amino acids (25) .
The eukaryotic mitochondrial intermembrane space proteins, such as
cytochromes b
and c
of the bc
complex, are synthesized as cytosolic
precursors containing such bipartite presequences(25) . The
positively charged N-terminal sequence is cleaved by a mitochondrial
processing peptidase after import of the protein into the matrix. The
remaining hydrophobic stretch directs the export of the protein back
across the inner membrane, where it is cleaved to the mature protein by
a membrane-associated peptidase. Our results suggest that PhGPx may be
synthesized in rat testis as a 197-amino acid precursor protein that is
cleaved to the mature form during import into the mitochondrial
intermembrane space. PhGPx contains the sequence
RLSRLL
, which is identical to the sequence in rat
aldehyde dehydrogenase that is cleaved by the matrix processing
peptidase(27) . This sequence is also similar to a pentapeptide
sequence (SRLLK) in the leader sequence of another mitochondrial matrix
protein, yeast KAD2(37) .
Taken together, our results
suggest that a single PhGPx gene encodes both cytosolic and
mitochondrial forms of the protein through differential transcription
and translation start sites. This type of mechanism has been reported
for several nuclear genes that encode cytosolic and mitochondrial
proteins in yeast, Neurospora crassa, and
mammals(38, 39, 40, 41, 42, 43, 44, 45) .
Our model is consistent with current knowledge on the subcellular
distribution of PhGPx in rat tissues. Studies by Ursini and co-workers (16) have shown that the majority of PhGPx activity is
localized in the mitochondria in rat testis, whereas the predominant
form of PhGPx in rat liver is cytosolic. However, the subcellular
distribution of PhGPx in other tissues has not been analyzed in detail.
Although PhGPx was detected in both the soluble and membrane fractions
from various rat tissues, the membrane fractions included nuclear and
plasma membranes as well as mitochondria(14) . Our results
suggest that PhGPx will primarily be nonmitochondrial in somatic
tissues since the predominant transcripts in these tissues lack
AUG.
Although the physiological significance of high level expression of PhGPx in testis mitochondria is not understood, this enzyme may play an important role in protecting mitochondrial DNA against oxidative damage. This hypothesis is supported by the fact that mitochondrial DNA is more susceptible to oxidative damage than nuclear DNA due to a lack of histones protecting the mitochondrial DNA, a lack of DNA repair enzymes in mitochondria, and the proximity of mitochondrial DNA to oxidants generated during oxidative phosphorylation(46) .
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U37427[GenBank].