(Received for publication, April 28, 1995)
From the
G4 nucleic acids are four-stranded helical structures that are
formed in vitro by nucleic acids that contain guanine tracts.
These structures anneal readily under physiological conditions and are
unusually stable once formed. G4 nucleic acids are thought to
participate in telomere function, retroviral genome dimerization,
chromosome alignment during homologue pairing, and mitotic
recombination, although the in vivo demonstration of these
structures in any of these situations has not yet been achieved. Here
we purify and characterize an activity from yeast, G4p1, which has a
high and specific affinity for G4 nucleic acids. G4p1 prefers
substrates containing multiple G4 domains, has an equal affinity for
parallel and antiparallel G4 structures, and binds equivalently to RNA
and DNA in G4 form. The Keq for G4p1 binding to a G4 DNA oligomer is
5.0 10
M
, under near
physiological conditions. G4p1 was purified and shown to derive from a
42-kDa protein (p42). We have cloned and sequenced the gene encoding
p42 and show it to encode a novel protein with a region significantly
homologous to bacterial methionyl-tRNA synthetase dimerization domains.
We have reconstituted the G4p1 binding activity with recombinant p42
and present evidence that G4p1 is a homodimer of p42.
Nucleic acids containing guanine tracts will associate in vitro into four-stranded right-handed helices, stabilized by guanine base quartets (for review see (1, 2, 3, 4, 5) ). These helices exist in parallel and cis or trans antiparallel conformations, as revealed by NMR and x-ray analysis. These quadruplex structures are called the G4 nucleic acids. G4 structures, once formed, are exceptionally stable, particularly in the presence of certain alkali metal cations, such as potassium, which efficiently bind guanine carbonyl oxygens lining the axial cavity of the helix. G4 helices can form at appreciable rates, particularly into the monomeric or dimeric antiparallel conformers, under moderate physiological conditions.
Do G4 structures arise in vivo? This is presently an unanswered question. The simple sequence organization of telomeres, specifically the guanine tracts, are strongly conserved phylogenetically(6) , indicating that telomere function depends in some way on a property unique to these tracts. Oligomers containing telomere repeat sequences readily form G4 structures in vitro(7, 8, 9) , and, a G4 DNA annealing activity, which greatly accelerates the formation of G4 structures by these oligomers, was identified in an Oxytricha telomere binding protein(10) . There is similar in vitro evidence supporting a role for G4 structures in the dimerization of retroviral genomes during virion assembly(11, 12) .
Sep1/Kem1 is a complex yeast enzyme which, among other activities, possesses a nucleolytic activity specific for DNA oligomers containing G4 domains(13) . Null alleles of the gene encoding this enzyme lead, in part, to mitotic loss of chromosomes and meiotic arrest at pachytene (14) . These defects may be due to an inability of the mutant cell to process properly G4 structures that may arise during the course of chromosomal replication, recombination, or meiotic pairing. Also, a domain in the 3`-untranslated region of the insulin-like growth factor II mRNA, has been shown in vitro to exist in a G4 structure(15) . This transcript undergoes a specific cleavage reaction in vivo at a site just 5` to this domain, suggesting a role for G4 structures in targeting mRNA-processing events.
To explore the possible biological functions of G4 nucleic acids, we analyzed a yeast extract for activities specific to these structures. We have identified an activity, G4p1, with a high and specific affinity for G4 nucleic acids. The activity was purified and shown to derive from a 42-kDa protein (p42). The gene encoding p42 was isolated and sequenced and shown to encode a novel protein with a domain significantly homologous to bacterial methionyl-tRNA synthetase dimerization domains. G4p1 is probably a homodimer of p42.
Figure 1:
G4 oligomers used in mobility shift and
competition experiments. Guanines in boldface are those that
participate in quartet formation. G indicates tetrameric
parallel (p) structures, and G`
indicates dimeric
antiparallel (ap) structures (nomenclature according to Sen
and Gilbert(2) ).
Competition experiments were carried out as above, with
saturating amounts of GL(G) probe. Ten-fold serial
dilutions of competitor nucleic acids were premixed with extract or
G4p1 fractions in binding buffer just prior to adding the probe.
Relative mass amounts of competitor to probe ranged from 0.1 to 1000.
SDS-PAGE was carried out by standard procedures(18) . Ten-µl aliquots of G4p1 fractions were loaded on 10% SDS-PAGE gels, and protein bands were visualized by staining twice with silver (Bio-Rad).
Partially purified G4p1 (16 ml) from the
previous step was loaded at 25 ml/h on a 20-ml S Sepharose FF
(Pharmacia Biotech Inc.) bed, which had been equilibrated in buffer B
and packed in a 2.5 10-cm EconoColumn (Bio-Rad). The bed was
washed with 100 ml of buffer B, and proteins were eluted with an 80-ml
linear gradient of increasing NaCl concentration (0-600
mM). One-ml fractions were collected and analyzed by mobility
shift assay for G4p1 and by SDS-PAGE. Fractions 48-61 (12 ml),
containing G4p1, were pooled and dialyzed for 12 h against 1200 ml of
buffer C (100 mM BisTris-Cl, pH 6.0, 1 mM EDTA, 0.05%
Triton X-100, 5 mM BME, 10% glycerol).
Partially purified
G4p1 (12 ml) from the previous step was loaded at 25 ml/h on a
DEAE-Sepharose FF (Pharmacia) bed, which had been equilibrated in
buffer C and packed in a 2.5 10-cm EconoColumn (Bio-Rad). The
bed was washed with 100 ml of buffer C, and proteins were eluted with
simultaneous 60-ml linear gradients of decreasing BisTris-Cl, pH 6.0,
concentration (100-0 mM) and increasing
NaPO
, pH 7.3, concentration (0-200 mM).
One-ml fractions were collected and analyzed by mobility shift assay
for G4p1 and by SDS-PAGE. Fractions 38-49 (12 ml), containing
G4p1, were pooled and dialyzed for 12 h against 1200 ml of buffer D (20
mM NaHepes, pH 7.3, 1 mM EDTA, 50 mM KCl,
0.05% Triton X-100, 5 mM BME, 10% glycerol).
A 1-ml G4 DNA
affinity bed was prepared as described previously(17) .
Essentially, oligomer GL (Fig. 1), with biotin incorporated next
to the 3` end, was synthesized at 1 µmol scale and converted into
G4 form. The G4 DNA (3.95 mg) was bound to 2 ml of 50%
streptavidin-agarose (Sigma) in 10 mM Tris, pH 7.6, 1 mM EDTA, 100 mM KCl. The affinity matrix was washed twice in
buffer D, 100 mM NaCl, packed in a 0.7 5 cm
EconoColumn (Bio-Rad), and equilibrated at 4 °C.
Partially purified G4p1 (12 ml) from the previous step was loaded at 2.3 ml/h on the G4 DNA affinity column, washed with 5 ml of buffer D, 100 mM NaCl, and bound proteins were eluted with a 6-ml convex exponential gradient of increasing NaCl concentration (100-500 mM). Sixty 0.1-ml fractions were collected and assayed by mobility shift assay for G4p1 and by SDS-PAGE. Fractions 33-42 (1 ml), containing G4p1, were pooled.
A 3.5-kb Sal partial/SmaI fragment from pSK85 ()was
subcloned into SmaI/SalI-cleaved pBluescript-KS+
(Stratagene). This construct was cleaved with AseI, and the
ends were made blunt by treatment with T4 DNA polymerase in the
presence of deoxyribonucleotides. Subsequent cleavage with NotI yielded a 2.5-kb fragment, which was inserted into SmaI/NotI cleaved pGEX-5X-1. The polypeptide produced
by this construct should be identical in amino acid sequence to p85
after removal of the glutathione S-transferase domain, except
that the N-terminal residues MTKLFSKVKESIEGIKMPSTLTI are replaced by
GIPEFP.
Large scale preparations of recombinant proteins were prepared in the Escherichia coli host strain LE392, according to recommended procedure(24) . Yields of 35 µg of GSTp42 and 75 µg of GSTp85 per 100 ml of culture were obtained. The glutathione S-transferase domains were removed from p42 and p85 by proteolysis with factor Xa, according to recommended procedure(24) .
Identification of a G4 Nucleic Acid Binding Protein, G4p1, in Yeast-We have tested extracts for activities specific for G4 quadruplex DNA. Previously, we reported on the characterization of G4p2, a protein present in a yeast whole-cell extract that binds specifically to nucleic acids containing G4 domains(17) . Here we describe another such binding activity, G4p1, found in the same extract preparation. Fig. 2A shows a mobility shift experiment identifying both the G4p1 and G4p2 binding activities. G4p1 binds to a parallel G4 DNA probe in the presence of excess double-stranded DNA ( Fig. 1lists the oligonucleotides used). Fig. 2B shows mobility shift competition experiments with single-stranded competitors and with competitors containing G4 structures. Single-stranded GL and denatured salmon sperm DNA do not compete, but molecules containing G4 regions compete effectively, identifying the G4 domain of the probe as required and sufficient for stable binding to G4p1. Mobility shift experiments with extracts from other yeast strains determined that G4p1 is not restricted to SK1 (data not shown). We proceeded to purify G4p1 in order to characterize it more fully.
Figure 2: Identification of G4 nucleic acid specific binding activities in a yeast extract. A, 5-fold serial dilutions of yeast extract were analyzed by mobility shift assay with 320-pg probe as described. Extract dilutions are indicated above the lanes. One µl of undiluted extract (25 µg), was analyzed in the leftmost lane. The G4 DNA probe and the two major G4 DNA binding activities, G4p1 and G4p2, are indicated. B, the binding specificity of G4p1 was analyzed by mobility shift competition assays, with 320-pg probe and with 10-fold serial dilutions of unlabeled G4 and single-stranded competitor DNAs (see Fig. 1) as described. One µl of a 20-fold dilution of extract (1.25 µg) was analyzed. Relative mass of competitor over probe is indicated above the lanes. GL(SS) indicates single-stranded GL oligomer, and sal DNA(SS) indicates single-stranded salmon sperm DNA.
Figure 3: Purification of G4p1. A, fractions from the G4 affinity column were analyzed by mobility shift assay with 64 pg probe, as described. One µl of a 20-fold dilution of each fraction was analyzed. The input fraction (IN) is comprised of pooled and concentrated DEAE-Sepharose fractions 38-49. G4p1 is indicated. B, SDS-PAGE of the same fractions as in A. In each case, 10 µl of undiluted fraction was analyzed. p42, p85, and the molecular weights of markers are indicated. The gel was stained twice with silver.
Figure 4: Binding specificity of purified G4p1 for G4 nucleic acids. Purified G4p1 was analyzed by mobility shift competition assays, with 2 ng of probe, and with serial dilutions of unlabeled competitor nucleic acids (see Fig. 1) as described. Bound probe was measured by exposing dried gels to an imaging plate (Fuji). One µl of a 20-fold dilution of purified G4p1 (500 pg; prep 2) was analyzed. Competitor nucleic acids are indicated. GL(DS) indicates GL oligomer annealed to a complementary sequence.
Fig. 5shows that
the equilibrium constant for G4p1 binding to parallel G4 DNA
(GL(G)) is 5.0
10
M
, in 50 mM KCl at pH 7.3
and at room temperature. This value is comparable with those for other
known protein-DNA interactions.
Figure 5:
Equilibrium constant for G4p1 binding to
GL(G). Mobility shift assays with 1.25-fm probe and with
2-fold serial dilutions of G4p1, beginning with 40 fm, were carried
out, as described. Free probe was measured by exposing the dried gel to
an imaging plate (Fuji). Total, bound, and free G4p1 was quantitated as
described under ``Materials and Methods.'' Free G4p1
concentration at 50% probe occupancy is indicated and is equivalent to
1/Keq.
Figure 6:
Sequence analysis of p42. A,
nucleotide sequence of the gene encoding p42 and its deduced amino acid
sequence. -helixes I and II, identified by GCG's
Robson-Garnier secondary structure prediction programs, and a putative
dimerization domain, are enclosed in brackets. B, a partial
restriction map of the gene encoding p42.
Significant homology was detected between the C-terminal domains of bacterial methionyl-tRNA synthetases and an internal region of p42. Sequence alignments with GCG's GAP program show that a 36% identity exists between residues 522-616 of the Thermus aquaticus methionyl-tRNA synthetase (SwissProt accession number P23395) and residues 208-304 of p42. Similar matches of comparable quality were found to the Bacillus stearothermophilus (SwissProt accession number P23920) and E. coli (SwissProt accession number P00959) enzymes. These matches lie outside the catalytic core of the methionyl-tRNA synthetases, in a region identified as the dimerization domain of these enzymes, which exist, in bacteria, as homodimers(25) . This region of p42, then, may be a dimerization domain.
The C-terminal segment of human endothelial monocyte-activating polypeptide (EMAP) II (GenBank accession number U10117), residues 152-311, are 50% identical to residues 205-364 of p42. The specific function of this region of EMAP II is unknown; however, by inferring from the matches noted above, EMAP II may also dimerize. EMAP II is secreted by certain human and mouse cultured tumor cells and can induce an acute host inflammatory response(23) . The N-terminal segments of these proteins are responsible for their inflammation-inducing properties.
An expressed sequence tag derived from rice callus (DDBJ accession number D23020), encodes a protein fragment 120 amino acids in length with a 61% identity to residues 185-302 of p42. The remainder of the cDNA sequence is required to establish whether it encodes a homologue of p42.
An inspection of the amino acid sequence of p42 reveals two
potentially interesting structural elements. Residues 127-160 (I)
and 166-199 (II) are predicted to exist as -helices by
GCG's Robson-Garnier secondary structure prediction program.
Helix I is lysine rich at both ends, and helix II is N-terminally rich
in charged residues and alanine and C-terminally rich in glutamine.
Neither helix is strongly amphipathic.
Figure 7: Reconstitution of G4p1 with recombinant p42. Mobility shift assays were carried out as described, with GSTp42 and GSTp85 fusion proteins purified from E. coli. Lane1 contains no protein. Lanes2 and 5 contain 1 µl (70 ng) of GSTp42, and GSTp42 treated partially with factor Xa, respectively. Lanes4 and 7 contain 1 µl (150 ng) of GSTp85, and GSTp85 treated partially with factor Xa, respectively. Lanes3 and 6 contain mixtures (0.5 µl each) of GSTp42 and GSTp85, and GSTp42 and GSTp85 treated partially with factor Xa, respectively. Lane8 contains 1 µl of a 20-fold dilution of pooled fractions, containing G4p1, obtained from the G4 affinity column. The binding activities assigned to GSTp42, dimers of p42, and monomers of p42 are indicated.
G4p1, isolated from yeast, displays a selective affinity for nucleic acids in the G4 quadruplex form. We demonstrate this with competition mobility shift experiments at saturating levels of G4 probe and excess double-stranded DNA using specific nucleic acid competitors containing single-stranded, double-stranded, or G4 regions. G4p1 binds G4 domains in both RNA and DNA with equivalent affinity. G4p1 appears to be a dimeric protein with two identical 42-kDa (p42) subunits. p42 is a novel protein of unknown cellular function, with an internal region that shows significant homology to bacterial methionyl-tRNA synthetase dimerization domains.
Methionyl-tRNA synthetases of eubacteria contain a C-terminal domain that can be removed by proteolysis without affecting the catalytic properties of these enzymes(25) . Removal of this domain converts these usually homodimeric proteins into monomers, indicating that this domain is a module that mediates homodimerization. The discovery of domains homologous to this module in p42, a yeast protein, in a secreted mammalian cytokine, and in a protein from a higher plant, raises interesting questions concerning its evolutionary history.
As
described above, a second protein, p85, purified along with p42
throughout the purification procedure. These proteins also
co-fractionate on a gel filtration column with apparent molecular
weights of 200-240 kDa, indicating they may be complexed. As
described in this work, however, we have reconstituted the G4p1
activity with recombinant p42 alone. An interesting possibility is that
p42 and p85 are in fact associated in a complex and dissociate upon
binding of the G4 DNA probe by p42. This is presently under
investigation. We have identified p85 as the yeast cytosolic
glutamyl-tRNA synthetase. If we can verify that p85 and p42
are complexed, this would implicate a G4 nucleic acid component in the
translation process.
G4p1 has several features in common with
another G4 nucleic acid binding activity, G4p2, which we have described
previously(17) . Like G4p2, G4p1 binds parallel and
antiparallel G4 nucleic acids, prefers substrates containing more than
one G4 domain, and binds RNA as well and DNA in G4 form. Also, the
affinities displayed by these activities for these substrates are
similar. These common binding properties may indicate that these
activities interact with similar substrates in vivo. We have
speculated that G4p2 functions as a gene regulating factor, controlled
by protein kinases, by interacting with regulatory DNA sequences, with
mRNAs, or with structural RNAs involved in translation, via G4
structures. On the other hand, p42 shows little sequence similarity to
the polypeptide responsible for the G4p2 activity, and there is no
obvious common linear motif that might be responsible for their G4
nucleic acid binding properties. -helix I and
-helix II,
which lie just N-terminal to the putative dimerization module, may be
the structural elements that bind G4 nucleic acids. Further studies on
the biochemistry and genetics of these proteins are required to
illuminate their cellular function.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank[GenBank].