©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Universal Minicircle Sequence Binding Protein, a CCHC-type Zinc Finger Protein That Binds the Universal Minicircle Sequence of Trypanosomatids
PURIFICATION AND CHARACTERIZATION (*)

(Received for publication, May 18, 1995)

Yehuda Tzfati Hagai Abeliovich Dana Avrahami Joseph Shlomai (§)

From the Department of Parasitology and Department of Cellular Biochemistry, The Hebrew University, Hadassah Medical School, Jerusalem 91010, Israel

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Replication of kinetoplast DNA minicircles of trypanosomatids initiates at a conserved 12-nucleotide sequence, termed the universal minicircle sequence (UMS, 5`-GGGGTTGGTGTA-3`). A single-stranded nucleic acid binding protein that binds specifically to this origin-associated sequence was purified to apparent homogeneity from Crithidia fasciculata cell extracts. This UMS-binding protein (UMSBP) is a dimer of 27.4 kDa with a 13.7-kDa protomer. UMSBP binds single-stranded DNA as well as single-stranded RNA but not double-stranded or four-stranded DNA structures. Stoichiometry analysis indicates the binding of UMSBP as a protein dimer to the UMS site. The five CCHC-type zinc finger motifs of UMSBP, predicted from its cDNA sequence, are similar to the CCHC motifs found in retroviral Gag polyproteins. The remarkable conservation of this motif in a family of proteins found in eukaryotic organisms from yeast and protozoa to mammals is discussed.


INTRODUCTION

Kinetoplast DNA (kDNA) (^1)is a unique extrachromosomal DNA network found in the single mitochondrion of parasitic flagellated protozoa of the family Trypanosomatidae. In Crithidia fasciculata, kDNA consists of about 5,000 DNA minicircles (2.5 kilobase pairs each) and about 50 DNA maxicircles (37 kilobase pairs each) interlocked topologically to form a huge DNA network (for review, see (1, 2, 3) ). Minicircles are heterogeneous in their nucleotide sequence but contain two short sequences, 70-100 base pairs apart, that are conserved in all the species studied so far: the dodecamer sequence known as universal minicircle sequence (UMS), 5`-GGGGTTGGTGTA-3`, and the hexamer sequence 5`-ACGCCC-3`.

On the basis of in vivo observations, Englund and co-workers (4, 5, 6, 7, 8) have described the replication of kDNA minicircles as a process in which individual minicircles are detached from the central zone of the disc-shaped network, replicated, and reattached to the periphery of the disc. The network increases in size until it doubles and then divides and segregates into two daughter networks. Extensive studies of minicircle replication intermediates(9, 10, 11, 12, 13, 14, 15, 16) have suggested that replication begins at the UMS site with synthesis of an RNA primer and proceeds by continuous elongation of the leading light strand (L-strand). A single gap of 6-10 nucleotides remains in the newly synthesized light strand at the UMS site (13) and is repaired only after replication of the minicircles and their reattachment to the network have been completed(8) . Discontinuous synthesis of the lagging heavy strand (H-strand) starts when its origin, containing the conserved hexamer sequence, is exposed by the advancing replication fork. Highly gapped and nicked nascent H-strands are generated.

We have previously reported on the recognition of UMS by a unique sequence-specific single-stranded DNA binding protein from C. fasciculata(17) and on the isolation and analysis of the UMSBP-encoding cDNA(18) . The amino acid sequence of the polypeptide, predicted from the cDNA, is 116 residues long and contains five Cys-X(2)-Cys-X(4)-His-X(4)-Cys (CCHC)-type zinc finger motifs. CCHC-type zinc finger motifs have been found in one or two copies in the retroviral nucleocapsid proteins and their Gag precursors (19) and in proteins of plant viruses and of eukaryotic cells. It has been suggested that this type of zinc finger is involved in binding of single-stranded nucleic acids(20) . UMSBP belongs to a distinct group of cellular proteins that contains several (5, 6, 7, 8, 9) adjacent CCHC motifs, including cellular nucleic acid binding protein (CNBP) from human and mouse, which binds a G-rich single-stranded sequence of a sterol regulatory element(21) ; hexamer binding protein (HEXBP) from Leishmania major, which binds a G-rich single-stranded repeated sequence found in the 5`-untranslated region of the gene encoding GP63(22) ; byr3 from Schizosacharomyces pombe(23) ; and CnjB from Tetrahymena thermophila(24) . The structure of the CCHC motifs of the HIV-1 Gag polyprotein was determined using NMR spectroscopy(25, 26, 27, 28) . These studies have revealed a very compact and well-defined structure, stabilized by coordination of the three cysteines and the histidine residue to the zinc ion and by extensive internal hydrogen bonding.

Here we describe the purification to apparent homogeneity of UMSBP from C. fasciculata cell extracts and the physical characteristics and specific nucleic acid binding properties of the protein. Finally we discuss the sequence and structure conservation of CCHC-type zinc finger motifs from UMSBP and other cellular proteins in reference to the structure and nucleic acid binding properties of the homologous retroviral motif.


EXPERIMENTAL PROCEDURES

Nucleic Acids, Nucleotides, Proteins, and Resins

Synthetic deoxyoligonucleotides were prepared by an Applied Biosystems oligonucleotide synthesizer at the Bletterman Laboratory of the Interdepartmental Division, Faculty of Medicine, the Hebrew University of Jerusalem. Synthetic ribo-oligonucleotide (5`-GGGGUUGGUGUA-3`) was prepared by New England Biolabs. Poly(dI-dC)bulletpoly(dI-dC) was purchased from Boehringer Mannheim. Phenyl-Sepharose was purchased from Sigma; hydroxyapatite was from Bio-Rad; and chromatofocusing resin (PBE

Cell Growth

Two hundred liters of C. fasciculata culture was grown at 29 °C in a 250-liter industrial fermentor with 100 rpm stirring and air flow rate of 0.6 volumes/medium volume/min. Growth medium, optimized for C. fasciculata (Dr. S. Braun, the Department of Biological Chemistry, The Hebrew University of Jerusalem) contained (per liter): 18 g of N-(Z)-amine B (Sheffield Products), 4.5 g of yeast extract (Biolife), 4.5 g of NaCl, 9 g of glucose, 0.2 ml of antifoam reagent polypropylene glycol (P-2000, Sigma), 100 mg of streptomycin sulfate, 10^5 units of penicillin (Teva, Israel), and 20 mg of hemin (Sigma). Cells were harvested during logarithmic growth phase (0.5-1.0 10^8 cells/ml) by centrifugation at 14,000 g in a Sharpless centrifuge, and washed with 50 mM Tris-Cl, pH 7.5, and 100 mg/ml sucrose (enzyme grade, Schwartz/Mann). Cell paste was frozen in liquid nitrogen and stored at -75 °C. 7 g of this cell paste was used for the preparation of UMSBP described here.

Electrophoretic Mobility Shift Analysis

Analyses were carried out as described previously(17) . The 10-µl standard reaction mixture contained: 25 mM Tris-Cl, pH 7.5, 2 mM MgCl(2), 1 mM dithiothreitol, 20% (v/v) glycerol, 10 µg of bovine serum albumin, 0.5 µg of poly(dI-dC)bulletpoly(dI-dC), and 0.2 ng of P-5`-labeled 12-mer UMS H-strand (UMS-H12; GGGGTTGGTGTA). UMSBP was added to the amounts indicated. Reaction mixtures were incubated at 0-30 °C for 15 min and electrophoresed in an 8% native polyacrylamide gel (1:32, bisacrylamide/acrylamide) in TAE buffer (6.7 mM Tris acetate, 3.3 mM sodium acetate, 1 mM EDTA, pH 7.5). Electrophoresis was conducted at 4 °C and 16 V/cm for 1.25 h. Gels were dried and exposed to x-ray films (Agfa Curix RP2 or Kodak X-Omat AR). Protein-DNA complexes were quantified either by excision of the radioactive bands and counting them in a scintillation counter or by exposing the dried gels to an imaging plate (BAS-IIIs, Fuji) and analyzing it by a Bio Imaging Analyzer (model BAS1000, Fuji). One unit of UMSBP is defined as the amount of protein required for binding of 1 fmol of UMS H-strand DNA probe under the standard mobility-shift assay conditions(17) .

SDS-polyacrylamide Gel Electrophoresis

Protein samples were mixed with loading buffer to final concentrations of: 50 mM Tris-Cl, pH 6.85, 4% (w/v) SDS, 2% (v/v) beta-mercaptoethanol, 10% (v/v) glycerol, and 10 mM EDTA. Samples were incubated at 45 °C for 30 min, loaded alongside protein molecular weight markers (``Rainbow'' prestained low molecular weight, Amersham Corp.; SDS-17S and SDS-70L, Sigma) onto a polyacrylamide gel (16.5% T, 3% C separating gel; 10% T, 3% C spacer gel; 4% T, 3% C stacking gel), electrophoresed following the procedure of Schagger and Von Jagow(29) , and silver-stained.

Gel Filtration

UMSBP (Fraction VIIb) was filtered through a gel filtration G3000 SW HPLC column (0.75 60 cm; LKB) that was dipped in ice. The column was equilibrated and washed with 25 mM Tris-Cl, pH 7.5, 50 mM (NH(4))(2)SO(4), 2 mM MgCl(2) ,and 2 mM beta-mercaptoethanol. Fractions of 200 µl were collected at a flow rate of 0.5 ml/min.

Glycerol Gradient Sedimentation

2.38 10^4 units of UMSBP (Fraction VIIb) were centrifuged with 10 units of Escherichia coli DNA polymerase I (5.6 S, 109 kDa), 150 µg of human hemoglobin (4.13 S, 64.5 kDa), 10 units (32 µg) of horseradish peroxidase (3.85 S, 40 kDa), and 150 µg of horse cytochrome C (2.1 S, 12 kDa). The 20-µl sample was layered onto a 5-ml 10-30% (v/v) linear glycerol gradient containing 25 mM Tris-Cl, pH 7.5, 100 mM KCl, 2 mM MgCl(2), and 1 mM DTT, and was centrifuged at 49,000 rpm and 2 °C for 39 h in a Kontron TST 55 rotor. 155-µl fractions were collected dropwise from the bottom of the tube and assayed for the presence of the different proteins as follows: UMSBP was assayed by the standard mobility-shift assay; cytochrome C and hemoglobin following A; DNA polymerase I by nick-translation assay modified from the method of Richardson et al.(30) ; and peroxidase activity following the increase at A resulted from the oxidation of pyrogallol to purpurogallin, as recommended by the manufacturer (Sigma).

Measurements of Equilibrium Binding Constants

Measurements of equilibrium binding constants for the interactions of purified UMSBP with various DNA and RNA probes were carried out as described by Fried and Crothers (31) and Liu-Johnson et al.(32) . Experiments were carried out under the standard mobility-shift assay conditions, by serial dilution of both UMSBP and the oligonucleotide probe while keeping their molar ratios constant. Reactions were incubated for 15 min at 30 °C. Quantification of protein-DNA complexes was carried out using a Bio Imaging Analyzer, as described above. Data were analyzed as described by Liu-Johnson et al.(32) , plotting (1 - r)(alpha - r)/rversus 1/[DNA], where r is the fraction of the DNA that is in the protein-DNA complex band, [DNA] is the total concentration of the DNA added, and alpha is the unknown but constant ratio of active protein to total DNA. The equation is solved by searching for an alpha value that will yield the best line passing through the origin. The slope reciprocal yields the binding constant (K).

Protein Assays

Protein was determined following the method of Bradford(33) , using bovine serum albumin as a protein standard. For the stoichiometry experiment, UMSBP concentration was quantified from the amino acid analysis of the protein, carried out at the amino acid analysis laboratory, The Weizmann Institute of Science, Rehovot, Israel.

Protein Sequence Analysis

Protein sequences were analyzed using the Genetics Computer Group software package(1991), Madison, Wisconsin.


RESULTS

Purification of Crithidia fasciculata UMSBP

The electrophoretic mobility shift assay was used to monitor and quantify the specific binding of UMSBP to the UMS H-strand(17) . C. fasciculata UMSBP was purified approximately 5,000-fold over the cleared cell lysate (Fraction I) to apparent homogeneity, with an overall yield of about 5% (Table 1). The procedure yields 18.7 µg of pure UMSBP from 7 g of C. fasciculata wet cell paste (2 10 cells).



The cleared cell lysate was fractionated by ammonium sulfate and then subjected to hydrophobic chromatography on phenyl-Sepharose and adsorption chromatography on hydroxyapatite. The following chromatofocusing step separated two forms of UMSBP (Table 1, Fractions VIa and VIb) with estimated pI values of 7.25 and 6.25, respectively, and an apparent polypeptide mass of 12.6 and 13.7 kDa (under denaturing and reducing conditions) (Fig. 1). The partial sequencing of the shorter polypeptide chain by Edman degradation has revealed the absence of 11 amino acid residues at the protein N terminus compared with the sequence of the cDNA ORF(18) . The identity of the longer polypeptide was verified by peptide mapping of the two polypeptide chains (not shown). Only the longer polypeptide was observed and co-chromatographed with UMS-binding activity at the purification steps prior to the chromatofocusing (Fraction VI). We presume that the shorter polypeptide, which has a significantly lower binding affinity to UMS DNA (not shown), is a degradation product of the full-length protein, formed at this stage of the procedure. The final chromatography on hydroxyapatite was carried out separately for the two forms of the protein, recovering approximately 7% (Fractions VIIa) and 5% (Fraction VIIb) of the overall UMS-binding activity measured in the cell lysates. However, a minor fraction of the shorter polypeptide is still present in the final UMSBP preparation (Fraction VIIb). Apparently homogenous UMSBP preparations were stable for at least one year at -75 °C, in the presence of 2 mM Mg ions.


Figure 1: SDS-polyacrylamide gel electrophoresis of UMSBP. A sample of 120 ng of protein Fraction VIIb and molecular weight markers were electrophoresed in an SDS-polyacrylamide gel composed of: 16.5% T 3% C separating gel, 10% T 3% C spacer gel and 4% T 3% C stacking gel, following the procedure of Schagger and Von Jagow (29) as described under ``Experimental Procedures.'' Protein markers were prestained ovalbumin (46 kDa), carbonic anhydrase (30 kDa), trypsin inhibitor (21.5 kDa), and aprotinin (6.5 kDa) (Rainbow, Amersham Corp.), and 17-, 14.4-, 10.6-, and 8.2-kDa myoglobin fragments (SDS 17 S, Sigma).



Physical Properties: Molecular Weight and Subunit Structure

Some of the physical properties of C. fasciculata UMSBP are summarized in Table 2. The purified protein migrates in SDS-polyacrylamide gels as a polypeptide band of 13.7 kDa (Fig. 1). This 13.7-kDa polypeptide chain co-fractionated with the specific UMS binding activity upon chromatography on phenyl-Sepharose (Fraction III) and hydroxyapatite (Fraction V), chromatofocusing (Fraction VI), G3000 SW HPLC gel filtration (Fig. 2) and glycerol gradient sedimentation (Fig. 3). The apparent protomer mass, as measured by SDS-polyacrylamide gel electrophoresis, is similar to the one calculated on the basis of the UMSBP encoding cDNA(18) . Gel filtration data yielded a Stokes radius of 28 Å, as calculated by the method of Siegel and Monty (34) (Fig. 2). A sedimentation coefficient of 2.37 S was measured in a 10-30% (v/v) glycerol gradient (Fig. 3), following the method of Martin and Ames(35) . The apparent native mass of the protein, calculated from the experimental sedimentation coefficient and Stokes radius (assuming a partial specific volume of 0.725 ml/g)(34) , is estimated as 27.4 kDa. The frictional coefficient (f/f(0)), calculated by the method of Siegel and Monty(34) , is 1.35, indicating an axial ratio of approximately 1:7.




Figure 2: Gel filtration of UMSBP. UMSBP and protein size markers were filtered through a G3000 SW HPLC column as described under ``Experimental Procedures.'' Protein markers and their stokes radii were bovine serum albumin (35.5 Å), bovine pancreas chemotrypsinogen (22.4 Å), bovine lactalbumin (20.1 Å), and horse cytochrome C (16.4 Å). V(0) was determined using bovine thyroglobulin (669 kDa). UMSBP was detected by the standard mobility-shift assay and protein markers by A. The Stokes radius of UMSBP was interpolated from the linear plot of (-log K) versus the known stokes radii values of the protein markers, as described by Siegel and Monty (34) .




Figure 3: Glycerol gradient sedimentation of UMSBP. UMSBP and protein size markers were centrifuged in a 10-30% (v/v) glycerol gradient and assayed as described under ``Experimental Procedures.'' Protein markers and their sedimentation coefficients (s) were E. coli DNA polymerase I (5.6 S), human hemoglobin (4.13 S), horseradish peroxidase (3.85 S), and horse cytochrome C (2.1 S). The sedimentation coefficient was interpolated from the linear plot of the s values of the markers, as described by Martin and Ames(35) .



These data suggest that the native C. fasciculata UMSBP is a homodimer with a protomer mass of 13.7 kDa. On the basis of the protein activity in cell extracts, the apparent molecular weight, and the specific activity of the pure protein, we estimate the presence of approximately 12,000 UMSBP molecules/Crithidia cell.

Characteristics of the Binding Reaction

Generation of UMSBPbulletDNA complexes, as monitored by the mobility shift assay, is greatly enhanced in the presence of bovine serum albumin at a concentration of 1 mg/ml. Chelation of divalent cations by EDTA prevents the formation of protein-DNA complexes, indicating the possible involvement of metal ions in the putative zinc fingers. Optimal binding activity is observed at 60 mM KCl. Higher concentrations of monovalent ions are inhibitory, with 50% inhibition observed in the presence of 200 mM KCl or 150 mM NaCl. Glycerol concentrations up to 50% (v/v) enhance complex stability upon electrophoresis in native polyacrylamide gels. Optimal electrophoresis temperature is at the range of 2-5 °C. At temperatures higher than 8 °C, dissociation of protein-DNA complexes during electrophoresis is fast and results in smeared bands. Raising the polyacrylamide gel concentration from 4 to 8% stabilizes protein-DNA complexes and reduces their dissociation, presumably due to caging effect (36, 37) .

Structural Features of the DNA Ligand

We have previously shown (17) the specific binding of UMSBP to a G-rich single-stranded sequence. The equilibrium binding constant measured for the interaction of UMSBP with a 12-mer oligonucleotide comprising the H-strand sequence of UMS, is 2.5 10^9M (Fig. 4). A similar (2.6 10^9M) equilibrium binding constant was measured with a 40-mer oligonucleotide that consists of the H-strand of UMS and its flanking sequences at the origin region of the C. fasciculata kDNA minicircle (Fig. 4). This observation indicates that neighboring sequences flanking UMS at the minicircle origin site, in their single-stranded conformation, have no significant effect on the binding of UMSBP.


Figure 4: Determination of the equilibrium binding constants of UMSBPbulletDNA complexes. Samples containing serial dilutions of both UMSBP and DNA ligand (at a constant molar ratio) were analyzed by the mobility shift assay, following the procedure of Fried and Crothers (31) and Liu-Johnson et al.(32) , as described under ``Experimental Procedures.'' The DNA substrates used were UMS-H12, a 12-mer oligonucleotide representing the H-strand of UMS; UMS-H40, a 40-mer oligonucleotide containing the H-strand of UMS and flanking sequences from C. fasciculata kDNA minicircle; TEL-12, a 12-mer oligonucleotide (5`-GGGGTTGGGGTT-3`) that contains two telomeric repeats from T. thermophila. Data were analyzed by plotting (1 - r)(alpha - r)/rversus 1/[DNA] and adjusting alpha to obtain a y intercept value of 0, where r is the fraction of DNA radioactivity that is in the band representing the protein-DNA complexes, [DNA] is the total concentration of DNA in the reaction, and alpha is the unknown but constant molar ratio of active protein to total DNA. The slopes reciprocals yield K = 2.5 10^9M for UMSBP interaction with UMS-H12 (), K = 2.6 10^9M with UMS-H40 (bullet), and K = 4.1 10^9M with TEL-12 (up triangle, filled) (TEL-12 concentrations account for only the fraction of monomeric molecules, as the G-quartets are not bound by UMSBP (see below, Fig. 5)).




Figure 5: UMSBP binds a single-stranded but not a four-stranded DNA structure. 0.1 ng (23 fmol) of 5`-P]-labeled TEL-12 (5`-GGGGTTGGGGTT-3`), which contains two telomeric repeats of T. thermophila, were incubated with 2.6 (lane a), 8.1 (lane b), and 24.2 (lane c) units of UMSBP (Fraction VIIb) and analyzed on a native polyacrylamide gel under the standard mobility-shift assay conditions. Lane d contains no UMSBP. Indicated are: UMSBP bullet DNA complexes, free monomeric DNA molecules, and free DNA molecules that had been dimerized by forming a G4 structure.



G-rich sequences similar to UMS, such as those of eukaryotic telomere termini, retroviral RNA genome dimerization site, gene regulatory elements, and immunoglobulins switch regions, form in vitro special four-stranded (quadruplex) DNA structures. These structures, known as G-quartets or G4-DNA, are stabilized by Hoogsteen base pairing (38, 39, 40, 41, 42) . Several proteins that have been discovered recently, bind specifically these special conformations(43, 44) . Considering the specific binding of UMSBP to a G-rich ligand that may potentially form a four-stranded structure, we have explored the possibility that such a conformation is recognized by the protein. Since we could not detect stable quadruplexes formed in vitro by the 12-mer UMS H-strand oligonucleotide, we have used for this purpose a similar oligonucleotide containing the repeated Tetrahymena telomeric sequence 5`-GGGGTTGGGGTT-3`. UMSBP binds tightly to this telomeric sequence ( Fig. 4and (17) ). This oligonucleotide adopts two different DNA conformations that migrate as two different bands upon electrophoresis in a native polyacrylamide gel. The lower mobility band corresponds to the quadruplex structure, which is composed of two oligonucleotide molecules in a fold-back conformation(38, 45) , while the higher mobility band represents the monomeric structure. Mobility-shift analyses (Fig. 5) clearly demonstrate that UMSBP binds only the higher mobility monomeric molecules, but not the lower mobility four-stranded dimers.

Stoichiometry of UMS DNA and UMSBP in the Protein-DNA Complex

On the basis of its encoding cDNA(18) , UMSBP contains five CCHC-type zinc finger motifs. Thus, the native UMSBP homodimer isolated from C. fasciculata may contain a total of 10 such zinc finger structures. The functional role of each of these potential zinc fingers is yet to be studied. However, understanding the specific UMSBPbulletUMS interactions requires an accurate estimation of the stoichiometry of the protein and DNA reactants in this specific nucleoprotein complex.

No binding of a dimeric DNA ligand (such as G4-DNA) by UMSBP could be observed (Fig. 5). However, we have further explored the possibility that a single UMSBP molecule may bind simultaneously more than one UMS site. To address this question, we have used two DNA ligands that contain the 12-mer UMS sequence but differ in their length. The oligonucleotide UMS-H12 contains only the 12-mer H-strand of UMS, while the 40-mer UMS-H40 contains the UMS 12-mer and its flanking sequence at the minicircle H-strand. Whereas both DNA ligands are tightly bound by UMSBP (equilibrium binding constants measured for the two protein-DNA interactions were almost identical (Fig. 4)), the two protein-DNA complexes differ in their electrophoretic mobility in native polyacrylamide gels. If UMSBP binds only one UMS site, then two types of protein-DNA complexes could be expected: UMSBPbullet(UMS-H12) and UMSBPbullet(UMS-H40). However, if the complex contains two UMS elements, then three types of complexes may be expected: UMSBPbullet(UMS-H12)(2), UMSBPbullet(UMS-H40)(2), as well as UMSBPbullet(UMS-H12)(1)bullet(UMS-H40)(1). Fig. 6describes the results of such an experiment in which the oligonucleotides UMS-H12 and UMS-H40 were mixed together at various molar ratios as indicated, heat denatured in order to disrupt any pre-existing higher order structures, and used as radioactive probes in an electrophoretic mobility shift experiment with UMSBP. Reciprocal titration of one species of UMSbulletDNA over the other at the various molar ratios, yields only two types of protein-DNA complexes. No additional species of protein-DNA complexes could be detected, indicating that only one DNA molecule is present in the UMSBPbulletUMS complex.


Figure 6: Stoichiometry of UMS elements bound in the UMSBPbulletUMS complex. UMSBP (Fraction VIb, 26 units) was incubated under the standard mobility-shift assay conditions in a series of binding reaction mixtures containing 46 fmol total of P-5`-labeled UMS-H12 and P-5`-labeled UMS-H40 at the following UMS-H12/UMS-H40 molar ratios: only UMS-H12, 7:1, 3:1, 1.7:1, 1:1, 1:1.7, 1:3, 1:7, only UMS-H40 (lanesa-i, respectively). Reaction products were electrophoresed in a native 5% polyacrylamide gel at 4 °C and 16 V/cm for 2 h. Indicated are: UMSBPbullet(UMS-H40) and UMSBPbullet(UMS-H12) complexes; free UMS-H12 and UMS-H40 oligonucleotides.



To determine the precise number of UMSBP monomers that bind a single UMS element in the complex, we have conducted a mobility-shift electrophoresis analysis of the protein-DNA complexes using an S-labeled UMSBP and P-5`-labeled UMS DNA. We have measured a value of 2.1 UMSBP-monomer/UMS site (Fig. 7), indicating the apparent binding of two UMSBP monomers to each UMS binding site and suggesting that UMSBP binds to DNA as a protein dimer.


Figure 7: Stoichiometry of the protein monomers in the UMSBPbulletUMS DNA complex. UMSBP was prepared by the specific proteolytic cleavage of UMSBP-glutathione S-transferase fusion protein, expressed in E. coli (H. Abeliovich and J. Schlomai, manuscript in preparation) in the presence of [S]cysteine. 9.3 pmol (monomers) of S-labeled UMSBP were incubated in the presence of 0, 0.55, 0.82, 1.24, 1.83, 2.75, and 4.12 pmol of either unlabeled or P-5`-labeled UMS-H12 DNA, under the standard mobility shift assay conditions. Reaction mixtures were electrophoresed under the standard mobility shift assay conditions (except that the electrophoresis TAE buffer was at pH 8.0). The amount of protein and DNA molecules in the UMSBPbulletUMS complexes was determined using PhosphorImager. The slope (at the range of 0-2.75 pmol of DNA) yields a ratio of 2.1 UMSBP monomers/UMS site.



Conservation of the CCHC-Type Zinc Finger Motif

The amino acid sequence of UMSBP, as predicted from the cDNA(18) , contains five CCHC-type zinc finger motifs. The sequence conservation of this motif in a growing group of cellular proteins including UMSBP is demonstrated in Fig. 8. The 35 CCHC motifs of UMSBP and HEXBP (from flagellated protozoa), CnjB (from ciliated protozoa), byr3 (from yeast), and CNBP (from mammals) were found to be remarkably conserved. Apart from the conservation of the cysteines and histidine, which apparently function in the coordination of a zinc ion, glycine residues are conserved at positions 5 and 8 of the motif. The presence of these glycine residues may reflect a requirement for small, sterically nondemanding residues at these positions of the compact motif structure, as suggested for the retroviral motifs(20) . The conservation of a proline residue at position 15 may reflect a structural conservation of a turn in the backbone of the motif. An aromatic residue (tyrosine or phenylalanine) is conserved at position 2, a hydrophobic residue at position 10, and serine or alanine at position 11. The same residues at the same positions, were found by South and Summers (25) to form a hydrophobic cleft in the HIV-1 Gag motif within which the DNA ligand binds. The aromatic and alanine residues, at the same conserved positions (2 and 11) in the retroviral motif, form specific hydrogen bonds with a guanine base of the DNA ligand. However, another basic residue (lysine) that is present immediately before the first cysteine of the Gag motif and forms a specific contact with the guanine base is not conserved in the CCHC motifs of these cellular proteins. Instead, a basic residue that might serve the same function is conserved at position 3 of their motifs. Another basic residue is conserved at position 12. The side chain of an arginine residue at this position of the Gag motif was found to form a nonspecific electrostatic interaction with the phosphodiester backbone of the ligand. Finally, an acid residue (aspartate or glutamate) is conserved at position 13.


Figure 8: Conservation of CCHC-type zinc finger motifs. 35 CCHC motifs from 5 cellular proteins are compared. In a, alignment of the amino acid sequence of the 35 CCHC motifs of C. fasciculata UMSBP(18) , T. thermophila CnjB(24) , human and mouse CNBP(21) , L. major HEXBP(22) , and S. pombe Byr3(23) . Shadedbackground denotes conserved amino acids, and darkbackground indicates the cysteine and histidine residues of the CCHC motif. In b, a summary of the amino acid conservation found at each of the positions of the CCHC motifs is shown.



Overall, we have found a high degree of conservation in 13 out of the 15 positions of the CCHC motifs of this family of eukaryotic cellular proteins. This remarkable conservation can be explained in light of the functions found by South and Summers for the same residues at the same positions in the HIV-1 Gag motif(25) .

Binding of UMSBP to Single-stranded RNA

It has been shown that in addition to its RNA ligand, the retroviral CCHC-type zinc finger can bind a single-stranded DNA analog(25) . Considering the remarkable homology between the CCHC motifs of the retroviral Gag and those of UMSBP, it was expected that UMSBP would bind the single-stranded RNA analog of UMS. To explore this possibility, we have used the 12-mer ribo-oligonucleotide analog of the G-rich strand of UMS (rUMS, 5`-GGGGUUGGUGUA-3`) in a mobility-shift assay. Fig. 9demonstrates that UMSBP indeed binds the RNA analog of the UMS H-strand. Binding competition experiments, using an RNA competitor of a higher sequence complexity than that of UMS (Fig. 10), clearly demonstrates the sequence specificity of UMSBP interactions with the single-stranded RNA ligand. The equilibrium binding constant measured for this protein-RNA interaction (K = 2.0 10M; Fig. 10, inset), is 8-fold higher than the value measured with the DNA ligand (Fig. 4). Inasmuch as the CCHC-type zinc finger motif is conserved in the group of cellular proteins from yeast and protozoa to mammals, the presumption is strong that other proteins in this group also bind both types of single-stranded nucleic acids. Whether this RNA binding activity of UMSBP, which is apparently an intrinsic property of the CCHC-type zinc finger motifs, has any physiological significance, is yet to be determined.


Figure 9: Binding of UMSBP to an RNA analog of the G-rich strand of UMS. 0.2 ng (46 fmol) of P-5`-labeled 12-mer deoxyoligonucleotide, comprising the G-rich sequence of UMS (lanes d-f), or its RNA analog 5`-GGGGUUGGUGUA-3` (lanesa-c) were incubated with 20 (lanesa and d) and 60 (lanesb and e) units of UMSBP (Fraction VIIb) and analyzed on a native polyacrylamide gel under the standard mobility-shift assay conditions. Lanesc and f contain no UMSBP.




Figure 10: Binding specificity of UMSBP-rUMS interaction. 40 units of UMSBP (Fraction VIIb) were incubated under the standard mobility shift assay conditions with 0.09 ng (21 fmol) of the P-5`-labeled UMS RNA analog (rUMS) and increasing concentrations of unlabeled rUMS (black square) or nonspecific RNA competitor (0.3-7.4 kilobase pairs of RNA transcripts, RNA molecular weight markers, Boehringer Mannheim) (bullet). Reaction mixtures were analyzed in a native polyacrylamide gel and quantified as described under ``Experimental Procedures.'' The inset describes measurement of the equilibrium binding constant for the interaction of UMSBP and rUMS (K = 2.0 10M), calculated as described under ``Experimental Procedures'' and in the legend to Fig. 4. [RNA] is the total concentration of RNA in the reaction.




DISCUSSION

During the S-phase of the trypanosomatid cell cycle, two highly interlocked kDNA catenanes, one composed of minicircles and the other of maxicircles, replicate at the same time and at the same location. Thus, replication and assembly of the two types of these topologically linked kDNA circles requires a strict coordination between their replication mechanisms. Recently, two copies of an 11-mer sequence identical to UMS (apart from its 3`-terminal residue) were found in the maxicircle variable region of Trypanosoma brucei(46, 47) . The presence of this origin-associated sequence in both minicircles and maxicircles may provide a clue for understanding this coordination at the replication initiation step. A specific origin-binding protein that interacts with the origin-associated UMS, is a likely candidate to function in the process of replication initiation and may play a role in a mechanism that coordinates kDNA minicircle and maxicircle replication. It is within this context that we had searched for and isolated a UMS-binding protein from C. fasciculata cell extracts. Since the 3`-terminal residue of UMS is insignificant for specific binding by UMSBP(17) , we expect that both the 12-mer UMS of the minicircles and the homologous 11-mer sequence of the maxicircles would be equally bound by the protein. The conservation of UMSBP binding sites in both maxicircles and minicircles supports a possible role for UMSBP in coordinating the replication of the two types of circles. Since UMS resides within a duplex DNA molecule, binding of UMSBP requires the melting of this sequence. We have recently found that UMSBP binds to native DNA minicircles and that the origin-associated UMS element resides within an unwound or otherwise sharply distorted DNA structure. (^2)We have shown here that UMSBP can bind a UMS RNA analog, as implied by the remarkable homology of the CCHC motifs from UMSBP and the retroviral Gag polyproteins. Whether UMS is indeed transcribed in the trypanosomatid cell and a UMS RNA ligand is actually available for binding by UMSBP, is yet unknown. Further investigation is required to determine the in vivo binding target and the biological function of UMSBP.

G-rich sequences similar to the UMS, such as those of telomeres(38, 39) , HIV-1 RNA genome dimerization site(40, 41) , IgG switch region (43, 48) , and others (44, 49, 50) form in vitro special four-stranded structures known as quadruplexes or G-quartets. Several DNA-binding proteins were recently found to interact specifically with a G-quartet structure(43, 44, 51, 52) . Although UMSBP binds exclusively to single-stranded nucleic acid conformation ( Fig. 5and (17) ), it may participate in regulation of quadruplex formation through its high affinity binding to the single-stranded conformation of quadruplex-forming sequences.

Local melting of the DNA double helix occurs during various cellular activities such as replication, recombination, and transcription. Single-stranded DNA and RNA binding proteins may play important roles in such cellular processes. UMSBP contains Cys-X(2)-Cys-X(4)-His-X(4)-Cys-type zinc finger motifs, typical to proteins that bind exclusively to single-stranded G-rich nucleic acid ligands(20) . It belongs to a distinct group of cellular proteins including Leishmania HEXBP(22) , human CNBP(21) , yeast byr3(23) , and Tetrahymena CnjB (24) that contain several adjacent CCHC motifs. Comparison of the CCHC-type motifs of these proteins (Fig. 8), reveals a remarkably high degree of conservation in 13 out of the 15 positions of this motif. Most of the conservation can be explained in light of the functions found by South and Summers (25) for the same residues at the same positions of the HIV-1 Gag motif. On the basis of these data, we suggest that the CCHC zinc finger motif is strictly conserved not only in the primary amino acid sequence and structure, but also in its mechanism of single-stranded nucleic acid binding. The observation that UMSBP is able to bind an RNA analog of the G-rich strand of UMS ( Fig. 9and Fig. 10) supports this notion. Whether the proteins of this well defined group share biological functions other than binding to single-stranded nucleic acids is yet to be explored.


FOOTNOTES

*
This work was supported, in part, by grants from the United States/Israel Binational Science Foundation (BSF 89-00190), the Israel Science Foundation, administered by the Israel Academy of Sciences and Humanities (117/92), and the Israeli Ministry of Health(2400). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed. Tel.: 972-2-758089; Fax: 972-2-432527/784010; SHLOMAI{at}MD2.HUJI.AC.IL.

(^1)
The abbreviations used are: kDNA, kinetoplast DNA; UMS, universal minicircle sequence; UMSBP, universal minicircle sequence binding protein; CNBP, cellular nucleic acid binding protein; HEXBP, hexamer binding protein; H-strand, heavy strand; CCHC, Cys-X(2)-Cys-X(4)-His-X(4)-Cys; HIV-1, human immunodeficiency virus, type 1; HPLC, high performance liquid chromatography.

(^2)
D. Avrahami, Y. Tzfati, and J. Shlomai, manuscript in preparation.


ACKNOWLEDGEMENTS

We thank Ran Avrahami for help and advice with the statistical analyses.


REFERENCES

  1. Ray, D. S. (1987) Plasmid 17,177-190 [Medline] [Order article via Infotrieve]
  2. Ryan, K. A., Shapiro, T. A., Rouch, C. A., and Englund, P. T. (1988) Annu. Rev. Microbiol. 42,334-358
  3. Simpson, L. (1987) Annu. Rev. Microbiol. 41,363-382 [CrossRef][Medline] [Order article via Infotrieve]
  4. Englund, P. T. (1979) J. Biol. Chem. 254,4895-4900 [Abstract]
  5. Englund, P. T. (1987) Cell 14,157-168
  6. Ferguson, M., Torri, A. F., Ward, D. C., and Englund, P. T. (1992) Cell 70,621-629 [Medline] [Order article via Infotrieve]
  7. Perez-Morga, D. L., and Englund, P. T. (1993) Cell 74,703-711 [Medline] [Order article via Infotrieve]
  8. Perez-Morga, D., and Englund, P. T. (1993) J. Cell Biol. 123, 1069-1079 [Abstract]
  9. Sheline, C., Melendy, T., and Ray, D. S. (1989) Mol. Cell. Biol. 9,169-176 [Medline] [Order article via Infotrieve]
  10. Sheline, C., and Ray, D. S. (1989) Mol. Biochem. Parasitol. 37,151-158 [CrossRef][Medline] [Order article via Infotrieve]
  11. Birkenmeyer, L., and Ray, D. S. (1986) J. Biol. Chem. 261,2362-2368 [Abstract/Free Full Text]
  12. Kitchin, P. A., Klein, V. A., Fein, B. I., and Englund, P. T. (1984) J. Biol. Chem. 259,15532-15539 [Abstract/Free Full Text]
  13. Ntambi, J. M., and Englund, P. T. (1985) J. Biol. Chem. 260,5574-5579 [Abstract]
  14. Ntambi, J. M., Shapiro, T. A., Ryan, K. A., and Englund, P. T. (1986) J. Biol. Chem. 261,11890-11895 [Abstract/Free Full Text]
  15. Ryan, K. A., Shapiro, T. A., Rauch, C. A., Griffith, J. D., and Englund, P. T. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,5844-5848 [Abstract]
  16. Ryan, K. A., and Englund, P. T. (1989) J. Biol. Chem. 264,823-830 [Abstract/Free Full Text]
  17. Tzfati, Y., Abeliovich, H., Kapeller, I., and Shlomai, J. (1992) Proc. Natl. Acad. Sci. U. S. A. 89,6891-6895 [Abstract]
  18. Abeliovich, H., Tzfati, Y., and Shlomai, J. (1993) Mol. Cell. Biol. 13,7766-7773 [Abstract]
  19. Henderson, L. E., Copeland, T. D., Sowder, R. C., Smythers, G. W., and Oroszlan, S. (1981) J. Biol. Chem. 256,8400-8406 [Abstract/Free Full Text]
  20. Summers, M. F. (1991) J. Cell. Biochem. 45,41-48 [Medline] [Order article via Infotrieve]
  21. Rajavashisth, T. B., Taylor, A. K., Andalibi, A., Svenson, K. L., and Lusis, A. J. (1989) Science, 245,640-643 [Medline] [Order article via Infotrieve]
  22. Webb, J. R., and McMaster, W. R. (1993) J. Biol. Chem. 268,13994-14002 [Abstract/Free Full Text]
  23. Hao-Peng, X., et al. (1992) Mol. Biol. Cell 3,721-734 [Abstract]
  24. Taylor, F. M., and Martindale, D. W. (1993) Nucleic Acids Res. 21,4610-4614 [Abstract]
  25. South, T. L., and Summers, M. F. (1993) Protein Science 2,3-19 [Abstract/Free Full Text]
  26. Morellet, N., Jullian, N., De Rocquigny, H., Maigret, B., Darlix, J. L., and Roques, B. P. (1992) EMBO J. 11,3059-3065 [Abstract]
  27. Omichinski, J. G., Clore, G. M., Sakagachi, K., Appella, E., and Gronenborn, A. M. (1991) FEBS lett. 292,25-30 [CrossRef][Medline] [Order article via Infotrieve]
  28. Summers, M. F., South, T. L., Kim B., and Hare, D. R. (1990) Biochemistry 29,329-340 [Medline] [Order article via Infotrieve]
  29. Schagger, H., and Von Jagow, G. (1987) Anal. Biochem. 166,368-379 [Medline] [Order article via Infotrieve]
  30. Richardson, C. C., Schildkraut, C. L., Aposhian, H. V., and Kornberg, A. (1964) J. Biol. Chem. 239,222-232 [Free Full Text]
  31. Fried, M. G., and Crothers, D. M. (1984) J. Mol. Biol. 172,241-262 [Medline] [Order article via Infotrieve]
  32. Liu-Johnson, H.-N., Gartenberg, M. R., and Crothers, D. M. (1986) Cell 47,995-1005 [Medline] [Order article via Infotrieve]
  33. Bradford, M. M. (1976) Anal. Biochem. 72,248-254 [CrossRef][Medline] [Order article via Infotrieve]
  34. Siegel, L. M., and Monty, K. J. (1966) Biochim. Biophys. Acta 112,346-362 [Medline] [Order article via Infotrieve]
  35. Martin, R. G., and Ames, B. N. (1961) J. Biol. Chem. 236,1372-1379 [Medline] [Order article via Infotrieve]
  36. Fried, M., and Crothers, D. M. (1981) Nucleic Acids Res. 9,6505-6525 [Abstract]
  37. Cann, J. R. (1989) J. Biol. Chem. 264,17032-17040 [Abstract/Free Full Text]
  38. Sundquist, W. I., and Klug, A. (1989) Nature 342,825-829 [CrossRef][Medline] [Order article via Infotrieve]
  39. Williamson, J. R., Raghuraman, M. K., and Cech, T. R. (1989) Cell 59,871-880 [Medline] [Order article via Infotrieve]
  40. Marquet, R., Baudin, F., Gabus, C., Darlix, J.-L., Mougel, M., Ehresmann, C., and Ehresmann, B. (1991) Nucleic Acids Res. 19,2349-2357 [Abstract]
  41. Sundquist, W. I., and Heaphy, S. (1993) Proc. Natl. Acad. Sci. U. S. A. 90,3393-3397 [Abstract]
  42. Murchie, A. I. H., and Lilley, D. M. J. (1992) Nucleic Acids Res. 20,49-53 [Abstract]
  43. Weisman-Shomer, P., and Fry, M. (1993) J. Biol. Chem. 268,3306-3312 [Abstract/Free Full Text]
  44. Walsh, K., and Gualberto, A. (1992) J. Biol. Chem. 267,13714-13718 [Abstract/Free Full Text]
  45. Guo, Q., Lu, M., and Kallenbach, N. R. (1993) Biochemistry 32,3596-3603 [Medline] [Order article via Infotrieve]
  46. Sloof, P., de-Haan, A., Eier, W., van-Iersel, M., Boel, E., van-Steeg, H., and Benne, R. (1992) Mol. Biochem. Parasitol. 56,289-299 [CrossRef][Medline] [Order article via Infotrieve]
  47. Myler, P. J., Glick, D., Feagin, J. E., Morales, T. H., and Stuart, K. D. (1993) Nucleic Acids Res. 21,687-694 [Abstract]
  48. Sen, D., and Gilbert, W. (1988) Nature 334,364-366 [CrossRef][Medline] [Order article via Infotrieve]
  49. Hammond-Kosack, M. C., Kilpatrick, M. W., and Docherty, K. (1993) J. Mol. Endocrinol. 10,121-126 [Abstract]
  50. Macaya, R. F., Schultze, P., Smith, F. W., Roe, J. A., and Feigon, J. (1993) Proc. Natl. Acad. Sci. U. S. A. 90,3745-3749 [Abstract]
  51. Fang, G., and Cech, T. R. (1993) Cell 74,875-885 [Medline] [Order article via Infotrieve]
  52. Liu, Z., Frantz, J. D., Gilbert, W., and Tye, B. K. (1993) Proc. Natl. Acad. Sci. U. S. A. 90,3157-3161 [Abstract]
  53. Shlomai, J., and Linial, M. (1986) J. Biol. Chem. 261,16219-16225 [Abstract/Free Full Text]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.