(Received for publication, July 27, 1994; and in revised form, December 8, 1994)
From the
The MF3 protein specifically recognizes telomeric and
non-telomeric DNA probes that can form GG base-paired structures
(Gualberto, A., Patrick, R. M., and Walsh, K.(1992) Genes & Dev. 6, 815-824). Here we further characterize the nucleic
acid recognition properties of MF3 and present a mathematical analysis
that evaluates the potential extent of telomere site occupancy by this
factor. The substitution of dI at dG positions in telomeric DNA probes
revealed that a single dG at any position within the internal repeat
was sufficient for high affinity binding to MF3. The RNA analogs of
high affinity DNA sites were not bound specifically by MF3, but the
substitution of dU for dT in a DNA probe had little or no effect on
binding. These data demonstrate that ribose ring structure is a
critical feature of nucleoprotein complex formation, and this ribose
specificity may enable MF3 to occupy sites of unusual DNA structure
while minimizing interactions with cellular RNAs. Collectively, the
nucleic acid binding properties of MF3 suggest that it may occupy a
significant fraction of sites at telomere ends or other G-rich regions
of altered DNA structure in vivo.
Telomeres consist of long double-stranded stretches of G-rich
repeat sequences including 5`-TAG
-3`, which is
found in vertebrates, and 5`-T
G
-3`, which is
found in ciliates. In some organisms, 12-16-nucleotide
single-stranded extensions of the G-rich strand have been documented at
the telomere end(1, 2) . Single-stranded
oligonucleotide probes of telomeric repeats can form hairpin duplex,
triplex, and quadruple helical structures through G
G base
pairing, and specific protein complexes with these structures have been
reported(3, 4, 5, 6, 7, 8, 9) .
These unusual nucleoprotein complexes may regulate telomere length or
have a capping function that prevents chromosome degradation or
recombination. Some ciliates have >10
telomere ends, and
the proteins that bind to these ends can be selectively released from 2 M salt-extracted macronuclear DNA by nuclease treatment (10) . Molecular clones for telomere end-binding proteins have
been isolated from Oxytricha nova and Stylonychia
mytilis(11, 12, 13) . However, the
identification of telomere end-binding proteins in higher eucaryotes is
a more formidable task due to the relatively low abundance of these
ends. Mammalian, avian, and Xenopus cell extracts contain
single-stranded DNAbinding activities that recognize telomeric
oligonucleotide
probes(3, 14, 15, 16, 17) .
The detection of these binding activities in cell extracts from diverse
species indicates that the telomeric nucleoprotein structures may be
highly conserved throughout evolution. However, it is difficult to know
with certainty whether the telomeric probe-binding activities detected
in higher eucaryotes represent true telomeric proteins or whether they
have entirely different functions within the cell. Recently, Ishikawa et al.(17) and McKay and Cooke (14) reported
that single-stranded telomeric probe-binding activities detected in
HeLa cell and mouse liver extracts represent abundant heterogeneous
nuclear ribonucleoproteins that may function in pre-mRNA splicing and
in chromosome capping. In view of these findings, it may be desirable
to elucidate the nucleic acid recognition properties of a putative
telomere-binding protein to determine whether these properties are
consistent with the hypothesis that the factor can occupy sites of
altered DNA structure under physiological conditions. Based upon these
measurements, an initial assessment can be made about whether a
telomeric probe-binding activity can potentially occupy sites at
chromosome ends in vivo.
Previously, we identified a potent
single-stranded DNA-binding activity, referred to as MF3, that
specifically recognized telomeric and non-telomeric
sequences(18) . High affinity binding occurred with probes
containing two or more single-stranded stretches of guanine residues
that were capable of GG base pairing(3) . Here we report
that a single guanine N-2 group is essential for factor binding, and
this critical amino group can occur at any of the three guanine
positions within the internal repeat. This nucleic acid specificity
indicates that MF3 could bind to G-rich regions of altered DNA
structure, such as telomere ends, or to abundant cellular RNAs, which
form a multitude of structures containing non-canonical base pairs. To
test the latter possibility, the RNA analogs of high affinity
DNA-binding sites were tested for their ability to bind to MF3. The RNA
probes did not form specific nucleoprotein complexes with MF3. The
lower affinity for RNA appears to result from the presence of the
2`-hydroxyl on the ribose ring and not from differences in pyrimidine
structure (i.e. uracil versus thymidine). A
mathematical analysis is presented that accounts for nonspecific
binding to nucleic acids, including RNA, and evaluates the ability of
MF3 to occupy specific sites of G-rich regions of altered DNA structure in vivo. These calculations indicate that the nucleic acid
recognition properties of MF3 are consistent with the hypothesis that
it can function bybinding to telomere ends or other sites of unusual
DNA structure within the genome.
Figure 1: Single-stranded telomeric probe sequence and deoxyguanosine position designation. Two vertebrate telomeric repeats occur 3` of an 18-nucleotide sequence that is inert in terms of its ability to bind to factor(18) . Derivative probes have deoxyinosine substituted for deoxyguanosine in the internal or terminal telomeric repeats. VET3.0 has three and VET4.0 has four wild-type telomeric repeats (not shown) that occur downstream of the same 18-nucleotide single-stranded fragment contained in the VET2.0 probe and its dI-substituted derivatives.
Figure 2:
A single guanine exocyclic amino group
occurring in the internal telomeric repeat is required for factor
binding. A, the VET2.0 parental probe and
deoxyinosine-substituted probes were tested for their ability to form
nucleoprotein complexes with factor in an electrophoretic mobility
shift assay. For each assay, 10,000 cpm (0.1 pmol) of each probe
was used with 0.2 µg of poly(dI-dC)
poly(dI-dC). B,
shown is the competition for nucleoprotein complex formation with
parental and deoxyinosine-substituted telomeric probes. Complexes were
formed between factor and radiolabeled VET2.0 in the absence or
presence of the designated competitor. Nonlabeled competitor
oligonucleotides were present at 0, 10, 30, or 100 ng (from left to right). The factor source for these experiments was
turkey erythrocytes. The nucleoprotein complex (C) and the
free probes (F) are indicated.
In a previous study(3) , it was found that the
exocyclic N-2 amino groups of the internal telomeric repeat, but not
the terminal telomeric repeat, were required for the formation of a
structure that was partially protected from methylation by dimethyl
sulfate. Since these same groups were also required for specific factor
recognition, these data provided correlative evidence that MF3
recognizes a nucleic acid structure that is more complex than a random
coil, and they indicated that the nucleoprotein complex may have a
specific G G base pair configuration at the core-binding site.
To extend these analyses to the series of dI-substituted derivatives
analyzed in this study, methylation protection studies were performed
on the VET2.1, VET2.2, VET2.3, and VET2.6 probes (Fig. 3). The
methylation protection of N-7 groups in the terminal repeat was most
extensive with the VET2.1 probe, which has three exocyclic N-2 groups
in the internal repeat. The VET2.2 probe, with two internal N-2 groups,
and VET2.3, with one internal N-2 group, gave protection patterns that
were intermediate between the VET2.1 and VET2.6 protection patterns.
Collectively, these data provide further evidence that the VET2 series
of probes can form a chemically protected structure that requires the
presence of at least one exocyclic N-2 group within the internal
telomeric repeat. Furthermore, these data indicate that the stability
of the chemically protected structure is dependent upon the total
number of internal exocyclic amino groups in that the extent of
protection is more pronounced with three N-2 groups than with two or
one.
Figure 3: Chemical protection analyses of the deoxyinosinesubstituted telomeric probes in the absence of protein. Probes were compared regarding their sensitivity to methylation and cleavage following treatment with dimethyl sulfate. Methylation patterns of the telomeric repeats are shown above, where the guanosine/inosine residues are denoted as described for Fig. 1. The relative methylation of each probe at positions 4-6 is plotted below as the fraction of total methylation (sum of positions 1-6). These band intensity patterns indicate trends in methylation protection in the terminal repeat (positions 4-6). The ranking of degree of chemical protection at the terminal repeat is VET2.1 > VET2.2 > VET2.3 > VET2.6. Furthermore, for every probe analyzed, the greatest protection occurred at position 4, while position 6 was least protected.
Next we performed binding experiments to determine whether the
position of the dG exocyclic amino group in the internal telomeric
repeat is critical for factor binding. For this experiment, the
derivatives VET2.1 (end structure = GGGTTAIII), VET2.3
(IIGTTAIII), VET2.4 (IGITTAIII), and VET2.5 (GIITTAIII) were compared
for their ability to compete for nucleoprotein complex formation in an
electrophoretic mobility shift assay (Fig. 4). All of these
dI-substituted probes were capable of significant competition, but the
VET2.5 and VET2.4 probes (which have a single dG at positions 1 and 2,
respectively) were slightly weaker competitors than the VET2.3 probe,
which has a single dG at position 3. However, all the derivative probes
with a single dG in the internal repeat were dramatically better at
binding MF3 than was the probe that had a dI substitution at every dG
position (VET2.6). Collectively, the ranking of probe affinity for MF3
is as follows: VET2.0, VET2.1, VET2.2 > VET2.3 > VET2.4, VET2.5
VET2.6. These data demonstrate that the guanine that contributes
the critical N-2 group can occur at any dG position within the internal
repeat, although position 3 was optimal for binding to MF3.
Figure 4:
The exocyclic amino group required for
factor binding can occur at any guanine position in the internal
telomeric repeat. The indicated probes were tested for their ability to
compete for complex formation in an electrophoretic mobility shift
assay. Nucleoprotein complexes were formed between the factor and the
radiolabeled MREnc probe in the presence or absence of nonlabeled
inosine-containing oligonucleotides. The MREnc sequence has a 4-fold
higher affinity for factor than the VET2.0 probe that was used in Fig. 2B(3) . For each assay, 10,000 cpm
(0.1 pmol) of MREnc probe was used with 0.2 µg of
poly(dI-dC)
poly(dI-dC), and the factor source was turkey
erythrocytes. Nonlabeled competitor oligonucleotides were present at 0,
10, 30, or 100 ng (from left to right). The
nucleoprotein complex (C) and the free probes (F) are
indicated.
Figure 5: MF3 forms specific nucleoprotein complexes with a VET2.0 DNA probe, but not with the corresponding VET2.0 RNA probe. The RNA analog of the VET2.0 DNA probe was synthesized using T7 RNA polymerase and a synthetic DNA template that had a double-stranded T7 promoter region upstream from the template sequence. Electrophoretic mobility shift assays were performed in the presence or absence of MF3 from Jurkat cells that was purified by heparin-Sepharose chromatography. Binding reactions contained 30,000 cpm of the DNA or RNA probe. Aliquots of each binding mixture were removed and analyzed by autoradiography following electrophoresis on a denaturing 20% polyacrylamide gel to confirm probe integrity.
Competition experiments were also performed
with the RNA analog of another DNA site that is referred to as MREnc (Fig. 6A). This probe corresponds to the noncoding
strand of the chicken skeletal actin promoter from positions -73
to -100, where MF3 specifically recognizes the region
encompassing the two G-rich repeats in the 3`-portion of this
molecule(18) . This RNA analog, MREnc RNA, was compared with
deoxyribonucleic MREnc (MREnc DNA) for binding to MF3 in quantitative
competition experiments (Fig. 6A). The MREnc DNA
sequence was a potent competitor, but the RNA analog of this sequence
and tRNA were relatively ineffective in their ability to compete for
complex formation. Based on quantitative experiments with higher levels
of competitor, the affinity of MF3 for RNA was 3-4 orders of
magnitude lower than for the specific DNA site (Fig. 6B). The binding to MREnc RNA was nonspecific
because this probe was no more effective than tRNA in its ability to
compete for the formation of the specific nucleoprotein complex. We
note that the affinity of MF3 for nonspecific sites in RNA was higher
than for nonspecific sites in duplexed DNA, which have dissociation
constants of 10
M (Table 1).
Figure 6:
Comparison of DNA and RNA in the
competition of the specific nucleoprotein complex. Complexes were
formed between factor and the P-labeled MREnc probe in the
presence or absence of nonlabeled competitors. Competitors were the
noncoding DNA strand of MRE (MREnc DNA), the RNA analog of this MRE
sequence (MREnc RNA), and tRNA. The RNA analog of the skeletal actin
MREnc probe was synthesized using T7 RNA polymerase and a synthetic DNA
template that had a double-stranded T7 promoter region upstream from
the template sequence. Factor was from embryonic chicken skeletal
muscle, and electrophoretic mobility shift assays were performed with
25 pg of
P-labeled probe and 100 ng of
poly(dI-dC)
poly(dI-dC). A, for each competition set, the
first lane contained no competitor (0), and subsequent lanes contained
0.3, 1.0, 3.0, 10, and 30 ng of competitor in this order. B,
binding reactions were performed in the absence of competitor (0) or
with competition sets containing increasing amounts of MREnc RNA or
tRNA. Competition sets contained 30 ng, 100 ng, 300 ng, 1 µg, or 3
µg of the designated RNA competitor in this order. The
nucleoprotein complex (C) and the free probes (F) are
indicated.
Further experiments were performed to determine the structural requirements for high affinity binding to specific sequences in single-stranded deoxyribonucleic acid. The ability of a protein to distinguish a single-stranded sequence of DNA from the same sequence of RNA can result from the presence of thymidine in place of uridine or from the absence of the 2`-hydroxyl group from the pyranose ring. dU was substituted in place of dT over the entire noncoding strand of the MRE to test whether the methyl group at position 5 of the pyrimidine base had a role in determining MF3 specificity (Fig. 7). The modified sequence is referred to as dU-MREnc. Quantitative competition assays revealed that the substitution of dU for dT had a minimal (<3-fold) effect on the affinity of MF3 for this sequence (Fig. 7). From these data, it appears that the specific recognition of DNA versus RNA results from the difference in ribose ring structure between these nucleic acids. In RNA, the 2`-hydroxyl groups impose severe stereochemical constraints on the overall conformation of the nucleic acid(21) , and these data are consistent with the hypothesis that MF3 recognizes a specific DNA structure that may be more complex than a random coil of single-stranded nucleic acid(3) . Finally, the substitution of dI for dG in the 3`-portion of the MREnc probe diminished factor binding by a factor of 50 (Fig. 7), indicating that MF3 recognizes similar structural features in the MREnc and telomeric DNA probes.
Figure 7:
Nucleotide structure requirements for high
affinity binding to DNA. Complexes were formed between the P-labeled MREnc probe and factor in the presence or
absence of nonlabeled competitor DNAs. DNA competitors were the
noncoding strand of the MRE (MREnc) and its analogs, dU-MREnc and
dI-MREnc. The analog dU-MREnc has deoxyuridine substituted for
deoxythymidine throughout the probe. The analog dI-MREnc has
deoxyinosine substituted for deoxyguanosine in the 3`-portion of the
sequence. The dU-MRE and dI-MRE analogs have G instead of C at the most
5`-position, so they correspond to the MREnc RNA sequence used as a
competitor in Fig. 4. For each competition set, the first lane
contained no competitor (0), and subsequent lanes contained 0.3, 1.0,
3.0, 10, and 30 ng of competitor in this order. Factor was partially
purified from embryonic chicken skeletal muscle, and electrophoretic
mobility shift assays were performed with 25 pg of
P-labeled probe and 100 ng of
poly(dI-dC)
poly(dI-dC).
In , P,
,
DNA
, and RNA
are the total concentrations of
the nucleic acid-binding protein, the specific binding site, DNA, and
RNA, respectively. P
, P
DNA, and P
RNA are the
concentrations of the different protein-nucleic acid complexes, and K
, K
, and K
are the equilibrium dissociation constants for
specific binding, nonspecific binding to DNA, and nonspecific binding
to RNA, respectively. is similar to the competition
equation derived by Lin and Riggs(22) , but additional terms
appear to account for nonspecific binding to RNA, a situation that is
likely to occur with single-stranded nucleic acid-binding proteins.
Under conditions where the total concentration of nonspecific sites in
cellular nucleic acids greatly exceeds the concentration of binding
protein and DNA
P
DNA and RNA
P
RNA, can be simplified, and substituting the term
for fractional specific site occupancy (
=
P
/
) gives .
was used to evaluate how the occupancy of a specific site by MF3 or any other protein can be influenced by the abundance of factor and sites and by nonspecific binding to DNA and RNA (Fig. 8).
Figure 8:
Parameters that influence estimated
specific site occupancy by factor in vivo. Plots were
constructed from the relationships expressed in . Estimated
site occupancy () was calculated using the parameters determined
for the factor described in this study except when that parameter was
varied for the calculations. The dissociation constant for specific
sites (K
) was 1
10
M, and the dissociation constants for nonspecific
binding to RNA and DNA (K
and K
) were 1
10
and 4
10
M, respectively. The
intracellular RNA concentration is 2.2 mg ml
, and
the intranuclear DNA concentration is 19 mg ml
for a
chicken embryo fibroblast; and thus, the calculated concentrations of
nonspecific sites in RNA (RNA
) and DNA
(DNA
) are 6.7
10
and
5.8
10
M, respectively. For many of
these calculations, it was assumed that the intranuclear concentration
of factor (P
) was 2.5
10
M (1
10
copies/nucleus) and that the
concentration of specific sites (
) was 2.5
10
M (100 per nucleus). A,
estimated specific site occupancy as a function of the concentration of
binding protein calculated with different values for the dissociation
constant for the specific protein-nucleic acid complex; B,
estimated specific site occupancy as a function of the concentration of
specific binding sites in the nucleus calculated for different nuclear
concentrations of binding protein; C, estimated specific site
occupancy as a function of total RNA concentration in the cell
calculated with different values for the dissociation constant for the
nonspecific interaction with RNA (for these calculations, the
contribution of nonspecific binding to DNA was assumed to be negligible
(DNA
= 0)); D, estimated specific
site occupancy as a function of total DNA concentration in the nucleus
calculated with different values for the dissociation constant for the
nonspecific interaction with DNA (for these calculations, the
contribution of nonspecific binding to RNA was assumed to be negligible
(RNA
= 0)).
The parameters used for calculations of site
occupancy were derived from previous measurements of cellular DNA and
RNA content and volume, estimates of telomere-binding protein levels,
and a value for the volume of the nucleus that has been used by others.
Similar assumptions have been used to estimate the extent of the
adenovirus major late promoter by the MLTF DNA-binding
protein(23) . The content of DNA and RNA in chicken embryo
fibroblasts is 1.3 and 1.55 pg/cell, respectively (24) . Thus,
the calculated concentration of total intracellular RNA is 2.2 mg
ml (RNA
= 6.7 mM), and
the calculated concentration of total intranuclear DNA is 19 mg
ml
(DNA
= 58 mM)
assuming that the nucleus is one-tenth the total volume of the cell.
Estimates for telomeric probe-binding protein in higher eucaryotes
range from <200 to
10
copies/cell(15, 16, 17) , and the
calculated concentration of telomere-binding protein (P
)
will range from 5 nM to
10
M assuming that the nucleus is a sphere with a 5-µm
diameter(25) . Likewise, the concentration of specific binding
sites (
) may vary from
100 per nucleus (2.5
nM), assuming a single binding site/telomere end, to 100,000
per nucleus (2.5 µM) or more if the protein binds
specifically to abundant non-telomeric sites. The binding constants for
MF3 were used for many of these calculations, where the dissociation
constant for the specific site (K
) is 1
nM, and the dissociation constants for nonspecific binding to
DNA and RNA are 100 and 4 µM, respectively (Table 1)(3) .
Fig. 8A shows that the
concentration of the telomere-binding protein and the relative affinity
for a specific site are critical parameters that determine telomere
site occupancy. Under the conditions assumed for these calculations,
50% telomere site occupancy occurs with 10
binding
protein molecules/nucleus when the telomere nucleoprotein dissociation
constant is 10
M,
10
binding protein molecules/nucleus when the dissociation constant
is 10
M, and
10
binding
protein molecules/nucleus when the dissociation constant is
10
M (with K
and K
values of 4 and 100 µM,
respectively). For the telomeric probe-binding activity described here,
we estimate that the factor level in avian erythrocytes is
10
molecules/cell based upon purification data and
calculations of nucleoprotein complex levels at saturating
concentrations of telomeric probe assuming a 1:1 binding
stoichiometry(3) . If this factor is confined to the nucleus,
the calculated level of telomere site occupancy is >92% with
10
MF3 molecules/cell and 53% with 10
MF3
molecules/cell. However, if MF3 was present at 10
copies/cell, which is the estimated level of the MLTF
transcription factor(26) , the calculated level of telomere
site occupancy would only be 10%. These calculations indicate that
telomere end-binding proteins may have to be present at high levels to
achieve a significant level of site occupancy despite the relatively
low level of telomere ends. We note, however, that these estimates of
telomere site occupancy may represent minimal values because much of
the total DNA or RNA may not be free in solution, or large portions of
the nucleic acid may be masked by bound proteins. Despite uncertainties
about the levels of nucleic acid that are available for nonspecific
competition, these calculations indicate that the thermodynamic
properties of MF3 are reasonably consistent with the hypothesis that it
can occupy sites at telomere ends or other sequences of altered DNA
structure within the cell.
These calculations also indicate that the
extent of telomere site occupancy is relatively unaffected by the total
number of specific binding sites within the nucleus
() as long as
does not exceed the
concentration of binding proteins (P
). For example, in Fig. 8B, the extent of site occupancy is plotted as a
function of the level of specific binding sites for three
concentrations of telomere-binding protein. At the higher levels of
telomere-binding protein (10
and 10
molecules/nucleus), the calculated site occupancies are similar
when the total number of sites is varied from 10 to 10
per
nucleus. The effects of nonspecific binding to RNA and DNA on telomere
site occupancy are shown in Fig. 8(C and D,
respectively). As in the case of gene regulatory
proteins(25, 27) , the interaction of telomere-binding
proteins with nonspecific sites in nucleic acids is predicted to have a
significant influence on the extent of specific site occupancy. For the
telomere-binding activity described here, the affinity for nonspecific
sites in single-stranded RNA is 25-fold higher than the affinity for
nonspecific sites in double-stranded DNA, and a greater fraction of
this protein is predicted to be bound nonspecifically to RNA than to
DNA under in vivo conditions.
Because GG base
pairs are likely to occur frequently in cellular RNAs, the specificity
of MF3 for DNA may be a feature that is critical for its function. To
determine the structural features that enable MF3 to distinguish
between DNA and RNA, dU was substituted for dT in one of the high
affinity DNA probes. The dU substitution had little or no effect on
binding (Fig. 7). These data rule out the possibility that the
methyl group at position 5 of the pyrimidine ring is a key determinant
in specific recognition, and they indicate that the factor
distinguishes between DNA and RNA on the basis of differences in ribose
ring structure. One possibility is that MF3 makes specific contacts
with the 2`-positions of the sugar molecules. An alternative hypothesis
is that the 2`-hydroxyl group of RNA prevents the nucleic acid from
attaining a conformation that is specifically recognized by MF3. The
2`-hydroxyl group will influence overall nucleic acid conformation by
altering the preferred pucker conformation of the ribose
ring(21) . In complementary RNA double helices, the 2`-hydroxyl
group predominantly confines the ribose to the C
-endo pucker, and the helix acquires the A conformation. However, in DNA
helices, the C
-endo and C
-endo ribose puckers occur, allowing numerous double-stranded
conformations. Thus, the extra flexibility of the deoxyribose ring,
permitted by the absence of the 2`hydroxyl group, may be an important
feature in the specific recognition of DNA by MF3. Based on these data,
we propose that MF3 recognizes a G
G base-paired structure that
occurs in a DNA-specific conformation and that RNA is not bound with
high affinity because the 2`-hydroxyl group prevents the nucleic acid
from attaining this conformation. Through this type of mechanism, MF3
can selectively bind to G
G base pairs in DNA and minimize
interactions with cellular RNAs that can adopt a multitude of tertiary
structures through non-canonical base pairing.
The concentration of the binding protein is a
critical parameter in determining the extent of telomere site occupancy (Fig. 8A). MF3 is a relatively abundant factor, and in
avian erythrocytes and embryonic skeletal muscle, we estimate that the
level of this factor may be as high as 10 copies/cell.
Given these parameters, the calculated level of telomere site occupancy
is >90% assuming that MF3 is present at 10
molecules/cell and >50% assuming that MF3 is present at
10
molecules/cell. These calculations also indicate that a
significant fraction of MF3 (>99%) is likely to be bound to
nonspecific nucleic acid sequences in vivo. Similar behavior
has also been predicted for sequence-specific B-DNA-binding proteins,
such as the lac repressor, where >98% of the total factor
is bound to nonspecific sites in the genome(25, 27) .
However, for MF3, RNA is likely to be the predominant nonspecific
competitor (Fig. 8, compare C and D), and
nonspecific sites in RNA are predicted to complex with 75% of this
protein given our assumptions about physiological conditions. The Xenopus telomere end-binding factor (XTEF) is reported to be
present at <100 binding units/nucleus in somatic
tissues(16) . If this estimation of factor abundance is
correct, it can be calculated that an extraordinarily low dissociation
constant (
10
M) would be required for
significant (
90%) telomere site occupancy (where K
and K
are 100 and 4
µM, respectively). On the other hand, if the dissociation
constant of XTEF for telomeric sites was in the nanomolar range, the
extent of telomere site occupancy would be <1% due to the
overwhelming nonspecific interactions with nucleic acids. Thus,
measurements of the XTEF affinity for specific and nonspecific sites in
nucleic acids will be required to assess the physiological significance
of the XTEF-telomere end interaction.
The extent of specific site
occupancy is relatively unaffected by the total concentration of
specific sites in that significant competition for binding only occurs
when the concentration of specific sites exceeds the concentration of
the factor (Fig. 8B). In view of these considerations,
it is interesting to compare the levels of telomere end-binding
proteins that are required for significant site occupancy in ciliates
that have up to 2
10
telomere ends and in
animals that have
1
10
telomere ends. The
calculated level of telomere-binding protein required for 90% telomere
end occupancy is 1.9
10
in ciliates, where most of
the protein is bound to telomeric sites, and 8
10
in animals, where most of the protein is bound nonspecifically to
nucleic acid. This suggests that despite enormous differences in the
abundance of telomere ends in these two species
(>10
-fold), the binding proteins must be present at
relatively high levels in both species to occupy a significant fraction
of these telomere end sites in vivo.
Collectively, these
calculations permit an initial assessment of a protein's ability
to bind to telomere ends in vivo based on measurements of
factor interactions with nucleic acids performed in vitro.
These estimates can be revised if it is determined that the telomeric
sites are blocked by the binding of a protein with more favorable
thermodynamic characteristics or if other preassembled telomeric
proteins facilitate the binding of the factor to the telomere end. In
this regard, we note that the Oxytricha telomeric protein is a
heterodimer of subunits that bind cooperatively to telomeric DNA probes
to form a highly stable ternary complex(28) . Whether similar
subunit interactions occur at chromosome ends in multicellular
organisms remains to be determined, but we note a number of
similarities between MF3 and the macronuclear telomere end-binding
protein of ciliates. For example, both proteins recognize
oligonucleotide probes to telomeric repeats with similar apparent
affinities, both nucleoprotein complexes are relatively resistant to
high concentrations of salt in the binding reaction, and the
substitution of dU residues for dT does not significantly effect
complex formation (Fig. 7)(29, 30) . ()
The data presented here are consistent with the
hypothesis that MF3 specifically binds to unusual non-Watson-Crick
base-paired structures of DNA. A key feature of the specific binding
site is the occurrence of two tracts of guanosine residues where the
exocyclic N-2 groups of the 3`-tract of guanosines are critical in
nucleoprotein complex formation. Here we show that a single exocyclic
N-2 group at any position within the internal repeat of a telomeric
probe is essential for factor binding, while all other exocyclic N-2
groups are expendable for this interaction. These binding data can be
correlated with chemical protection assays showing that probes
containing exocyclic N-2 groups in the internal telomeric repeat were
undermethylated relative to a probe that did not. Furthermore, the
extent of methylation protection correlated with the total number of
exocyclic N-2 groups within the internal repeat. These data indicate
that MF3 has a relatively low specificity for nucleotide sequence, but
a high specificity for a tertiary structure that may be stabilized by a
specific dGdG base pair configuration. This tertiary structure
may be rare in chromatin and only occur at chromosome ends or at other
G-rich sequences that are sensitive to S1 nuclease digestion. However,
G
G base pairs may occur more frequently in cellular RNAs, and the
specificity of MF3 for deoxyribose may be a feature that enables this
factor to occupy a significant fraction of telomere ends or other sites
of altered DNA structure in vivo despite an overwhelming
abundance of non-Watson-Crick base-paired RNA. Finally, the
thermodynamic properties and binding specificity measurements reported
here may prove useful in evaluating the physiological roles of putative
telomere end-binding activities encoded by molecular clones.