From the Departments of Molecular Biophysics and
Biochemistry and ¶ Genetics, Yale University,
New Haven, Connecticut 06520-8114
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The immunoglobulin heavy chain switch regions
contain multiple runs of guanines on the top (nontemplate) DNA strand.
Here we show that LR1, a B cell-specific, duplex DNA binding factor, binds tightly and specifically to synthetic oligonucleotides containing G-G base pairs (KD Immunoglobulin class switch recombination is a regulated
recombination event that joins a rearranged and expressed heavy chain variable region (VDJ) to a new downstream constant
(C)1 region, deleting the DNA
between. Each immunoglobulin class removes antigen from the body in a
distinct way, so switch recombination alters the pathway of antigen
clearance without affecting antigen specificity. Switch recombination
is region-specific: junction sites are found throughout the upstream
(donor) and downstream (acceptor) S regions. Comparison of switch
junction sequences shows that switch recombination does not depend on
either sequence-specific or homologous recombination mechanisms (1).
Circular molecules containing the deleted C region and flanking
sequences are produced during switching (2), suggesting that switching
involves a synapsis event in which distant switch regions are brought
together into a recombination complex, which undergoes cleavage and
religation to produce an excised switch circle and a chromosomal switch
junction (see Fig. 1).
Switch recombination depends on G-rich DNA regions called switch or S
regions. The S regions are 2-10 kilobases in length and located in the
intron upstream of each C region that undergoes switching: Cµ, C 0.25 nM). LR1
also binds to single-stranded G-rich sequences (KD
10 nM). The two subunits of LR1, nucleolin and hnRNP D,
bind with high affinity to G4 DNA (KD = 0.4 and 0.5 nM, respectively). LR1 therefore contains two
independent G4 DNA binding domains. We propose that LR1 binds with
G-G-paired structures that form during the transcription of the S
regions that is prerequisite to recombination in vivo.
Interactions of donor and acceptor S regions with subunits of the
LR1 could then juxtapose the switch regions for recombination.
INTRODUCTION
Top
Abstract
Introduction
Procedures
Results
Discussion
References
,
C
, and C
(Fig. 1). The organization
of the S regions in the heavy chain locus reflects their functional
importance in switch recombination: there is an S region upstream of
each C region except C
, and use of the C
region is governed by
RNA processing, not DNA recombination. Simultaneous transcription of
both activated switch regions is essential for subsequent recombination (reviewed in Ref. 3). During S-region transcription, the G-rich strand
is the "top" or nontemplate strand.
View larger version (13K):
[in a new window]
Fig. 1.
Immunoglobulin heavy chain switch
recombination. The murine immunoglobulin heavy chain locus is
illustrated in the top line. The leader (L), variable
(VDJ), enhancer (Eµ), and constant
(C) regions are shown as filled boxes, and the
switch (S) regions are shown as open boxes.
Switch recombination from µ to 1 results in the deletion of Sµ,
Cµ, C
, S
3, and C
3 sequences. The excised sequences can be
isolated as circular DNA molecules, suggesting that during
recombination, the donor Sµ and acceptor S
1 regions are
juxtaposed. The bottom line illustrates the chromosomal
heavy chain locus subsequent to switching to
1.
G-rich DNA has unusual properties because of the unique pairing
potential of guanine. Guanine can interact with cytosine in standard
Watson-Crick G-C base pairs, and guanine can also interact with guanine
in structures stabilized by G-G Hoogsteen bonding (Fig.
2). One stable structure formed by G-G
bonding is the G quartet, in which four guanines associate in a planar
ring in which each G interacts with two other Gs (Fig. 2A).
G quartets can, in turn, stabilize interactions between runs of Gs,
thus allowing nucleic acids to form four-stranded structures called G4
DNA (Fig. 2B; Refs. 4-8). Synthetic oligonucleotides
derived from the Sµ and S2b switch regions were among the first
sequences shown to form G4 DNA in vitro (4). Since that
time, G4 DNA formation has been characterized extensively in
experiments which show that a run of three Gs is sufficient to drive G4
DNA formation and that G-G interactions are essentially independent of
sequence context in which the runs of Gs occur (reviewed in Ref.
9).
|
In mammalian cells, G-rich DNAs are found in three distinct genomic microenvironments: the heavy chain switch regions, the rDNA, and the telomeres. The fact that G-rich DNA occurs at very specific regions of the genome suggests that G-G pairing may be important to specific cellular functions. Consistent with this hypothesis, a number of proteins have been described that bind to, cleave, or promote formation of G-G-paired DNAs (10-17), including some that interact with telomeric sequences (18-21). We have recently shown that G4 DNA is the preferred substrate of one critical mammalian helicase, the BLM helicase (22). The BLM helicase is deficient in Bloom's syndrome, a human genetic disease characterized by extreme genomic instability, tendency to development malignancies, and immunodeficiency (23-25).
LR1 is a B cell-specific, sequence-specific DNA binding factor that binds to duplex sites in the S regions (26, 27). LR1 DNA binding activity is present in pre-B and B cell lines, and it is absent from resting B cells but induced in primary B cells activated to carry out switch recombination. This spectrum of LR1 activity correlates with the ability of a cell type to support recombination of extrachromosomal switch substrates (28). LR1 binds with very high affinity (KD = 1.8 nM) to duplex DNA sites conforming to the consensus, GGNCNAG(G/C)CTG(G/A) (29).
LR1 is a heterodimer of nucleolin and a specific isoform of hnRNP D (29, 30). Both nucleolin and hnRNP D are members of the large family of eukaryotic nuclear proteins that contain RNA binding domains (RBDs, also called RNA recognition motifs, or RRMs) and Arg-Gly-Gly repeats (RGGs). These structural motifs are commonly found in proteins that interact with RNA or single-stranded DNA (reviewed in Refs. 31-33). Surprisingly, despite the high affinity and sequence specificity of LR1 duplex DNA binding, neither of its subunits contains domains that commonly mediate duplex DNA interactions.
The unusual subunit composition of LR1 suggested that duplex DNA might
not be its only binding target. Here we report that LR1 specifically
binds G4 DNA with KD = 0.25 nM, 7-fold lower than its KD for duplex DNA binding. We further show that both recombinant nucleolin and hnRNP D also bind G4 DNA
(KD = 0.4 nM and 0.5 nM,
respectively). As nucleolin and hnRNP D, the two components of LR1, can
both independently bind G4 DNA, a single LR1 heterodimer can bind to
two separate G-rich regions of DNA. We suggest that LR1 binds to
G-G-paired structures that form during S region transcription, to
juxtapose donor and acceptor switch regions for recombination. G-G
pairing has the further potential to stabilize intermediates in the
recombination process.
![]() |
EXPERIMENTAL PROCEDURES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Protein Preparations and Antibodies-- LR1 was purified 12,000-fold from nuclear extract of the murine pre-B cell line, PD31, by four chromatographic steps (29). Recombinant nucleolin was produced as a maltose-binding protein fusion protein containing amino acids 284-709 of human nucleolin, expressed in Escherichia coli from the pMalNuc plasmid, and purified as described previously (30). The fusion protein contains the four RBDs of nucleolin and the C-terminal RGG motifs; deletion of the acidic N terminus of nucleolin was essential to permit bacterial expression. Recombinant His6-tagged hnRNP D M20 was produced from a fusion construct in the pET30A(+) (Novagen) bacterial expression vector using an engineered murine cDNA clone in which an N-terminal His6 tag is fused to hnRNP D amino acid 31 (29). The M20 isoform of hnRNP D contains sequences encoded by alternative codon exon 2 but not exon 7 (34). His6-tagged hnRNP D M20 was expressed in E. coli and purified by nickel-chelate chromatography as described by the manufacturer (Novagen). Polyclonal antibodies were raised against recombinant human nucleolin and a synthetic peptide bearing a C-terminal sequence of hnRNP D and purified as described previously (29, 30).
G4 DNA Formation, DNA Labeling, and Methylation Footprinting-- Sequences of synthetic deoxyoligonucleotides used to form G4 DNA are shown in Table I. G4 DNAs were formed and end-labeled as previously described (22). In all cases, the characteristic G-G pairing was verified by methylation footprinting using dimethylsulfate (35). End-labeling of single-stranded oligonucleotides (22) and formation and labeling of DNA duplexes (26) also followed previously described procedures.
DNA Binding and Measurements of Binding Affinity-- Binding to duplex DNA was carried out in 15-µl reactions containing 20 mM HEPES, pH 7.5, 100 mM NaCl, 1 mM dithiothreitol, 0.1% Nonidet P-40, 2.5% glycerol, 2% polyvinyl alcohol, 100 µg/ml bovine serum albumin, 4 fmol of 32P-labeled duplex DNA for 15 min at room temperature. Protein-DNA complexes were resolved by electrophoresis on 5% polyacrylamide gels in 90 mM Tris-borate, 1 mM EDTA, pH 8.3. Binding to G4 DNA and single-stranded DNA was carried out in 15-µl reactions containing 10 mM Tris, pH 7.4, 100 mM NaCl, 1 mM EDTA, 100 µg/ml bovine serum albumin, 1 fmol of 32P-labeled DNA for 30 min at 37 °C, and the complexes were resolved by gel electrophoresis on 6% polyacrylamide, 45 mM Tris-borate-EDTA gels at 4 °C. When antibodies were included, they were preincubated with protein in a 10-µl volume on ice for 10 min before the addition of 10 µl of binding buffer containing labeled DNA.
Affinities were estimated by gel mobility shift assays in which binding
to a fixed amount of G4 DNA was assayed in the presence of increasing
amounts of protein. Concentrations of purified LR1 and recombinant
proteins were determined by Bradford microassay (Bio-Rad). Protein-DNA
complex formation was quantitated by phosphoimager analysis of the
dried gels, and KD values were calculated by
plotting the fraction of bound DNA at each protein concentration. Reported KD values are averages from at least three
separate experiments. To verify the very low KD
values for G4 DNA interactions, assays were performed at three DNA
concentrations, 330 fM, 3.3 pM, and 33 pM; the apparent KD was the same at all concentrations.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
LR1 Binds G4 DNA--
Each of the immunoglobulin switch regions
contains reiterations of a consensus repeat characterized by at least
one run of three or more of Gs (Table I).
To test the possibility that LR1 might recognize G4 DNA formed by S
region sequences, we carried out gel mobility shift experiments to
assay binding of highly purified protein to 5' end-labeled G4 DNA
formed from the P oligonucleotide. This oligonucleotide is a synthetic
49-mer derived from the S2b switch region (4). G4 DNA, formed from
four separate G-rich strands, provides an excellent model for
G-G-paired structures, as it is readily formed at high yield and is
very stable in solution. Formation of G4 DNA was in all cases verified
by methylation footprinting (Refs. 4, 6, and 22; data not shown).
|
Highly purified LR1 (29) bound to G4 DNA formed from the P oligonucleotide with KD = 0.25 nM (Fig. 3A). This is a very low binding constant for interaction between a eukaryotic protein and DNA. LR1 also bound to single-stranded P oligonucleotide (Fig. 3A); the dissociation constant for this interaction is KD = 11 nM. LR1 bound other G-rich single-stranded DNA with similar KD values (data not shown). The complexes of LR1 with G4 DNA or single-stranded DNA were sensitive to proteinase K/SDS treatment, and protein binding therefore does not permanently alter DNA structure or conformation (data not shown). In assays of LR1 binding to Watson-Crick duplex DNA formed from the P oligonucleotide, a very small fraction of labeled DNA interacted with protein.
|
In the assay of LR1 binding to single-stranded P oligonucleotide shown in Fig. 3A, two bands are apparent in the lane that contains no protein. The faster-migrating band is single-stranded oligonucleotide, and the other is G4 DNA that has formed spontaneously. This illustrates the propensity of guanines to interact in solution. Similarly, concentrated solutions of GMP have been shown to form a viscous gel (36). Binding of the G4 DNA by LR1 probably accounts for the highly retarded species.
We used rabbit polyclonal antibodies raised against a recombinant fusion protein carrying amino acids 284-709 of nucleolin (Hanakahi (30) or a C-terminal peptide of hnRNP D (see "Experimental Procedures") to verify that LR1 interacts with G4 DNA. As shown in Fig. 3B, neither preimmune nor immune serum antibodies affected the mobility of G4 DNA in the absence of protein. Anti-nucleolin antibodies dramatically inhibited DNA binding. Antibodies raised against the hnRNP D C-terminal peptide supershifted the protein-DNA complex. The LR1 heterodimer is therefore responsible for the observed G4 DNA binding activity.
At increasing concentrations of LR1, multiple complexes were evident in assays of LR1 binding to G4 DNA (Fig. 3A). This may reflect protein interaction with more than one G4 DNA molecule, as would occur if the LR1 heterodimer contained multiple G4 DNA binding domains. To investigate this possibility, we asked if recombinant nucleolin or hnRNP D could bind G4 DNA.
Recombinant Nucleolin Binds G4 DNA-- Recombinant nucleolin was expressed and purified as described under "Experimental Procedures" and assayed for binding to G4 DNA in gel mobility shift experiments. Recombinant nucleolin bound P oligonucleotide G4 DNA, with an estimated KD = 0.4 nM. It did not bind single-stranded DNA or Watson-Crick duplexes formed from the same oligonucleotide, even at very high protein concentrations (KD > 200 nM; Fig. 4A). The nucleolin-G4 DNA complex was competed by G4 DNA but not by single-stranded P oligonucleotide, as was expected from the direct binding assays (Fig. 4B).
|
Switch recombination in vivo frequently joins Sµ and S
switch regions. In the S
switch regions, the characteristic repeat is about 50 base pairs in length and consists of one or more runs of G. In contrast, the Sµ switch region (like S
and S
) is composed of
variations of pentameric motifs like GGGGT, GAGCT, and GGGCT (see Table
I). We tested the ability of recombinant nucleolin to bind to G4 DNA
formed from an oligonucleotide, RX1, that derives from the murine Sµ
region. RX1 carries one GGGGT repeat and two GAGCT repeats (Table I).
As shown in Fig. 4C, recombinant nucleolin bound to RX1-G4
DNA with KD = 0.4 nM, comparable with binding to P oligonucleotide G4 DNA. Recombinant nucleolin also bound
to G4 DNAs formed from G-rich sequences not derived from the switch
regions (data not shown). G4 DNA binding by nucleolin therefore appears
to be specific for the G4 DNA structure, independent of surrounding sequence.
Recombinant hnRNP D Binds G4 DNA-- hnRNP D is a highly conserved protein that is expressed as three isoforms related by alternative splicing (34). Recombinant hnRNP D bound to G4 DNA (KD = 0.5 nM) but not to single-stranded DNA or Watson-Crick duplexes formed from the same oligonucleotide (KD > 200 nM: Fig. 5A). Formation of the complex between hnRNP D and P oligonucleotide G4 DNA was competed by G4 DNA formed from the P oligonucleotide but not by single-stranded P oligonucleotide (Fig. 5B). Recombinant hnRNP D also bound to G4 DNA formed from the Sµ region RX1 oligonucleotide (Fig. 5C) and to G4 DNAs formed from other G-rich sequences (data not shown).
|
Antibody Recognition of Nucleolin or hnRNP D Bound to G4 DNA-- We verified binding of recombinant nucleolin and hnRNP D to G4 DNA by assaying sensitivity to anti-nucleolin and anti-hnRNP D antibodies. As shown in Fig. 6, anti-nucleolin antibodies inhibited the interaction of nucleolin with G4 DNA but had no effect on the mobility of G4 DNA in the absence of added nucleolin. Anti-hnRNP D antibodies supershifted the complex of hnRNP D with G4 DNA but did not affect the mobility of free G4 DNA. The observation that the anti-C terminal hnRNP D antibodies supershifted the binding complex suggests that the epitopes recognized by this anti-peptide antibody preparation are in an exposed region of the hnRNP D polypeptide. In contrast, the dominant epitopes recognized by the polyclonal anti-nucleolin antibodies appear to be within the region of nucleolin that makes contact with DNA.
|
We previously studied recognition of LR1 duplex DNA binding activity by
these anti-nucleolin and anti-hnRNP D antibodies (29, 30). Analogous to
the observations shown in Fig. 6, in those experiments we found that
the anti-nucleolin antibodies removed nucleolin from the LR1
(nucleolin/hnRNP D) heterodimer (30), whereas anti-hnRNP D antibodies
supershifted the LR1-DNA complex (29).
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We have shown that the B cell-specific factor, LR1, binds to G-rich S region sequences as Watson-Crick duplexes and as G-G-paired structures stabilized by Hoogsteen pairing. The dissociation constant of the interaction of LR1 with G4 DNA is 0.25 nM. This is a very high affinity interaction for a eukaryotic nucleic acid-binding protein. Moreover, both subunits of the LR1 heterodimer, nucleolin and hnRNP D, can independently bind G4 DNA with comparably high affinity.
G-G-paired DNA May Form during Transcription That Is Prerequisite to Switch Recombination-- Simultaneous transcription of both activated S regions is prerequisite to switch recombination (reviewed in Ref. 3). During switch region transcription, the G-rich nontemplate strand is unwound from the C-rich template strand, which is transcribed by RNA polymerase. The mechanistic basis for the dependence of switch recombination on transcription has not been understood. We hypothesize that transcription may allow the G-rich strand to transiently form structures stabilized by G-G pairing and that proteins involved in recombination recognize this DNA structure. Such structures are likely to be transient, as they can be unwound by the BLM helicase (22) and possibly other mammalian helicases. S regions are long: in the mouse and human, S regions are from 2 to 10 kilobases in length. If G-G pairing occurs during S-region transcription, then the potential for G-G pairing should be roughly proportional to the length of an S region and the number of runs of three or more Gs it contains. Long S regions will therefore increase the opportunity for recombination by providing additional sites at which G-G pairing can occur.
Other laboratories have shown that during in vitro transcription of the G-rich S regions, stable RNA:DNA hybrids form between the newly synthesized transcript and the C-rich template strand (37-39). Formation of such hybrids would increase the opportunity for G-G pairing, by increasing the time during which the G-rich top strand is freed from the Watson-Crick duplex.
LR1 Binding May Promote Synapsis of Two G-rich Switch Regions-- LR1 can bind duplex DNA, single-stranded G-rich DNA, and G4 DNA. Recognition of each of these substrates might contribute to LR1 function in switch recombination.
The LR1 heterodimer contains six RBD domains: four in nucleolin, and two in hnRNP D. These domains are found in many proteins that interact with single-stranded nucleic acids (reviewed in Refs. 31-33). Structurally, an RBD forms a platform upon which a single strand of nucleic acid is bound in an open conformation (40, 41). As others have pointed out (4, 7), G-rich DNA has an unusual potential to function in recombination because G-rich but nonhomologous regions can interact via G-G Hoogsteen pairing. RBDs have been shown to function in nucleic acid annealing (42-46). It is an interesting possibility that, by binding to a single-stranded region, LR1 may render the DNA available to interactions with other G-rich nucleic acids.
LR1 binds to duplex sites in the S regions that conform loosely to the
consensus GGNCNAG(G/C)CTG(G/A), and LR1 duplex DNA binding activity is
found only in B cells, where it correlates with switch recombination
(26-28). LR1 binds to one of its sites in the S1 switch region with
KD = 1.8 nM, and binding is relatively
insensitive to mutations at most positions in this consensus (26, 27,
29, 30, 47, 48). The S regions are dense with sites that are very
similar to the LR1 binding consensus, and LR1 is likely to occupy some
fraction of these sites in B cells that have been activated for switch recombination.
LR1 bound to either single-stranded or duplex regions would be
poised to capture G4 DNA that formed even transiently. Because the
affinity of the LR1/G4 DNA interaction is so very high, LR1 bound to
duplex sites would comprise a reservoir of protein that would be poised
to capture G-G-paired DNA as it formed. Moreover, as both components of
the LR1 heterodimer, nucleolin and hnRNP D, can bind G4 DNA, LR1 has
two independent G4 DNA binding domains. The presence of these two
domains would enable a single LR1 heterodimer to interact with two
G-G-paired regions. If these regions are located on donor and acceptor
switch regions, then interaction with LR1 could juxtapose these two
switch regions for recombination.
![]() |
ACKNOWLEDGEMENTS |
---|
We are grateful to our colleagues and friends for many interesting and useful discussions.
![]() |
FOOTNOTES |
---|
* This research was supported by National Institutes of Health Grant R01 GM39799 (to N. M.), National Research Service Award GM15948 (to L. A. D.), and by a Ford Foundation Postdoctoral Fellowship (to L. A. H.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
§ Present address: Dept. of Surgery, Division of Transplantation Biology, Mayo Foundation, Rochester, MN 55905.
To whom correspondence should be addressed: Yale University,
266 Whitney Ave., New Haven, CT 06520-8114. Tel.: 203-432-5641; Fax:
203-432-3047; E-mail: nancy.maizels{at}yale.edu.
The abbreviations used are: C, constant; hnRNP, human ribonucleoprotein; RBD, RNA binding domain.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|