(Received for publication, June 5, 1995; and in revised form, July 10, 1995)
From the
The combined factors that regulate the expression of cell adhesion molecules (CAMs) during development of the nervous system are largely unknown. To identify such factors for Ng-CAM, the neuron-glia CAM, constructs containing portions of the 5` end of the Ng-CAM gene were examined for activity after transfection into N2A neuroblastoma and NIH3T3 cells. Positive regulatory elements active in both cell types included an Ng-CAM proximal promoter with SP1 and cAMP response element motifs extending 447 base pairs upstream of a single RNA start site and a region within the first exon corresponding to 5`-untranslated sequences. Negative regulatory elements included five neuron-restrictive silencer elements (NRSEs) and a binding site for Pax gene products in a 305-base pair segment of the first intron. Constructs containing the promoter together with the entire first intron were active in N2A cells but were silenced in NIH3T3 cells. This silencer activity was mapped to the NRSEs. In contrast, the Pax motif inhibited activity of Ng-CAM constructs in both cell types. The DNA elements defined in these transfection experiments were examined for their ability to bind nuclear factors. The region within the first exon formed a DNA-protein complex after exposure to nuclear extracts prepared from both NIH3T3 and N2A cells. The NRSE region formed a more prominent complex with proteins prepared from NIH3T3 cells than it did with extracts from N2A cells. A member of the Pax protein family, Pax-3 bound to the Pax motif. Mutations introduced within the Pax motif in its ATTA sequence eliminated this binding whereas mutations in its GTTCC sequence did not, suggesting that paired homeodomain interactions are important for the recognition of Pax-3 by this DNA target sequence. The combined data suggest that negative regulation by NRSEs and Pax proteins may play a key role in the place-dependent expression patterns of Ng-CAM during development.
Cell adhesion molecules (CAMs) ()are essential for
guiding tissue formation (1) and play key roles in the
development of the nervous system. CAMs important in the nervous system
include N-CAM-related molecules such as Ng-CAM(2, 3) ,
Nr-CAM (4, 5) , L1 (NILE)(6, 7) ,
neurofascin/ABGP(8, 9) , TAG-1/axonin-1/F3 (10, 11, 12) , and contactin(13) .
Ng-CAM, L1, Nr-CAM, neurofascin/ABGP from vertebrates and neuroglian
from Drosophila(14) comprise a subfamily of neural
CAMs containing six immunoglobulin domains and five fibronectin type
III repeats. Molecules of this Ng-CAM subfamily are expressed
prominently in axonal pathways in both the central nervous system and
peripheral nervous system and are involved in neurite fasciculation and
outgrowth. Each of these neural CAMs has a characteristic spatial and
temporal expression pattern during neural morphogenesis but the factors
that restrict expression of Ng-CAM and other neural CAM genes to
particular populations of neural cells are not well understood.
To define the sequences of DNA responsible for place-dependent expression of CAMs, we have focused on signals from homeobox and Pax gene products(15, 16, 17, 18) . An attractive hypothesis is that Ng-CAM and other neural CAMs are targets of homeodomain and Pax proteins(19) . During neural development, a number of transcriptional regulators encoded by the homeobox and Pax gene families appear in defined expression patterns along the anterior-posterior and dorsal-ventral axes of the embryo that correlate with the patterns of a variety of CAMs. Moreover, mutations in Pax genes are known to alter the programs of neural differentiation and migration, processes that are influenced by the activity of neural CAMs. For example, mutations in Pax-6 and Pax-3 genes lead to developmental defects of the nervous system as shown, respectively, in small eye and splotch mutant mice(20) . Some of these defects appear to be caused by aberrant neuronal migration(21, 22) , a process normally modulated by neural CAMs.
To examine the factors regulating Ng-CAM expression, we have isolated genomic clones containing the 5` end of the Ng-CAM gene, characterized its proximal promoter, and located two regulatory regions within a 305-base pair segment of the first intron. One region of the first intron was found to contain five neuron-restrictive silencer elements (NRSEs) which extinguished expression of the Ng-CAM gene in a fibroblast cell line, but not a neuroepithelial cell line. Another region contained GTTCC and ATTA sequences characteristic of binding sites for Pax proteins. This Pax motif was found to bind Pax-3 in gel mobility shift experiments; such binding was disrupted when specific mutations were introduced in the ATTA sequence within the Pax motif. In transfection experiments with both NIH3T3 and N2A cells, this Pax motif was found to be a negative regulator of Ng-CAM gene expression independent of the silencing imposed by NRSEs. Our studies suggest that the NRSEs and Pax motif may play critical roles in the place-dependent expression of Ng-CAM in the nervous system.
Figure 2: DNA sequence of a 2536-base pair segment of the Ng-CAM gene starting at a position 447 base pairs upstream of the transcription start site and ending within the second exon, 2089 base pairs downstream from the RNA start site. The start of RNA transcription is indicated by a rightward pointing arrow. Potential regulatory motifs within the proximal promoter and the first intron of the gene are underlined. These sequences include the CRE, SP1, NRSE, and Pax motifs.
RNase protection
analysis was performed using the RPA II kit (Ambion). The template for
the RNA probe was made by polymerase chain reaction from the genomic
DNA. The downstream primer used for the amplification was Ng-7/T3-, an
oligonucleotide identical in sequence to oNg-7, but also containing the
T3 RNA polymerase promoter at the 5` end. The upstream primer used in
the polymerase chain reaction was an oligonucleotide designated oNg-6,
derived from the sense strand of the Ng-CAM genomic sequence (Fig. 3, position -172 to -156). A labeled RNA probe
was synthesized from the template using T3 RNA polymerase and
[P]UTP. Template DNA was then digested with
RNase-free DNase. Two fmol of probe in elution buffer was mixed with 1
µg of poly(A)
RNA from 12 day embryonic chick
brains and hybridized at 45 °C for 16 h. RNase protection analysis
was performed using different dilutions of RNase to determine the
optimal conditions for cleavage. The products of both primer extension
and RNase protection analysis were resolved on an 8% polyacrylamide
sequencing gel.
Figure 3:
Activity of Ng-CAM constrcts in NIH3T3 and
N2A cells. Top, diagram of the 5` end of the Ng-CAM gene
showing the position of various restriction sites used to prepare
deletion constructs. Sequences included in exons are indicated with boxes that are either solid black or cross-hatched. The cross-hatched region corresponds
to the +82/+182 region of the first exon. The five NRSEs are
represented by boxes numbered 1-5. The Pax motif is located
immediately downstream from the NRSEs and is indicated by an open
box. N2A or NIH3T3 cells were transfected with the promoterless
CAT gene reporter vector (Bas) or 14 other constructs containing
various segments from the 5` end of the Ng-CAM gene. Cell extracts were
normalized to an internal reference standard of -galactosidase
activity and assayed for CAT activity as described under
``Materials and Methods.'' CAT activity for all constructs
was quantitated using a PhosphorImager from four separate experiments
performed in duplicate in which the activity levels varied no more than
5%.
For binding reactions involving nuclear
extracts, 5 fmol (approximately 20,000 cpm) of either
+82/+182 or NRSE probe was mixed with 10 µg of protein
from either NIH3T3 or N2A cells in a buffer containing 10 mM Hepes, 200 mM KCl, 0.5 mM dithiothreitol, 0.1%
Nonidet P-40, 100 ng/ml poly(dI-dC) with 10 µg of bovine serum
albumin. Competitors (either 10- or 100-fold excess of cold
+82/+182 or NRSE DNAs) were included in some binding
reactions. Binding components were incubated for 15 min at room
temperature and subjected to electrophoresis on a 4% polyacrylamide gel
in 0.25 TBE buffer at 400 V for 2 h at 4 °C.
For binding
reactions involving the Pax-3 fusion protein, 200,000 cpm of probe was
incubated with 0.5, 2.5, or 5 µg of GST/Pax-3 fusion protein.
Binding reactions were performed in a volume of 20 µl containing 15
mM Tris-HCl, pH 7.5, 6.5% glycerol, 90 mM KCl, and
0.2 mM dithiothreitol. Bovine serum albumin (0.5 mg/ml) and
sheared salmon sperm DNA (100 ng) were added to each binding reaction
to reduce nonspecific binding. Reactions were incubated at room
temperature for 30 min and subjected to electrophoresis on a 5%
polyacrylamide gel in 0.5 TBE buffer at 200 V at 4 °C. Gels
were dried and exposed to film at -70 °C.
Figure 1: Structure of the 5` end of the Ng-CAM gene. Top, diagram of the 20-kilobase insert from the chicken cosmid clone Cos-Ng containing the first seven exons of the Ng-CAM gene. The portions of exons 1 and 2 encoding 5`-untranslated sequences are indicated with an open box. Exons encoding translated mRNA sequence are indicated with black boxes.Bottom, nucleotide sequence of the borders for the first seven exons and introns of the Ng-CAM gene. The coding sequences of the exons are shown in upper case letters, while intronic sequences are indicated with lower case letters. The ATG codon is located in exon 2 and amino acids encoded by nucleotides 3` of this position are shown over the appropriate triplet codons. In some cases borders were inferred solely from comparison of the Ng-CAM genomic sequence to the published Ng-CAM cDNA sequence(3) .
To
determine the site of transcription initiation within the Ng-CAM gene,
we performed primer extension and RNase protection analyses of
poly(A) RNA isolated from chick brain tissue at
embryonic day 12. Extension of an antisense oligonucleotide primer
(oNg-7) yielded a single radiolabeled band of 185 nucleotides in length
(data not shown). Two prominent bands were observed in an RNase
protection experiment using a radiolabeled RNA probe corresponding to
the region from -300 to +185. Because the primer extension
and the largest RNase protection products had termini at exactly the
same upstream base pair in the genomic sequence, the nucleotide at this
position was designated +1, the start of transcription initiation
(see Fig. 2).
The DNA sequence of a 2536-base pair segment of the 5` end of the Ng-CAM gene including the 5`-flanking sequence, the first exon, the first intron, and the second exon was determined (Fig. 2). This sequence has been deposited in GenBank under accession number U31086. Several potential regulatory motifs were found in the 5`-flanking region immediately adjacent to and upstream of the RNA start site. They included two consensus binding sites for the SP1 transcription factor (28) located at -124 and -15 as well as a consensus cyclic AMP response element (CRE), located at -255, which is known to bind members of the CREB family of transcription factors(29) . After analyses of deletion constructs in transfection experiments, additional elements were located in a 178-base pair segment of the first intron (+1388 to +1566) that included five copies of a sequence similar to that of the NRSE. Further analysis also uncovered a 30-base pair element (+1679 to +1708) similar to that of a binding site for Pax gene products(20, 30, 31) . This Ng-CAM Pax motif contains GTTCC and ATTA core sequences characteristic of a target sequence recognized by those Pax proteins that contain two independent DNA binding domains, paired domain and a homeodomain(20, 32) .
The various Ng-CAM reporter constructs and observed CAT activities in N2A and NIH3T3 cells are summarized in Fig. 3. Five reporter constructs containing different lengths of 5`-flanking sequence (see Fig. 3, constructs Ng4, Ng1.5, Ng447, Ng200, Ng447/182) were constitutively active in both NIH3T3 and N2A cells. The Ng447 construct, containing 447 base pairs of 5`-flanking sequence and 61 base pairs of the first exon, was approximately 2-fold more active than Ng4 and Ng1.5 in N2A cells, but not in NIH3T3 cells. The proximal promoter region within this construct contained two consensus binding sites for the SP1 transcription factor and a CRE. The Ng200 construct, in which the promoter region between -447 and -200 containing the CRE sequence was deleted, showed 3-fold less activity in both NIH3T3 and N2A cells than the Ng447 construct. This result indicated that regulatory elements between -200 and -447 are required for basal promoter activity. The promoterless CAT vector, pCAT-Basic (Bas) showed little, if any, activity in both cell types.
A construct designated Ng447/182, containing the entire first exon was approximately 2-fold more active than the Ng447 construct in both NIH3T3 and N2A cells. These data suggested that the segment between +61 and +182 is a positive regulatory region of the Ng-CAM gene. A construct designated Ng447+I, containing the Ng-CAM proximal promoter together with the entire first exon, first intron, and a portion of the second exon was active in N2A cells but was completely silent in NIH3T3 cells. This finding suggested that sequences within the first intron of the Ng-CAM gene were capable of silencing the gene in non-neuronal cells.
Figure 4: Comparison of the five NRSEs (Ng-NRSE1-5) located between +1388 and +1566 in the first intron of the Ng-CAM gene with a consensus NRSE sequence derived from NRSEs found in several neural-specific genes(24) . The Ng-CAM NRSEs are in the orientation of the reverse complement from those found in other genes.
To examine whether the NRSEs were sufficient for silencing of the Ng-CAM proximal promoter, constructs were prepared in which the five NRSEs were placed upstream of the Ng447 region of the Ng-CAM gene linked to a CAT reporter. Two constructs were prepared in which the NRSEs were present either in the orientation normally found in the gene (SAS) or in the opposite orientation (SS). The SAS and SS constructs were both silent in NIH3T3 cells, showing CAT activities that were comparable to the Ng447+I and pCAT-Basic constructs (Fig. 3). However, in N2A cells, the SAS and SS constructs were highly active and showed no reduction of Ng-CAM proximal promoter activity. These studies suggest that, while the NRSEs are sufficient to silence Ng-CAM promoter activity in NIH3T3 cells, they do not show silencing in N2A cells.
Figure 5: The +82/+182 and NRSE regions of the Ng-CAM gene bind to nuclear factors prepared from NIH3T3 and N2A cells. Either the +82/+182 (panels A and B) or the NRSE (panels C and D) probe was mixed with nuclear extracts prepared from NIH3T3 cells (panels A and C) or from N2A cells (panels B and D). For binding reactions, probes were incubated without extract (all panels, lane 1) or with the appropriate nuclear extract (lanes 2-6). DNA-protein complexes formed between the +82/+182 and NRSE regions of the Ng-CAM gene with nuclear extracts are indicated with brackets labeled EX and NRSE, respectively. Competitor DNAs were added to some binding reactions (all panels, lanes 3-6). Competitors included either a 10- or 100-fold excess of the unlabeled DNA probe used in the specific binding reactions (all panels, lanes 3 and 4, respectively). To test whether the NRSE sequence could compete for binding of the +82/+182 probe to nuclear extracts, either a 10-fold or a 100-fold excess of cold NRSE DNA was added to the +82/+182 binding reactions (panels A and B, lanes 5 and 6, respectively). A similar test for cross-competition was performed by adding either a 10- or 100-fold excess of cold +82/+182 DNA to the NRSE binding reactions (panels C and D, lanes 5 and 6, respectively). Reactions were subjected to electrophoresis on a 4% polyacrylamide gel in 0.25 TBE buffer at 400 V for 2 h at 4 °C.
Binding experiments using the NRSE region as probe detected a clear cut DNA-protein complex with nuclear extracts prepared from NIH3T3 cells and a slight, but detectable complex with proteins from N2A cells (Fig. 5, panels C and D, lane 2). Formation of the NRSE-protein complex (particularly evident in NIH3T3 cells) was inhibited with an excess of cold NRSE competitor (Fig. 5C, lanes 3 and 4). Moreover, the +82/+182 segment did not compete for the formation of the NRSE-protein complex (Fig. 5, panels A-D, lanes 5 and 6). Similar competition experiments performed in N2A cells showed no diminution of the faint signal, regardless of the competitor used (Fig. 5D, compare the band intensities in lanes 5 and 6 with those in lanes 2, 3, and 4). These results suggested that proteins that bind to the NRSE are much more abundant in NIH3T3 cells than in N2A cells. Furthermore, the nuclear proteins bound by the +82/+182 and the NRSE regions of the Ng-CAM gene appear to be different.
Four different probes were tested for binding to Pax-3 (Fig. 6). The Ng-wt probe contained the wild-type Pax motif (Fig. 3, the region between 1679 and 1708). The Ng-H variant contained 3-base pair substitutions that altered the ATTA motif. The Ng-P variant contained 11 base pair substitutions. Four substitutions destroyed the GTTCC motif which has been shown to be essential for the binding of the e5 DNA sequence to the paired domains of Pax-1 and Pax-3(30, 32, 35) . The seven other substitutions made in Ng-P were introduced after comparing the sequence of the Ng-CAM gene Pax motif to the consensus binding sequences for Pax proteins (31, 36, 37) and took account of critical base pair substitutions that may disrupt paired domain interactions with DNA. The third variant, Ng-HP (Fig. 6), contained the combination of mutations made in both Ng-H and Ng-P variants.
Figure 6:
Binding of a Pax-3 fusion protein to the
Pax motif in the first intron of the Ng-CAM gene. Top, sequences of four probes used in gel mobility shift experiments.
The 5 base pairs at the 5` end of each sequence (lower-case
letters) are BamHI cohesive ends. The remaining 30 base
pairs (upper-case letters) are derived from the Pax motif
(between +1679 and +1708 in the Ng-CAM gene). Ng-wt is the
wild-type Ng-CAM Pax motif. GTTCC and ATTA motifs known to be important
for binding of Pax proteins to DNA are boxed. Three variants
of the Ng-wt sequence, designated Ng-HP, Ng-H, and Ng-P contain
specific base pair substitutions which are highlighted in boldface
type. Bottom, gel mobility shift assay showing binding of the
GST/Pax-3 fusion protein to P-labeled Ng-wt, Ng-HP, Ng-H,
and Ng-P probes. The individual probe used for binding is indicated at
the bottom of each panel. Binding reactions contained the
indicated probe either: without added protein (lanes marked
1), with 2.5 µg of GST control protein purified from E.
coli NM522 cells transformed with pGEX-2T (lanes marked
2), or with 0.5, 2.5, or, 5 µg of GST/Pax-3 or GST/Pax-6
fusion (lanes marked 3-5, respectively. Binding
reactions were subjected to electrophoresis on 6% polyacrylamide gels,
dried, and autoradiographed for 4 h.
The Ng-wt probe showed binding to the GST/Pax-3 protein, but no binding to the GST control protein. Similar binding experiments using the Ng-HP and Ng-H variants showed little or no binding to Pax-3. The Ng-P variant showed no detectable decrease in binding to Pax-3 as compared to that of the Ng-wt. This result suggested that the GTTCC motif and other base pairs important for paired domain interactions were not necessary for Pax-3 binding. Rather, these experiments suggested that it was the ATTA motif that was important for Pax-3 binding to the Ng-CAM Pax motif.
In the chicken, Ng-CAM is first detected in the central nervous system at embryonic day 3 in cells of the ventral neural tube that are the precursors of motor neurons(38) . At later stages, Ng-CAM appears on a number of neurons and is distributed mainly on axons rather than cell bodies. In the cerebellum, Ng-CAM appears in Purkinje cells, is expressed by a number of different fibers, and plays a role in the migration of granule cell neurons on radial glia. The synthesis of Ng-CAM also shows dynamic changes during both myelination and during nerve regeneration. At the onset of myelination, Ng-CAM expression decreases in the central nervous system but not in the peripheral nervous system. Ng-CAM expression is increased after peripheral nerve injury in the spinal cord and in the nerve at the lesion site, but is decreased in dorsal root ganglia(39) . These dynamic patterns of Ng-CAM synthesis in the nervous system prompted us to identify factors that regulate Ng-CAM gene expression.
Figure 7:
Location of DNA control elements in the 5`
end of the Ng-CAM gene. The locations of putative regulatory elements
in the proximal promoter are indicated with the appropriate symbol:
, SP1 sites;
, CRE. The +82/+182 region of the
first exon showing positive regulation of the Ng-CAM constructs in both
N2A and NIH3T3 cells is represented with a cross-hatched box.
The five NRSEs within the first intron showing cell type-specific
silencing of Ng-CAM promoter activity in NIH3T3 cells are represented
by leftward pointing arrows indicating that similarities to
the NRSE consensus sequence (24) are found on the bottom
strand. The Pax motif showing negative regulation of the Ng-CAM
promoter in both cell types and binding to Pax-3 is indicated by an open box. The first and second exons of the Ng-CAM gene are
labeled 1 and 2,
respectively.
A region of 447 base pairs of 5`-flanking upstream sequence was sufficient for basal Ng-CAM promoter activity in cells; addition of up to 4 kilobases of Ng-CAM 5`-flanking sequence showed no further increases in this activity. Potential regulatory sequences located in the proximal promoter included an additional SP1 motif at -134 and a consensus CRE (TGACGTCA) at -255 (Fig. 7). In both NIH3T3 and N2A cells, a 3-fold decrease in expression was observed for constructs that had a deletion in the -200 to -447 region of the Ng-CAM proximal promoter. This region contained the CRE and thus, trans-factors of the CREB family (43, 44) may control Ng-CAM gene expression.
Inclusion of the region between +82 and +182 in the first exon of the Ng-CAM gene in constructs led to a 2-fold stimulation of Ng-CAM promoter activity in both NIH3T3 and N2A cells. It is likely that this sequence may impart either additional stability of Ng-CAM mRNAs or binds a transactivator important for transcription of the Ng-CAM gene. We found that the +82 to +182 region bound to nuclear proteins from both NIH3T3 and N2A cells.
In cellular transfection experiments, we found that the Ng-CAM NRSEs silenced the Ng-CAM proximal promoter in NIH3T3 cells. In contrast, NRSEs did not silence the promoter in N2A cells. Thus, the NRSEs silenced Ng-CAM gene expression in non-neuronal cells. The multiple copies of NRSEs in the Ng-CAM first intron suggest that they may bind proteins cooperatively. Recently, a neuron-restrictive silencer factor (NRSF, also called REST) has been identified which binds to the NRSE (48, 49) . The protein contains eight zinc fingers which are related to those found in proteins of the GLI-Krüppel family.
Consistent with their activity patterns, we found that the Ng-CAM NRSEs formed a more prominent DNA-protein complex with nuclear proteins from NIH3T3 cells than they did with proteins from N2A cells. It is therefore likely that the protein enriched in NIH3T3 cells which binds to the NRSEs and silences Ng-CAM gene expression is NRSF/REST or a related protein. Interestingly, constructs with deletions in the NRSEs also show slightly elevated expression in N2A cells when compared to constructs containing the NRSEs (see Fig. 3). Thus, in N2A cells a minor amount of NRSF/REST may contribute to silencing, but the predominance of positive factors may greatly override this activity. In a preliminary search of the published L1 gene sequence (40) we have located an NRSE with the sequence TCTGCTGTCCGTGGTGCTGGA within the first intron at position 277-297. The possibility must therefore be considered that NRSEs may be used in the negative regulatory programs of other neural CAM genes in the Ng-CAM family.
Recently, a sequence containing an ATTA motif has been located at -170 in the promoter of the gene for the neural CAM L1(40) . While this motif has some similarities with the Ng-CAM Pax motif, it also shows some important differences. The two Pax motifs are similar in that they both contain ATTA motifs and share a few identities in the base pairs flanking this sequence. However, in contrast to the Ng-CAM Pax motif, the L1(-170) motif closely resembles the consensus binding site for Pax-6(31, 37) . L1(-170) has been shown to bind to Pax-6, but does not bind to Pax-3. Unlike the binding of the Ng-CAM Pax motif to Pax-3 which is eliminated when mutations are introduced into the ATTA sequence, the binding of L1(-170) to Pax-6 was eliminated by mutations disrupting sequences which interact with the paired domain(50) . The binding was unaffected by mutations in the ATTA motif that disrupt homeodomain interactions. It will be revealing to determine whether other genes encoding neural CAMs that are related to Ng-CAM and L1 at the amino acid level, such as Nr-CAM and neurofascin, also contain Pax motifs. Genes encoding this family of neural CAMs may all contain Pax motifs with subtle variations in sequence composition that may determine binding preferences and selective CAM gene control by different Pax proteins.
It will be necessary to determine in vivo what roles the DNA control elements described here play in the developmental expression pattern of Ng-CAM in the nervous system. It is likely that particular combinations of Ng-CAM regulatory elements identified in this study (see Fig. 7) are utilized to control specific contexts of Ng-CAM expression during neural development and regeneration. For example, the NRSEs and Pax motif and their bound proteins may act combinatorially to restrict Ng-CAM expression to particular classes of neural cells during development. This possibility may be explored in chicken embryos by using retroviral vectors(51) . Chicken Ng-CAM constructs can also be tested in transgenic mice, an approach has been used successfully across species to analyze the brain-specific expression directed by regulatory sequences from the chicken gene encoding the a2 neuronal acetylcholine receptor(52) . Such animal studies will be particularly useful in determining how the regulatory regions identified in the present study function to determine place-dependent expression of Ng-CAM.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank[GenBank].