1 Department of Molecular and Cellular Biology, Division of Genetics and
Development, 401 Barker Hall, University of California, Berkeley, CA 94720,
USA
2 Hewlett-Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA 94304,
USA
3 Computer Science Division Office, University of California, Berkeley, 387 Soda
Hall #1776, Berkeley, CA 94720-1776, USA
Authors for correspondence (e-mail:
mlevine{at}uclink4.berkeley.edu
and
michele{at}opengenomics.org)
Accepted 12 February 2004
![]() |
SUMMARY |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: Drosophila, Anopheles, cis-regulation, enhancers, neurogenic ectoderm, mesectoderm, Twist, Su(H), Dorsal
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Differential gene activity is primarily controlled by enhancers, which are
typically 500 bp in length and contain roughly ten binding sites for two or
more sequence-specific transcription factors (reviewed by
Levine and Tjian, 2003). The
total number of enhancers might be a critical determinant of organismal
complexity. Based on well-characterized genes such as even skipped
and fushi tarazu, which are regulated by multiple enhancers, one
might estimate the Drosophila genome to contain 30,000-50,000
enhancers (e.g. Davidson,
2001
). The use of comparative genome methods to understand animal
diversity would be greatly facilitated by the existence of `cis-regulatory
codes' that link DNA sequence data with inferred patterns of gene activity.
The dorsoventral patterning of the early Drosophila embryo provides a
well-defined system for applying computational methods to the problem of
predicting gene activity from DNA sequence information
(Markstein et al., 2002
;
Markstein and Levine,
2002
).
Dorsoventral patterning is controlled by the sequence-specific
transcription factor Dorsal (reviewed by
Stathopoulos and Levine,
2002). The Dorsal protein is distributed in a broad nuclear
gradient in the early embryo, with peak levels in ventral regions, and
progressively lower levels in more lateral and dorsal regions. This regulatory
gradient initiates the differentiation of several embryonic tissues by
regulating the expression of over 30 target genes in a concentration-dependent
fashion (e.g. Casal and Leptin,
1996
; Stathopoulos et al.,
2002
). Some of these target genes are activated by high levels of
the Dorsal gradient within the presumptive mesoderm, whereas others are
activated by intermediate or low levels of the gradient in ventral and dorsal
regions of the neurogenic ectoderm, respectively. Previous studies identified
seven of the estimated 30 Dorsal target enhancers in the Drosophila
genome (reviewed by Rusch and Levine,
1996
; Stathopoulos and Levine,
2002
). Their analysis raised the possibility that co-regulated
enhancers responding to the same levels of the Dorsal gradient share a
distinctive combination of cis-regulatory elements
(Stathopoulos et al.,
2002
).
Two of the previously identified enhancers are associated with the
rhomboid (rho) and ventral nervous system defective
(vnd) genes (White et al.,
1983; Bier et al.,
1990
). Both enhancers are activated by intermediate levels of the
Dorsal gradient in ventral regions of the neurogenic ectoderm
(Ip et al., 1992
;
Stathopoulos et al., 2002
).
The present study identified a third enhancer, from the brinker
(brk) gene (Jazwinska et al.,
1999
), which directs a similar pattern of expression. The three
co-regulated enhancers share three sequence motifs, in addition to Dorsal
binding sites: CACATGT, YGTGDGAA and CTGWCCY
(Stathopoulos et al., 2002
).
The first two motifs bind the known transcription factors, Twist and
Suppressor of Hairless [Su(H)], respectively
(Thisse et al., 1987
;
Bailey and Posakony, 1995
). All
three motifs are shown to function as critical regulatory elements, thereby
providing direct evidence that Twist and Su(H) are essential for the
specification of the neurogenic ectoderm. A whole-genome survey for tightly
linked Dorsal, Twist, Su(H) and CTGWCCY motifs identified only seven clusters
in the entire Drosophila genome. Three correspond to the `input'
enhancers: rho, vnd and brk. Another two clusters are shown
to correspond to new neurogenic enhancers associated with the vein
(vn) and single-minded (sim) genes
(Kasai et al., 1992
;
Schnepp et al., 1996
).
Additionally, the defined computational model for neurogenic gene expression
permitted the identification of an orthologous sim enhancer in the
distantly related Anopheles genome.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Cloning and injection of DNA fragments
Genomic D. melanogaster DNA was prepared from a single
anesthetized yw male as described
(Gloor et al., 1993). Mosquito
DNA was derived from the Anopheles gambiae PEST strain (a gift from
Anthony James). DNA fragments encompassing identified clusters were amplified
from genomic DNA with the primer pairs listed (see supplemental data at
http://dev.biologists.org/supplemental/).
PCR products were purified with the QiagenTM QiaQuick® PCR
purification kit, and either cloned into the PromegaTM pGEM® T-Easy
vector (brk, Ady, C1 and vn) or digested with restriction
enzymes corresponding to restriction sites added to the 5' ends of each
primer pair. PCR products cloned into pGEM® T-Easy (brk, Ady and
C1) were digested with NotI and cloned into the
gypsy-insulated pCaSpeR vector E2G (a gift from Hilary Ashe), or partially
digested with EcoRI (vn) and cloned into the
[-42evelacZ]-pCaSpeR vector (Small et al.,
1992
). The remaining PCR products were directly digested and
cloned into a modified version of the E2G vector called newE2G, which contains
BglII, SpeI and EcoRI cloning sites in place of
NotI. Enhancers were mutagenized in pGem® T-Easy using the
StratageneTM QuickChange® Multi Site-directed Mutagenesis Kit and the
primers indicated (see supplemental data at
http://dev.biologists.org/supplemental/).
Constructs were introduced into the D. melanogaster germline by
microinjection as described previously (e.g.
Ip et al., 1992
;
Jiang and Levine, 1993
;
Rubin and Spradling, 1982
).
Between three and nine independent transgenic lines were obtained for each
construct.
Whole-mount in situ hybridization
Embryos were hybridized with digoxigenin-labeled antisense RNA probes as
described (Jiang et al.,
1991). An antisense lacZ RNA probe was used to examine
the staining patterns of transgenic embryos. To examine the patterns of
endogenous gene expression, probes were generated by PCR amplification from
genomic DNA. A 26 bp tail encoding the T7 RNA polymerase promoter
(aagTAATACGACTCACTATAGGGAGA) was included on the reverse primer. PCR products
were purified with the QiagenTM PCR purification kit and used directly as
templates in transcription reactions. Between 500 bp to 3 kb of coding
sequence was used as a template for each probe.
Computational identification of shared motifs and enhancers
To identify shared motifs, we developed a program called MERmaid (available
at
www.opengenomics.org)
which finds all n-mers of any length that are present or absent in specified
groups of sequences. In this study, we considered two classes of motifs:
`exact match' motifs, in which every position in the motif is filled by one
specific nucleotide; and `fuzzy' motifs, in which up to two positions in the
motif can be occupied by any of the four nucleotides. The vn and
sim enhancers could be identified in genome-wide searches for
clusters of sequence motifs using the parameters indicated in the text and
supplement, and online search tools freely available at
www.flyenhancer.org
(Markstein et al., 2002). A
similar tool is available for the mosquito genome at
www.mosquitoenhancer.org.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
The rho, vnd and brk enhancers share common cis-regulatory elements
The rho, vnd and brk enhancers direct similar patterns of
gene expression (Fig. 2). The
rho and vnd enhancers were previously shown to contain
multiple copies of two different sequence motifs: CTGNCCY and CACATGT
(Stathopoulos et al., 2002). A
three-way comparison of minimal rho, vnd and brk enhancers permitted
a more refined definition of the CTGNCCY motif (CTGWCCY), and also allowed for
the identification of a third motif, YGTGDGAA
(Table 1, and supplemental data
at
http://dev.biologists.org/supplemental/).
The CACATGT and YGTGDGAA motifs bind the known transcription factors, Twist
and Suppressor of Hairless [Su(H)], respectively
(Thisse et al., 1991
;
Bailey and Posakony, 1995
). All
three motifs are over-represented in authentic Dorsal target enhancers
directing expression in the ventral neurogenic ectoderm, as compared with the
10 false-positive Dorsal-binding clusters
(Table 1). As indicated in
Table I, some of the
false-positive clusters contain motifs matching either Twist or CTGWCCY;
however, none of the false-positive clusters contain representatives of both
of these motifs. The rho enhancer is repressed in the ventral
mesoderm by the zinc-finger Snail protein
(Ip et al., 1992
). The four
Snail-binding sites contained in the rho enhancer share the consensus
sequence, MMMCWTGY; the vnd and brk enhancers contain
multiple copies of this motif and are probably repressed by Snail as well.
|
|
Identification of the vein and sim enhancers
To determine whether the shared motifs would help identify additional
ventral neurogenic enhancers, the genome was surveyed for 250 bp regions
containing an average density of one site per 50 bp and at least one
occurrence of each of the four motifs for Dorsal, Twist, Su(H) and CTGWCCY. In
total, only seven clusters were identified (see supplemental data at
http://dev.biologists.org/supplemental/).
Three of the seven clusters correspond to the rho, vnd and
brk enhancers. Two of the remaining clusters are associated with
genes that are known to be expressed in ventral regions of the neurogenic
ectoderm: vein and sim
(Fig. 4A-D)
(Kasai et al., 1992;
Schnepp et al., 1996
). Both
clusters were tested for enhancer activity by attaching appropriate genomic
DNA fragments to a lacZ reporter gene and then analyzing
lacZ expression in transgenic embryos. The cluster associated with
vein is located in the first intron, about 7 kb downstream of the
transcription start site. The vein cluster (497 bp) directs robust
expression in the neurogenic ectoderm, similar to the pattern of the
endogenous gene (Fig. 4A,B)
(Schnepp et al., 1996
). The
cluster located in the 5' flanking region of the sim gene (631
bp) directs expression in single lines of cells in the mesectoderm (the
ventral-most region of the neurogenic ectoderm), just like the endogenous
expression pattern (Fig. 4C,D)
(Kasai et al., 1992
). These
results indicate that the computational methods defined an accurate regulatory
model for gene expression in ventral regions of the neurogenic ectoderm of
D. melanogaster (see Discussion).
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Two of the seven composite clusters are likely to be false-positives, as
they are associated with genes that are not known to exhibit localized
expression across the dorsoventral axis. It is possible that the order,
spacing and/or orientation of the identified binding sites accounts for the
distinction between authentic enhancers and false-positive clusters. For
example, there is tight linkage of Dorsal and Twist sites in each of the five
neurogenic enhancers. This linkage might reflect Dorsal-Twist protein-protein
interactions that promote their cooperative binding and synergistic
activities. Previous studies identified particularly strong interactions
between Dorsal and Twist-Daughterless (Da) heterodimers
(Jiang and Levine, 1993;
Castanon et al., 2001
). Da is
ubiquitously expressed in the early embryo and is related to the E12/E47 bHLH
proteins in mammals (Murre et al.,
1989
). Dorsal-Twist linkage is not seen in one of the two
false-positive binding clusters.
The regulatory model defined by this study probably failed to identify all
enhancers responsive to intermediate levels of the Dorsal gradient. There are
at least 30 Dorsal target enhancers in the Drosophila genome, and it
is possible that 10 respond to intermediate levels of the Dorsal gradient
(e.g. Stathopoulos et al.,
2002). Thus, we might have missed half of all such target
enhancers. Perhaps the present study defined just one of several `codes' for
neurogenic gene expression.
The possibility of multiple codes is suggested by the different contributions of the same regulatory elements to the activities of the vnd and brk enhancers. Mutations in the CTGWCCY motifs nearly abolish the activity of the brk enhancer, but have virtually no effect on the vnd enhancer (see Fig. 3). Future studies will determine whether there are distinct codes for Dorsal target enhancers that respond to either high or low levels of the Dorsal gradient. Indeed, it is somewhat surprising that the sog and CG12443 enhancers essentially lack Twist, Su(H) and CTGWCCY motifs, even though they direct lateral stripes of gene expression that are quite similar (albeit broader) to those seen for the rho, vnd and brk enhancers (see below and Fig. 5).
|
The Dorsal gradient produces three distinct patterns of gene expression
within the presumptive neurogenic ectoderm (summarized in
Fig. 5A). We propose that these
patterns arise from the differential usage of the Su(H) and Dorsal activators.
Enhancers that direct progressively broader patterns of expression become
increasingly more dependent on Dorsal and less dependent on Su(H) (indicated
in Fig. 5B). The sog
and CG12443 enhancers mediate expression in both ventral and dorsal
regions of the neurogenic ectoderm, and contain several optimal Dorsal sites
but no Su(H) sites. By contrast, the sim enhancer is active only in
the ventral-most regions of the neurogenic ectoderm, and contains just one
high-affinity Dorsal site but five optimal Su(H) sites. The reliance of
sim on Dorsal might be atypical for genes expressed in the
mesectoderm. For example, the m8 gene within the Enhancer of split
complex may be regulated solely by Su(H) (e.g.
Cowden and Levine, 2002). The
Anopheles sim enhancer might represent an intermediate between the
Drosophila sim and m8 enhancers, as it contains optimal
Su(H) sites but only one weak Dorsal site. This trend may reflect an
evolutionary conversion of Su(H) sites to Dorsal sites, and the concomitant
use of the Dorsal gradient to specify different neurogenic cell types. A
testable prediction of this model is that basal arthropods use Dorsal solely
for the specification of the mesoderm and Su(H) for the patterning of the
ventral neurogenic ectoderm.
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
Footnotes |
---|
* These authors contributed equally to this study
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J. M.,
Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A. et al.
(2002). Whole-genome assembly and analysis of the genome of
Fugu rubripes. Science
297,1301
-1310.
Bailey, A. M. and Posakony, J. W. (1995). Suppressor of hairless directly activates transcription of enhancer of split complex genes in response to Notch receptor activity. Genes Dev. 9,2609 -2622.[Abstract]
Berman, B. P., Nibu, Y., Pfeiffer, B. D., Tomancak, P.,
Celniker, S. E., Levine, M., Rubin, G. M. and Eisen, M. B.
(2002). Exploiting transcription factor binding clustering to
identify cis-regulatory modules involved in pattern formation in the
Drosophila genome. Proc. Natl. Acad. Sci. USA
99,757
-762.
Bier, E., Jan, L. Y. and Jan, Y. N. (1990). rhomboid, a gene required for dorsoventral axis establishment and peripheral nervous system development in Drosophila melanogaster. Genes Dev. 4,190 -203.[Abstract]
Casal, J. and Leptin, M. (1996). Identification
of novel genes in Drosophila reveals the complex regulation of early
gene activity in the mesoderm. Proc. Natl. Acad. Sci.
USA 93,10327
-10332.
Castanon, I., Von Stetina, S., Kass, J. and Baylies, M. K. (2001). Dimerization partners determine the activity of the Twist bHLH protein during Drosophila mesoderm development. Development 28,3145 -3159.
Cowden, J. C. and Levine, M. (2002). The Snail
repressor positions Notch signaling in the Drosophila embryo.
Development 129,1785
-1793.
Davidson, E. H. (2001). Genome Regulatory Systems: Development and Evolution. San Diego: Academic Press.
Dehal, P., Satou, Y., Campbell, R. K., Chapman, J., Degnan, B.,
De Tomaso, A., Davidson, B., Di Gregorio, A., Gelpke, M., Goodstein, D. M. et
al. (2002). The draft genome of Ciona intestinalis:
insights into chordate and vertebrate origins. Science
298,2157
-2167.
Gloor, G. B., Preston, C. R., Johnson-Schlitz, D. M., Nassif, N.
A., Phillis, R. W., Benz, W. K., Robertson, H. M. and Engels, W. R.
(1993). Type I repressors of P element mobility.
Genetics 135,81
-95.
Gonzalez-Crespo, S. and Levine, M. (1993). Interactions between Dorsal and helix-loop-helix proteins initiate the differentiation of the embryonic mesoderm and neuroectoderm in Drosophila. Genes Dev. 7,1703 -1713.[Abstract]
Halfon, M. S., Grad, Y., Church, G. M. and Michelson, A. M.
(2002). Computation-based discovery of related transcriptional
regulatory modules and motifs using an experimentally validated combinatorial
model. Genome Res. 12,1019
-1028.
Ip, Y. T., Park, R., Kosman, D., Bier, E. and Levine, M. (1992). The Dorsal gradient morphogen regulates stripes of rhomboid expression in the presumptive neuroectoderm of the Drosophila embryo. Genes Dev. 6,1728 -1739.[Abstract]
Jazwinska, A., Rushlow, C. and Roth, S. (1999).
The role of brinker in mediating the graded response to Dpp in early
Drosophila embryos. Development
126,3323
-3334.
Jiang, J. and Levine, M. (1993). Binding affinities and cooperative interactions with bHLH activators delimit threshold responses to the Dorsal gradient morphogen. Cell 72,741 -752.[Medline]
Jiang, J., Kosman, D., Ip, Y. T. and Levine, M. (1991). The Dorsal morphogen gradient regulates the mesoderm determinant twist in early Drosophila embryos. Genes Dev. 5,1881 -1891.[Abstract]
Kasai, Y., Nambu, J. R., Lieberman, P. M. and Crews, S. T. (1992). Dorsalventral patterning in Drosophila: DNA binding of Snail protein to the single-minded gene. Proc. Natl. Acad. Sci. USA 89,3414 -3418.[Abstract]
Kosman, D., Ip, Y. T., Levine, M. and Arora, K. (1991). Establishment of the mesoderm-neuroectoderm boundary in the Drosophila embryo. Science 254,118 -122.[Medline]
Levine, M. and Tjian, R. (2003). Transcription regulation and animal diversity. Nature 424,147 -151.[CrossRef][Medline]
Markstein, M. and Levine, M. (2002). Decoding cis-regulatory DNAs in the Drosophila genome. Curr. Opin. Genet. Dev. 12,601 -605.[CrossRef][Medline]
Markstein, M., Markstein, P., Markstein, V. and Levine, M.
(2002). Dorsal binding clusters identify potential target genes
in the Drosophila embryo. Proc. Natl. Acad. Sci.
USA 99,763
-768.
Morel, V. and Schweisguth, F. (2000).
Repression by suppressor of hairless and activation by Notch are required to
define a row of single-minded expressing cells in the
Drosophila embryo. Genes Dev.
14,377
-388.
Mural, R. J., Adams, M. D., Myers, E. W., Smith, H. O., Miklos,
G. L., Wides, R., Halpern, A., Li, P. W., Sutton, G. G., Nadeau, J. et al.
(2002). A comparison of whole-genome shotgun-derived mouse
chromosome 16 and the human genome. Science
296,1661
-1671.
Murre, C., McCaw, P. S., Vaessin, H., Caudy, M., Jan, L. Y., Jan, Y. N., Cabrera, C. V., Buskin, J. N., Hauschka, S. D., Lassar, A. B. et al. (1989). Interactions between heterologous helix-loop-helix proteins generate complexes that bind specifically to a common DNA sequence. Cell 58,537 -544.[Medline]
Rubin, G. M. and Spradling, A. C. (1982). Genetic transformation of Drosophila with transposable element vectors. Science 218,348 -353.[Medline]
Rusch, J. and Levine, M. (1996). Threshold responses to the Dorsal regulatory gradient and the subdivision of the primary tissue territories in the Drosophila embryo. Curr. Opin. Genet. Dev. 6,416 -423.[CrossRef][Medline]
Schnepp, B., Grumbling, G., Donaldson, T. and Simcox, A. (1996). Vein is a novel component in the Drosophila epidermal growth factor receptor pathway with similarity to the neuregulins. Genes Dev. 10,2302 -2313.[Abstract]
Schweisguth, F. and Posakony, J. W. (1992). Suppressor of Hairless, the Drosophila homolog of the mouse recombination signal-binding protein gene, controls sensory organ cell fates. Cell 69,1199 -1212.[Medline]
Small, S., Blair, A. and Levine, M. (1992). Regulation of even-skipped stripe 2 in the Drosophila embryo. EMBO J. 11,4047 -4057.[Abstract]
Stathopoulos, A. and Levine, M. (2002). Dorsal gradient networks in the Drosophila embryo. Dev. Biol. 246,57 -67.[CrossRef][Medline]
Stathopoulos, A., Van Drenth, M., Erives, A., Markstein, M. and Levine, M. (2002). Whole-genome analysis of dorsal-ventral patterning in the Drosophila embryo. Cell 111,687 -701.[Medline]
Thisse, B., el Messal, M. and Perrin-Schmitt, F. (1987). The twist gene: isolation of a Drosophila zygotic gene necessary for the establishment of dorsoventral pattern. Nucl. Acids Res. 15,3439 -3453.[Abstract]
Thisse, C., Perrin-Schmitt, F., Stoetzel, C. and Thisse, B. (1991). Sequence-specific transactivation of the Drosophila twist gene by the dorsal gene product. Cell 65,1191 -1201.[Medline]
White, K., DeCelles, N. L. and Enlow, T. C.
(1983). Genetic and developmental analysis of the locus
vnd in Drosophila melanogaster.
Genetics 104,433
-448.
Related articles in Development: