From the
Silk gland factor-1 (SGF-1) regulates transcription of the
Bombyx sericin-1 gene via interaction with the SA site. In
this study, two related SGF-1 polypeptides of apparent molecular masses
of 40 and 41 kDa were purified. Specific interaction of these proteins
with the SA site was demonstrated by electrophoretic mobility shift and
dimethyl sulfate methylation interference assays. The SGF-1 40-kDa
protein was partially sequenced and characterized as a new member of
the fork head/HNF-3 family. Several full-length cDNAs encoding
the SGF-1 40-kDa and possibly also the 41-kDa proteins were cloned and
sequenced. SGF-1 mRNA is expressed consistently with the
presumed role of the SGF-1 protein product in regulating the sericin-1
gene. The SGF-1 protein contains putative transactivation domains. We
conclude that the 40- and 41-kDa SGF-1 proteins affect transcription of
the sericin-1 gene via binding to the SA site.
The silk glands of the silkworm Bombyx mori produce
vast amounts of several silk proteins. Genes for the H- and L-chain
fibroin are expressed in the posterior part of silk glands, whereas
several serine-rich proteins arise from transcripts of two sericin
genes expressed in the middle part of silk glands. We are interested in
the regulation of this massive tissue-specific gene expression (see
Ref. 1 for a review). Several protein factors were shown to interact
with DNA elements located upstream of the 5`-ends of the H-chain
fibroin and sericin-1 genes
(2, 3) . A silk
gland-enriched factor SGF-3
Another silk gland-enriched factor, named SGF-1, interacts with the
SA site located in the proximal upstream region of the sericin-1 gene
promoter. Removal of the SA element decreases in vitro transcription from the sericin-1 promoter in nuclear extract
prepared from MSG
(3) . A major tissue and sequence-specific
complex was detected by means of electrophoretic mobility shift assay
(EMSA) with a crude MSG extract using an oligonucleotide containing the
SA element as a probe
(4) . It was proposed that the protein
component of this retarded complex is identical to SGF-1
(4) .
The same protein probably binds also to two regions upstream of the
H-chain fibroin gene
(2) .
In the present report, we describe
purification of the SGF-1 protein and molecular cloning of its
corresponding cDNAs. Bombyx SGF-1 is a homologue to the
Drosophila fork head protein
(6) and therefore a new
member of the fork head/HNF-3 family of transcriptional
regulators.
Preparation of Nuclear Extract Nuclear extract was obtained from MSG of 2-day-old final instar
larvae as described earlier
(3) . SGF-1 Binding to DNA-coupled Latex Particles SGF-1 binds to the SA site in the sericin-1 gene promoter
(3) . DNA containing multiple repeats of the SA element was
prepared by ligation of double-stranded oligonucleotides (SA20
oligonucleotide in Fig. 1). The ligated DNA was coupled with
latex particles as described
(8) . The DNA-coupled particles
were used to enrich SGF-1 binding activity from crude nuclear extract.
The published guidance
(8) was followed throughout the
procedure, and all manipulations were done in a cold room. Typical
binding reaction contained 35 mg of nuclear extract and 2 mg of
Escherichia coli tRNA in 13 m
M Hepes (pH 7.9), 8.3
m
M MgCl
A
bromodeoxyuridine-substituted oligonucleotide carrying the SGF-1
cognate sequence SA was prepared by filling in the SAUV oligonucleotide
shown in Fig. 1. This probe was UV cross-linked with the latex
fraction. Several specific bands were observed of molecular masses
ranging from 52 to 60 kDa (data not shown). Some bands were apparently
caused by probe or protein degradation during the incubation period.
Our attempt to shorten the cross-linked probe by DNase digestion
failed, probably because SGF-1 was degraded by proteases during the 37
°C incubation step. Molecular mass of the SGF-1 protein was
estimated within the range of 38-46 kDa, subtracting the
contribution of the probe from the electrophoretic mobilities of the
observed bands.
SDS-PAGE analysis of the latex fraction revealed a
prominent band of apparent molecular mass of 40 kDa (Fig. 2 A,
lane 1). The relative SGF-1 binding activity
estimated by EMSA correlated well with the intensity of the 40-kDa
protein band in various washes and successive elutions throughout the
DNA affinity enrichment procedure (data not shown). The 40-kDa protein
was therefore presumed to be identical with the SGF-1 factor. To
confirm this hypothesis, the 40-kDa band was excised from SDS-PAGE,
renatured, and analyzed by EMSA. The mobility of the resulting complex
was identical with the mobility of the SGF-1
The latex
fraction was chromatographed on a reverse phase column. An additional,
less abundant protein of approximately 41 kDa co-eluted with the 40-kDa
protein in a single fraction composed of two overlapping peaks
(Fig. 2 A, lane 4, and data not shown).
This fraction was further referred to as reverse phase fraction.
Several members of this
family have been recently cloned in our laboratory from a Bombyx embryonic cDNA library.
The entire 3.1-kb insert of the SGF-1 cDNA 6 clone was
sequenced (Fig. 5 A). The sequence revealed a 1047-base
pair-long ORF (nucleotides 36-1082), encoding a 38.8-kDa protein
that contains all the peptides obtained by the SGF-1 (40 kDa) trypsin
digest. The starting AUG codon is preceded by sequence AGCC, which
corresponds relatively well with the Drosophila translation
start consensus
(15) .
The
3`-untranslated end of SGF-1 cDNA 9 (and other cDNAs) differs
from that of clone 6; four base pair substitutions were found
within the last 100 base pairs (Fig. 5 B and data not
shown). Sequences of the remaining portions of the 3`-untranslated
region of clone 9 were not determined, but no differences were
noticed after restriction mapping. We sequenced the entire ORF and the
5`-untranslated end of the SGF-1 cDNA 9 and found that this
sequence was identical with the corresponding region of clone 6 (data not shown).
Restriction mapping revealed that the 3`-end
of clone 5 contains an extra 0.8-kb stretch; this sequence was
not present in the other clones shown in Fig. 5 B.
Partial sequencing at this region indicated that the 3`-end of clone
5 is merely an extended 3`-end found in the other cDNAs
(Fig. 5 B and data not shown). The 5`-end and the ORF of
cDNA 5 was sequenced except for the conserved fork head domain. No difference was found between the coding regions of
clones 5 and 6 (data not shown).
The SGF-1
cDNA clones 6 and 9 differ by base substitutions
at their 3`-untranslated regions (see above). This is not due to the
existence of two closely related SGF-1 genes, since Southern
blot of genomic DNA showed the presence of a single hybridizing band in
most restriction enzyme digests (Fig. 6 A). Therefore, we
speculate that the observed differences between the two cDNA clones
reflects occurrence of two slightly different SGF-1 alleles.
We did not study the
differences between cDNAs 6, 9, and 5 in
details. Our presumption is that the 3.1-kb-long clones 6 and
9, which differ by base substitutions at the 3`-end, are
full-length cDNAs (Fig. 5 B),
There is conclusive evidence that SGF-1 stimulates sericin-1 gene
transcription via interaction with the SA site
(3, 4) .
The SGF-1 transcription-stimulating activity was ascribed to a DNA
binding protein forming the major and the only sequence-specific
retarded complex detected by EMSA, with the SA site containing probe in
crude extract. The SGF-1 complex is tissue specific, being most
profound when extract from middle or posterior silk glands is used
(3, 4) . We propose that the SGF-1 protein, which we
have cloned in the present study, is involved in the sericin-1 gene
regulation because 1) this is the protein that forms the
sequence-specific complex with the SA site probe in crude extract
(Figs. 3 and 4), 2) high SGF-1 mRNA levels are found in silk
glands of animals of the appropriate developmental stages but not in
the control tissue (Fig. 6 B),
Nevertheless, a direct test, preferably in vitro transcription assay, is needed to study the function of
recombinant SGF-1 protein and its domains in stimulation of the
sericin-1 promoter activity. Other questions concern putative roles of
SGF-1 in the control of other genes coding for silk components and
possible SGF-1 interaction with other protein factors, namely SGF-3
(5) .
The nucleotide
sequence(s) reported in this paper has been submitted to the
GenBank/EMBL Data Bank with accession number(s) D38514.
We thank Dr. C.-c. Hui for Bombyx genomic
DNA; Dr. H. Kokubo for the actin oligonucleotide probe and useful
discussion; H. Kajiura and Y. Makino for peptide sequencing; Dr. M.
Yoshikuni for help with operating the Smart system HPLC; C. Inoue, M.
Ohkubo, E. Suzuki, and M. Sasaki for assistance; Prof. G. Eguchi, Dr.
M. Mochii, and Prof. Y. Nagahama for the access to Smart system HPLC,
and Drs. X. Xu, K. Nakai, and P.-X. Xu for useful discussion. We are
especially grateful to Dr. M. Jindra for his careful revision of the
manuscript (grammar and style).
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES
(
)
is known to bind to
the SC site upstream of the sericin-1 gene promoter
(3) .
Removal of this SC element curbs down in vitro transcription
from the sericin-1 gene promoter in crude middle silk gland (MSG)
nuclear extract
(3, 4) . POU-M1, a homologue of the
Drosophila Cf1-a transcription factor, was recently found to
be expressed in the middle (and to a lesser extent in the posterior)
silk glands and shown to be identical with SGF-3
(5) .
, 33 m
M KCl, 170 m
M NaCl,
17 m
M EDTA, 0.1% Nonidet P-40, 15% (v/v) glycerol, and 1.3
m
M dithiothreitol in total volume of 1.5 ml. This mixture was
pre-incubated for 20 min, 20 mg of latex particles coupled to 20 µg
of DNA was added, and the incubation was continued for an additional 60
min. Washes and elutions were performed with 20 m
M Hepes (pH
7.9), 15% (v/v) glycerol, 10 m
M EDTA, and increasing
concentration of NaCl. Typically, two 5-min washes (0.3 ml each) with
the above solution containing 0.1
M NaCl were followed by four
washes containing 0.3
M NaCl and by four 100-µl elutions
containing 0.5
M NaCl. The elutes were pooled, and this
SGF-1-enriched fraction was referred to as latex fraction. The entire
procedure was repeated until approximately 10 ml of the latex fraction
was obtained. EMSA Binding reactions (20 µl) included 100 µg/ml poly(dI-dC),
100 µg/ml tRNA, and 1 mg/ml nuclear extract and were carried out in
10 m
M Hepes (pH 7.9), 100 m
M NaCl, 1 m
M dithiothreitol, 10 m
M EDTA, and either 8% (v/v) glycerol
or 2.5% Ficoll 400. When EMSA was performed with the latex fraction
(0.2 µl/reaction) instead of the nuclear extract, the amount of
heterologous nucleic acid was reduced to 0.5 µg/ml tRNA, and
Nonidet P-40 and bovine serum albumin were added to final
concentrations of 0.1% and 1 mg/ml, respectively. Similar conditions
were used to examine renatured proteins. A double-stranded
oligonucleotide containing the wild type SGF-1 binding site (SA site,
see Ref. 3) was used both as a probe (5 fmol/reaction) and a specific
competitor (SA oligonucleotide in Fig. 1). The mutated version of
this oligonucleotide, which contained three T to A transitions, was
found to compete inefficiently with the wild type probe and served as a
control competitor (SAM1 oligonucleotide in Fig. 1). UV Cross-linking A pair of SAUV oligonucleotides (Fig. 1) was used to
incorporate 5-bromo-2`-deoxyuridine triphosphate and labeled dATP into
the SA sequence. This probe was UV cross-linked (see Ref. 9, Suppl. 3)
with SGF-1 present in the latex fraction. Reactions contained either
excess of the unlabeled SA oligonucleotide, unlabeled SAM1
oligonucleotide, or no competing oligonucleotide. The optimal interval
for UV irradiation was empirically determined to be 20-25 min.
The reactions were not treated with DNase prior to SDS-PAGE analysis. Dimethyl Sulfate Methylation Interference The protocol described in Ref. 9 (Suppl. 25) was followed except
that a passive elution was used to recover DNA from acrylamide gels. Renaturation of Proteins SDS-PAGE was performed as described (see Ref. 9, Suppl. 13).
Proteins were stained with zinc
(10) , and the protein bands
were excised, destained, and renatured as described
(8) . HPLC The Pharmacia Smart system micro HPLC was employed.
Figure 1:
Sequence of oligonucleotides used in
this study. The double-stranded oligonucleotides were derived from the
SA site sequence (3). Numbers indicate the positions of
nucleotides relative to the nucleotide +1 of sericin-1 gene. The
nucleotides mutated from the wild type sequence are indicated in
lower case letters. These substitutions were
introduced to facilitate labeling by Klenow enzyme ( not underlined) or to interfere with SGF-1 binding
( underlined). The active SA site in the SA20 and SAUV DNAs was
generated after self-ligation and a filling-in procedure,
respectively.
Proteins
The latex fraction was supplemented
with solid guanidium hydrochloride, 1% trifluoroacetic acid, and 50%
ACN to achieve final concentrations of 6
M, 0.1 and 4%,
respectively. Up to 2 ml of this mixture was loaded on a µRPC
C2/C18 PC 3.2/3 column. The gradient was run in 0.1% trifluoroacetic
acid, from 4 to 40% ACN in 1 ml, and then from 40 to 80% ACN in 5 ml.
The flow rate was 0.1 ml/min. The entire latex fraction was processed
by reverse phase chromatography, aliquots containing 40 and 41 kDa
proteins were pooled, and this material (total 0.8 ml) was referred to
as reverse phase fraction.
Peptides
Trypsin digest of 40-kDa protein
was concentrated according to the protocol used in Ref. 11. All
material was loaded on a µRPC C2/C18 SC 2.1/10 column and
fractionated as recommended by the manufacturer. Trypsin Digestion The reverse phase fraction was mixed with 20 µl of 1.5%
deoxycholate, four times concentrated by a Speedvac, and precipitated
with trichloroacetic acid using the Sigma P5656 kit. Precipitated
proteins were resuspended in 50 µl of SDS-PAGE sample buffer and
stored at -70 °C. Following the SDS-PAGE analysis, the gel
was zinc stained, and the 40-kDa band was excised and destained with
citric acid
(10) . From this point, the gel slice was washed in
ACN and treated with trypsin as described
(11) . Peptide Sequencing Custom sequencing (Takara) as well as the National Institute for
Basic Biology facilities were used to obtain peptide sequences. cDNA Library Construction RNA fraction enriched for poly(A)was isolated
from MSG of the 2-day-old last instar larvae. The MSG cell layer was
separated from the lumen containing silk proteins, and 150 mg of the
cell layer was homogenized in 3.5 ml of solution containing 3
M LiCl, 6
M urea, 10 m
M sodium acetate (pH 5.2),
and 0.4 mg/ml heparin. RNA was precipitated on ice and centrifuged. The
precipitate was resuspended in 0.4 ml of the extraction solution
supplied with the Quick-prep micro mRNA isolation kit (Pharmacia), and
mRNA was isolated as recommended by the manufacturer. cDNA was
synthesized using the TimeSaver cDNA synthesis kit (Pharmacia) and
cloned into EcoRI/CIAP-treated lambdaZAP II vector
(Stratagene). The library was plated in the E. coli SURE cells
(Stratagene). Phagemids from positive clones were rescued by the
Ex-assist/SOLR system (Stratagene). Northern Blot Poly(A)
RNA was obtained as described
above; MSG lumen contents were eliminated if applicable. Samples of
approximately 2 µg of poly(A)
RNA per lane were
electrophoresed in an agarose/formaldehyde gel as described
(12) . The RNA was electroblotted on Hybond-N membrane (Amersham
Corp.) in 25 m
M sodium phosphate (pH 7.0). DNA Sequencing The sequence of SGF-1 cDNA 6 was determined from a set of
overlapping deletions using the Sequenase version 2.0 system (U. S.
Biochemical Corp.). Standard reactions containing dGTP were run in
parallel with dITP-containing reactions. Oligonucleotide primers were
synthesized to verify this sequence and to resolve sequences of other
cDNA clones.
Purification of the SGF-1
Protein
Approximately 5000 pairs of MSGs were dissected
from 2-day-old final instar larvae and used in batches of 1000 MSG
pairs to prepare a total of 40 ml of nuclear extract containing 1.6 g
of protein. This material was enriched for the SGF-1 binding activity
using latex particles coupled with DNA containing the SGF-1 binding
site SA (see ``Experimental Procedures''). The estimated
SGF-1 enrichment was greater than 500-fold, with 20-30% recovery.
This fraction is further referred to as latex fraction.
DNA complex in both
the crude extract and the latex fraction (data not shown).
Figure 2:
SDS-PAGE analysis of purified fractions.
M, marker lane. A, silver-stained gel. The crude MSG
extract was enriched for the SGF-1 binding activity using DNA-coupled
latex particles. This latex fraction ( lane 1) was
chromatographed on a reverse phase HPLC column. Lanes 2-4 contain fractions 12-14, respectively,
obtained by the reverse phase chromatography step. Material in lane 4 was saved, and this fraction was referred to as the
reverse phase fraction. The SGF-1 40-kDa protein is present in
lanes 1, 3, and 4 (indicated by an
arrow); the SGF-1 41-kDa protein migrates just above the
40-kDa band. Relative intensity of the SGF-1 41-kDa polypeptide band
was exaggerated by the photography process. B, Coomassie
Brilliant Blue-stained precipitated reverse phase fraction. Proteins
present in the reverse phase fraction were precipitated by
trichloroacetic acid/deoxycholate and dissolved in the SDS-PAGE loading
buffer; of this mixture was loaded on a SDS-PAGE gel. Approximately
400 ng of the SGF-1 40-kDa polypeptide was present, as estimated by
Coomassie Brilliant Blue staining and comparison with standard proteins
(standard proteins are not shown). Position of the SGF-1 40-kDa protein
is indicated by an arrow, and the SGF-1 41-kDa protein is
visible just above.
An
aliquot of the reverse phase fraction was loaded on a SDS-PAGE, and
both the 40- and 41-kDa protein bands were excised separately,
renatured, and assessed in EMSA. Both proteins displayed the same
binding specificity and produced retarded complexes migrating
indistinguishable from the SGF-1 complex formed in crude extract (Fig.
3). We used the methylation interference assay to examine the
interaction of the renatured proteins with the proximal upstream region
of the sericin-1 gene in greater detail. Both proteins yielded clear,
apparently identical dimethyl sulfate footprints (compare lanes 8 and 11 in Fig. 4 A). Indistinguishable
footprints were obtained when the DNA probe was incubated either with
crude extract or with the latex fraction (Fig. 4 A). This
confirms that SGF-1 binding activity purified in the 40- and 41-kDa
proteins is identical with the DNA binding activity ascribed to the
SGF-1 factor in crude extract. The observed footprint is depicted in
Fig. 4B and maps within the DNase I protected region
defined as the SA site
(3) . We will hereafter refer to the
40-kDa protein as SGF-1 (40 kDa) and to the 41-kDa protein as SGF-1 (41
kDa).
Figure 4:
Evidence
that the purified SGF-1 proteins correspond to the SGF-1 factor in
crude extract (methylation interference assay). A, a DNA
fragment spanning positions -132 to -50 of the sericin-1
gene promoter was uniquely labeled at the 5`-end of the coding
( lanes 1-12) or the non-coding ( lanes 13-15) strand. Guanine and (to a lesser extent)
adenine residues were partially methylated with dimethyl sulfate, and
this probe was used in a gel shift assay with either crude extract,
latex fraction, or renatured SGF-1 proteins (obtained as described in
Fig. 3). Protein complexes and unbound DNAs were localized by
autoradiography, eluted, and cleaved at methylated residues with
piperidine. The protein-bound and free DNAs were compared on a
sequencing gel. DNA extracted from the free probe band is present in
the outer lanes 1 and 3 (crude
extract), 4 and 6 (latex fraction), 7 and
9 (renatured SGF-1 40-kDa protein), 10 and 12 (renatured SGF-1 41-kDa protein), and 13 and 15 (latex fraction), whereas DNA extracted from the retarded band is
present in the middle lanes 2 (crude
extract), 5 (latex fraction), 8 (renatured SGF-1
40-kDa protein), 11 (renatured SGF-1 41-kDa protein), and
14 (latex fraction). Methylated G and A residues that
interfere with protein binding are diminished in the protein-bound
lanes compared with unbound DNA. B, an overview of the
observed interferences. Numbers indicate the positions of
nucleotides relative to the nucleotide +1 of sericin-1 gene.
Filled arrowheads indicate strong interferences, and
blank arrowheads indicate weak or ambiguous
interferences.
Internal Sequence of SGF-1 (40 kDa)
Protein
Proteins present in the reverse phase fraction were
precipitated with trichloroacetic acid/deoxycholate, and the sediment
was dissolved in SDS-PAGE loading buffer. One-tenth of this mixture was
electrophoresed to estimate the total yield of SGF-1 by Coomassie
Brilliant Blue staining (Fig. 2 B). About 4 µg of the
SGF-1 (40 kDa) protein (approximately 100 pmol) and 1 µg of the
SGF-1 (41 kDa) protein (25 pmol) were purified. These two proteins
together comprised over 85% of the precipitated reverse phase fraction.
The entire precipitated reverse phase fraction was loaded onto a
SDS-PAGE gel, and the SGF-1 (40 kDa) protein was excised from the gel
and digested by trypsin. The SGF-1 digestion products were separated by
HPLC, and their sequences were partially determined. Several SGF-1
fragments (21A, 25A, 44A, 8B, and 25B) revealed useful sequences. These
peptides are indicated by underlines in Fig. 5 A, and
their sequences are listed in the figure legend.
Cloning of SGF-1 cDNA
The partial SGF-1
peptide sequences were compared with a non-redundant protein sequence
data base using the Fasta program
(13) . The overlapping
peptides 21A and 25A were found to be contained within the
Drosophila fork head protein
(6) . Peptide 8B was very
similar to another part of Drosophila fork head (data not
shown). We therefore concluded that SGF-1 belongs to the fork
head/HNF-3 family (see Refs. 7 and 14).
(
)
One of these
clones appears to encode several SGF-1 peptides.
(
)
This embryonic cDNA clone was used to screen a MSG cDNA
library. Over 20 putative positive clones were identified in among
approximately 3.5
10
plaque-forming units of the
unamplified library. 12 of them were isolated, mapped by restriction
enzyme digestion and Southern hybridization, and partially sequenced.
Figure 5:
Nucleotide sequences of the SGF-1 cDNA clones. A, nucleotide sequence and predicted
translation of the SGF-1 cDNA clone 6. The completely
sequenced full-lengthcDNA 6 is 3078 base pairs
long and encodes a 349-amino acid-long polypeptide. Amino acids are
shown and numbered in italics. The putative
polyadenylation signal and the peptide sequences determined from the
trypsin digest of the SGF-1 40-kDa protein are underlined. Not
all amino acid residues in the underlined peptides were resolved by
peptide sequencing. The actually unambiguously determined amino acid
residues are as follows: peptide 44A, EQ XXXSPTSALQ; Peptide
25A, FKDEK; Peptide 21A, XKDEKK; Peptide 41A, LLPGADTK;
Peptide 8B, QEPSGYAPAQHPF; Peptide 25B,
XYDVNYGYG XXPA XNYY. B, partial
sequences of several SGF-1 cDNAs. Left, the extreme
5`-ends of SGF-1 cDNAs 6, 9, 13, and
5. The clones differ by only a few nucleotides and therefore
may correspond to (nearly) full-length cDNAs.
Right, the sequences of clones 6, 9,
13,and 3 represent the extreme 3`-ends of
the respective clones. The sequence of clone 5 maps
approximately 0.8 kb from the actual 3`-end of this clone. The
underlined putative polyadenylation signal was probably used by clones 6,
3, 9 (and 13) but not by clone
5.
The SGF-1 cDNAs 6,
9, and 13 revealed essentially identical restriction
maps when digested with PstI, XhoI, and KpnI
enzymes (data not shown). The 5`-ends of these clones were partially
sequenced and found to differ by only a few bases in length
(Fig. 5 B and data not shown). The cDNA library was not
amplified, and therefore these three cDNAs likely correspond to the
(nearly) full-length copies.(
)
SGF-1 Expression
We were interested to
see whether SGF-1 was expressed specifically or predominantly
by MSG and whether the time course of its expression was consistent
with its presumed role as a regulator of the sericin-1 gene. We have
detected two SGF-1 transcripts in MSG poly(A)RNA preparations. The major transcript (approximately 5 kb)
was present in MSG of all stages examined, while the less
abundant one (approximately 6 kb) was found in MSG of molting larvae in
extremely low levels also occurred in MSG of 2-day-old final instar
larvae (Fig. 6 B, lanes 2 and
3). Thus, SGF-1 mRNA appearance preceded the massive
expression of sericin-1 gene in the last larval instar
(16) ,
and the major 5-kb SGF-1 transcript was maintained throughout
the last instar. The 5-kb SGF-1 mRNA could also be detected in
the control tissue (Fig. 6 B, lane 5),
although its relative amount was very low (see legend to Fig. 6,
B and C).
(
)
Figure 6:
Evidence that SGF-1 is a single
copy gene and that its transcripts are predominately expressed by MSG.
A, Southern blot analysis. Bombyx Sho-Wa strain
genomic DNA was digested with EcoRI ( lane 1), PstI ( lane 2),
PstI and XhoI ( lane 3), or
XhoI alone ( lane 4). The restriction enzyme
digests were resolved on an agarose gel, blotted, and probed with a
fragment of the SGF-1 cDNA 6 (nucleotides 1-325, Fig.
5 A). Presence of a single hybridizing band in most lanes
suggests that SGF-1 is a single copy gene. B,
Northern blot analysis. Poly(A)RNAs were prepared
from MSG of animals from the penultimate larval instar staged 72 h
after the third ecdysis ( lane 1), from MSG of the E2
stage in the fourth molt (staged approximately 24 h later than the
previous stage, see Ref. 24 for details), from MSG of 2-day-old last
instar larvae ( lane 3), from MSG of 5-day-old last
instar larvae ( lane 4), and from ovary of 4.5-day-old
prepupae ( lane 5). The RNAs were hybridized to the
same SGF-1 probe as in A. The major 5-kb transcript
is prominent in all MSG samples and is present as an extremely weak
band in the ovary ( lane 5). The minor 6-kb transcript
is clearly observed in the E2 stage ( lane 2) and is
almost diminished in the next stage ( lane 3).
C, the same Northern blot as in B rehybridized with a
Bombyx cytoplasmic actin oligonucleotide probe. We used a Fuji
Bas2000 image analyzer to calculate the actual amount of the major
SGF-1 transcript after correction on actin level. The amount
of SGF-1 5-kb mRNA present in the last instar is roughly
constant; approximately two times higher than in the stage from the
fourth instar and 40 times higher than in ovary. We excluded the
molting stage from our calculations, since we feel that actin level is
not an appropriate marker for this stage.
SGF-1 Proteins and cDNA Clones
We have
identified two closely related SGF-1 proteins of observed molecular
masses of 40 and 41 kDa, respectively. They both yielded retarded EMSA
complexes that comigrated with the complex generated by SGF-1 in crude
extract (Fig. 3). Furthermore, the interaction of the SGF-1 (40
kDa) and SGF-1 (41 kDa) proteins with the methylated SA site appears
identical when compared both with each other and with the SGF-1 protein
present in crude extract (Fig. 4). In fact, the results of the
methylation interference assay practically preclude that other
protein(s), different from the 40- and 41-kDa ones, participate in the
formation of the tissue- and sequence-specific SGF-1 complex in crude
extract. The SGF-1 40- and 41-kDa proteins co-elute in a single
fraction from reverse phase chromatography (Fig. 2 A and
data not shown), further indicating their high similarity.
Figure 3:
Evidence that the purified SGF-1 proteins
correspond to the SGF-1 factor in crude extract (electrophoretic
mobility shift assay). EMSA was performed with 5 fmol of labeled SA
oligonucleotide (Fig. 1) and either 10 µg of crude MSG extract
( lane 1) or unknown but equal amounts of SGF-1 40-kDa
protein ( lanes 2-4) or unknown but equal
amounts of SGF-1 41-kDa proteins ( lanes 5-7).
To obtain SGF-1 proteins, reverse phase fraction was analyzed by
SDS-PAGE (Fig. 2 A, lane 4), and the 40- and
41-kDa SGF-1 polypeptides were excised and renatured as described under
``Experimental Procedures.'' Lanes 3 and
6 contain 500 pmol of unlabeled SA oligonucleotide (efficient
competitor containing functional SA site, Fig. 1), and lanes 4 and 7 contain 500 pmol of unlabeled SAM1
oligonucleotide (inefficient competitor bearing mutated SA site, Fig.
1).
We have
analyzed over a dozen cDNA clones including several likely full-length
cDNAs ( Fig. 5and data not shown).These
clones fall into three categories represented by SGF-1 cDNAs 6, 9, and 5. Based on a Southern blot hybridization, all of
them originate from a single gene (Fig. 6 A) and possess
an identical ORF encoding a 38.8-kDa protein. This protein is a new
member of the fork head family and contains all peptide
sequences found in the SGF-1 (40 kDa) protein fragments
(Fig. 5 A and legend). As discussed above, the SGF-1 (40
kDa) and SGF-1 (41 kDa) proteins are likely to be close relatives. If
they arose from different mRNAs, one would expect the presence of cDNA
clones encoding different protein products. However, no such different
cDNAs were obtained. We therefore assume that both the 40- and 41-kDa
(apparent molecular mass) proteins probably correspond to the predicted
molecular mass of 38.8 kDa of the product deduced from cDNA sequences.
One possibility is that the 41-kDa protein is a post-translationally
modified version of the 40-kDa polypeptide. The 1.2-kDa difference
between 40 and 38.8 kDa is well within the limits of experimental error
of molecular mass determination from SDS-PAGE.
representing two
slightly different alleles of the SGF-1 gene. Both of these
clones could arise by the usage of the most proximal polyadenylation
signal (Fig. 5 B) and likely correspond to the major transcript
observed on Northern blot (Fig. 6 B).
The
3.9-kb-long clone 5 contains additional 0.8 kb at its 3`-end,
and its restriction map and partial sequence suggest that it is an
extension of clone 9 (Fig. 5 B and data not
shown). It is likely that a more distant polyadenylation site was used
in the case of cDNA 5. This clone may correspond to the 6-kb
transcript detected on Northern blot (Fig. 6 B, lanes 2 and 3).
Regulation of Genes Coding for Silk Proteins by
SGF-1
SGF-1 is a new member of the fork head/HNF-3 family. This family contains HNF-3 factors ,
, and
, which regulate transcription of tissue-specific genes in rodent
liver and several other organs derived from the embryonic gut
(14) . The other two members of this family are Xenopus XFKH-1
(17) and Drosophila fork head
(6) ,
which are developmental regulators. Members of this family are
characterized by a 110-amino acid-long conserved DNA binding motif (see
Ref. 18). This motif is highly conserved also in SGF-1
(Lys
-Glu
in Fig. 5 A,
domain I in Fig. 7), indicating that SGF-1 is a true
homologue of Drosophila fork head and mammalian HNF-3 rather than a homologue of some other Drosophila (19) or mammalian
(7, 20) fork head protein. In
addition, the SGF-1 binding site (Fig. 4 B) closely
resembles some sequences recognized by HNF-3
(21, 22) .
Two additional domains (Leu
-Leu
and
Asp
-Leu
, Fig. 5 A;
domains II and III, respectively, in
Fig. 7
) are conserved between Drosophila fork head and
the three mammalian HNF-3 factors. These two domains are
transactivation domains required for transcriptional stimulation by
HNF-3
(23) . Both of these domains can be recognized in
SGF-1 (Fig. 7). Except for two gaps, domain II is well conserved
between HNF-3
and SGF-1. It is likely that this conservation of
structure is paralleled by conservation of function; these regions may
be required for target gene transactivation. Further biochemical
evidence is necessary to address this question.
Figure 7:
Comparison of SGF-1 with other members of
fork head/HNF-3 family. The three domains conserved between
Drosophila fork head protein (FKH) and mammalian HNF-3 factors
are present also in SGF-1. The domain boundaries in SGF-1 are
Lysand Glu
( I), Leu
and Leu
( II), and Asp
and
Leu
( III). Domain I is the DNA
binding domain, and domains II and III participate in transactivation by HNF-3
(23).
Drosophila fork head is a homeotic gene required for differentiation of
embryonic termini. Aside from early terminal domains in ectoderm,
fork head is expressed in four additional tissues: the
developing midgut, salivary glands, nervous system, and the yolk
nuclei. With the exception of midgut, expression persists until very
late stages of embryonic development
(6) . Salivary glands are
missing in fork head mutant embryos (cited in Ref. 6).
Lepidopteran silk glands are homologous to the salivary glands of
Diptera, and SGF-1 is expressed in Bombyx embryos.We speculate that the SGF-1 protein may be
initially required for the development of silk glands and subsequently
utilized in the control of genes coding for silk proteins. Indeed,
besides the sericin-1 gene, SGF-1 probably interacts with an upstream
region of the fibroin-H gene
(2) , and there are putative SGF-1
binding sites in the regulatory regions of several other genes coding
for silk components. To our knowledge, no target genes for the
Drosophila fork head protein have been identified to date.
and 3) SGF-1
is a new member of an established family of transcriptional regulators,
possesses putative transactivation domains, and its Drosophila homologue is required for the development of salivary glands (see
above).
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.