From the
From a cDNA library prepared from venom glands of worker bees,
clones encoding the precursors of apamin and MCD peptide have been
isolated. The cDNAs are similar at the 5`-ends and identical in their
3`-regions. Analysis of the corresponding genes has revealed the
existence of six exons separated by introns rich in A + T.
Starting from the 5`-end, these exons are arranged in the following
order: three exons of the mast cell-degranulating (MCD) peptide
precursor, two exons of the gene for the apamin precursor, and finally
a 3`-exon present in both cDNAs. This suggests that the bulk of the
apamin gene resides in the third intron of the MCD peptide gene. Using inverse polymerase chain reaction, a segment of
genomic DNA upstream of the first exon of the MCD precursor
gene was obtained. The sequence of this segment shows 81% identity to
the DNA sequence preceding the first exon of the apamin gene and both
contain a putative TATA box. We thus propose that the mRNA encoding the
apamin precursor originates from a primary transcript which starts in
the third intron of the MCD peptide gene. Both cDNAs encode
unusually small precursors comprising only 46 amino acids in case of
apamin and 50 in the case of the MCD peptide.
The venom from the gland of the honeybee, Apis
mellifera, produces an aqueous secretion which contains in
significant quantity only two enzymes and four
peptides
(1, 2) . The enzymes are a phospholipase A
Plasmid preparation and subsequent digestion with
EcoRI and BamHI yielded insert sizes of 250-700
bp. Clones were further characterized by hybridization with the
peptide-specific oligonucleotides Apa1 and MCD1, respectively. Apa1
coded for the amino-terminal apamin sequence Cys-Asn-Cys-Lys-Ala-Pro
(TGT/C-AAT/C-TGT/C-AAA/G-GCX-CC, 17 mer), while MCD1 contained
the codons of the fragment Lys-Cys-Asn-Cys-Lys-Arg (residues 2-7)
(AAA/G-TGT/C-AAT/C-TGT/C-AAA/G-A/CG, 17 mer). Of the positive clones,
about one-half hybridized with the Apa1 and the other with the MCD1
oligonucleotide. The clones with the largest inserts were sequenced
using the chain termination method.
Genomic
DNA was amplified with two oligonucleotides: (a) AMG-5
(G-ATT-TCT/C-ATG-CTG/A-AGA-TG, 18 mer) containing the codons for the
amino-terminal sequence (Me)t-Ile-Ser-Met-Leu-Arg-Cy(s) of both
precursors; (b) AMG-3 (CCATTTTGATGAATCCAA, 18 mer, antisense)
which bound to the 3`-untranslated region of both cDNAs. In a 50-µl
reaction, 0.2-2 µg of genomic DNA and 25 pmol of each primer
were used for the PCR (buffer as described above). After 30 cycles (40
s at 92 °C, 40 s at 52 °C, 2 min at 72 °C), two fragments
containing about 400 bp (AMH1) and 1200 bp (AMH2) were specifically
amplified. Both were subcloned into the pBluescript vector and
sequenced.
The cloned cDNAs for the precursors of
apamin and MCD peptide show a high degree of homology at both ends. The
5`-untranslated region and the segment encoding the signal and the
propart of the respective precursors are 85% identical. Even more
striking is the fact that the 3`-ends of the two cloned cDNAs are
completely identical. In view of these rather unusual sequence
identities and similarities, it was considered of interest to
investigate the structure of the corresponding genes.
The postulated splicing
scheme depicted in Fig. 3would yield the two mRNAs containing
identical 3`-ends. In this scheme, two exons and two introns of the
apamin gene form part of the third intron of the MCD gene.
The precursors for apamin and MCD peptide are unusual for two
reasons. First, they are extremely small, containing only 46 and 50
amino acids, respectively. It has previously been shown that precursor
polypeptides comprised of only 60-80 amino acids, such as those
for melittin from bee venom, cecropin from moth hemolymph, and PGLa
from amphibian skin, can be transported across the membrane of the
endoplasmic reticulum via a post-translational route which is
independent of signal recognition particle and docking protein
(21-23). Whether this is also true for the even smaller
precursors of these bee venom peptides remains to be seen. The second
aspect in which these precursors differ from the numerous other
propeptides and proproteins that have been analyzed in recent years is
the processing reactions which have to take place to obtain the mature
peptide. The conversion of proapamin and pro-MCD peptide to the final
products requires hydrolysis of Ser/Cys and Pro/Ile bonds,
respectively. The endo- and/or exopeptidases catalyzing these reactions
are not known. It has been shown that the liberation of melittin from
its precursor proceeds via stepwise cleavage of dipeptides. A
dipeptidyl aminopeptidase cleaving after proline and alanine residues
participates in the processing of promelittin
(24) as well as of
several precursors from other sources
(25) . However, it is
unlikely that this exopeptidase could hydrolyze the proparts of the
precursors of apamin or MCD peptide. In this context, it is noteworthy
that liberation of hyaluronidase, another bee venom constituent, from
its precursor requires hydrolysis of a Thr/Pro bond
(6) , again a
most unusual site for proteolysis. Generally, it appears likely that
processing reactions in invertebrates are much more diverse than in
vertebrate cells (see e.g. Ref. 26).
Apamin and MCD peptide
have different biological activities, yet they show some sequence
similarity (see Fig. 1). This homology is much more pronounced
when one compares the sequence of the cloned cDNAs encoding these
precursors. The 5`-untranslated region and the segments encoding the
signal and the propeptides of the two precursor are 85% identical. What
was, however, striking was the fact that the 3`-ends of the mRNAs for
these precursors were found to be identical. This raised the intriguing
possibility that a common 3`-exon was present in both mRNAs. Our
analysis of the gene encoding these precursors has corroborated this
notion.
The genomic organization was shown to be as follows. The
first three exons of the precursor for the MCD peptide, two exons of
the apamin precursor, and finally the 3`-exon were present in both
mRNAs. For the apamin precursor, this last exon contains the genetic
information for the COOH-terminal hexapeptide of the final product,
whereas in the case of the MCD precursor, this exon is entirely within
the 3`-untranslated region. The introns in the two genes occur in
homologous positions: the first is located between the penultimate and
the ultimate amino acid of the propart, while the second is inserted
before the sixth residue from the carboxyl end. Moreover, these second
introns are of identical length each containing 81 bp.
These results
demonstrate that the 5`-region of the gene for the apamin precursor
forms part of the third intron of the MCD peptide gene. This
sequence is homologous to the upstream sequence preceding the latter
gene. In fact a TATA box is present in both instances about 75 bp prior
to the initiating ATG codons. We thus assume that the two genes have a
transcription start site close to these TATA boxes. The primary
transcript starting at the first TATA box would then be spliced in such
a way that two exons and introns of the apamin gene are excised as part
of the third intron of the MCD precursor gene. As shown
schematically in Fig. 3, this would yield the mRNA encoding the
precursor of the MCD peptide. Initiation of transcription at the the
second TATA box and subsequent splicing of the two introns gives rise
to the mRNA for the apamin precursor. This would also explain the fact
that both mRNAs terminate in the same exon sequence. Apamin and MCD
peptide are present in about equal amounts in the venom of worker bees,
which indicates that both promoters are used with roughly the same
frequency. They may also be controlled jointly, since only trace
amounts of the two mRNAs can be detected on Northern blots with total
RNA from queen bee venom glands.
Both the
origin of new genes by duplications and the generation of multiple
products from one gene through alternative splicing have been
documented in numerous instances. In addition, several cases of genes
within other genes, particularly within introns of other genes, have
been described. This was first shown for the Gart gene of
Drosophila, which contains a gene for a pupal cuticle protein
in one of its introns
(27) . Last, a few examples are known where
transcription initiation sites within a gene lead to the formation of
smaller polypeptides. For example, a number of dystrophin-related
proteins have been described which are encoded by mRNAs transcribed
from the same gene by the use of an alternative internal
promoter
(28) . The common gene for the bee venom constituents
MCD peptide and apamin apparently represents a new variation on the
theme of gene organization with some similarity to these other cases.
It could have originated from a primordial MCD precursor gene
through, e.g. unequal crossing over, whereby the promoter
region, two exons and two introns of this gene were duplicated.
Transcription from both promoters then yields the two mRNAs encoding
different components of honeybee venom.
and a hyaluronidase, and the amino acid sequence of these
proteins has been
determined
(3, 4, 5, 6) . The main
constituent of bee venom is melittin which represents 50-60% of
the dry weight. This is a lytic peptide that forms an amphipathic helix
(7) and inserts into phospholipid bilayers. Other peptides, each
present at less than one-tenth the amount of melittin, are mast
cell-degranulating (hence the name MCD)
(
)
peptide, apamin, and secapin. MCD peptide, also called
peptide 401
(8) contains 22 amino acids (see Fig. 1). It
has a potent histamine-releasing activity
(1, 2) , and it
also acts as an anti-inflammatory agent
(8) . Upon injection into
the brain, MCD peptide elicits in a dose-dependent manner a variety of
symptoms ranging from arousal to convulsions
(9) . It was shown
subsequently that in rat brain MCD peptide binds with high affinity to
voltage-sensitive potassium
channels
(10, 11, 12) . Apamin, which contains 18
amino acids and shows some sequence similarity to MCD peptide (see
Fig. 1
), is a potent neurotoxin when administered by
intraventricular injection
(1, 13) . At nanomolar
concentration, apamin specifically inhibits a particular class of
calcium-dependent potassium channels
(14, 15) . Evidence
has been presented that mammalian brain contains a peptide with
biological activities similar to apamin
(16) .
Figure 1:
Amino
acid sequences of apamin (A) and MCD peptide (B).
Carboxyl-terminal amides are marked (NH). The
CNCK-sequences found in both peptides are underlined. The
arrow above the sequences indicates the length and orientation
of the oligonucleotides used for the PCR and screening
experiments.
Using cDNA
cloning techniques, the structure of the precursors of
melittin
(17) , phospholipase A(4) , and
hyaluronidase
(6) have been elucidated. Here we present the
sequence of the cloned cDNAs and of the gene encoding the precursors of
MCD peptide and apamin. The sequence data suggest that the gene evolved
from a common ancestor through a rather unusual partial duplication.
Isolation of mRNA from Venom Glands and cDNA
Synthesis
Poly(A)-rich RNA was isolated from venom glands of
worker bees as previously described
(6) . First strand cDNA was
synthesized with 2.5 µg of RNA, the primer-adaptor
TGATTCAGGATCCTATCGA(T) and reverse transcriptase
(Superscript, Life Technologies, Inc.) using a modified version of the
RACE protocol
(18) . The reaction was diluted 25-fold with TE
buffer (10 mM Tris-Cl, 1 mM EDTA, pH 8.0). This cDNA
pool was then used for the polymerase chain reaction (PCR) experiments.
For amplification of the 3`-ends of the cDNAs derived from the mRNAs
encoding the precursors of apamin and MCD peptide, degenerate
oligonucleotides encoding the last four amino acids plus the glycine
residue required for amidation were used. These were Apa2 coding for
Cys-Gln-Gln-His-Gly (TGT/C-CAA/G-CAA/G-CAT/C-GG, 14 mer) and MCD2
containing the codons for the peptide Cys-Gly-Lys-Asn-Gly
(TGT/C-GGX-AAA/G-AAT/C-GG, 14 mer, X stands for all
four bases). The conditions used for the two separate PCRs were the
following. In a total volume of 50 µl, 10 µl of cDNA pool, 50
pmol of either Apa2 or MCD2, and 25 pmol of the adaptor Ada2
(TGATCAGGATCCTATCG, 17 mer), 10 mM Tris-HCl (pH 9.0), 50
mM KCl, 0.01% gelatin, 1.5 mM magnesium chloride,
0.1% Triton X-100, 0.2 mM dNTPs (Pharmacia) and 0.2 unit of
Hi-Taq polymerase (Vienna Laboratories) were mixed. After 30
cycles (40 s at 92 °C, 1 min at 50 °C (for MCD2), or 51 °C
(for Apa2), and 1 min at 72 °C), an
150-bp fragment was
amplified from the PCR in the presence of the Apa2 primer and a
somewhat larger one from the other reaction. The fragments were eluted
from the agarose gel, phosphorylated, blunt ended, and subcloned into
the pBluescript vector (Stratagene). Both strands of the cloned cDNAs
were sequenced by the chain termination method using the Sequenase 2.0
kit (U. S. Biochemical Corp.).
Preparation and Screening of a cDNA Library
The
PCR fragments were labeled with [-
P]dATP
and [
-
P]dCTP using the Klenow polymerase
and a random primed DNA labeling kit (Boehringer Mannheim). A cDNA
library prepared from venom glands of worker bees
(6) was
screened with these fragments, and 24 positive clones were selected for
further analysis.
Isolation of Genomic Fragments
Genomic DNA was
isolated as described by John et al.(19) . Bee pupae
were homogenized in 10 ml of solution A (10 mM Tris-HCl, pH
7.6, 10 mM KCl, 10 mM magnesium chloride) containing
1.2% (v/v) Nonidet P-40. Nuclei were spun down and lysed in solution B
(solution A containing 0.5 M NaCl and 0.5% SDS). After
successive extraction with phenol (saturated with 1 M
Tris-HCl, pH 8.0), phenol-chloroform-isoamyl alcohol (24:24:1), and
chloroform-isoamyl alcohol (24:1), the DNA was precipitated with two
volumes of ethanol. The washed DNA was then dissolved in TE buffer at a
concentration of 0.2 µg/µl and stored at 4 °C.
Isolation of 5`-Upstream Regions
The promoter
region preceding the first exon of the gene for the MCD peptide
precursor was isolated by using inverse PCR technology
(20) .
Genomic DNA was digested with Sau3A, extracted with
phenol-chloroform-isoamyl alcohol (24:24:1) and chloroform-isoamyl
alcohol (24:1) and then precipitated. Self-ligation was performed under
conditions favoring the formation of circles rather than concatamers,
i.e. less than 1 µg of DNA/ml, incubation at 16 °C for
15 h. The reaction was stopped by extraction with
phenol-chloroform-isoamyl alcohol (24:24:1) and chloroform-isoamyl
alcohol (24:1), and DNA was precipitated with isopropyl alcohol and
dissolved in TE buffer (25 µg/ml). The primers used for PCR were
IP-1 (TACCATCGTCGGTGTTAC, nucleotides 79-62 of fragment AMH2,
antisense) and IP-2 (AGACGTTGTCAACAGCAT, nucleotides 1128-1145 of
AMH2, sense orientation). In a final volume of 50 µl, 2 µg of
the religated DNA and 25 pmol each of IP-1 and IP-2 were incubated with
Hi-Taq polymerase as described above. After 35 cycles (40 s at
92 °C, 40 s at 55 °C, 3 min at 72 °C), 10 µl of the
reaction mixture were loaded onto an agarose gel, blotted, and
hybridized with labeled AMG-3 primer. Labeling of AMG-3 was performed
with T4 polynucleotide kinase in the presence of
[-
P]ATP. The hybridization conditions were
as follows: 6
SSPE (1
SSPE = 150 mM
NaCl, 10 mM sodium phosphate, 1 mM EDTA) and 0.1% SDS
at 30 °C for 90 min in the presence of about 10
cpm/ml
of labeled AMG-3. Filters were washed three times with 6
SSPE
and 0.1% SDS for 10 min at 35 °C. Under these conditions, a
fragment containing about 1.4 kb hybridized with AMG-3. One-half of the
PCR product was then loaded onto an agarose gel, and the barely visible
band at 1.4 kb was eluted (gene clean II, BIO 101). This fragment was
reamplified under the same conditions for 20 cycles, purified,
blunt-ended with T4 DNA polmerase (BioLabs), and subcloned into the
pBluescript vector. For sequence analysis, the Sequenase kit and a set
of appropriate primers were used.
Sequence Analysis
For sequence comparisons, the
Gene Works Program, version 2.3.1 (IntelliGenetics Inc.) was used on an
Apple computer. Nucleotide sequences of cDNAs and genomic fragments
have been deposited in the data base.
The Apamin Precursor
From poly(A)-rich RNA
isolated from worker bee venom glands, cDNA was synthesized with the
oligo(dT) adapter as primer for the reverse transcriptase
reaction
(23) . A degenerate oligonucleotide derived from the
carboxyl-terminal pentapeptide of apamin (see Fig. 1A)
was then used for the synthesis of the second strand. The nucleotide
sequence of the 149-bp fragment thus obtained started at the 5`-end
with the codons for the carboxyl end of apamin followed by a glycine
and an in-frame stop codon (see Fig. 2A). This fragment
was then labeled and used to screen a cDNA library prepared from worker
bee venom glands. Two dozen positive clones were selected and
rescreened with a degenerate oligonucleotide derived from the
amino-terminal sequence Cys-Asn-Cys-Lys-Ala-Pro of apamin. About every
other clone gave a positive signal in this second screening. Two of
these clones, Apamin-3 and Apamin-6 containing inserts of
450 and 300 bp, respectively, were investigated further. The nucleotide
sequence of the insert present in clone Apamin-6 is shown in
Fig. 2A. Starting with the first ATG codon, this
sequence contains a single open reading frame which encodes
prepro-apamin comprising 46 amino acids. The cloned cDNA terminates at
the 3`-end with a poly(A) segment which is preceded by the
polyadenylation signal AATAAA. The following segments can be discerned
in the precursor: (i) a signal sequence which probably terminates after
serine 19; (ii) a proregion of 8 residues which ends in proline; (iii)
the sequence of apamin; and (iv) a glycine residue required for
formation of the carboxyl-terminal amide.
Figure 2:
Nucleotide sequence of cloned cDNAs
encoding the precursors of apamin (A) and MCD peptide
(B). The deduced amino acid sequence of the precursors is
shown in the single letter code above the nucleotide sequence.
The predicted end of the signal peptides (29) and the end of the
proregion are marked (), as well as the stop codons (///). The
arrow (
) indicates the beginning of the segment where
the sequences of both cDNAs are identical. The polyadenylation signals
close to the 3`-ends are
underlined.
The nucleotide sequence of
the insert present in clone Apa3 was found to be largely identical to
that of Apa6, except that an intron of 146 bp was present between the
Met-Pro codons immediately preceding the sequence of apamin (see
Fig. 2A). This intron is very rich in A + T (86%).
The Precursor for MCD Peptide
Using the RACE
protocol
(18) combined with the polymerase chain reaction, a
fragment derived from the 3`-end of the mRNA for the precursor of MCD
peptide was obtained. Surprisingly, the nucleotide sequence of this
fragment was found to be identical to the 3`-end of the cloned cDNAs
encoding the apamin precursor (Fig. 2B). In fact, most
of the clones eliminated in the screen with the oligonucleotide
specific for apamin (see above) were found to contain inserts encoding
the precursor of MCD peptide. The inserts present in three of these
clones were sequenced, and the results obtained with clone mcd-1 are
shown in Fig. 2B. After the first initiation codon, this
sequence had an open reading frame encoding a polypeptide of 50 amino
acids. The precursor of MCD peptide contains a signal peptide of 19
amino acids, a propart of 8 resides terminating in serine, the sequence
of mature MCD peptide and a carboxyl-terminal glycine from which the
terminal amide is derived.
Analysis of the Genes Encoding the Precursors of Apamin
and MCD Peptide
Starting with genomic DNA, PCR experiments were
performed to amplify parts of both genes simultaneously. Two
oligonucleotides, AMG-5 and AMG-3, were synthesized that bind to
regions close to the 5`- and the 3`-ends of both cDNAs, respectively
(see ``Experimental Procedures''). Under suitable conditions,
two fragments containing about 400 bp (AMH-1) and 1200 bp (AMH-2) were
amplified. AMH-1 contains the three exons encoding the apamin
precursor. Of the two introns, one was already known from the
apamin-3 clone, while the second is located precisely in front of
the common 3`-ends of the cDNAs. Sequence analysis of AMH-2 gave
another surprising result. Starting from the 5`-end, this sequence
comprises three exons of the precursor of the MCD peptide, interrupted
by two short introns, then a larger intron of 400 bp, and finally the
sequence of AMH-1. The genomic sequence encoding the two precursors is
schematically shown in Fig. 3.
Figure 3:
Schematic representation of the genes for
the MCD peptide and apamin precursors. The structure of the genes was
deduced from the sequence of AMH1 and AMH2. Exons are drawn as
rectangles, introns as a thick line. The stop codons
in two exons are marked by dots. Above and below the genomic
sequence, the postulated splicing patterns yielding the mature mRNAs
for the two precursors are indicated.
The gene for the precursor of
the MCD peptide contains two introns at the same positions as the
apamin precursor gene. All these introns are rich in A +
T (84-88%). The second intron of both genes contains 81 bp, of
which 51 (63%) are identical. The first introns of the two genes are,
however, of different length, 127 bp for the apamin and 100 bp for the
MCD peptide gene, respectively, and only limited sequence
similarity close to the ends is discernible.
The 5`-Upstream Region of the MCD Peptide Gene
The
intron preceding the first exon of the apamin precursor contains a
putative TATA box (see ``Discussion''). For the analysis of
the corresponding region preceding the first exon of the gene for the
MCD peptide precursor, we used the inverse PCR
(25) .
Genomic bee DNA was digested with Sau3A and religated under
conditions favoring the formation of circles. Using the
oligonucleotides IP-1 and IP-2 (see ``Experimental
Procedures'') several fragments were amplified including the
1.4-kb fragment IPS-1.4. This fragment contained on the two sides of
the GATC sequence recognized by the Sau3A restriction
endonuclease about 330 bp derived from the 5`-side and 800 bp from
the 3`-side of the genes for the two precursors. In Fig. 4, this
5`-sequence is compared with the sequence of the region that precedes
the first exon of the apamin gene. The two sequences are 81% identical
and both contain a TATA box 72 bp (MCD gene) or 76 bp
(apamin gene) upstream from the ATG initiating codon. This
provides further support for the assumption that the two genes
originated from a partial duplication of a primordial gene.
Figure 4:
Comparison of 5`-regions of the two genes.
The genomic nucleotide sequences preceding the initiating methionine of
the precursors for MCD peptide (upper lines) and apamin
(lower lines) are shown. The latter sequence starts with the
end of the third exon (underlined) of the MCD precursor gene
(see Fig. 3), followed by the sequence of the following intron. The
start of the cloned cDNAs is indicated by (<), and putative TATA
boxes are marked by dots above the sequence. Identities are
marked (filled squares); gaps (-) were introduced to maximize
homology.
(
)
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.