(Received for publication, March 3, 1997, and in revised form, June 10, 1997)
From the Institute of Biological Chemistry, and Department of Biochemistry and Biophysics, Washington State University, Pullman, Washington 99164-6340
Grand fir (Abies grandis) has been
developed as a model system for studying defensive oleoresin formation
in conifers in response to insect attack or other injury. The
turpentine fraction of the oleoresin is a complex mixture of
monoterpene (C10) olefins in which ()-limonene and
(
)-
- and (
)-
-pinene are prominent components; (
)-limonene
and (
)-pinene synthase activities are also induced upon stem
wounding. A similarity based cloning strategy yielded three new
cDNA species from a wounded stem cDNA library that appeared to
encode three distinct monoterpene synthases. After expression in
Escherichia coli and enzyme assay with geranyl diphosphate as substrate, subsequent analysis of the terpene products by chiral phase gas chromatography and mass spectrometry showed that these sequences encoded a (
)-limonene synthase, a myrcene synthase, and a
(
)-pinene synthase that produces both
-pinene and
-pinene. In
properties and reaction stereochemistry, the recombinant enzymes resemble the corresponding native monoterpene synthases of
wound-induced grand fir stem. The deduced amino acid sequences
indicated the limonene synthase to be 637 residues in length (73.5 kDa), the myrcene synthase to be 627 residues in length (72.5 kDa), and the pinene synthase to be 628 residues in length (71.5 kDa); all of
these monoterpene synthases appear to be translated as preproteins bearing an amino-terminal plastid targeting sequence. Sequence comparison revealed that these monoterpene synthases from grand fir
resemble sesquiterpene (C15) synthases and diterpene
(C20) synthases from conifers more closely than other
monoterpene synthases from angiosperm species. This similarity between
extant monoterpene, sesquiterpene, and diterpene synthases of
gymnosperms is surprising since functional diversification of this
enzyme class is assumed to have occurred over 300 million years ago.
Wound-induced accumulation of transcripts for monoterpene synthases was
demonstrated by RNA blot hybridization using probes derived from the
three monoterpene synthase cDNAs. The availability of cDNA
species encoding these monoterpene synthases will allow an
understanding of the regulation of oleoresin formation in conifers and
will ultimately permit the transgenic manipulation of this defensive
secretion to enhance resistance to insects. These cDNAs also
furnish tools for defining structure-function relationships in this
group of catalysts that generate acyclic, monocyclic, and bicyclic
olefin products.
Chemical defense of conifer trees against bark beetles and their
associated fungal pathogens relies primarily upon constitutive and
inducible oleoresin biosynthesis (1, 2). This defensive secretion is a
complex mixture of monoterpene and sesquiterpene olefins (turpentine)
and diterpene resin acids (rosin) that is synthesized constitutively in
the epithelial cells of specialized structures, such as resin ducts and
blisters or, in the case of induced oleoresin formation, in
undifferentiated cells surrounding wound sites (3). The volatile
fraction of conifer oleoresin, which is toxic to both bark beetles and
their fungal associates (4), may consist of up to 30 different
monoterpenes (5), including acyclic types (e.g.
myrcene), monocyclic types (e.g. limonene), and bicyclic
types (e.g. pinenes) (Fig. 1).
Although the oleoresin is toxic, many bark beetle species nevertheless employ turpentine volatiles in host selection and can convert various
monoterpene components into aggregation or sex pheromones to promote
coordinated mass attack of the host (2, 6). In grand fir (Abies
grandis), increased formation of oleoresin monoterpenes, sesquiterpenes, and diterpenes is induced by bark beetle attack (3, 7,
8), and this inducible defense response is mimicked by mechanically
wounding sapling stems (3, 8, 9). Therefore, grand fir has been
developed as a model system to study the biochemical and molecular
genetic regulation of constitutive and inducible terpene biosynthesis
in conifers (10).
Most monoterpenes are derived from geranyl diphosphate, the ubiquitous
C10 intermediate of the isoprenoid pathway, by synthases that catalyze the divalent metal ion-dependent ionization
(to 1) and isomerization of this substrate to enzyme-bound linalyl diphosphate which, following rotation about C2-C3, undergoes a
second ionization (to 2) followed by cyclization to the -terpinyl cation, the first cyclic intermediate en route to both monocyclic and bicyclic products (11, 12) (Fig. 1). Acyclic monoterpenes, such as myrcene, may arise by deprotonation of
carbocations 1 or 2, whereas the isomerization
step to linalyl diphosphate is required in the case of cyclic types,
such as limonene and pinenes, which cannot be derived from geranyl
diphosphate directly because of the geometric impediment of the
trans-double bond at C2-C3 (11, 12). Many monoterpene
synthases catalyze the formation of multiple products, including
acyclic, monocyclic, and bicyclic types, by variations on this basic
mechanism (13-15). For example, (
)-limonene synthase, the principal
monoterpene synthase of spearmint (Mentha spicata) and
peppermint (Mentha × piperita), produces small
amounts of myrcene, (
)-
-pinene and (
)-
-pinene in addition to
the monocyclic product (16, 17). Conversely, six different inducible
monoterpene synthase activities have been demonstrated in extracts of
wounded grand fir stem (18) indicating that formation of acyclic,
monocyclic, and bicyclic monoterpenes in this species involves several
genes encoding distinct catalysts. The inducible (
)-pinene synthase
has been purified (19) and isotopically sensitive branching experiments
employed to demonstrate that this enzyme synthesizes both (
)-
- and
(
)-
-pinene (20).
Deciphering the molecular genetic control of oleoresinosis and
examining structure-function relationships among the monoterpene synthases of grand fir require isolation of the cDNA species
encoding these key enzymes. Although a protein-based cloning strategy
was recently employed to acquire a cDNA for the major
wound-inducible diterpene synthase from grand fir, abietadiene synthase
(9, 21, 22), all attempts at the reverse genetic approach to cloning of
grand fir monoterpene synthases have failed (10). As an alternative, a
similarity based PCR1
strategy was developed (10) that employed sequence information from
terpene synthases of angiosperm origin, namely a monoterpene synthase,
()-4S-limonene synthase, from spearmint (M. spicata, Lamiaceae) (17), a sesquiterpene synthase,
5-epi-aristolochene synthase, from tobacco (Nicotiana
tabacum, Solanaceae) (23), and a diterpene synthase, casbene
synthase, from castor bean (Ricinus communis,
Euphorbiaceae) (24).
In this paper, we describe the successful application of this strategy
to the amplification of specific hybridization probes and their use in
the isolation of six new "terpene synthase-like" cDNAs. Three
of the full-length clones were functionally expressed in
Escherichia coli and thereby identified as myrcene synthase, ()-limonene synthase, and a pinene synthase that produces both (
)-
- and (
)-
-pinene (Fig. 1). This is the first report of the
isolation of any cDNA encoding a monoterpene synthase from a
gymnosperm, and the first report to describe the cloning of several
different monoterpene synthases (for acyclic, monocyclic, and bicyclic
products) from a single plant species. Sequence comparison revealed
significantly greater conservation between the grand fir monoterpene
synthases and other gymnosperm terpene synthases than with angiosperm
terpene synthases, and targeted a number of highly conserved amino acid
residues for further study. Additionally, Northern hybridization
analysis demonstrated that induced oleoresinosis in grand fir is
regulated at the level of monoterpene synthase RNA accumulation.
[1-3H]Geranyl diphosphate (250 Ci/mol)
(25), [1-3H]farnesyl diphosphate (125 Ci/mol) (26), and
[1-3H]geranylgeranyl diphosphate (120 Ci/mol) (21) were
prepared as described previously. Terpenoid standards were from our own collection. All other biochemicals and reagents were purchased from
Sigma or Aldrich, unless otherwise noted. Construction of the ZAP II
cDNA library, using mRNA isolated from wounded grand fir
sapling stems, was described previously (22).
Based on comparison of sequences
of limonene synthase from spearmint (17),
5-epi-aristolochene synthase from tobacco (23), and casbene
synthase from castor bean (24), four conserved regions were identified
for which a set of consensus degenerate primers (primers A-D) were
synthesized. Primers A-C have been described previously (10); primer D
(see Fig. 2) was designed based on the
conserved amino acid sequence motif DD(T/I)(I/Y/F)D(A/V)Y(A/G) of the
above noted terpene synthases (17, 23, 24). The sequence of sense
primer D was 5-GA(C/T) GA(C/T) III T(T/A)(T/C) GA(C/T) GCI (C/T)A(C/T)
GG-3
. Each of the sense primers, A, B, and D, was used for PCR in
combination with antisense primer C by employing a broad range of
amplification conditions. PCR was performed in a total volume of 50 µl containing 20 mM Tris/HCl (pH 8.4), 50 mM
KCl, 5 mM MgCl2, 200 µM each
dNTP, 1-5 µM each primer, 2.5 units of Taq
polymerase (Life Technologies, Inc.), and 5 µl of purified grand fir
stem cDNA library phage as template (1.5 × 109
plaque-forming units/ml). Analysis of the PCR reaction products by
agarose gel electrophoresis (27) revealed that only the combination of
primers C and D generated a specific PCR product of approximately 110 bp. This PCR product was gel-purified, ligated into pT7Blue (Novagen),
and transformed into E. coli XL1-Blue cells. Plasmid DNA was
prepared from 41 individual transformants, and the inserts were
sequenced (DyeDeoxy Terminator Cycle Sequencing, Applied Biosystems).
Four different insert sequences were identified and were designated as
probes 1, 2, 4, and 5. Subsequent isolation of four new cDNA
species, encoding terpene synthases from grand fir corresponding to
these probes, allowed the identification of three additional conserved
sequence elements which were used to design a set of three new PCR
primers.
Degenerate primer E (designed to conserved element GE(K/T)(V/I)M(E/D)EA
(see Fig. 2)) and degenerate primer F (designed to conserved element
Q(F/Y/D)(I/L)(T/L/R)RWW) were based on comparison of the sequences of
five cloned terpene synthases from grand fir as follows: a monoterpene
synthase corresponding to probe 2, two sesquiterpene
synthases2 corresponding to
probe 4 and probe 5, respectively, a previously described diterpene
synthase (22), and a truncated terpene
synthase3 corresponding to
probe 1. The sequence of sense primer E was 5-GGI GA(A/G) A(A/C)(A/G)
(A/G)TI ATG GA(A/G) GA(A/G) GC-3
and of sense primer F was 5
-GA(A/G)
(C/T)TI CA(G/A) (C/T)TI (A/C/T)(C/G/T)I (A/C)GI TGG
TGG-3
. Degenerate primer G (see Fig. 2) was designed according to the
amino acid sequence DVIKG(F/L)NW obtained from a peptide generated by
trypsin digestion of purified (
)-pinene synthase from grand
fir.4 The sequence of
antisense primer G was 5
-CCA (A/G)TT IA(A/G) ICC (C/T)TT IAC
(A/G)TC-3
. Primers E and F were independently used for PCR
amplification in combination with primer G, with grand fir
stem cDNA library as template. The combination of primers E and G
yielded a specific PCR product of approximately 1020 bp. This PCR
product was ligated into pT7Blue and transformed into E. coli XL1-Blue. Plasmid DNA was prepared from 20 individual transformants, and inserts were sequenced from both ends. The sequence
of this 1022-bp insert was identical for all 20 plasmids and was
designated as probe 3.
For library screening, 100 ng of each
probe (1 through 5) was amplified by PCR, gel purified, randomly
labeled with [-32P]dATP (28), and used individually to
screen replica filters of 105 plaques of the wound-induced
grand fir stem cDNA library plated on E. coli LE392.
Hybridization with probes 1, 2, 4, and 5 was performed for 14 h at
65 °C in 3 × SSPE and 0.1% SDS. Filters were washed three
times for 10 min at 55 °C in 3 × SSPE with 0.1% SDS and
exposed for 12 h to Kodak XAR film at
70 °C (27). All of the
ZAPII clones yielding positive signals were purified through a
second round of hybridization (probe 1 gave 25 positives, probe 2 gave
16 positives, probe 4 gave 49 positives, and probe 5 gave 12 positives). Hybridization with probe 3 was performed as before, but the
filters were washed three times for 10 min at 65 °C in 3 × SSPE and 0.1% SDS before exposure. Approximately 400
ZAPII clones
yielded strong positive signals, and 34 of these were purified through
a second round of hybridization at 65 °C. Approximately 400 additional clones yielded weak positive signals with probe 3, and 18 of
these were purified through a second round of hybridization for 20 h at 45 °C. Purified
ZAP II clones isolated using all five probes
were in vivo excised as Bluescript II SK
phagemids and
transformed into E. coli XLOLR according to the
manufacturer's instructions (Stratagene). The size of each cDNA
insert was determined by PCR using T3 and T7 promoter primers, and
selected inserts (>1.5 kb) were partially sequenced from both
ends.
Except for
cDNA clones pAG3.18 and pAG3.48, all of the partially sequenced
inserts were either truncated at the 5-end, or were out of frame, or
bore premature stop codons upstream of the presumptive methionine start
codon. For the purpose of functional expression, a 2001-bp insert
fragment from plasmid pAG2.2 and a 1903-bp insert fragment from pAG3.18
were subcloned in frame into pGEX vectors (Pharmacia Biotech Inc.). A
2046-bp insert fragment from pAG10 was subcloned in frame into the
pSBETa vector (29). To introduce suitable restriction sites for
subcloning, fragments were amplified by PCR using primer combinations
2.2-BamHI (5
-CAA AGG GAT CCA GAA TGG CTC
TGG-3
) and 2.2-NotI (5
-AGT AAG CGG CCG CTT TTT
AAT CAT ACC CAC-3
) with pAG2.2 as template, 3.18-EcoRI
(5
-CTG CAG GAA TTC GGC ACG AGC-3
) and
3.18-SmaI (5
-CAT AGC CCC GGG CAT AGA TTT GAG
CTG-3
) with pAG3.18, and 10-NdeI (5-GGC AGG AAC ATA
TGG CTC TCC TTT CTA TCG-3
) and 10-BamHI (5
-TCT AGA
ACT AGT GGATCC CCC GGG CTG CAG-3
with pAG10. PCR reactions
were performed in volumes of 50 µl containing 20 mM
Tris/HCl (pH 8.8), 10 mM KCl, 10 mM
(NH4)2SO4, 2 mM
MgSO4, 0.1% Triton X-100, 5 µg of bovine serum albumin,
200 µM each dNTP, 0.1 µM each primer, 2.5 units of recombinant Pfu polymerase (Stratagene), and 100 ng
of plasmid DNA with the following program: denaturation at 94 °C, 1 min; annealing at 60 °C, 1 min; extension at 72 °C, 3.5 min; 35 cycles with final extension at 72 °C, 5 min. The PCR products were
purified by agarose gel electrophoresis and used as template for a
secondary PCR amplification with the identical conditions in total
volumes of 250 µl each. Products from this secondary amplification
were digested with the above indicated restriction enzymes, purified by
ultrafiltration, and then ligated, respectively, into
BamHI/NotI-digested pGEX-4T-2 to yield plasmid
pGAG2.2, into EcoRI/SmaI-digested pGEX-4T-3 to
yield plasmid pGAG3.18, and into
NdeI/BamHI-digested pSBETa to yield plasmid
pSBAG10; these plasmids were then transformed into E. coli
XL1-Blue or E. coli BL21(DE3).
For expression, bacterial strains E. coli XLOLR/pAG3.18,
E. coli XLOLR/pAG3.48, E. coli XL1-Blue/pGAG2.2,
E. coli XL1-Blue/pGAG3.18, and E. coli
BL21(DE3)/pSBAG10 were grown to A600 = 0.5 at
37 °C in 5 ml of LB medium (27) supplemented with 100 µg of
ampicillin/ml or 30 µg of kanamycin/ml as determined by the vector.
Cultures were then induced by addition of 1 mM
isopropyl-1-thio--D-galactopyranoside and grown for
another 12 h at 20 °C. Cells were harvested by centrifugation (2000 × g, 10 min) and resuspended in either 1 ml of
monoterpene synthase assay buffer (50 mM Tris/HCl (pH 7.5),
500 mM KCl, 1 mM MnCl2, 5 mM dithiothreitol, 0.05% (w/v) NaHSO3, and
10% (v/v) glycerol), 1 ml of sesquiterpene synthase assay buffer (10 mM dibasic potassium phosphate, 1.8 mM
monobasic potassium phosphate (pH 7.3), 140 mM NaCl, 10 mM MgCl2, 5 mM dithiothreitol,
0.05% (w/v) NaHSO3, and 10% (v/v) glycerol), or 1 ml of
diterpene synthase assay buffer (30 mM Hepes (pH 7.2), 7.5 mM MgCl2, 5 mM dithiothreitol, 10 µM MnCl2, 0.05% (w/v) NaHSO3,
and 10% (v/v) glycerol). Cells were disrupted by sonication
(Braun-Sonic 2000 with microprobe at maximum power for 15 s at
0-4 °C); the homogenates were cleared by centrifugation
(18,000 × g, 10 min), and 1 ml of the resulting supernatant was assayed for monoterpene synthase activity with 2.5 µM [1-3H]geranyl diphosphate, for
sesquiterpene synthase activity with 3.5 µM
[1-3H]farnesyl diphosphate, or for diterpene synthase
activity with 5 µM [1-3H]geranylgeranyl
diphosphate following standard protocols (11, 21, 26). In the case of
the monoterpene synthase and sesquiterpene synthase assays, the
incubation mixture was overlaid with 1 ml of pentane to trap volatile
products. In all cases, after incubation at 31 °C for 2 h, the
reaction mixture was extracted with pentane (3 × 1 ml), and the
combined extract was passed through a 1.5-ml column of anhydrous
MgSO4 and silica gel (Mallinckrodt 60 Å) to provide the
terpene hydrocarbon fraction free of oxygenated metabolites. The
columns were subsequently eluted with 3 × 1 ml of ether to collect any oxygenated products, and an aliquot of each fraction was
taken for liquid scintillation counting to determine conversion rate.
To obtain sufficient product for analysis by radio-GLC, chiral capillary GLC, and GLC-MS, preparative-scale enzyme incubations were carried out. Thus, the enzyme was prepared from 50 ml of cultured bacterial cells by extraction with 3 ml of assay buffer as above, and the extracts were incubated with excess substrate overnight at 31 °C. The hydrocarbon fraction was isolated by elution through MgSO4-silica gel as before, and the pentane eluate was concentrated for evaluation by capillary radio-GLC as described (30), by chiral column capillary GLC (5), and by combined GLC-MS (Hewlett-Packard 6890 GC-MSD with cool (40 °C) on-column injection, detection via electron impact ionization (70 eV), helium carrier at 0.7 p.s.i., column: 0.25-mm inner diameter × 30-m fused silica with 0.25-µm film of 5MS (Hewlett-Packard) programmed from 35 °C (5 min hold) to 230 °C at 5 °C/min).
Sequence AnalysisInserts of all recombinant bluescript plasmids, pAG1.28, pAG2.2, pAG3.18, pAG3.48, pAG4.30, pAG5.9, and pAG10, and inserts of all recombinant pGEX plasmids, pGAG2.2, pGAG3.18, and pSBAG10, were completely sequenced on both strands via primer walking and nested deletions (27) using the DyeDeoxy Terminator Cycle Sequencing method (Applied Biosystems). Sequence analysis was done using the Wisconsin Package version 9.0, Genetics Computer Group (GCG), Madison, WI.
RNA Extraction and Northern BlottingGrand fir sapling stem
tissue was harvested prior to wounding or 2 days after wounding by a
standard procedure (18). Total RNA was isolated (31), and 20 µg of
RNA per gel lane was separated under denaturing conditions (27) and
transferred to nitrocellulose membranes (Schleicher and Schuell)
according to the manufacturer's protocol. To prepare hybridization
probes, cDNA fragments of 1.4-1.5 kb were amplified by PCR from
ag2.2 with primer JB29 (5-CTA CCA TTC CAA TAT CTG-3
) and
primer 2-8 (5
-GTT GGA TCT TAG AAG TTC CC-3
), from ag3.18
with primer 3-9 (5
-TTT CCA TTC CAA CCT CTG GG-3
) and primer 3-11
(5
-CGT AAT GGA AAG CTC TGG CG-3
), and from ag10 with
primer 7-1 (5
-CCT TAC ACG CCT TTG GAT GG-3
) and primer 7-3 (5
-TCT
GTT GAT CCA GGA TGG TC-3
). The probes were randomly labeled with
[
-32P]dATP (28). Blots were hybridized for 24 h
at 55 °C in 3 × SSPE and 0.1% SDS, washed at 55 °C in
1 × SSPE and 0.1% SDS, and subjected to autoradiography as
described above at
80 °C for 24 h.
Grand fir has been developed as a model system for the study of induced oleoresin production in conifers in response to wounding and insect attack (1, 2, 7, 10, 32). The chemistry and biosynthesis of the oleoresin monoterpenes, sesquiterpenes, and diterpenes have been well defined (5, 8, 9, 18, 19, 21, 33); however, structural analysis of the responsible terpene synthases as well as studies on the regulation of oleoresinosis require the isolation of cDNA species encoding the terpene synthases. Protein purification from conifers, as the basis for cDNA isolation, has been of limited success (22) and thus far has not permitted cloning of any of the monoterpene synthases from these species (10).
As a possible alternative to protein-based cloning of terpene
synthases, a homology-based PCR strategy was recently proposed (10)
that was founded upon the three terpene synthases of plant origin then
available, a monoterpene synthase, ()-(4S)-limonene synthase, from spearmint (M. spicata, Lamiaceae) (17), a
sesquiterpene synthase, 5-epi-aristolochene synthase, from
tobacco (N. tabacum, Solanaceae) (23), and a diterpene
synthase, casbene synthase, from castor bean (Ricinus
communis, Euphorbiaceae) (24). Despite the taxonomic distances
between these three angiosperm species and the differences in substrate
utilized, reaction mechanism, and product type of the three enzymes, a
comparison of the deduced amino acid sequences identified several
conserved regions that appeared to be useful for the design of
degenerate PCR primers (see Fig. 2 and "Experimental Procedures").
Using cDNA from a wound-induced grand fir stem library as template,
one set of primers (C and D) PCR-amplified products corresponding to
four distinct sequence groups, all of which showed significant
similarity to sequences of cloned terpene synthases of plant origin.
The four different inserts were designated as probes 1, 2, 4, and 5 and were employed for isolation of the corresponding cDNA clones by plaque hybridization.
Screening of 105 cDNA phage plaques from the wounded
grand fir stem library, with each of the four probes, yielded a 4-fold difference in the number of positives (see "Experimental
Procedures"), most likely reflecting different levels of expression
of the corresponding genes. Size-selected inserts (>1.5 kb) of
purified and in vivo excised clones were partially sequenced
from both ends and were shown to segregate into four distinct groups
corresponding to the four hybridization probes. Since all cDNAs
corresponding to probes 1, 4, and 5 were truncated at their 5-ends,
only inserts of the largest representatives of each group, clone
ag1.28, clone ag2.2 (apparently full length),
clone ag4.30, and clone ag5.9, were completely
sequenced. The sequences of clone ag1.28 (2414 bp, with an
ORF of 2350 nt encoding 782 amino acids), clone ag2.2 (2196 bp, with an ORF of 1881 nt encoding 627 amino acids), clone ag4.30 (2979 bp, with an ORF of 1731 nt encoding 577 amino
acids), and clone ag5.9 (1394 bp, with an ORF of 1194 nt
encoding 398 amino acids) were compared pairwise with each other and
with other cloned plant terpene synthases (Fig.
3). Truncated clone ag1.28 resembled most closely in size and sequence (72% similarity, 49% identity) a diterpene cyclase, abietadiene synthase, from grand fir
(22). Clones ag4.30 and ag5.9 share approximately
80% similarity (60% identity) at the amino acid level and are almost
equally distant from both clone ag1.28 and full-length clone
ag2.2 (range of 65-70% similarity and 45-47% identity);
the amino acid sequence similarity between ag1.28 and
ag2.2 is 65% (41% identity). Considering the high level of
homology between ag4.30 and ag5.9, these
comparisons suggest that the four new cDNAs, ag1.28,
ag2.2, ag4.30, and ag5.9, represent
the three major subfamilies of grand fir terpene synthase genes (Fig.
3) encoding monoterpene synthases, sesquiterpene synthases, and
diterpene synthases. Isolation of full-length clones corresponding to
ag1.28, ag4.30, and ag5.9, by
employing PCR-based rapid amplification of cDNA ends, and
functional identification of ag4.30 and ag5.9 as
two new sesquiterpene synthases will be described
elsewhere.2,3
Identification of cDNA Clone ag2.2 as Myrcene Synthase
The pAG2.2 insert appeared to be a full-length clone encoding a protein of molecular weight 72,478 with a calculated pI at 6.5 (Fig. 2). The size of the translated protein encoded by ag2.2 (627 residues) is in the range of the monoterpene synthase preproteins for limonene synthase from spearmint (17) and Perilla frutescens (34) but is about 240 amino acids shorter than the two gymnosperm diterpene synthase preproteins for abietadiene synthase (22) and taxadiene synthase (35). Monoterpene and diterpene biosyntheses are compartmentalized in plastids, whereas sesquiterpene biosynthesis is cytosolic (reviewed in Refs. 36-38); thus, monoterpene and diterpene synthases are encoded as preproteins bearing an amino-terminal transit peptide for import of these nuclear gene products into plastids where they are proteolytically processed to the mature forms (39, 40). Both the size of the deduced protein and the presence of an amino-terminal domain (of 60-70 amino acids) with features characteristic of a targeting sequence (rich in serine residues (16-18%) and low in acidic residues (four Asp or Glu) (39, 40)) suggest that ag2.2 encodes a monoterpene synthase rather than a sesquiterpene synthase or a diterpene synthase.
Since pAG2.2 contained the terpene synthase insert in reversed
orientation, the ORF was subcloned in frame with glutathione S-transferase, for ultimate ease of purification (41, 42), into pGEX-4T-2, yielding plasmid pGAG2.2. The recombinant fusion protein was expressed in E. coli strain XL1-Blue/pGAG2.2 and
then extracted and assayed for monoterpene synthase, sesquiterpene synthase, and diterpene synthase activity using tritium-labeled geranyl
diphosphate, farnesyl diphosphate, and geranylgeranyl diphosphate as
the respective substrate. Enzymatic production of a terpene olefin was
observed only with geranyl diphosphate as substrate, and the only
product was shown to be myrcene (Fig. 1) by radio-GLC and GLC-MS
comparison to an authentic standard (Fig.
4). Bacteria transformed with pGEX vector
containing the ag2.2 insert in antisense orientation did not
afford detectable myrcene synthase activity when induced, and the
protein was isolated and assayed as above. A myrcene synthase cDNA
has not been obtained previously from any source, although myrcene is a
minor co-product (2%) of the native and recombinant limonene synthase
from spearmint (16, 17) and of several enzymes from sage (15). cDNA
cloning and functional expression of myrcene synthase, which is one of several wound-inducible monoterpene synthase activities of grand fir
(18), demonstrates that this acyclic monoterpene is formed by a
distinct enzyme and is not a co-product of another synthase.
Identification of cDNA Clone ag3.18 as (
Alignment of the four new terpene synthase cDNA sequences (ag1.28, ag2.2, ag4.30, and ag5.9) and that for abietadiene synthase (22) allowed the identification of several conserved sequence motifs among this enzyme family from grand fir, which provided the foundation for an extended similarity based cloning approach. Two new sense primers E and F were designed according to conserved sequence elements, whereas a degenerate antisense primer G was designed based upon very limited amino acid sequence information from pinene synthase4 (see Fig. 2 and "Experimental Procedures"). Only the combination of primers E and G amplified a specific product of 1022 bp, which was designated as probe 3.
Hybridization of 105 grand fir ZAP II cDNA clones
with probe 3 yielded two types of signals comprised of about 400 strongly positive clones and an equal number of weak positives,
indicating that the probe recognized more than one type of cDNA.
Thirty-four of the former clones and 18 of the latter were purified,
the inserts were selected by size (2.0-2.5 kb), and the in
vivo excised clones were partially sequenced from both ends. Those
clones that afforded weak hybridization signals were shown to contain
inserts that were either identical to myrcene synthase clone
ag2.2 or exhibited no significant sequence similarity to
terpene synthases. Clone pAG3.48 contained the myrcene synthase ORF in
the correct orientation and in frame for expression from the Bluescript
plasmid vector. This cDNA was functionally expressed in E. coli, and the resulting enzyme was shown to accept only geranyl
diphosphate as the prenyl diphosphate substrate and to produce myrcene
as the exclusive reaction product. This finding with pAG3.48 confirms
that expression as the glutathione S-transferase fusion from
pGAG2.2 does not influence substrate utilization or product outcome of
the myrcene synthase.
Clones that gave strong hybridization signals segregated into distinct sequence groups represented by clone ag3.18 (2018-bp insert with ORF of 1884 nt; encoded protein of 628 residues at 71,505 Da and pI of 5.5) and ag10 (2084-bp insert with ORF of 1911 nt; encoded protein of 637 residues at 73,477 Da and pI of 6.4) (Fig. 2). ag3.18 and ag10 form a subfamily together with the myrcene synthase clone ag2.2 that is characterized by a minimum of 79% pairwise similarity (64% identity) at the amino acid level (Fig. 3). Like myrcene synthase, both ag3.18 and ag10 encode amino-terminal sequences of 60-70 amino acids that are rich in serine (19-22 and 11-15%, respectively) and low in acidic residues (4 and 2 residues, respectively) (Fig. 2) characteristic of plastid transit peptides (39, 40).
Plasmid pAG3.18 contained the presumptive terpene synthase ORF in frame
for direct expression from the Bluescript plasmid, whereas the
ag10 ORF was in reversed orientation. Both ag3.18 and ag10 were subcloned into expression vectors yielding
plasmids pGAG3.18 and pSBAG10. Recombinant proteins were expressed in
bacterial strain E. coli XLOLR/pAG3.18, E. coli
XL1-Blue/pGAG3.18, and E. coli BL21(DE3)/pSBAG10. When
extracts of the induced cells were tested for terpene synthase activity
with all of the potential prenyl diphosphate substrates, only geranyl
diphosphate was utilized. Extracts from E. coli
BL21(DE3)/pSBAG10 converted geranyl diphosphate to limonene as the
major product with lesser amounts of -pinene,
-pinene, and
-phellandrene, as determined by radio-GLC and combined GLC-MS (Fig.
5). Chiral phase capillary GLC on
-cyclodextrin revealed the limonene product to be the
(
)-(4S)-enantiomer and the pinene products to be the
related (
)-(1S,5S)-enantiomers. Although
optically pure standards were not available for the analysis, stereochemical considerations suggest that the minor product
-phellandrene is also the mechanistically related
(
)-(4S)-antipode (13, 14, 20, 43, 44). Similar analysis of
the monoterpene products generated from geranyl diphosphate by
cell-free extracts of E. coli XLOLR/pAG3.18 and E. coli XL1-Blue/pGAG3.18 demonstrated the presence of a 42:58%
mixture of
-pinene and
-pinene (Fig. 6), the same product ratio previously
described for the purified, native (
)-pinene synthase from grand fir
(19). Chiral phase capillary GLC confirmed the products of the
recombinant pinene synthase to be the
(
)-(1S,5S)-enantiomers, as expected. No other monoterpene co-products were detected with the recombinant (
)-pinene synthase, as observed previously for the native enzyme (19).
Evidence for the formation of both - and
-pinene by a single
enzyme has been previously provided through co-purification studies,
and differential inhibition and inactivation studies, as well as by
isotopically sensitive branching experiments (13, 20, 45, 46). The
cDNA cloning of pinene synthase provides the ultimate proof that a
single enzyme forms both products. The calculated molecular weight of
the (
)-pinene synthase deduced from ag3.18 is
approximately 64,000 (excluding the putative transit peptide), which
agrees well with the molecular weight of 63,000 established for the
native enzyme from grand fir by gel permeation chromatography and
SDS-polyacrylamide gel electrophoresis (19).
A limonene synthase cDNA has thus far been cloned only from two
very closely related angiosperm species (17, 34), and the isolation of
a pinene synthase cDNA has not been reported before. Pinene
synthase has previously received considerable attention as a major
defense-related monoterpene synthase in conifers (18, 19). In the grand
fir cDNA library, which was synthesized from mRNA obtained from
wound-induced sapling stems, clones corresponding to pinene synthase
are at least 10 times more abundant than clones for myrcene synthase.
This finding reflects the relative proportions of the induced levels of
activities of these enzymes in grand fir saplings; pinene synthase and
limonene synthase are the major monoterpene synthase activities,
whereas the induced level of myrcene synthase activity is relatively
low (18). The cDNAs for inducible monoterpene synthases provide
probes for genetic and molecular analysis of oleoresin-based defense in
conifers. Northern blots (Fig. 7) of
total RNA extracted from non-wounded sapling stems and from stems 2 days after wounding (when enzyme activity first
appears)5 were probed with
cDNA fragments for ag2.2, ag3.18, and
ag10 and thus demonstrated that increased mRNA
accumulation for monoterpene synthases is responsible for this induced,
defensive response in grand fir. The availability of cloned,
defense-related monoterpene synthases presents several possible avenues
for transgenic manipulation of oleoresin composition to improve tree
resistance to bark beetles and other pests. For example, altering the
monoterpene content of oleoresin may chemically disguise the host and
decrease insect aggregation by changing the levels of pheromone
precursors or predator attractants, or lower infestation by increasing
toxicity toward beetles and their pathogenic fungal associates (1, 2, 6).
Properties of the Recombinant Monoterpene Synthases
All three recombinant enzymes require Mn2+ for activity, and Mg2+ is essentially ineffective as the divalent metal ion cofactor. This finding confirms earlier results obtained with the native monoterpene synthases of grand fir and lodgepole pine (Pinus contorta) (19, 47). All terpene synthases and prenyltransferases are thought to employ a divalent metal ion, usually Mg2+ or Mn2+, in the ionization steps of the reaction sequence to neutralize the negative charge of the diphosphate leaving group (12, 48, 49), and all relevant sequences thus far obtained bear a conserved aspartate-rich element (DDXXD) considered to be involved in divalent metal ion binding (22, 50-54). In addition to this strict, general dependence on a divalent metal ion, the monoterpene synthases of conifers are unique in their further requirement for a monovalent cation (K+), a feature that distinguishes the gymnosperm monoterpene synthases from their counterparts from angiosperm species and implies a fundamental structural and/or mechanistic difference between these two families of catalysts (47). All three recombinant monoterpene synthases depend upon K+, with maximum activity achieved at approximately 500 mM KCl. A requirement for K+ has been reported for a number of different types of enzymes, including those that catalyze phosphoryl cleavage or transfer reactions (55) such as Hsc70 ATPase (56). The crystal structure of bovine Hsc70 ATPase indicates that both Mg2+ and K+ interact directly with phosphate groups of the substrate and implicates three active site aspartate residues in Mg2+ and K+ binding (56), reminiscent of the proposed role of the conserved DDXXD motif of the terpene synthases and prenyltransferases in divalent cation binding, a function also supported by recent site-directed mutagenesis (57-60) and by x-ray structural analysis (52) of farnesyl diphosphate synthase.
cDNA cloning and functional expression of the myrcene, limonene, and pinene synthases from grand fir represent the first example of the isolation of multiple synthase genes from the same species and provide tools for evaluation of structure-function relationships in the construction of acyclic, monocyclic, and bicyclic monoterpene products and for detailed comparison to catalysts from phylogenetically distant plants that carry out ostensibly identical reactions (13, 16, 17, 61). The recent acquisition of cDNA isolates encoding sesquiterpene synthases2 and diterpene synthases (22) from grand fir should, together with the monoterpene synthases, also permit addressing the structural basis of chain length specificity for prenyl diphosphate substrates in this family of related enzymes.
Sequence Comparison and a Proposed Gene NomenclaturePrevious studies based on substrate protection from inactivation with selective amino acid modifying reagents have implicated functionally important cysteine, histidine, and arginine residues in a range of different monoterpene synthases (16, 19, 47, 62, 63). Sequence alignment of 21 terpene synthases of plant origin (17, 22-24, 34, 35, 64-68) reveals two absolutely conserved arginine residues, corresponding to Arg184 and Arg365 of pinene synthase (Fig. 2), one highly conserved cysteine residue (pinene synthase Cys543), and one highly conserved histidine residue (pinene synthase His186). The DDXXD sequence motif (pinene synthase Asp379, Asp380, and Asp383) (Fig. 2) is absolutely conserved in all relevant plant terpene synthases, as are several other amino acid residues corresponding to Phe198, Leu248, Glu322, Trp329, Trp460, and Pro467 of pinene synthase.
Amino acid sequences of the plant terpene synthases were compared with each other and with the deduced sequences of several sesquiterpene synthases cloned from microorganisms (54, 69, 70). As with all other plant terpene synthases, no significant conservation in primary sequence exists between the monoterpene synthases from grand fir and the terpene synthases of microbial origin, except for the DDXXD sequence motif previously identified as a common element of all terpene synthases, and prenyltransferases that employ a related electrophilic reaction mechanism (45, 51, 71). The evidence is presently insufficient to determine whether extant plant and microbial terpene synthases represent divergent evolution from a common ancestor, which may also have given rise to the prenyltransferases, or whether these similar catalysts evolved convergently.
Comparative levels of identity among the known plant terpene synthases (17, 22-24, 34, 35, 64-68) are indicated in Fig. 3. This family of genes, which we suggest be denoted by the three-letter designation tps (terpene synthase), currently consists of six sub-groups with at least 40% amino acid identity between members. The sub-groups are ordered tpsa through tpsf corresponding to the priority of publication of a member sequence. Interestingly, the monoterpene synthases, sesquiterpene synthases, and diterpene synthases from gymnosperms (tpsd), including taxadiene synthase from Pacific yew (Taxus brevifolia) (35), are more closely related to each other than to their respective counterparts from angiosperms (Fig. 3). This pattern of segregation implying limited evolutionary change in these gymnosperm catalysts is noteworthy since fossilized oleoresin (amber) dating from the carboniferous period (72, 73) indicates that gymnosperm terpene synthases had undergone functional specialization over 300 million years ago.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U87908, U87909, and AF006193.
We thank Eva Katahira, Hiroko Ichii, David Williams, Michael Phillips, and Thomas Savage for technical assistance; Gerhard Munske of the Washington State University Laboratory for Bioanalysis and Biotechnology for primer synthesis and nucleotide sequencing; Douglas J. McGarvey for guidance in developing the gene nomenclature; and Joyce Tamura-Brown for typing the manuscript.