(Received for publication, November 17, 1994; and in revised form, January 30, 1995 )
From the
Expression of plant tetrapyrroles is high in photosynthetic
tissues and in legume root nodules in the form of chlorophyll and heme,
respectively. The universal tetrapyrrole precursor -aminolevulinic
acid (ALA) is synthesized from glutamate 1-semialdehyde (GSA) by GSA
aminotransferase in plants, which is encoded by gsa.
Immunoblot analysis showed that GSA aminotransferase was expressed in
soybean leaves and nodules, but not in roots, and that protein
correlated with enzyme activity. These observations indicate that GSA
aminotransferase expression is controlled in tetrapyrrole formation and
argue against significant activity of an enzyme other than the well
described aminotransferase for GSA-dependent ALA formation. gsa mRNA and protein were induced in soybean nodules, and their
activation was temporally intermediate between those of the respective
early and late genes enod2 and lb. A GSA
aminotransferase gene, designated gsa1, was isolated and
appears to be one of two gsa genes in the soybean genome. gsa1 mRNA accumulated to high levels in leaves and nodules,
but not in uninfected roots as discerned with a gsa1-specific
probe. Message levels were higher in leaves from etiolated plantlets
than in mature plants, and expression in the former was slightly
elevated by light. The expression pattern of gsa1 mRNA was
qualitatively similar to that of total gsa. The data strongly
suggest that gsa1 is a universal tetrapyrrole synthesis gene
and that a gsa gene specific for a tissue, tetrapyrrole, or
light condition is unlikely. The gsa1 promoter contained a
genetic element found in numerous Drosophila melanogaster genes; the so-called GAGA element displayed single-stranded
character in vitro and formed a complex with nuclear factors
from nodules and leaves but not from roots. From these observations we
infer that the GAGA element is involved in the transcriptional control
of gsa1.
The tetrapyrroles chlorophyll, heme, siroheme, and bilins are
expressed in plants for participation in numerous cellular processes,
and they are synthesized from the universal tetrapyrrole precursor ALA
(reviewed in (1) ). Chlorophyll is the most abundant
tetrapyrrole in plants, and the bulk of ALA synthesized in
photosynthetic tissues is incorporated into the chlorophyll ring.
Accordingly, some evidence shows that ALA formation in those tissues is
controlled by, or coordinated with, factors related to
photosynthesis(2, 3, 4, 5, 6, 7) .
Glutamate-dependent ALA ()formation occurs in plants by a
three-step mechanism termed the C
pathway; the latter two
steps are committed to ALA synthesis and are catalyzed by glutamyl-tRNA
reductase and glutamate 1-semialdehyde (GSA) aminotransferase,
respectively(1, 8) . Plant cDNA or genes encoding
C
pathway enzymes have been isolated from several sources (7, 9, 10) , (
)(
)and
from a green alga(6) . Radiolabel from
[1-
C]glutamate is incorporated into
mitochondrial heme a as well as into plastid chlorophyll in etiolated
seedlings of maize which, along with an absence of ALA synthase
activity, indicates that higher plants use the C
pathway
for synthesis of all tetrapyrroles, at least in photosynthetic
tissue(11) .
Legume root nodules are specialized plant
organs elicited by rhizobia bacteria that contain a large quantity of
heme for the prosthetic group of plant hemoglobin (reviewed in (12) ), but lack chlorophyll. A soybean glutamate-dependent ALA
synthesis activity is induced in nodules(13, 14) , as
well as other genes encoding heme pathway enzymes (15, 16) , but no plant ALA synthase activity is
detectable. In addition, soybean nodule cDNA encoding GSA
aminotransferase was isolated, and both enzyme activity and mRNA are
induced in the symbiotic tissue(10) . These data strongly
support the universality of the C pathway in higher plants
with respect to the tetrapyrrole formed and its distribution in tissues
where ALA formation can be discerned.
It is not certain whether the
C pathway itself is heterogeneous with regard to the
enzymes that catalyze a given step or to the number of functional genes
that encode an enzyme. Evidence for two enzymes with glutamyl-tRNA
reductase activity are described in two bacterial
species(17, 18, 19) , and although no similar
situation has been reported in plants, separate ALA pools for heme and
chlorophyll synthesis have been proposed in plant
chloroplasts(20) . A plant gene and cDNA encoding glutamyl-tRNA
reductase (hemA) have been isolated from Arabidopsis
thaliana(7) , and multiple copies of hemA are
inferred. Two genes encoding GSA aminotransferase have also been
isolated from A. thaliana with high homology in their
exons(7) .
Thus, enzymes of the C
pathway may be encoded by gene families in plants, but whether a
particular gene within a family has specificity for a tissue,
tetrapyrrole, or developmental state is not known. This question is
difficult to address in green tissues because heme, although
functionally important, is quantitatively a minor tetrapyrrole in most
plants. However, symbiotic root nodules are not only unique organs
within the plant kingdom, they have high ALA synthetic activity that is
not fated to chlorophyll formation.
In the present work, we isolate
soybean gsa1, one of two gsa genes, and provide
evidence that it is involved in the synthesis of cellular tetrapyrroles
in leaves and root nodules, thereby demonstrating the universality of a
C pathway gene in a higher plant. In addition, we show that gsa1 is regulated, and present evidence for a cis-acting
regulatory element in the gsa1 promoter that has heretofore
been described only in Drosophila. Finally, we argue that the
primary structure of extant plant gsa genes results from
recent evolutionary events.
Figure 1: Overexpression of nodule gsa cDNA and detection of GSA aminotransferase protein and enzyme activity in soybean fractions. A, gsa cDNA was overexpressed in E. coli and a Coomassie-stained SDS-PAGE gel of the purified product from inclusion bodies is shown (inset). GSA aminotransferase activity was measured in the inclusion body fraction is shown as ALA formed from GSA/mg of protein as a function of time. B, Western blot analysis of GSA aminotransferase and enzyme activity in leaves of etiolated plants (L), uninfected roots (R), nodules (N), and the nodule bacteroid fraction (B). For the Western blot, 20 µg of protein was loaded per lane and antibodies were raised against protein purified in A. GSA aminotransferase activity is expressed as nmol ALA formed in 20 min/mg of protein.
For gel retardation assays, the
oligonucleotides (dG-dA) and (dT-dC)
were
synthesized at the core facility at the State University of New York at
Buffalo and annealed to form double-stranded DNA. The DNA was then
end-labeled with
P using
[
-[
P]dATP and polynucleotide kinase.
Gel retardation assays were carried out as described
previously(34) . 0.5 ng (50 fmol) of radiolabeled probe (
6
10
bequerels) and 1500 ng of the unlabeled
nonspecific competitor DNA (dI-dC)
(dI-dC)
were used per reaction, along with 5 µg of nuclear extract
from leaves, roots, or nodules. The samples were run on nondenaturing
12% polyacrylamide gels and subsequently developed by autoradiography.
Figure 2:
Northern blot analysis of the expression
of nodule mRNAs as a function of nodule age. Approximately 5 µg of
poly(A) RNA from uninfected roots (U) and
from nodules 10, 13, and 25 days post-infection were loaded onto each
lane. A single filter was hybridized with each radiolabeled cDNA
separately, and the filter was stripped after each hybridization and
exposure. Ubiquitin (Ubi) was used as a control for a
constitutively expressed gene. Exposure times varied with different
probes, so they cannot be directly compared with each
other.
Figure 3: Western blot analysis of GSA aminotransferase (GSA) and leghemoglobin (LB) protein in nodules as a function of nodule age. Protein corresponding to 2 mg of tissue was loaded per lane run on a 12% SDS-PAGE gel; protein was transferred to a filter and analyzed with antibodies raised against the respective enzyme.
Figure 4:
Southern blot analysis of soybean genomic
DNA probed with a 1-kb NcoI/EcoRI nodule gsa cDNA fragment (A) or a 25-base oligonucleotide probe
(primer 1) (B). 25 mg of DNA was digested with EcoRV (R), HincII (H), or EcoRI (E) and run on a 0.7% agarose gel. DNA from the gel was
transferred to nitrocellulose and probed with P-labeled
DNA.
Figure 5: Gene structures of soybean gsa1 and A. thaliana gsa1 and gsa2. The striped areas represent exon coding regions and open areas represent introns.
One remarkable feature between the gsa genes from soybean and Arabidopsis is the intron variability with respect to number, size, and relative positions (Fig. 5). The only feature shared among the three genes is the approximate size of the first exon and position of the 5` boundary of the first intron. The nucleotide sequences of the coding regions of the three genes are highly homologous (73-83%), as are the peptide sequences (80-90% identical and 88-95% similar), thus it appears that these genes were derived from a common ancestor. Therefore, the intron variability indicates that plant gsa gene structure has changed recently on an evolutionary time scale, subsequent to the establishment of higher plant lineages.
Figure 6:
Northern blot analysis of gsa1 and total gsa RNA from various soybean tissues.
Poly(A) RNA was analyzed from leaves (L),
roots (R), and nodules (N) from 23-day-old plants and
from leaves of dark-grown etiolated plantlets that were either
illuminated (I) or kept in the dark (D) for 24 h
prior to harvest. The RNA was probed either with gsa cDNA to
assess total gsa mRNA or with a 25-base oligonucleotide
(primer 1) specific to gsa1. Exposure times varied with
different probes, thus they cannot be compared with each other in a
quantitative way.
Figure 7: Upstream region of soybean gsa1. The underlined region denotes the translation start site. Asterisks below nucleotides denote transcription start sites. The putative TATA box is shown with a broken underline. The GAGA element is boxed.
Figure 8:
S1 nuclease sensitivity of the GAGA
element. Plasmid-borne DNA containing upstream sequence of gsa1 was used as a template for (A) sequencing reactions or
was (B) treated with S1 nuclease for 0, 16, 32, or 64 min
(lanes 1, 2, 3, 4, respectively) and used as template for synthesis of
a complementary strand. The T7 primer was used resulting in the strand
containing (GA) as the template for the reactions. A and B were exposed for 12 and 48 h, respectively, and are
different portions of the same gel (see ``Materials and
Methods'').
A synthetic (dG-dA)(dT-dC)
double-stranded DNA was used in gel retardation experiments to
discern nuclear factors from soybean tissues that bind to the GAGA
element. The mobility of GAGA DNA was retarded on polyacrylamide gels
when treated with nuclear extracts from nodules or from leaves of
illuminated etiolated plantlets, but no retarded species resulted from
treatment with root extracts (Fig. 9). The retarded DNA was
discerned as a doublet for both nodule and etiolated leaf extracts (Fig. 9), which was also observed for pure Drosophila GAGA protein complexed with GAGA DNA(41) . In addition,
leaf extracts yielded another feature on the gel that appeared to be a
diffuse shadow rather than a sharp band, and we have not attempted to
interpret that feature. In conclusion, formation of complexes with DNA
were observed only with nuclear extracts from tissues where gsa1 is strongly expressed, which infers that binding of a nuclear
factor to the GAGA element positively affects the transcription of gsa1.
Figure 9:
Retardation of GAGA DNA mobility in
nondenaturing gels. P-End-labeled
(dG-dA)
(dT-dC)
(50 fmol) was incubated
with 5 µg of nuclear extract from nodules (N), roots (R), or leaves from illuminated etiolated plantlets (L) or was incubated as free probe (F). The samples
were run on a 12% nondenaturing polyacrylamide gel and then assayed by
autoradiography.
ALA formation in root nodules is unique among plants in that
none of the ALA produced there is incorporated into chlorophyll. In
addition, ALA synthesis activity is high relative to that found in
other root cell types and presumably in other nonphotosynthetic
tissues. Finally, synthesis is induced in response to interactions with
a bacterium and should be controlled by factors related to symbiosis
and nodule development rather than to photosynthesis. Despite these
unique aspects of the symbiotic organ, previous
studies(10, 13, 14) and the current work
underscore the similarities between leaves and nodules with respect to
ALA synthesis. Herein, we show that gsa1 is a regulated gene
that is strongly expressed in root nodules and leaves and the data
indicate that a putative second gsa gene is either not
expressed or else has a similar expression pattern as that of gsa1. Thus, it is unlikely that soybean has a gsa gene that is specific to a particular tissue, tetrapyrrole, or an
external stimulus such as light, and therefore the evidence strongly
supports the universality of a step of the C pathway at the
genetic level in a higher plant. This implies that a pool of ALA
committed to either chlorophyll or heme as proposed by Huang and
Castelfranco (20) does not have a genetic basis, but may rely
on spatial separation of common enzymes. Multiple hemA genes,
which encodes the first committed enzyme glutamyl-tRNA reductase, have
been inferred in A. thaliana (7), but the specificity of a
given gene has not been assessed. In addition, the induction of gsa1 in root nodules infers that vigorous ALA synthesis can be
uncoupled from chloroplast development (see (5) ), hence gsa1 may be affected by separate and independent signal
transduction pathways.
Analysis of GSA aminotransferase mRNA,
protein, and enzyme activity showed that gsa is regulated and
that control occurs at the RNA level ( Fig. 1Fig. 2Fig. 3and Fig. 6; (10) ). These observations prompted us to initiate an analysis
of the promoter region of gsa1, which led to the
identification of a DNA element hitherto characterized only in Drosophila. The so-called GAGA element was located immediately
downstream of the putative TATA box, and the pure, plasmid-borne
element was sensitive to S1 nuclease. Because the nuclease-sensitive
GAGA elements in the promoters of Drosophila his3-his4 (44)
and soybean gsa1 have different flanking sequences, the
dinucleotide repeat itself appears to be sufficient for the sensitivity
and does not depend on the context in which the element is found. Gel
mobility shift experiments showed that nuclear extracts from nodules
and from leaves of greening etiolated plantlets contained a factor
which bound to GAGA DNA ((dG-dA)(dT-dC)
),
but a GAGA binding factor was not discerned in root extracts. These
data indicate that binding of a nuclear factor to the GAGA element has
a positive effect on transcription, and they underscore similarities in
the regulation of gsa1 in leaves and nodules despite the
specialization of those tissues for different metabolic processes.
A group of nodule proteins called nodulins are generally described as being strictly specific to the symbiotic tissue, such as leghemoglobin or Enod2(45) . Other proteins such as soybean phenylalanine ammonia-lyase and chalcone synthase are found throughout the plant, but they are encoded by gene families of which some are symbiosis-specific (46) . Soybean GSA aminotransferase differs from these other proteins involved in symbiosis in that its presence in nodules must result from regulatory factors that alter the spatial expression of a gene normally expressed strongly only in photosynthetic tissues. Soybean ALA dehydratase, the enzyme which metabolizes ALA directly for tetrapyrrole synthesis, is also induced in nodules, but unlike GSA aminotransferase, the control of the dehydratase is at protein synthesis or turnover (16) . It is not clear why two enzymes of the same pathway should be regulated by separate mechanisms, but we note that porphobilinogen, the product of ALA dehydratase, is committed to plant tetrapyrrole synthesis in nodules(47) , whereas ALA may be taken up by B. japonicum for bacterial heme formation(47) . In developing nodules, gsa expression preceded that of lb with respect to both mRNA and protein ( Fig. 2and Fig. 3). GSA aminotransferase protein was strongly expressed in nodules by 13 days, whereas leghemoglobin was only weakly expressed compared with levels seen in older nodules. These data are consistent with previous observations which show that glutamate-dependent ALA formation activity by soybean is high in young nodules where the leghemoglobin content is low but discernible(14) . It is plausible that an increased demand for plant ALA is needed prior to leghemoglobin synthesis for heme-dependent respiration associated with cell division or for bacterial heme formation. Transcripts of the early nodulin gene enod2 is strongly expressed in 10-day-old nodules where gsa message induction is discernible but weak in those nodules (Fig. 2). Although no anti-Enod2 antibodies were available to us to follow the time course of protein expression, other work shows it to be a nodule structural protein (see (37) ). Thus, it is probably found in nodules prior to 13 days post-infection, where GSA aminotransferase was first discerned (Fig. 3). It is likely that gsa expression in nodules is controlled differently than is lb or enod2, which can be inferred by the lack of a GAGA element in the promoter of the latter genes, and by a greater spatial expression of gsa throughout the plant.
The coding region of gsa1 is arranged on three exons, separated by two small introns. It is remarkable that the genomic arrangements of soybean gsa1 and the two gsa genes from A. thaliana are very different from each other with respect to the size, position, and number of introns. Soybean GSA aminotransferase has a greater homology to the Arabidopsis enzymes than with any other known plant or bacterial GSA aminotransferase, hence it is very likely that the three genes share a common ancestor rather than having arisen from separate lineages that have converged during evolution. It follows then that the intron variability resulted either from the differential loss of introns found in the common ancestral gene or from the acquisition of introns to each gene. In either case, the events leading to intron variability are likely to be recent, after the establishment of modern lineages. It is also possible that all three genes were present in a common ancestor, and entire genes have been lost, in which case gene loss would also be subsequent to the establishment of modern lineages. A corollary to the theory of ancient introns is that exons represent functional domains which were differentially spliced to allow protein diversity from a limited genome in evolving organisms (discussed in (48) ). It is clear, however, that there can be no correlation between exons and protein domains that fits all the three known extant plant gsa genes. Therefore, the gsa genes cannot both be ancient and be accommodated by the exon theory.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U20260[GenBank].