©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
gsa1 Is a Universal Tetrapyrrole Synthesis Gene in Soybean and Is Regulated by a GAGA Element (*)

(Received for publication, November 17, 1994; and in revised form, January 30, 1995 )

Jana M. Frustaci Indu Sangwan Mark R. O'Brian (§)

From the Department of Biochemistry, State University of New York, Buffalo, New York 14214

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Expression of plant tetrapyrroles is high in photosynthetic tissues and in legume root nodules in the form of chlorophyll and heme, respectively. The universal tetrapyrrole precursor -aminolevulinic acid (ALA) is synthesized from glutamate 1-semialdehyde (GSA) by GSA aminotransferase in plants, which is encoded by gsa. Immunoblot analysis showed that GSA aminotransferase was expressed in soybean leaves and nodules, but not in roots, and that protein correlated with enzyme activity. These observations indicate that GSA aminotransferase expression is controlled in tetrapyrrole formation and argue against significant activity of an enzyme other than the well described aminotransferase for GSA-dependent ALA formation. gsa mRNA and protein were induced in soybean nodules, and their activation was temporally intermediate between those of the respective early and late genes enod2 and lb. A GSA aminotransferase gene, designated gsa1, was isolated and appears to be one of two gsa genes in the soybean genome. gsa1 mRNA accumulated to high levels in leaves and nodules, but not in uninfected roots as discerned with a gsa1-specific probe. Message levels were higher in leaves from etiolated plantlets than in mature plants, and expression in the former was slightly elevated by light. The expression pattern of gsa1 mRNA was qualitatively similar to that of total gsa. The data strongly suggest that gsa1 is a universal tetrapyrrole synthesis gene and that a gsa gene specific for a tissue, tetrapyrrole, or light condition is unlikely. The gsa1 promoter contained a genetic element found in numerous Drosophila melanogaster genes; the so-called GAGA element displayed single-stranded character in vitro and formed a complex with nuclear factors from nodules and leaves but not from roots. From these observations we infer that the GAGA element is involved in the transcriptional control of gsa1.


INTRODUCTION

The tetrapyrroles chlorophyll, heme, siroheme, and bilins are expressed in plants for participation in numerous cellular processes, and they are synthesized from the universal tetrapyrrole precursor ALA (reviewed in (1) ). Chlorophyll is the most abundant tetrapyrrole in plants, and the bulk of ALA synthesized in photosynthetic tissues is incorporated into the chlorophyll ring. Accordingly, some evidence shows that ALA formation in those tissues is controlled by, or coordinated with, factors related to photosynthesis(2, 3, 4, 5, 6, 7) . Glutamate-dependent ALA (^1)formation occurs in plants by a three-step mechanism termed the C(5) pathway; the latter two steps are committed to ALA synthesis and are catalyzed by glutamyl-tRNA reductase and glutamate 1-semialdehyde (GSA) aminotransferase, respectively(1, 8) . Plant cDNA or genes encoding C(5) pathway enzymes have been isolated from several sources (7, 9, 10) , (^2)(^3)and from a green alga(6) . Radiolabel from [1-^14C]glutamate is incorporated into mitochondrial heme a as well as into plastid chlorophyll in etiolated seedlings of maize which, along with an absence of ALA synthase activity, indicates that higher plants use the C(5) pathway for synthesis of all tetrapyrroles, at least in photosynthetic tissue(11) .

Legume root nodules are specialized plant organs elicited by rhizobia bacteria that contain a large quantity of heme for the prosthetic group of plant hemoglobin (reviewed in (12) ), but lack chlorophyll. A soybean glutamate-dependent ALA synthesis activity is induced in nodules(13, 14) , as well as other genes encoding heme pathway enzymes (15, 16) , but no plant ALA synthase activity is detectable. In addition, soybean nodule cDNA encoding GSA aminotransferase was isolated, and both enzyme activity and mRNA are induced in the symbiotic tissue(10) . These data strongly support the universality of the C(5) pathway in higher plants with respect to the tetrapyrrole formed and its distribution in tissues where ALA formation can be discerned.

It is not certain whether the C(5) pathway itself is heterogeneous with regard to the enzymes that catalyze a given step or to the number of functional genes that encode an enzyme. Evidence for two enzymes with glutamyl-tRNA reductase activity are described in two bacterial species(17, 18, 19) , and although no similar situation has been reported in plants, separate ALA pools for heme and chlorophyll synthesis have been proposed in plant chloroplasts(20) . A plant gene and cDNA encoding glutamyl-tRNA reductase (hemA) have been isolated from Arabidopsis thaliana(7) , and multiple copies of hemA are inferred. Two genes encoding GSA aminotransferase have also been isolated from A. thaliana with high homology in their exons(7) .^3 Thus, enzymes of the C(5) pathway may be encoded by gene families in plants, but whether a particular gene within a family has specificity for a tissue, tetrapyrrole, or developmental state is not known. This question is difficult to address in green tissues because heme, although functionally important, is quantitatively a minor tetrapyrrole in most plants. However, symbiotic root nodules are not only unique organs within the plant kingdom, they have high ALA synthetic activity that is not fated to chlorophyll formation.

In the present work, we isolate soybean gsa1, one of two gsa genes, and provide evidence that it is involved in the synthesis of cellular tetrapyrroles in leaves and root nodules, thereby demonstrating the universality of a C(5) pathway gene in a higher plant. In addition, we show that gsa1 is regulated, and present evidence for a cis-acting regulatory element in the gsa1 promoter that has heretofore been described only in Drosophila. Finally, we argue that the primary structure of extant plant gsa genes results from recent evolutionary events.


MATERIALS AND METHODS

Bacteria and Plants

Escherichia coli strain BL21(DE3) (pLysS) expresses T7 RNA polymerase (21) and was used for overexpression of gsa1 cDNA in the present work; it was grown in LB medium (22) supplemented with 25 µg/ml chloramphenicol, and 100 µg/ml ampicillin was added to maintain pBluescript-derived plasmids in that strain. Bradyrhizobium japonicum strain I110 was the soybean endosymbiont used in the present work, and it was grown in GSY medium(23) . Soybeans (Glycinemax cv Essex), either inoculated with B. japonicum or not inoculated, were grown in a growth chamber under a 16 h light/8 h dark regime at 25 °C. Nodules, leaves, and roots were harvested for enzyme assays or RNA extraction; DNA was extracted from leaves only. Etiolated soybean plants were grown in total darkness for 9 days and then exposed to direct light for the final 24 h before harvesting the leaves.

GSA Aminotransferase Activity

Enzymatic formation of ALA from GSA (provided by Dr. C. G. Kannangara) by soybean nodule, root, leaf, and bacteroid extracts and by purified GSA aminotransferase obtained by overexpression of gsa cDNA in E. coli was carried out as described previously(24) . Reactions containing either 0.5 mg of cell extract or 0.03 mg of partially purified recombinant GSA aminotransferase were incubated in 1 ml containing 50 mM MOPS, pH 6.8, 1 mM dithiothreitol, 20 µM pyridoxal phosphate, and 5 mM levulinic acid. Controls contained heat-inactivated protein samples. Reactions were started by addition of GSA to 50 µM final concentration and proceeded at 30 °C for times indicated in Fig. 1and its legend. Reactions were then terminated, and ALA was isolated from reaction samples by Dowex ion exchange chromatography and solvent extraction and then quantified as described previously(14) .


Figure 1: Overexpression of nodule gsa cDNA and detection of GSA aminotransferase protein and enzyme activity in soybean fractions. A, gsa cDNA was overexpressed in E. coli and a Coomassie-stained SDS-PAGE gel of the purified product from inclusion bodies is shown (inset). GSA aminotransferase activity was measured in the inclusion body fraction is shown as ALA formed from GSA/mg of protein as a function of time. B, Western blot analysis of GSA aminotransferase and enzyme activity in leaves of etiolated plants (L), uninfected roots (R), nodules (N), and the nodule bacteroid fraction (B). For the Western blot, 20 µg of protein was loaded per lane and antibodies were raised against protein purified in A. GSA aminotransferase activity is expressed as nmol ALA formed in 20 min/mg of protein.



Analysis of Genomic DNA and Isolation of a gsa Gene Fragment

To isolate DNA from soybean, 3 g of leaves were pulverized to powder in liquid N(2), then DNA extracted as described previously(22) . The DNA was purified by subsequent precipitations with NaCl(25) , cetyl trimethylammonium bromide (26) , and polyethylene glycol(27) . The final DNA pellet was dissolved in 10 mM Tris, pH 8, 1 mM EDTA. Southern blot analysis of the genomic DNA was carried out as described previously (22) . A 2-kb EcoRI fragment of the soybean genome that hybridized to gsa cDNA was isolated by inverse PCR using oligonucleotides 5`-GACGACAGATAAGACTCTCTCACTC-3` (hereafter referred to as primer 1) and 5`-GATCTCTGATCGCATCT-3`. Genomic DNA isolated from leaves was digested with EcoRI and size-fractionated by centrifugation in a 1-5 M Nacl step gradient as described previously(28) . Fractions enriched for DNA homologous to gsa cDNA were pooled, ligated with T4 ligase to produce circular DNA, and then used as template DNA for inverse PCR as described previously(29) . The PCR product was cloned into pBluescript and identified as the gsa1 gene by comparing its nucleotide sequence to the corresponding cDNA.

Sensitivity of gsa Promoter to S1 Nuclease

A 191-bp DNA fragment, including the gsa1 promoter region, was ligated into pBluescript SKII, and supercoiled recombinant plasmid was isolated by cesium chloride centrifugation(22) . DNA (50 µg) was added to 300 µl final volume reaction mixture containing 30 mM sodium acetate, pH 4.6, 30 mM NaCl, 1 mM ZnCl(2). S1 nuclease (16 units) was added, and the digestion was carried out at 37 °C. Aliquots containing 5 µg of DNA were removed at various time points and immediately extracted with a saturated acid-phenol solution, followed by a chloroform extraction. The DNA was used as a template for synthesis of a radiolabeled complementary strand from a T7 primer using a Sequenase kit (U. S. Biochemical Corp.) and alpha-S-dCTP, except that dideoxynucleotides were omitted from the reactions. In these experiments, termination of polymerization results from S1 nuclease-dependent breaks in the template DNA. Sequencing reactions were carried out in parallel using the T7 primer, undigested DNA, and dideoxynucleotides according to the manufacturer's instructions, and all samples were loaded onto a 6% polyacrylamide sequencing gel. In our hands, the fragments resulting from the sequencing reactions were more heavily radiolabeled than those using S1 nuclease-treated DNA as template, thus the data are presented as different exposures of the same gel.

RNA Isolation and Analysis

Nodules, leaves, and roots were excised, frozen in liquid N(2), and homogenized in a blender with buffer and phenol (2:2:3, w/v/v). The homogenization buffer contained 500 mM Tris, pH 8, 10 mM MgCl(2), 1 mM EDTA, 100 mM NaCl, 0.5% (w/v) deoxycholate, and 1 mM beta-mercaptoethanol. Total RNA was isolated from the homogenate as described previously (22) , and poly(A) RNA was isolated using oligo(dT) cellulose columns. Northern blot analysis of poly(A) RNA was carried out as described previously under high stringency conditions (22) using either oligonucleotide or cDNA probes. gsa cDNA used in Northern blots was obtained previously (10) . Other DNAs were obtained from T. Bisseling (enod2; (30) ), K. Marcker (lba; (31) ), and D. P. S. Verma (ubi; (32) ). The 5` end of gsa1 mRNA from nodules and from leaves of etiolated plants was determined by S1 nuclease protection and primer extension as described previously (22) . Primer 1 (see above) was used to prime RNA-dependent DNA synthesis in the primer extension analysis.

Preparation of Nuclear Extracts and Gel Retardation Assays

Preparation of nuclear extracts from leaves, nodules, or roots was carried out using a modified protocol of Jensen et al.(33) , made available to us by Dr. Frans de Bruijn. 20 g of tissue was pulverized in liquid N(2) and resuspended in 5 ml of buffer A/g of tissue (buffer A is 10 mM MES, pH 6, 10 mM NaCl, 5 mM EDTA, 0.15 mM spermine, 0.5 mM spermidine, 10 mM beta-mercaptoethanol, 1 mM phenylmethylsulfonyl fluoride, 0.6% Triton X-100, and 0.25 M sucrose). The homogenate was filtered successively through one and two layers of Miracloth (CalBiochem), and the filtrate was centrifuged for 5 min at 2000 times g. The pellet, which contained the crude nuclei, was washed once with 5 ml of buffer A and then resuspended in 5 ml of buffer B (1 part 5 times buffer A to 7.5 parts Percoll, pH 6 (w/w)). The homogenate was centrifuged at 5000 times g for 5 min, and the fraction floating on top of the gradient was collected and washed twice with buffer A by centrifugation at 2000 times g for 5 min. The pellet was resuspended in buffer C (20 mM Hepes, pH 7.9, 420 mM NaCl, 12% glycerol, 1.5 mM MgCl(2), 0.2 mM EDTA, 0.5 mM dithiothreitol, and 0.5 mM phenylmethylsulfonyl fluoride) and passed through a small French pressure cell (4 ml capacity) at 900 p.s.i. The extracts were incubated at 4 °C for 1 h with slow shaking on an Orbitron shaker and then nuclear debris was removed by centrifugation in a microcentrifuge at 15,000 times g for 30 min. Supernatant fractions were aliquoted in Eppendorf tubes and stored at -70 °C.

For gel retardation assays, the oligonucleotides (dG-dA)(9) and (dT-dC)(9) were synthesized at the core facility at the State University of New York at Buffalo and annealed to form double-stranded DNA. The DNA was then end-labeled with P using [-[P]dATP and polynucleotide kinase. Gel retardation assays were carried out as described previously(34) . 0.5 ng (50 fmol) of radiolabeled probe (6 times 10^6 bequerels) and 1500 ng of the unlabeled nonspecific competitor DNA (dI-dC)(n)bullet(dI-dC)(n) were used per reaction, along with 5 µg of nuclear extract from leaves, roots, or nodules. The samples were run on nondenaturing 12% polyacrylamide gels and subsequently developed by autoradiography.

Overexpression of Soybean gsa cDNA in E. coli and Antibody Production

Soybean gsa cDNA was isolated and described previously(10) . Modified 5` and 3` ends of a subcloned fragment were constructed such that it could be cloned into the NdeI-BamHI sites of pET3c (21) and translated from the second methionine codon in the open reading frame. To do this, the upstream primer 5`-CCGCATATGGCCGTATCTATCGACCC-3` and the downstream primer 5`-CCTGGATCCAACCATCAGATCTCCCT-3` were employed in a PCR reaction using pKN4 (10) as template. The PCR product was blunt-ended with T4 polymerase and ligated into the EcoRV site of pBluescript SKII to construct pKN4C1. The gsa coding region was removed from pKN4C1 by digestion with BamHI and partial digestion with NdeI and ligated into the NdeI/BamHI sites of pET3c to construct pET3CGSAA. The product encoded by pET3CGSAA lacks the putative plastid leader peptide and should be nearly identical to the mature peptide. The modified gsa cDNA in pET3CGSAA was expressed in E. coli strain BL21(DE3) (pLysS) as described previously(35) , and the nearly pure 45-kDa protein was recovered from the inclusion body fraction of those cells catalyzed GSA-dependent ALA formation (Fig. 1). For antibody production, 50 µg of the inclusion body extract was loaded onto a preparative 10% SDS-PAGE gel, and the GSA aminotransferase was excised from the gel with a razor blade. The excised fragment was homogenized by freezing and thawing, then forcing through a 18-guage needle several times. The sample was then used to raise antibodies in rabbits as described previously(22) . Crude antiserum was affinity-purified using GSA aminotransferase bound to nitrocellulose as described previously(36) .

Western Blotting

Protein extracts were run on 10-12% SDS-PAGE gels, transferred to nitrocellulose or Immobilon (Millipore) filters, and screened with antibodies raised against soybean GSA aminotransferase or leghemoglobin as described previously(22) . Cross-reactive material bound to the filter was discerned with peroxidase-conjugated goat anti-rabbit IgG and visualized either by color development using 4-chloro-1-naphthol or by chemiluminescence using the Renaissance kit (DuPont NEN) according to the manufacturer's instructions.


RESULTS

Expression of GSA Aminotransferase Protein and Enzyme Activity

The soybean gsa nodule cDNA isolated previously (10) was overexpressed in E. coli such that the product was missing the putative plastid leader peptide. The purified protein expressed GSA aminotransferase activity (Fig. 1A), thereby formally demonstrating that the gsa cDNA encodes that enzyme. Expression of GSA aminotransferase was examined in soybean tissues in Western blots using antibodies raised against the purified protein. Immunoreactive protein was detected in the plant fraction of root nodules elicited by B. japonicum and in leaves of etiolated plantlets, but not in uninfected roots (Fig. 1B). In addition, GSA aminotransferase enzyme activity was also observed only in leaves and nodules (Fig. 1B; (10) ), with protein and enzyme activity being about 2-fold greater in the etiolated leaves than in nodules. The correlation between protein and enzyme activity argues against significant expression of an enzyme other than the well described aminotransferase that has GSA-dependent ALA formation activity. A high level of GSA aminotransferase was observed only in tissues that synthesize a large quantity of tetrapyrrole, showing that expression of gsa is a control point for ALA and tetrapyrrole synthesis in soybean. Furthermore, no GSA aminotransferase protein or enzyme activity was found in extracts of B. japonicum bacteroids isolated from nodules, showing that the protein observed in the plant fraction was not a bacterial contaminant, nor does the plant enzyme compartmentalize into a bacterial space. This latter conclusion is relevant because bacterial heme expression is rescued in a B. japonicum ALA auxotroph in nodules, which we have attributed to provision of ALA itself rather than ALA synthetic enzymes to the bacterial endosymbiont by the plant host(13) .

Temporal Expression of gsa in Root Nodules

Root nodule ontogeny is broadly divided into early and late development, with the latter stage commencing with the onset of nitrogen fixation. In addition, developmental stages of cells of so-called determinate nodules of soybean are approximately uniform at a given time, allowing temporal information to be obtained from analysis of whole nodules (reviewed in (37) and (38) ). We compared the temporal expression of gsa with those of the nodule-specific genes enod2 and lb, which are well described markers of early and late development, respectively(37, 39, 40) . Northern blot experiments showed that enod2 mRNA was not detected in uninfected root, but was easily discerned by 10 days post-infection, whereas lb mRNA was not observed in nodules until 13 days (Fig. 2). A weak but discernible induction of gsa mRNA was observed at 10 days post-infection in Northern blot experiments, and it was strongly expressed by 13 days and maintained thereafter to at least 25 days (Fig. 2). GSA aminotransferase protein was also expressed in developing nodules by Western blot analysis. GSA aminotransferase protein was not detected in extracts from uninfected roots or from 10-day-old nodules, but was strongly expressed in nodules at 13 days and older (Fig. 3), as was gsa message (Fig. 2). Unlike GSA aminotransferase, leghemoglobin was only weakly expressed in 13-day-old nodules compared with the amount observed in older nodules (Fig. 3). The data show that full expression of gsa was temporally intermediate between the expressions of enod2 and lb and is likely to be controlled at the RNA level.


Figure 2: Northern blot analysis of the expression of nodule mRNAs as a function of nodule age. Approximately 5 µg of poly(A) RNA from uninfected roots (U) and from nodules 10, 13, and 25 days post-infection were loaded onto each lane. A single filter was hybridized with each radiolabeled cDNA separately, and the filter was stripped after each hybridization and exposure. Ubiquitin (Ubi) was used as a control for a constitutively expressed gene. Exposure times varied with different probes, so they cannot be directly compared with each other.




Figure 3: Western blot analysis of GSA aminotransferase (GSA) and leghemoglobin (LB) protein in nodules as a function of nodule age. Protein corresponding to 2 mg of tissue was loaded per lane run on a 12% SDS-PAGE gel; protein was transferred to a filter and analyzed with antibodies raised against the respective enzyme.



Isolation and Characterization of gsa Genomic DNA

Southern blot analysis showed that a 1-kb NcoI/EcoRI 3` fragment of gsa cDNA hybridized to two fragments of genomic DNA when digested with EcoRV, HincII, or EcoRI, but only one fragment was observed for each digestion when a 25-bp oligonucleotide probe (primer 1) was used that corresponded to the 5`-untranslated region of the cDNA (Fig. 4). Primer 1 was one of two primers used to isolate the unique 2-kb EcoRI genomic fragment by inverse PCR (see ``Materials and Methods''), and the nucleotide sequence was subsequently determined. The cloned fragment contained the entire gsa coding region on three exons that were separated by two small introns (Fig. 5). We designate this gene gsa1. The nucleotide sequence of the gsa cDNA shared 100% identity with gsa1 exon sequence, hence we conclude that the cDNA was derived from gsa1 transcript. The gsa1 gene contained no internal restriction sites of EcoRV, HincII, or EcoRI within the region delimited by the probe used in the Southern blot, thus the hybridization to two genomic fragments (Fig. 4) indicates that gsa1 is one of two gsa genes in the soybean genome. Similarly, two different genes from A. thaliana encoding GSA aminotransferase have been reported ((7) ).^3


Figure 4: Southern blot analysis of soybean genomic DNA probed with a 1-kb NcoI/EcoRI nodule gsa cDNA fragment (A) or a 25-base oligonucleotide probe (primer 1) (B). 25 mg of DNA was digested with EcoRV (R), HincII (H), or EcoRI (E) and run on a 0.7% agarose gel. DNA from the gel was transferred to nitrocellulose and probed with P-labeled DNA.




Figure 5: Gene structures of soybean gsa1 and A. thaliana gsa1 and gsa2. The striped areas represent exon coding regions and open areas represent introns.



One remarkable feature between the gsa genes from soybean and Arabidopsis is the intron variability with respect to number, size, and relative positions (Fig. 5). The only feature shared among the three genes is the approximate size of the first exon and position of the 5` boundary of the first intron. The nucleotide sequences of the coding regions of the three genes are highly homologous (73-83%), as are the peptide sequences (80-90% identical and 88-95% similar), thus it appears that these genes were derived from a common ancestor. Therefore, the intron variability indicates that plant gsa gene structure has changed recently on an evolutionary time scale, subsequent to the establishment of higher plant lineages.

gsa1 Is a Universal Tetrapyrrole Synthesis Gene in Soybean

Expression of GSA aminotransferase in root nodules for heme synthesis shows that the enzyme, and most likely the C(5) pathway, is not confined to chlorophyll synthesis or to photosynthetic tissues. In addition, enzyme activity and message is found in leaves of dark-grown etiolated plantlets (10) which, along with the observations of nodules, shows that gsa can be strongly expressed in the absence of light. The presence of multiple GSA genes in plants led us to ask whether a given gene has specificity for a tissue, for light, or for the tetrapyrrole in which the ALA is incorporated. RNA from several soybean tissues was analyzed by Northern blotting, using either a 1-kb cDNA fragment that hybridizes to two genomic regions (Fig. 4) or a 25-bp oligonucleotide specific to gsa1 (primer 1). The data show clearly that gsa1 mRNA was strongly expressed in leaves and nodules from 23-day-old plants, but very little message was observed in uninfected roots (Fig. 6). Therefore, gsa1 is induced in nodules for heme synthesis, and the same gene is expressed in leaves, where chlorophyll is the predominant tetrapyrrole. To assess the light requirement for gsa1 mRNA expression, message was analyzed in leaves of dark-grown etiolated plantlets, where the level was higher than in any tissue from 23-day-old plants grown under a light/dark regime (Fig. 6). gsa1 mRNA increased about 2-fold in the etiolated plantlet leaves upon illumination for 24 h prior to harvest (Fig. 6). The data infer strongly that soybean gsa1 is a universal tetrapyrrole synthesis gene that is expressed significantly in tissues where ALA is synthesized for heme or chlorophyll formation. In addition, the pattern of gsa1 mRNA expression and that of total gsa were qualitatively the same, indicating either that the putative second gsa gene is not expressed in the tissues examined or else its pattern of expression is similar to that of gsa1. Again, this conclusion indicates that the same gene or genes are required for ALA synthesis from GSA irrespective of the fate of the precursor or the tissue where it is formed. Finally, the data described herein, along with previous work(10) , show that ALA formation in soybean is controlled, at least in part, by gsa expression, and that this regulation is at the RNA level.


Figure 6: Northern blot analysis of gsa1 and total gsa RNA from various soybean tissues. Poly(A) RNA was analyzed from leaves (L), roots (R), and nodules (N) from 23-day-old plants and from leaves of dark-grown etiolated plantlets that were either illuminated (I) or kept in the dark (D) for 24 h prior to harvest. The RNA was probed either with gsa cDNA to assess total gsa mRNA or with a 25-base oligonucleotide (primer 1) specific to gsa1. Exposure times varied with different probes, thus they cannot be compared with each other in a quantitative way.



Identification of a GAGA Element in the gsa1 Promoter Region

The isolated EcoRI genomic fragment encoding GSA aminotransferase included 223 bp of DNA upstream of the translation start site (Fig. 7). Two transcription start sites were found 90 and 114 bp upstream of the initiation codon as determined by S1 nuclease and by primer extension analysis using RNA either from nodules or from leaves of light-exposed etiolated plantlets (Fig. 7, data not shown). Because primer 1 (see above) was used for the primer extension experiments and is specific for the gsa1 gene, we interpret those data as two transcription start sites of a single gene rather than a start site for each of two genes. A striking feature of the 5` upstream region of gsa is a perfect dinucleotide repeat of (GA)(9) found between the 5`-most transcription start site and a putative TATA element (Fig. 7). This motif has been observed in the promoter of several Drosophila genes, where it binds to the GAGA transcription factor and affects transcription positively ( (41) and references therein). For Drosophila hsp26 and hsp70, binding to the GAGA element results in localized nucleosome disruption, thereby making the promoters accessible for transcription(42, 43) . Pure plasmid DNA containing a GAGA element has been shown be sensitive to S1 nuclease in the dinucleotide repeat region at low pH(44) , indicating that the double stranded DNA has some single stranded character. A 191-bp fragment of the soybean gsa1 promoter region was cloned into pBluescript, treated with S1 nuclease, and the treated DNA was then used as as a template for DNA synthesis (Fig. 8). The data show clearly that the GAGA element of the plasmid-borne gsa1 promoter region was sensitive to S1 nuclease as seen by chain termination within the dinucleotide repeat region.


Figure 7: Upstream region of soybean gsa1. The underlined region denotes the translation start site. Asterisks below nucleotides denote transcription start sites. The putative TATA box is shown with a broken underline. The GAGA element is boxed.




Figure 8: S1 nuclease sensitivity of the GAGA element. Plasmid-borne DNA containing upstream sequence of gsa1 was used as a template for (A) sequencing reactions or was (B) treated with S1 nuclease for 0, 16, 32, or 64 min (lanes 1, 2, 3, 4, respectively) and used as template for synthesis of a complementary strand. The T7 primer was used resulting in the strand containing (GA)(9) as the template for the reactions. A and B were exposed for 12 and 48 h, respectively, and are different portions of the same gel (see ``Materials and Methods'').



A synthetic (dG-dA)(9)bullet(dT-dC)(9) double-stranded DNA was used in gel retardation experiments to discern nuclear factors from soybean tissues that bind to the GAGA element. The mobility of GAGA DNA was retarded on polyacrylamide gels when treated with nuclear extracts from nodules or from leaves of illuminated etiolated plantlets, but no retarded species resulted from treatment with root extracts (Fig. 9). The retarded DNA was discerned as a doublet for both nodule and etiolated leaf extracts (Fig. 9), which was also observed for pure Drosophila GAGA protein complexed with GAGA DNA(41) . In addition, leaf extracts yielded another feature on the gel that appeared to be a diffuse shadow rather than a sharp band, and we have not attempted to interpret that feature. In conclusion, formation of complexes with DNA were observed only with nuclear extracts from tissues where gsa1 is strongly expressed, which infers that binding of a nuclear factor to the GAGA element positively affects the transcription of gsa1.


Figure 9: Retardation of GAGA DNA mobility in nondenaturing gels. P-End-labeled (dG-dA)(9)bullet(dT-dC)(9) (50 fmol) was incubated with 5 µg of nuclear extract from nodules (N), roots (R), or leaves from illuminated etiolated plantlets (L) or was incubated as free probe (F). The samples were run on a 12% nondenaturing polyacrylamide gel and then assayed by autoradiography.




DISCUSSION

ALA formation in root nodules is unique among plants in that none of the ALA produced there is incorporated into chlorophyll. In addition, ALA synthesis activity is high relative to that found in other root cell types and presumably in other nonphotosynthetic tissues. Finally, synthesis is induced in response to interactions with a bacterium and should be controlled by factors related to symbiosis and nodule development rather than to photosynthesis. Despite these unique aspects of the symbiotic organ, previous studies(10, 13, 14) and the current work underscore the similarities between leaves and nodules with respect to ALA synthesis. Herein, we show that gsa1 is a regulated gene that is strongly expressed in root nodules and leaves and the data indicate that a putative second gsa gene is either not expressed or else has a similar expression pattern as that of gsa1. Thus, it is unlikely that soybean has a gsa gene that is specific to a particular tissue, tetrapyrrole, or an external stimulus such as light, and therefore the evidence strongly supports the universality of a step of the C(5) pathway at the genetic level in a higher plant. This implies that a pool of ALA committed to either chlorophyll or heme as proposed by Huang and Castelfranco (20) does not have a genetic basis, but may rely on spatial separation of common enzymes. Multiple hemA genes, which encodes the first committed enzyme glutamyl-tRNA reductase, have been inferred in A. thaliana (7), but the specificity of a given gene has not been assessed. In addition, the induction of gsa1 in root nodules infers that vigorous ALA synthesis can be uncoupled from chloroplast development (see (5) ), hence gsa1 may be affected by separate and independent signal transduction pathways.

Analysis of GSA aminotransferase mRNA, protein, and enzyme activity showed that gsa is regulated and that control occurs at the RNA level ( Fig. 1Fig. 2Fig. 3and Fig. 6; (10) ). These observations prompted us to initiate an analysis of the promoter region of gsa1, which led to the identification of a DNA element hitherto characterized only in Drosophila. The so-called GAGA element was located immediately downstream of the putative TATA box, and the pure, plasmid-borne element was sensitive to S1 nuclease. Because the nuclease-sensitive GAGA elements in the promoters of Drosophila his3-his4 (44) and soybean gsa1 have different flanking sequences, the dinucleotide repeat itself appears to be sufficient for the sensitivity and does not depend on the context in which the element is found. Gel mobility shift experiments showed that nuclear extracts from nodules and from leaves of greening etiolated plantlets contained a factor which bound to GAGA DNA ((dG-dA)(9)bullet(dT-dC)(9)), but a GAGA binding factor was not discerned in root extracts. These data indicate that binding of a nuclear factor to the GAGA element has a positive effect on transcription, and they underscore similarities in the regulation of gsa1 in leaves and nodules despite the specialization of those tissues for different metabolic processes.

A group of nodule proteins called nodulins are generally described as being strictly specific to the symbiotic tissue, such as leghemoglobin or Enod2(45) . Other proteins such as soybean phenylalanine ammonia-lyase and chalcone synthase are found throughout the plant, but they are encoded by gene families of which some are symbiosis-specific (46) . Soybean GSA aminotransferase differs from these other proteins involved in symbiosis in that its presence in nodules must result from regulatory factors that alter the spatial expression of a gene normally expressed strongly only in photosynthetic tissues. Soybean ALA dehydratase, the enzyme which metabolizes ALA directly for tetrapyrrole synthesis, is also induced in nodules, but unlike GSA aminotransferase, the control of the dehydratase is at protein synthesis or turnover (16) . It is not clear why two enzymes of the same pathway should be regulated by separate mechanisms, but we note that porphobilinogen, the product of ALA dehydratase, is committed to plant tetrapyrrole synthesis in nodules(47) , whereas ALA may be taken up by B. japonicum for bacterial heme formation(47) . In developing nodules, gsa expression preceded that of lb with respect to both mRNA and protein ( Fig. 2and Fig. 3). GSA aminotransferase protein was strongly expressed in nodules by 13 days, whereas leghemoglobin was only weakly expressed compared with levels seen in older nodules. These data are consistent with previous observations which show that glutamate-dependent ALA formation activity by soybean is high in young nodules where the leghemoglobin content is low but discernible(14) . It is plausible that an increased demand for plant ALA is needed prior to leghemoglobin synthesis for heme-dependent respiration associated with cell division or for bacterial heme formation. Transcripts of the early nodulin gene enod2 is strongly expressed in 10-day-old nodules where gsa message induction is discernible but weak in those nodules (Fig. 2). Although no anti-Enod2 antibodies were available to us to follow the time course of protein expression, other work shows it to be a nodule structural protein (see (37) ). Thus, it is probably found in nodules prior to 13 days post-infection, where GSA aminotransferase was first discerned (Fig. 3). It is likely that gsa expression in nodules is controlled differently than is lb or enod2, which can be inferred by the lack of a GAGA element in the promoter of the latter genes, and by a greater spatial expression of gsa throughout the plant.

The coding region of gsa1 is arranged on three exons, separated by two small introns. It is remarkable that the genomic arrangements of soybean gsa1 and the two gsa genes from A. thaliana are very different from each other with respect to the size, position, and number of introns. Soybean GSA aminotransferase has a greater homology to the Arabidopsis enzymes than with any other known plant or bacterial GSA aminotransferase, hence it is very likely that the three genes share a common ancestor rather than having arisen from separate lineages that have converged during evolution. It follows then that the intron variability resulted either from the differential loss of introns found in the common ancestral gene or from the acquisition of introns to each gene. In either case, the events leading to intron variability are likely to be recent, after the establishment of modern lineages. It is also possible that all three genes were present in a common ancestor, and entire genes have been lost, in which case gene loss would also be subsequent to the establishment of modern lineages. A corollary to the theory of ancient introns is that exons represent functional domains which were differentially spliced to allow protein diversity from a limited genome in evolving organisms (discussed in (48) ). It is clear, however, that there can be no correlation between exons and protein domains that fits all the three known extant plant gsa genes. Therefore, the gsa genes cannot both be ancient and be accommodated by the exon theory.


FOOTNOTES

*
This work was supported by the Cooperative State Research Service, United States Department of Agriculture, under Agreement 91- 37305-6750. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U20260[GenBank].

§
To whom correspondence should be addressed: Dept. of Biochemistry, 140 Farber Hall, State University of New York, Buffalo, NY 14214. Tel.: 716-829-3200; Fax: 716-829-2725; cammrob{at}ubvms.cc.buffalo.edu.

(^1)
The abbreviations used are: ALA, -aminolevulinic acid; GSA, glutamate 1-semialdehyde; MOPS, 4-morpholinepropanesulfonic acid; kb, kilobase pair(s); PCR, polymerase chain reaction; bp, base pair(s); MES, 4-morpholineethanesulfonic acid; PAGE, polyacrylamide gel electrophoresis.

(^2)
K. B. Axelsen and B. Grimm, GenBank number X65973[GenBank].

(^3)
J. Wenzlau and S. Berry-Lowe, GenBank number U10278[GenBank].


ACKNOWLEDGEMENTS

We thank Drs. Ton Bisseling, Kjeld Marcker, Desh Pal Verma, and Gary Stacey for cDNAs and antibodies. We also thank Christine Kaczor for construction of pET3CGSAA.


REFERENCES

  1. Beale, S. I., and Weinstein, J. D. (1990) in Biosynthesis of Heme and Chlorophylls (Dailey, H. A., ed) pp. 287-391, McGraw-Hill, New York
  2. Masoner, M., and Kasemir, H. (1975) Planta 126, 111-117
  3. Huang, L., and Castelfranco, P. A. (1989) Plant Physiol. (Bethesda) 90, 996-1002
  4. Huang, L., Bonner, B. A., and Castelfranco, P. A. (1989) Plant Physiol. (Bethesda) 90, 1003-1008
  5. Beator, J., and Kloppstech, K. (1993) Plant Physiol. (Bethesda) 103, 191-196 [Abstract/Free Full Text]
  6. Matters, G. L., and Beale, S. I. (1994) Plant Mol. Biol. 24, 617-629 [Medline] [Order article via Infotrieve]
  7. Ilag, L. L., Kumar, M., and Söll, D. (1994) Plant Cell 6, 265-275 [Abstract/Free Full Text]
  8. Jahn, D., Verkamp, E., and Söll, D. (1992) Trends Biochem. Sci. 17, 215-218 [CrossRef][Medline] [Order article via Infotrieve]
  9. Grimm, B. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 4169-4173 [Abstract]
  10. Sangwan, I., and O'Brian, M. R. (1993) Plant Physiol. (Bethesda) 102, 829-834 [Abstract/Free Full Text]
  11. Schneegurt, M. A., and Beale, S. I. (1986) Plant Physiol. (Bethesda) 81, 965-971
  12. Appleby, C. A. (1984) Annu. Rev. Plant Physiol. 35, 443-478 [CrossRef]
  13. Sangwan, I., and O'Brian, M. R. (1991) Science 251, 1220-1222
  14. Sangwan, I., and O'Brian, M. R. (1992) Plant Physiol. (Bethesda) 98, 1074-1079
  15. Madsen, O., Sandal, L., Sandal, N. N., and Marcker, K. A. (1993) Plant Mol. Biol. 23, 35-43 [Medline] [Order article via Infotrieve]
  16. Kaczor, C. M., Smith, M. W., Sangwan, I., and O'Brian, M. R. (1994) Plant Physiol. 104, 1411-1417 [Abstract/Free Full Text]
  17. Rieble, S., and Beale, S. I. (1991) J. Biol. Chem. 266, 9740-9745 [Abstract/Free Full Text]
  18. Verkamp, E., Jahn, M. Jahn, D., Kumar, A. M., and Söll, D. (1992) J. Biol. Chem. 267, 8275-8280 [Abstract/Free Full Text]
  19. Jahn, D., Michelsen, U., and Söll, D. (1991) J. Biol. Chem. 266, 2542-2548 [Abstract/Free Full Text]
  20. Huang, L., and Castelfranco, P. A. (1990) Plant Physiol. (Bethesda) 92, 172-178
  21. Studier, F. W., Rosenberg, A. H., Dunn, J. J., and Dubendorff, J. W. (1990) Methods Enzymol. 185, 60-89 [Medline] [Order article via Infotrieve]
  22. Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A., and Struhl, K. (1987) Current Protocols in Molecular Biology, Wiley Interscience, New York
  23. Frustaci, J. M., Sangwan, I., and O'Brian, M. R. (1991) J. Bacteriol. 173, 1145-1150 [Medline] [Order article via Infotrieve]
  24. Hoober, J. K., Kahn, A., Ash, D. E., Gough, S., and Kannangara, C. G. (1988) Carlsberg Res. Commun. 53, 11-25 [Medline] [Order article via Infotrieve]
  25. Fang, G., Hammar, S., and Grumet, R. (1992) BioTechniques 13, 52-55 [Medline] [Order article via Infotrieve]
  26. Dellaporta, S. L., Wood, J., and Hicks, J. B. (1983) Plant Mol. Biol. Rep. 1, 19-21
  27. Rowland, L. J., and Nguyen, B. (1993) BioTechniques 14, 735-736
  28. O'Brian, M. R., Kirshbom, P. M., and Maier, R. J. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 8390-8393 [Abstract]
  29. Ochman, H., Gerber, A. S., and Hartl, D. L. (1988) Genetics 120, 621-623 [Abstract/Free Full Text]
  30. Franssen, H. J., Thompson, D. V., Idler, K., Kormelink, R., van Kammen, A., and Bisseling, T. (1989) Plant Mol. Biol. 14, 103-106
  31. Hyldig-Nielsen, J. J., Jensen, E., Paludan, K., Wiborg, O., Garrett, R., Jorgensen, P., and Marcker, K. A. (1982) Nucleic Acids Res. 10, 689-701 [Abstract]
  32. Fortin, M. G., Purohit, S. K., and Verma, D. P. S. (1988) Nucleic Acids Res. 16, 11377 [Medline] [Order article via Infotrieve]
  33. Jensen, E. O., Marcker, K. A., Schell, J., and de Bruijn, F. J. (1988) EMBO J. 7, 1265-1271
  34. Fried, M., and Crothers, D. M. (1981) Nucleic Acids Res. 9, 6505-6525 [Abstract]
  35. Frustaci, J. M., and O'Brian, M. R. (1993) Appl. Environ. Microbiol. 59, 2347-2351 [Abstract]
  36. Olmsted, J. B. (1981) J. Biol. Chem. 256, 11955-11957 [Abstract/Free Full Text]
  37. Nap, J. P., and Bisseling, T. (1990) Science 250, 948-954
  38. Caetano-Anolles, G., and Gresshoff, P. M. (1991) Annu. Rev. Microbiol. 45, 345-382 [CrossRef][Medline] [Order article via Infotrieve]
  39. Franssen, H. J., Nap, J. P., Gloudemans, T., Stiekema, W., van Dam, H., Govers, F., Louwerse, J. van Kammen, A., and Bisseling, T. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 4495-4499 [Abstract]
  40. Dickstein, R., Bisseling, T., Reinhold, V. N., and Ausubel, F. M. (1988) Genes & Dev. 2, 677-687
  41. Soeller, W. C., Oh, C. E., and Kornberg, T. B. (1993) Mol. Cell. Biol. 13, 7961-7970 [Abstract]
  42. Lu, Q., Wallrath, L. L., Granok, H., and Elgin, S. C. R. (1993) Mol. Cell. Biol. 13, 2802-2814 [Abstract]
  43. Tsukiyama, T., Becker, P. B., and Wu, C. (1994) Nature 367, 525-532 [CrossRef][Medline] [Order article via Infotrieve]
  44. Gilmour, D. S., Thomas, G. H., and Elgin, S. C. R. (1989) Science 245, 1487-1490 [Medline] [Order article via Infotrieve]
  45. Verma, D. P. S., Fortin, M. G., Stanley, J., Mauro, V. P., Purohit, S., and Morrison, N. (1986) Plant Mol. Biol. 7, 51-61
  46. Estabrook, E. M., and Sengupta-Gopalan, C. (1991) Plant Cell 3, 299-308 [Abstract/Free Full Text]
  47. Chauhan, S., and O'Brian, M. R. (1993) J. Bacteriol. 175, 7222-7227 [Abstract]
  48. Stoltzfus, A., Spencer, D. F., Zuker, M., Logsdon, J. M., and Doolittle, W. F. (1994) Science 265, 202-207 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.