©1996 by The American Society for Biochemistry and Molecular Biology, Inc.
Differential Expression of the Expression Site-associated Gene I Family in African Trypanosomes (*)

(Received for publication, November 28, 1995; and in revised form, February 5, 1996)

Rodney W. Morgan Najib M. A. El-Sayed Jadwiga K. Kepa Mehrdad Pedram John E. Donelson (§)

From the Department of Biochemistry, University of Iowa and Howard Hughes Medical Institute, Iowa City, Iowa 52242

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

A minimum of 20 different mRNA species encoding related members of the expression site-associated gene I (ESAG-I) family occur in metacyclic variant antigen type 4 bloodstream trypanosomes. None of these ESAG-I mRNAs are derived from the metacyclic variant antigen type 4 variant surface glycoprotein (VSG) gene expression site, and some appear to come from pseudogenes. The ESAG-Is are transcribed in both procyclic and bloodstream trypanosomes, but their mRNAs accumulate to a detectable steady state level only in bloodstream trypanosomes. At least five different groups of 3`-untranslated regions (3`-UTRs) are represented among these ESAG-I mRNAs, suggesting that the 3`-UTR does not contribute to their differential expression. Some ESAG-I mRNAs completely lack a 3`-UTR or have only a single nucleotide as a 3`-UTR. Transcription of the ESAG-Is is sensitive to alpha-amanitin, indicating that they are transcribed by a different RNA polymerase than the VSG genes. These results collectively demonstrate that ESAG-I's are a heterogeneous population that can be expressed independently of VSG genes, but like the VSG genes, their mRNAs are present in the bloodstream stage of the parasite and not in the procyclic stage.


INTRODUCTION

African trypanosomes evade their hosts' immune response by sequentially expressing different variant surface glycoproteins (VSGs) (^1)from a repertoire of 1000 or more VSG genes. The 20 or more potential expression sites for a VSG gene are invariably situated near a telomere, whereas the transcriptionally silent VSG genes are scattered throughout the chromosomes. The mechanisms that activate one and only one of these telomere-linked expression sites at a time in a given trypanosome are only partially understood. In some cases, activation is associated either with duplicative transposition of a silent donor VSG gene to a telomeric-linked expression site or with a telomere exchange event. In other cases, a silent VSG gene already at a telomere-linked site is activated in situ without apparent DNA rearrangement (for recent reviews, see (1, 2, 3) ).

Transcription of at least some telomere-linked VSG expression sites is initiated 45-60 kb upstream of the VSG gene and proceeds through as many as nine or 10 members of different gene families called expression site-associated genes (ESAGs). The resultant polycistronic pre-mRNA is processed into individual mRNAs by 5` trans-splicing and 3` polyadenylation(4, 5, 6, 7) . The steady state levels of the ESAG mRNAs are as much as 100-700-fold less than that of the VSG mRNA, indicating that expression of these co-transcribed genes is regulated at least in part by post-transcriptional events such as pre-mRNA processing and/or mRNA stability(8) . The different ESAGs in an expression site are distinguished from one another by numbers or Roman numerals. ESAG-1 (or ESAG-I) is designated as the first gene preceding the VSG gene, and in general the larger the number or numeral, the further upstream within the expression site the ESAG is (the exception being ESAG-8, which lies between ESAG-3 and -4(3) ). Most of the 31-kb sequence of the AnTat 1.3A VSG gene expression site has been reported(3, 6, 9) , and known protein products of its nine ESAG representatives include an adenylate cyclase (ESAG-4), two transferrin receptor subunits (ESAG-6 and -7), and a putative zinc finger protein (ESAG-8). The functions of the other ESAG products encoded in this expression site remain to be elucidated.

The ESAG-I family was the first ESAG family to be discovered(8, 10) . Its 14-25 members encode amphiphilic glycoproteins of about 46 kDa whose function and cellular location are not known(10, 11) . We have previously reported the genomic sequence of an ESAG-I that is located several kb upstream of the VSG gene expressed by the MVAT4 trypanosome clone(12) . A promoter has been found to occur between the ESAG-I and the VSG gene in this telomere-linked expression site(13) . Nuclear run-on assays, primer extension experiments, and reporter gene transfections all indicate that this promoter initiates synthesis of a monocistronic pre-mRNA encoding only one protein, the VSG. No additional open reading frame occurs in this pre-mRNA, suggesting that the upstream ESAG-I must be part of another transcription unit, if indeed it is transcribed at all.

To resolve the question of whether the upstream ESAG-I is expressed in MVAT4 trypanosomes, we isolated two dozen ESAG-I cDNAs from an MVAT4 cDNA library. We discovered that none of these cDNAs were identical to the ESAG-I upstream of the MVAT4 VSG gene. This finding led to the study described here, which demonstrates that many different ESAG-Is are transcribed in bloodstream trypanosomes by an alpha-amanitin-sensitive RNA polymerase and suggests that the term ``expression site-associated gene'' may be a misnomer for this gene family.


MATERIALS AND METHODS

Trypanosomes

Bloodstream trypanosome clones MVAT4, MVAT5-Rx2, MVAT7, WRATat1.1, and WRATat1.19 from the WRATat serodeme of Trypanosoma brucei rhodesiense(14) were grown and isolated from rats as described previously(13, 15, 16) . The WRATat1.1 trypanosome clone is the progenitor of the other clones. The MVAT4, MVAT5-Rx2, and MVAT7 bloodstream clones express VSGs that are also expressed by metacyclic stage trypanosomes(17) . The clones were shown by immunofluorescence to be at least 99% pure with respect to the VSG being expressed. Procyclic trypanosomes derived from bloodstream clones MVAT7 or MVAT5-Rx2 were maintained in culture as described(13) .

Analysis of Nascent (Run-on) RNA in Isolated Nuclei

The nuclei of procyclic or bloodstream trypanosomes were isolated using a protocol kindly provided by Etienne Pays and described previously (13, 18, 19) and stored at -70 °C until used. They were thawed and incubated with [alpha-P]UTP, and their RNAs were isolated for use as probes in Southern blots as described(13) . In some experiments alpha-amanitin (500 µg/ml) was added to the nuclei prior to incubation.

Other Procedures

The bloodstream T. brucei rhodesiense cDNA libraries were constructed in ZAP (Stratagene) as described earlier for the MVAT4 cDNA library(20) . The libraries were screened with P-labeled probe A (see Fig. 2) using standard procedures (21) under the following moderately stringent hybridization and washing conditions: 42 °C for 16 h with the labeled probe in 50% formamide, 6 times SSC, 0.1% SDS, 5 times Denhardt's solution, and 100 mg/ml denatured salmon sperm DNA followed by a single washing at 45 °C for 1 h in 0.2 times SSC and 0.1% SDS. Genomic DNAs (22) and total RNAs (23) were isolated from bloodstream or procyclic trypanosomes for Southern and Northern blots(21) . Probes A-F shown in Fig. 2were generated by either polymerase chain reaction amplification or restriction enzyme digestions. Hybridization probes were labeled with P using the random priming method(24) . DNA sequences in plasmids were determined by a combination of manual sequencing (25) using a Sequenase kit (U.S. Biochemical Corp.) and automated sequencing using an ABI 373 automated sequencer (Perkin-Elmer). Sequences were aligned using the HIBIO MacIntosh DNASIS program (Hitachi) and the CLUSTAL algorithm(26) .


Figure 2: Schematic representation of the nucleotide sequences of 14 different ESAG-I cDNAs and the ESAG-I upstream of the MVAT4 VSG gene. The cDNAs were isolated from the following cDNA libraries: ESAG-Ia and ESAG-If (WRATat 1.1 library); ESAG-Ib and ESAG-Ig (WRATat 1.19 library); all remaining ESAG-Is (MVAT4 library). MVAT4 ESAG-I was isolated from a genomic DNA clone in an MVAT4 genomic DNA library(12) . ESAG-Ia was chosen as the reference with which the others are compared because it is the longest sequence. The ESAG-I coding region is about 1 kb. Vertical lines represent one to four nucleotide changes; dashed lines denote insertions or deletions of one to four nucleotides. Patterned regions represent segments of nonidentity; identical patterns indicate identical or near identical sequences. Black circles indicate a stop codon interrupting the open reading frame. Small black rectangles at the beginning of some cDNAs indicate a putative start codon of ATA. Small open rectangles at the beginning of ESAG-Il, ESAG-Im, and the MVAT4 ESAG-I indicate the conventional start codon, ATG. SL indicates that the cDNA contains at least part of the 5`-spliced leader sequence. All the cDNAs have a 3` poly(A) tail. ESAG-Ij completely lacks a 3`-UTR, and ESAG-Il has a single nucleotide as a 3`-UTR. ESAG-Ic and ESAG-Ij are identical except that ESAG-Ij lacks the 3`-UTR. Probes A-F, indicated by thick black horizontal lines, were used in hybridization experiments.




RESULTS

Comparison of ESAG-I cDNAs

The telomere-linked expression sites for the genes encoding the MVAT4 VSG and the WRATat1.1/1.19 VSG were characterized in earlier experiments (13, 39) and are shown in Fig. 1. MVAT4 bloodstream trypanosomes express a metacyclic VSG gene without apparent DNA rearrangement from a promoter located about 2 kb upstream of the VSG gene's start codon. An ESAG-I situated about 3 kb upstream of this promoter will be referred to as MVAT4 ESAG-I. Nuclear run-on experiments have demonstrated that most, if not all, of the 3 kb intergenic region between MVAT4 ESAG-I and the promoter is not transcribed(13) .


Figure 1: Diagrams showing the telomere-linked expression sites for the genes encoding the MVAT4 and WRATat1.1/1.19 VSGs. Rectangles indicate the VSG and ESAG-Is; black circles denote telomeres; the flag indicates the promoter for the MVAT4 VSG gene; zig-zag lines show tandem 76-bp repeats; wavy lines denote precursor RNAs. Probe A was used to screen an MVAT4 cDNA library and is also depicted in Fig. 2.



WRATat1.1 and WRATat1.19 are separately isolated trypanosome clones expressing the same bloodstream VSG gene, also without apparent DNA rearrangement. WRATat1.19 was originally cloned from a mouse infected by tsetse flies, which had ingested WRATat1.1 trypanosomes. In both WRATat1.1 and WRATat1.19, the expressed VSG gene is preceded by a barren region of 25 kb or more that is comprised predominately of 76-bp repeats. The promoter for this VSG gene has not been identified but nuclear run-on assays using ultraviolet (UV)-irradiated nuclei from WRATat1.1 indicated that it is located far upstream of the gene and perhaps in front of the 76-bp repeats. Thus, this expression site resembles other bloodstream VSG expression sites whose primary transcripts have been shown to be 45-60 kb in length(3, 4, 5, 6) . It is not known if an ESAG-I is located in front of this barren region or if one is represented in the very long primary transcript encoding the WRATat1.1/1.19 VSG.

To examine the expression of ESAG-Is in these trypanosomes, the 660-bp probe A indicated in Fig. 1and Fig. 2was used to screen cDNA libraries constructed from poly(A) RNA of each of these three trypanosome clones. When 70,000 clones in the MVAT4 cDNA library were screened, 24 clones were identified (0.034%). Since about 4% of the cDNAs in the same library encode the MVAT4 VSG, the ratio of ESAG-I cDNAs to MVAT4 VSG cDNAs in this library is 0.034:4 or about 1:120. Likewise, the WRATat1.1 and WRATat1.19 cDNA libraries were found to contain a similar ratio of ESAG-I cDNAs to VSG cDNAs.

The 24 ESAG-I cDNAs from a MVAT4 cDNA library and four additional ESAG-I cDNAs (two each from WRATat1.1 and WRATat1.19 cDNA libraries) were chosen for further study. Partial sequence determinations of the 24 ESAG-I cDNAs from the MVAT4 cDNA library revealed that 20 possessed unique coding sequences, none of which was identical to MVAT4 ESAG-I. Thus, a minimum of 20 different ESAG-I mRNA species occur in MVAT4 trypanosomes, and it seems likely that additional unique ESAG-I cDNAs could be found if the library were rescreened and more positive clones were sequenced. The complete sequences of 11 of the 24 cDNAs were determined (Fig. 2). The complete sequences of the remaining 13 cDNAs were not determined because their partial sequences indicated that they were very similar to at least one of the other sequences. As it turned out, 2 of the 11 cDNAs were found to be identical (collectively called ESAG-Ik in Fig. 2), and another two differ only in the lengths of their 3`-UTRs (ESAG-Ic and ESAG-Ij in Fig. 2). In addition, the complete sequences of the four ESAG-I cDNAs from the other two libraries were determined and were also found to have nucleotide differences. These 14 distinct ESAG-I cDNAs are compared schematically in Fig. 2along with the MVAT4 ESAG-I coding sequence. The 14 cDNAs are called ESAG-Ia to -In, with ESAG-Ia being chosen as the reference sequence for the sake of comparison because it is the longest sequence determined.

As is readily apparent from Fig. 2, some ESAG-I cDNAs are more similar to each other than are others. For example, ESAG-Ia to -Ij have very similar coding regions, but only the coding regions of ESAG-Ic and ESAG-Ij are identical. Among these 10 ESAG-Is, the first four share long 3`-UTRs (>1 kb) that are very similar, the fifth has a related intermediately sized 3`-UTR, and the next three (ESAG-If, -Ig, and -Ih) share an unrelated short 3`-UTR (207 bp) whose divergence extends into the last 16 codons of the coding region. The last two cDNAs among this common group of 10 have either a very short 3`-UTR of 15 bp (ESAG-Ii) or completely lack a 3`-UTR (ESAG-Ij). In the latter case, the last residue of the termination codon, TGA, is the first residue of the poly(A) tail. Five of these 10 common cDNAs have an interior termination codon (indicated by the black dot in Fig. 2and corresponding to codon 106 of ESAG-Ia) that disrupts the open reading frame, suggesting they are derived from pseudogenes. At least one other ESAG-I sequence with an internal termination codon has been reported (27) . ESAG-Ik appears to be a hybrid ESAG; the 5`-half resembles the above group of 10, whereas the 3`-half, including the 3`-UTR, is divergent. Another unexpected feature of this group of 10 ESAG-I cDNAs is that some of those whose 5`-ends extend to the 5`-spliced leader do not possess the usual ATG start codon. Instead, they have the codon ATA at this position (small black rectangles in Fig. 2). Although it is not known if the corresponding AUA in the RNA can serve to initiate protein synthesis in trypanosomes, this triplet can function as a start codon in prokaryotes and mitochondria(28, 29, 30) . Whether it can serve as a start codon in eukaryotes probably depends on the flanking nucleotides (31) and remains to be demonstrated in trypanosomes. In contrast, ESAG-Il, ESAG-Im, and the MVAT4 ESAG-I have the conventional ATG start codon.

The last three cDNAs shown in Fig. 2(ESAG-Il, -Im, and -In) differ more substantially in sequence, both among each other and from the other ESAG-Is, than do the first 11. These differences are represented by the differently patterned segments. ESAG-Il has a 3`-UTR consisting of a single cytosine between the termination codon TGA and the poly(A) tail. MVAT4 ESAG-I, the last sequence represented in Fig. 2, is also quite divergent from any of the above 14 cDNAs.

Fig. 3displays the deduced amino acid sequences for the eight most divergent ESAG-I coding regions, again with ESAG-Ia as the reference sequence. Three general features are apparent from this amino acid alignment. First, the N-terminal halves of the ESAG-I coding sequences are clearly more similar than are the C-terminal halves, an observation made previously for a smaller number of ESAG-I sequences (8, 11, 12) . The only exception to this rule is ESAG-Id (second line in Fig. 3), which displays more divergence from ESAG-Ia in the front half than in the back half. Thus, from a functional standpoint, the more highly conserved N-terminal half appears to tolerate fewer changes than the C-terminal half. The second apparent feature is that individual ESAG-Is often differ from each other in small blocks of amino acids. Correspondingly, two or more ESAG-Is may share a block of 2-10 amino acids that the other ESAG-Is do not have. This result suggests that ESAG-I family members may undergo occasional internal cross-over events among themselves, diversifying their sequences. An example of this possibility is ESAG-Ik, whose N-terminal half closely resembles the common group of 10 and whose C-terminal half is clearly derived from a different sequence that nevertheless has within it some conserved blocks of amino acids. Another illustration comes from a comparison of ESAG-Im and ESAG-In, two of the more highly divergent ESAG-Is whose amino acid sequences are much more similar in their N-terminal halves than in their C-terminal halves. Finally, six of the eight cysteines in ESAG-Ia are conserved in all of the ESAG-Is, and some are conserved within highly divergent regions, suggesting that these cysteines may play an important structural or functional role.


Figure 3: Comparison of the deduced amino acid sequences of the indicated ESAG-Is. Dots indicate identical amino acids. Amino acid differences are highlighted in black. Note that some ESAG-Is share short blocks of identity, whereas others share large regions of identity. The site of the interior termination codon indicated by the black dots in other cDNAs shown in Fig. 2is denoted by an asterisk.



Northern and Southern Blot Analyses

Fig. 4A shows Northern blots of RNAs from procyclic trypanosomes (lane P) and from bloodstream trypanosome clones MVAT4 and MVAT7 (lanes 4 and 7). The blots are probed with three probes indicated in Fig. 2and a tubulin probe. Probe B is a representative of the coding sequence shared by the common group of 10 cDNAs described above. Probe C represents the long 3`-UTR possessed by a subset of this common group, and probe D is the divergent 3`-half of ESAG-Ik.


Figure 4: Northern blot analysis of ESAG-I mRNAs. A, Northern blots of total RNAs (5 mg/lane) from procyclic (lane P), bloodstream MVAT4 (lane 4), and bloodstream MVAT7 (lane 7) trypanosomes probed with ESAG-I probes B, C, and D shown in Fig. 2and with the tubulin coding region. B, similar Northern blots of bloodstream WRATat1.1 (lane 1) and WRATat1.19 (lane 19) trypanosome RNAs probed with probes B and C. Lane M contains standard length markers.



Probe B hybridizes to two RNA size classes, one of 2.3 kb and one of 1.3 kb. If the poly(A) tail accounts for about 0.2 kb, the two classes correspond in size to the ESAG-I cDNAs in the common group of 10 that have either the long 3`-UTR (ESAG-Ia to -Id) or the unrelated short 3`-UTR (ESAG-If to -Ih). In some lanes an intermediately sized band can be seen that likely corresponds to ESAG-Ie, which has an intermediately sized 3`-UTR. No hybridization to procyclic RNA was detected, even under very long exposure times. Thus, the steady state level of ESAG-I RNA is much higher in bloodstream parasites than in procyclic organisms.

Probe C does not hybridize to the 1.3-kb RNA, supporting the above interpretation that this RNA size class corresponds to ESAG-I cDNAs lacking this long 3`-UTR. Probe C does hybridize to the 2.3-kb RNA, as expected, and also hybridizes to a larger RNA species not recognized by probe B. The significance of this larger RNA is uncertain. Perhaps all or part of the probe C sequence occurs in an RNA species that does not have an ESAG-I coding region or has an ESAG-I coding region that has diverged enough to not cross-hybridize with probe B under the moderately stringent conditions used. Since a representative of this larger RNA species was not among the cDNAs examined, its presence was not studied further. Probe D recognizes an RNA slightly smaller than 1.3 kb, as expected from the size of its cDNA (Fig. 2). The tubulin probe (which does not have small amounts of vector sequence that hybridize to the marker DNAs as do the other probes) generates signals of similar intensities in the other three lanes, showing that similar amounts of RNA were loaded in each lane.

Fig. 4B shows that similar hybridization patterns are obtained when RNAs from the bloodstream WRATat1.1 and WRATat1.19 trypanosome clones are probed with fragments B and C. Thus, multiple ESAG-I mRNA species are present in these trypanosome clones as well, consistent with the finding that the four ESAG-I cDNAs examined from the WRATat1.1 and WRATat1.19 cDNA libraries are all different. Since the expression site for the 1.1/1.19 VSG is transcribed into a long polycistronic pre-mRNA, the presence of multiple, heterogeneous ESAG-I transcripts is not an aberration of active expression sites transcribed into a monocistronic VSG pre-mRNAs.

When probe A (the MVAT4 ESAG-I) was used in Northern blots under the same hybridization stringency conditions, no detectable signal was observed to either bloodstream or procyclic RNAs, even after long exposure times (not shown). This result is consistent with the fact that none of the 24 ESAG-I cDNAs isolated from the MVAT4 cDNA library encode the MVAT4 ESAG-I. Thus, if MVAT4 ESAG-I RNA is present in MVAT4 trypanosomes, it is at a level too low to detect on Northern blots and at best represents only a few percent of the total ESAG-I RNA population in MVAT4 organisms.

Fig. 5shows Southern blots of genomic DNAs from MVAT4, MVAT7 and WRATat1.1 trypanosomes hybridized to probes B, C, and F. EcoRI and HindIII were used to digest the DNAs because none of the ESAG-I cDNA sequences possess their cleavage sites. The hybridization patterns were identical for all three genomes with all of the probes tested, providing no evidence for DNA rearrangements among the ESAG-Is. Probe B (coding region of the common 10) recognizes at least eight HindIII fragments, one of which is much more intense than the others, suggesting that either multiple copies of this particular fragment exist or multiple genes occur within the fragment. Probe C (long 3`-UTR) hybridizes to a subset of the EcoRI and HindIII fragments to which probe B binds, as expected if this 3`-UTR is encoded by only some of the genes recognized by probe B. Probe F, which should recognize another subset of these same genes, generates a simple banding pattern but appears to detect an EcoRI and HindIII fragment not recognized by probe B. The interpretation of this result is not clear, but it suggests that at least one copy of the probe F sequence in the genome might be flanked by a region other than the probe B sequence.


Figure 5: Southern blot analysis of ESAG-Is. DNAs (5 mg/lane) from trypanosome clones MVAT4 (lane 4), MVAT7 (lane 7), and WRATat1.1 (lane 1) were digested with EcoRI or HindIII and probed with ESAG-I probes B, C, and F shown in Fig. 2.



Analysis of Nascent ESAG-I Transcripts in Nuclei of Procyclic and Bloodstream Trypanosomes

Nuclear run-on assays were used to detect the nascent transcripts of the ESAG-Is. Radiolabeled run-on RNAs from nuclei of procyclic and bloodstream trypanosomes incubated in the presence or absence of alpha-amanitin were used to probe the ESAG-I coding region. Transcription from the VSG and PARP gene expression sites is known to be resistant to alpha-amanitin, suggesting that these expression sites are transcribed by RNA polymerase I or a modified RNA polymerase II(32, 33) . The procyclic portion of Fig. 6(left panel) shows the results when procyclic run-on RNA was used as the probe. As expected, procyclic RNA synthesized in the absence of alpha-amanitin (panel C) hybridizes to the PARP and tubulin genes but not to the VSG gene. It also hybridizes with a reduced signal to probe B (the coding region of the common ten) but not to probe A (MVAT4 ESAG-I coding region). This result indicates that at least some of the ESAG-Is are transcribed in procyclic organisms even though procyclic ESAG-I RNA was not detected on Northern blots. When alpha-amanitin was present during the procyclic nuclei incubation (panel B), transcription of the PARP genes was unaffected but transcription of the genes for tubulin and ESAG-I was greatly reduced, indicating that the ESAG-Is are transcribed by a conventional RNA polymerase II, similar to the tubulin genes but distinct from the PARP and VSG genes. The bloodstream portion of Fig. 6(right panel) shows a similar experiment using run-on RNA from bloodstream MVAT4 nuclei. In the absence of alpha-amanitin this RNA hybridizes strongly to the MVAT4 VSG and weakly to probe B. Thus, the single MVAT4 VSG gene is transcribed at a much higher rate than are the multiple ESAG-Is, consistent with the 120:1 ratio of their respective cDNAs in the MVAT4 cDNA library. Again, no hybridization to probe A was detected, consistent with the finding that transcripts of MVAT4 ESAG-I are only a small fraction of the pool of heterogeneous ESAG-I RNAs in MVAT4 trypanosomes, if indeed they even exist at all. The alpha-amanitin had no effect on VSG gene transcription, as expected, but diminished transcription of the ESAG-Is to an undetectable level, indicating that the MVAT4 VSG gene and the ESAG-Is are transcribed by different RNA polymerase complexes. The run-on RNAs were also used to probe other unique regions within the collection of ESAG-I cDNAs (not shown). Although the strongest signals were obtained to the probe B sequence, consistent with its presence in more than half of the ESAG-I cDNAs, weak alpha-amanitin-sensitive signals were also obtained to probes C and D and to fragments unique to ESAG-Il, -Im, and -In.


Figure 6: Analysis of nascent ESAG-I transcripts in procyclic and bloodstream form trypanosomes. Left panel (procyclic): run-on RNA prepared from nuclei of procyclic form trypanosomes incubated in the presence (+) or absence(-) of alpha-amanitin was used to probe Southern blots. Panel A shows a photograph of an ethidium bromide-stained gel containing the excised inserts of different plasmids. Fragment T encodes tubulin; fragment P encodes PARP; fragment V encodes the MVAT7 VSG; fragments A and B are the ESAG-I probes A and B shown in Fig. 2. The 3-kb fragment in each lane is the linearized vector, and the larger fragments are partial digestion products. Panels B and C are autoradiograms of blots containing the fragments shown in panel A. The arrow indicates the signal to fragment B in panel C. Right panel (bloodstream): run-on RNA prepared from nuclei of bloodstream form MVAT4 trypanosomes incubated in the presence (+) or absence(-) of alpha-amanitin was used to probe Southern blots. In panel A, fragment V encodes the MVAT4 VSG, and fragments A and B are the ESAG-I probes A and B. The arrow indicates the signal to fragment B in panel C.




DISCUSSION

Recent reviews on the molecular mechanisms of trypanosome antigenic variation, including one from this lab(1) , invariably refer to ESAG-Is as genes that are co-transcribed with the VSG gene in the active VSG gene expression site. In MVAT4 trypanosomes this clearly is not the situation. No evidence was obtained for the expression of MVAT4 ESAG-I in MVAT4 organisms. None of the ESAG-I cDNAs had the same sequence as MVAT4 ESAG-I ( Fig. 1and Fig. 2), nuclear run-on RNA from MVAT4 nuclei did not hybridize to MVAT4 ESAG-I (Fig. 6), and Northern blots probed with the MVAT4 ESAG-I probe (probe A) did not detect a transcript. Thus, if MVAT4 ESAG-I is expressed in MVAT4 organisms, its mRNA is a very small fraction of the total pool of heterogeneous ESAG-I mRNAs and too rare to detect by conventional techniques.

A possible explanation for this unexpected finding is that the monocistronic MVAT4 expression site is not representative of the polycistronic expression sites identified for at least some other bloodstream VSG genes(3, 4, 5, 6) . However, the expression site for the WRATat1.1/1.19 VSG gene does resemble these other polycistronic VSG expression sites, and WRATat1.1/1.19 trypanosomes likewise contain a heterogeneous population of ESAG-I mRNAs. Indeed, Northern blots (Fig. 4) suggest that four different trypanosome clones (MVAT4, MVAT7, WRATat1.1, and WRATat1.19) express essentially the same array of different ESAG-Is. Yet, in those cases where an ESAG-I does occur within the polycistronic transcription unit of a VSG gene expression site, its RNA sequence is the predominant ESAG-I mRNA in that trypanosome clone(8) . The simplest explanation of these results is that an expressed ESAG-I need not be in a telomere-linked VSG gene expression site, but it does no harm if it is there. Perhaps ESAG-Is become part of a telomere-linked, polycistronic expression site only if they inadvertently land there as a partner in a recombination event such as a transposon-mediated transposition, a VSG gene conversion, or some other DNA rearrangement event. Our preliminary characterizations of genomic DNA clones containing ESAG-Is suggest that they are scattered about the genome, sometimes in clusters of at least four gene copies, and Southern blots of CHEF gels probed with probes B and C indicate that they occur predominately on large chromosomes of 2,000 kb or more (not shown). In addition, Southern blots of restricted genomic DNA (Fig. 5) provide no evidence that they are near large barren regions such as the 76-bp repeats or telomeric repeats. Thus, although some members of the ESAG-I family are located in a VSG expression site, others are not and manage to be transcribed from these other sites.

The nuclear run-on assays (Fig. 6) indicated that the ESAG-Is are transcribed in both procyclic and bloodstream trypanosomes by an alpha-amanitin-sensitive RNA polymerase, in contrast to VSG genes whose RNA synthesis is resistant to alpha-amanitin. Yet Northern blots showed that the ESAG-I mRNAs accumulate to a detectable steady state level only in bloodstream trypanosomes, similar to the VSG mRNAs. These results are consistent with the earlier findings of Graham and Barry (34) , except they found that synthesis of ESAG-I transcripts in procyclic organisms was resistant to alpha-amanitin rather than sensitive. Although the reason for this difference is unclear, the distribution of ESAG-Is within the genome of the T. brucei 221 clone used in their work is undoubtedly different from that in our trypanosome clones, so at least some ESAG-Is of the two respective serodemes could be on different transcription units in procyclic organisms.

In the case of VSG mRNAs, their semiconserved 3`-UTRs are crucial in conferring the bloodstream stage specificity to the VSG mRNAs (35) . However, at least five different general groupings of 3`-UTR sequences exist among the heterogeneous ESAG-I cDNAs described here (represented in ESAG-Ia, -If, -Ik, -Im, and -In). Our attempts to identify substantive sequence similarities among these five main 3`-UTR groupings via either pairwise alignments or a group alignment of all five were not particularly revealing. The longest stretch of sequence identity among all five is five bp. The maximum sequence identity between any two of the five groups is 17 of 23 positions. Still other poly(A) ESAG-I cDNAs were found to have 3`-UTRs of only 15, 1, or 0 nucleotides, suggesting but not proving that the 3`-UTR is not responsible for the bloodstream stage-specific stability of the ESAG-I mRNAs as it is for VSG mRNAs. One possibility for the existence of these unique ESAG-I 3`-UTR sequences is that they might confer different properties to the ESAG-I RNAs within bloodstream trypanosomes, such as different half-lives, specific cytoplasmic compartmentalization, or differential expression in stumpy and slender forms. Another possibility is that their bloodstream stage-specificity could be conferred, not by the 3`-UTRs but by sequences within a conserved segment of the more highly conserved N-terminal coding regions. It should be possible to test these and other models for the bloodstream stage specificity of ESAG-I mRNAs using transient transfections with a reporter gene containing different segments of the ESAG-I coding regions and 3`-UTRs.

The small blocks of identity and/or dissimilarity among some of the deduced ESAG-I amino acid sequences (Fig. 3) are reminiscent of similar properties of a few duplicated VSG genes(36, 37, 38) . In those VSG examples, the newly duplicated VSG gene is a mosaic of segments from two or more closely related donor VSG isogenes and is likely created by multiple cross-over events among the isogenes during duplication. Sometimes, these donor VSG genes are actually pseudogenes with internal termination codons that prohibit their own expression into protein but do not interfere with the conversion of other segments of their sequence into a new gene. Similar events could also scramble blocks of sequences among the multiple ESAG-Is and pseudogenes, leading to the patterns revealed in Fig. 2and Fig. 3.

The function of the ESAG-I proteins remains an enigma. Their sequences and the rarity of their mRNAs suggest that they are minor surface proteins(3, 8) , although attempts to verify their surface location using antibodies have been precluded by their low abundance(10) . The results described here do not shed new light on their possible functions but do demonstrate that their presence is not linked to a specific VSG or VSG gene expression site. In addition, the complete sequence conservation of some segments of their N-terminal halves and the complete conservation in all ESAG-Is of six of the eight cysteines in ESAG-Ia (see Fig. 3) suggest that, despite their heterogeneity and low abundance, ESAG-I proteins do serve a role for bloodstream trypanosomes. This is a role that apparently can be fulfilled by heterogeneous mixtures of related proteins rather than a homogeneous protein population such as is required for VSG function. The challenge now is to identify that ESAG-I role.


FOOTNOTES

*
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U40840[GenBank], U40841[GenBank], and U41223[GenBank]-U41234 [GenBank](for ESAG-Ia through ESAG-In cDNAs) and M21052 [GenBank](for MVAT4 ESAG-I).

§
To whom correspondence should be addressed: Dept. of Biochemistry, University of Iowa, Iowa City, IA 52242. Tel.: 319-335-7889; Fax: 319-335-6764.

(^1)
The abbreviations used are: VSG, variant surface glycoprotein; ESAG, expression site-associated gene; UTR, untranslated region; MVAT, metacyclic variant antigen type; WRATat, Walter Reed Army Trypanozoon antigen type; PARP, procyclic acidic repetitive protein; kb, kilobase(s); bp, base pair(s).


ACKNOWLEDGEMENTS

We thank Kwang Kim for generously providing the MVAT7 RNA and DNA, Shiyong Li for conducting some of the early experiments, and the University of Iowa DNA Facility for automated DNA sequence determinations (National Institutes of Health Grant DK25295).


REFERENCES

  1. Donelson, J. E. (1995) J. Biol. Chem. 270, 7783-7786 [Free Full Text]
  2. Graham, S. V. (1995) Parasitol. Today 11, 217-223 [CrossRef]
  3. Vanhamme, L. and Pays, E. (1995) Microbiol. Rev. 59, 223-240 [Abstract]
  4. Kooter, J. M., van der Spek, H. J., Wagter, R., d'Oliveira, C. E., van der Hoeven, F., Johnson, P. J., and Borst, P. (1987) Cell 51, 261-272 [Medline] [Order article via Infotrieve]
  5. Johnson, P. J., Kooter, J. M., and Borst, P. (1987) Cell 51, 273-281 [Medline] [Order article via Infotrieve]
  6. Pays, E., Tebabi, P., Pays, A., Coquelet, H., Revelard, P., Salmon, D., and Steinert, M. (1989) Cell 57, 835-845 [Medline] [Order article via Infotrieve]
  7. Hajduk, S., Adler, B., Bertrand, K., Fearon, K., Hager, K., Hancock, K., Harris, M., Le Blanc, A., Moore, R., Pollard, V., Priest, J., and Wood, Z. (1992) Am. J. Med. Sci. 303, 258-270 [Medline] [Order article via Infotrieve]
  8. Cully, D. F., Ip, H. S., and Cross, G. A. M. (1985) Cell 42, 173-182 [Medline] [Order article via Infotrieve]
  9. Lips, S., Revelard, P., and Pays, E. (1993) Mol. Biochem. Parasitol. 62, 135-138 [CrossRef][Medline] [Order article via Infotrieve]
  10. Cully, D. F., Gibbs, C. P., and Cross, G. A. M. (1986) Mol. Biochem. Parasitol. 21, 189-197 [Medline] [Order article via Infotrieve]
  11. Barnes, D. A., Mottram, J. C., and Agabian, N. (1990) Mol. Biochem. Parasitol. 41, 101-114 [Medline] [Order article via Infotrieve]
  12. Son, H. J., Cook, G. A., Hall, T., and Donelson, J. E. (1989) Mol. Biochem. Parasitol. 33, 59-66 [CrossRef][Medline] [Order article via Infotrieve]
  13. Alarcon, C. M., Son, H. J., Hall, T., and Donelson, J. E. (1994) Mol. Cell. Biol. 14, 5579-5591 [Abstract]
  14. Esser, K. M., Schoenbechler, M. J., and Gingrich, J. B. (1982) J. Immunol. 129, 1715-1718 [Abstract/Free Full Text]
  15. Lu, Y., Hall, T., Gay, L. S., and Donelson, J. E. (1993) Cell 72, 397-406 [Medline] [Order article via Infotrieve]
  16. Lu, Y., Alarcon, C. M., Hall, T., Reddy, L. V., and Donelson, J. E. (1994) Mol. Cell. Biol. 14, 3971-3980 [Abstract]
  17. Fredman, P., Richert, N. D., Magnani, J. L., Willingham, M. C., Pastan, I., and Ginsburg, V. (1983) J. Biol. Chem. 258, 11206-11210 [Abstract/Free Full Text]
  18. Coquelet, H., Tebabi, P., Pays, A., Steinert, M., and Pays, E. (1989) Mol. Cell. Biol. 9, 4022-4025 [Medline] [Order article via Infotrieve]
  19. Coquelet, H., Steinert, M., and Pays, E. (1991) Mol. Biochem. Parasitol. 44, 33-42 [CrossRef][Medline] [Order article via Infotrieve]
  20. El-Sayed, N. M., Alarcon, C. M., Beck, J. C., Sheffield, V. C., and Donelson, J. E. (1995) Mol. Biochem. Parasitol. 73, 75-90 [CrossRef][Medline] [Order article via Infotrieve]
  21. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  22. Chirgwin, J. M., Przbyla, A. E., MacDonald, R. J., and Rutter, W. J. (1979) Biochemistry 18, 5294-5299 [Medline] [Order article via Infotrieve]
  23. Chomczynski, P., and Sacchi, N. (1987) Anal. Biochem. 162, 156-159 [CrossRef][Medline] [Order article via Infotrieve]
  24. Feinberg, A. P., and Vogelstein, B. (1983) Anal. Biochem. 132, 6-13 [Medline] [Order article via Infotrieve]
  25. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467 [Abstract]
  26. Higgins, D. G., and Sharp, P. M. (1988) Gene (Amst.) 73, 237-244
  27. Alexandre, S., Guyaux, M., Murphy, N. B., Coquelet, H., Pays, A., Steinert, M., and Pays, E. (1988) Mol. Cell. Biol. 8, 2367-2378 [Medline] [Order article via Infotrieve]
  28. Poulis, M. I., Shaw, D. C., Campbell, H. D., and Young, I. G. (1981) Biochemistry 20, 4178-4185 [Medline] [Order article via Infotrieve]
  29. Belin, D., Hedgpeth, J., Selzer, G. B., and Epstein, R. H. (1979) Proc. Natl. Acad. Sci. U. S. A. 76, 700-704 [Abstract]
  30. Barrell, B. G., Bankier, A. T., and Drowin, J. (1979) Nature 282, 189-194 [Medline] [Order article via Infotrieve]
  31. Kozak, M. (1989) Mol. Cell. Biol. 9, 5073-5080 [Medline] [Order article via Infotrieve]
  32. Rudenko, G., Chung, H-M. M., Pham, V. P., and Van der Ploeg, L. H. T. (1991) EMBO J. 10, 3387-3397 [Abstract]
  33. Zomerdijk, J. C. B. M., Kieft, R., and Borst, P. (1991) Nature 353, 772-775 [CrossRef][Medline] [Order article via Infotrieve]
  34. Graham, S. V., and Barry, J. D. (1991) Mol. Biochem. Parasitol. 47, 31-42 [Medline] [Order article via Infotrieve]
  35. Berberof, M., Vanhamme, L., Tebabi, P., Pays, A., Jefferies, D., Welburn, S., and Pays, E. (1995) EMBO J. 14, 2925-2934 [Abstract]
  36. Kamper, S. M., and Barbet, A. F. (1992) Mol. Biochem. Parasitol. 53, 33-44 [CrossRef][Medline] [Order article via Infotrieve]
  37. Pays, E., Houard, S., Pays, A., Van Assel, S., Dupont, F., Aerts, D., Huet-Duvillier, G., Gomes, V., Richet, C., Degand, P., Van Meirvenne, N., and Steinert, M. (1985) Cell 42, 821-829 [Medline] [Order article via Infotrieve]
  38. Thon, G., Baltz, T., Giroud, C., and Eisen, E. (1990) Genes & Dev. 9, 1374-1383
  39. Kepa, J. K. (1990) The Characterization of a T.b. rhodesiense VSG Gene before and after Passage through the Tsetse Fly , Ph.D. thesis, University of Iowa

©1996 by The American Society for Biochemistry and Molecular Biology, Inc.