Proposed pathway for the biosynthesis of serovar-specific glycopeptidolipids in Mycobacterium avium serovar 2

Torsten M. Eckstein, John T. Belisle and Julia M. Inamine

Mycobacteria Research Laboratories, Department of Microbiology, Immunology and Pathology, Colorado State University, Fort Collins, CO 80523-1682, USA

Correspondence
Julia M. Inamine
jinamine{at}colostate.edu


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Members of the Mycobacterium avium complex are distinguished by the presence of highly antigenic surface molecules called glycopeptidolipids (GPLs) and the oligosaccharide portion of the serovar-specific GPL defines the 28 serovars. Previously, the genomic region (ser2) encoding the enzymes responsible for the glycosylation of the lipopeptide core to generate the serovar-2-specific GPLs has been described. In this work, the ser2 gene clusters of M. avium serovar 2 strains 2151 and TMC 724 were fully sequenced and compared to the homologous regions of M. avium serovar 1 strain 104, M. avium subsp. paratuberculosis and M. avium subsp. silvaticum. It was also determined that 104Rg, a mutant of strain 104 that produces truncated GPLs, lost several GPL biosynthesis genes by deletion. This comparison, together with analysis of protein similarities, supports a biosynthetic model in which serovar-2-specific GPLs are synthesized from a serovar-1-specific GPL intermediate that is derived from a non-specific GPL precursor. We also identified a gene encoding an enzyme that is necessary for the biosynthesis of serovar-3- and 9-specific GPLs, but not serovar-2-specific GPLs, suggesting that the different serovars may have evolved from the acquisition or loss of genetic information. In addition, a subcluster of genes for the biosynthesis and transfer of fucose, which are needed to make serovar-specific GPLs such as those of serovar 2, is found in the non-GPL-producing M. avium subspecies paratuberculosis and silvaticum.


Abbreviations: 6-dTal, 6-deoxytalose; Fuc, fucose; GPLs, glycopeptidolipids; MAC, Mycobacterium avium complex; Me, methyl; nsGPLs, non-specific glycopeptidolipids; Rha, rhamnose; ssGPLs, serovar-specific glycopeptidolipids

The GenBank/EMBL/DDBJ accession numbers for the sequences reported in this paper are AF125999 and AF143772.


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
The mycobacterial envelope contains a unique array of glycolipids (Brennan, 1988; Aspinall et al., 1995). Members of the Mycobacterium avium complex (MAC) are distinguished from other members of this genus by the presence of highly antigenic cell surface molecules called glycopeptidolipids (GPLs), which are subdivided into non-specific GPLs (nsGPLs) and serovar-specific GPLs (ssGPLs) (reviewed by Chatterjee & Khoo, 2001). All nsGPLs have an N-acylated lipopeptide core that is glycosylated at the C-terminal L-alaninol with a mono- or dimethylated rhamnose (Rha), and at the D-allo-threonine with a 6-deoxytalose (6-dTal). Based on the structure, it is hypothesized that the ssGPLs arise from further glycosylation of 6-dTal, yielding a haptenic oligosaccharide that defines each of the 28 serovars of the MAC (Brennan & Goren, 1979; Aspinall et al., 1995). We previously described a genomic region termed ser2 in two strains of M. avium subsp. avium (M. avium) serovar 2 that encodes the enzymic machinery for the glycosylation of the lipopeptide core to produce the serovar-2-specific GPL (Belisle et al., 1991, 1993b). Four functional loci (ser2A, ser2B, ser2C and ser2D) within this gene cluster were mapped and their putative functions were analysed by transposon mutagenesis (Mills et al., 1994). Recently, it was determined that the rtfA gene located in the ser2A locus is responsible for the transfer of Rha to 6-dTal (Eckstein et al., 1998).

The ser2 gene cluster should be highly conserved among strains containing the serovar-2-specific GPL; however, previous studies demonstrated restriction fragment length polymorphisms in the ser2 region of the genomes of M. avium strains 2151 and TMC 724 (Belisle et al., 1993b). In the present study, the ser2 gene cluster and the flanking regions of these two strains were sequenced and compared with the homologous regions of M. avium strain 104 (serovar 1), M. avium subsp. paratuberculosis and M. avium subsp. silvaticum. In addition, the genetic basis for a rough colony variant of strain 104 was determined to be the deletion of 10 kb of DNA that includes several GPL biosynthetic genes. The results support a proposed pathway for ssGPL biosynthesis in M. avium serovar 2 in which serovar-2-specific GPLs are synthesized from nsGPLs via a serovar-1-specific GPL intermediate, and raise interesting questions about the evolutionary mechanism for creating the different serovars.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Growth of bacterial strains and DNA isolation.
E. coli DH5{alpha} was used for propagation of all recombinant plasmids. Cells were grown in Luria–Bertani medium and recombinants were selected by the addition of ampicillin (100 µg ml-1) or kanamycin (25 µg ml-1). Recombinant plasmids were isolated from E. coli using a QIAprep Spin Miniprep Kit (Qiagen) according to the manufacturer's instructions.

M. avium strain 104 and its rough derivative (104Rg) were from J. Torrelles and D. Chatterjee (Torrelles et al., 2002). These strains were grown on Middlebrook 7H11 agar (Difco) containing 10 % oleic acid-albumin-glucose-catalase (OADC). After 2 weeks growth at 37 °C, a single colony was used to inoculate Middlebrook 7H9 broth (Difco) containing 10 % OADC. The 10 ml starting culture was up-scaled several times to a final volume of 200 ml. Cells were harvested by centrifugation at 3000 g for 30 min. For isolation of genomic DNA, harvested cells were resuspended in TES buffer (50 mM Tris, pH 8·0; 10 mM EDTA, pH 8·0; 100 mM NaCl) and stored on ice. Cells were lysed by vortexing with 0·5 mm zirconium beads for 2 min. SDS (final concentration 1 %) was added to the lysed cells and kept on ice for 15 min. The lysate was treated with proteinase K (final concentration 100 µg ml-1) at 55 °C for 15 min and then extracted with phenol/chloroform/isoamyl alcohol (25 : 24 : 1). The aqueous phase was loaded on an equilibrated Genomic Tip 100 (Qiagen) and the genomic DNA was eluted with TE (10 mM Tris, pH 8·0; 1 mM EDTA, pH 8·0), according to the manufacturer's instructions.

Cloning, restriction endonuclease mapping and sequencing of ser2 and flanking regions from M. avium strains 2151 and TMC 724.
The recombinant plasmids used in this study and the procedures used for their derivation are shown in Table 1. These plasmids were characterized by digestion with restriction endonucleases SmaI, KpnI, PstI, EcoRI, EcoRV, SacI, NotI and BamHI (Invitrogen), and subclones were constructed by ligation of restriction fragments into the respective sites of the pBluescript II SK(-) vector (Stratagene) using T4 DNA ligase (Invitrogen). These subclones were used as double-stranded DNA templates for DNA sequencing with the pBluescript T3 and M13–20 primers. Custom primers were synthesized as necessary to resolve sequence ambiguities. Sequencing of DNA was performed by Macromolecular Resources Facility, Colorado State University. Contiguous DNA sequences, ORF analysis and codon usage were determined with Sequencher 3.0 software (Gene Codes Corporation) and FramePlot 2.3beta (http://www.nih.go.jp/~jun/cgi-bin/frameplot.pl) (Ishikawa & Hotta, 1999). Identification of the putative function of each ORF was achieved via similarity searches between the deduced amino acid sequences and known proteins using BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) and multi-alignments were generated using MultAlin (http://www.toulouse.inra.fr/multalin.html) (Corpet, 1988).


View this table:
[in this window]
[in a new window]
 
Table 1. Recombinant plasmids used for DNA sequencing

 
PCR analysis, cloning and sequencing of the ser1 and flanking regions remaining in M. avium strain 104Rg.
Forward primer P1 (5'-CCGGCCGTTCCTGGTGAAGTG-3') and reverse primer P2 (5'-GATCGCCCGGAACGTCTTCTT-3') were synthesized by the Macromolecular Resources Facility at Colorado State University. PCR amplification was performed using a 2400 GeneAmp PCR System (Perkin–Elmer) and Taq polymerase (Invitrogen). The PCR programme used with the above primers was 28 cycles of 94·0 °C for 1 min, 63·4 °C for 1 min and 72·0 °C for 2·5 min. Prior to the first cycle, the starting temperature of 94·0 °C was held for 5 min, and at the end of the last cycle, a temperature of 72·0 °C was held for 7 min. The PCR product was resolved on a 0·8 % agarose gel and stained with ethidium bromide.

The 3690 bp PCR product was purified from the gel using the QIAquick Gel Extraction Kit (Qiagen), according to the manufacturer's instructions, cloned into pGEM-TEasy (Promega) and the insert was sequenced using the SP6 and T7 primers. Subclones for DNA sequencing were generated with SstI, NcoI and HincII (Invitrogen). Sequencing of DNA was performed by the Macromolecular Resources Facility at Colorado State University. DNA sequence alignments, ORFs and codon usage were determined with Sequencher 3.0 software (Gene Codes Corporation). The site of the genomic deletion was facilitated by comparing the sequence data to the genome sequence data from M. avium strain 104 provided by The Institute for Genomic Research (http://www.tigr.org).

Nucleotide sequence accession numbers.
The sequence data presented in this article are available in GenBank at AF143772 (M. avium 2151) and AF125999 (M. avium TMC 724). The previously published rtfA sequence data (Eckstein et al., 1998) at AF060183 are included in AF125999, and the previously published IS1601 sequence data (Eckstein et al., 2000) at AF060182 are included in AF143772.


   RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Sequence analysis and definition of the ser2 gene cluster
The ser2 gene clusters of M. avium serovar 2 strains 2151 and TMC 724, as well as their flanking regions, were sequenced and the ORFs were determined (Table 2). Elucidation of putative gene function was facilitated by comparing this sequence information to the homologous regions in the genome sequence data from M. avium serovar 1 strain 104 provided by The Institute for Genomic Research (http://www.tigr.org). The data were also compared to the GS element of M. avium subsp. paratuberculosis and M. avium subsp. silvaticum (Tizard et al., 1998; Bull et al., 2000). The alignments of the five genetic maps of these regions are shown in Fig. 1.


View this table:
[in this window]
[in a new window]
 
Table 2. Analysis of the ORFs in the ser2 regions of M. avium serovar 2 strains 2151 and TMC 724

 


View larger version (32K):
[in this window]
[in a new window]
 
Fig. 1. Restriction endonuclease map and genetic organization of the gene clusters involved in the glycosylation of the lipopeptide core of M. avium strain 104 (a), M. avium strain 2151 (b), M. avium strain TMC 724 (c), M. avium subsp. silvaticum (d) and M. avium subsp. paratuberculosis (e). The boxes represent the following homologous gene clusters (from left to right as shown in panel b): gene cluster upstream of ser1 (left crosshatching), ser1 gene cluster (black), ser3' gene cluster (white), fbt gene cluster (grey), drr gene cluster (right crosshatching). Genes encoding putative proteins with unknown functions are labelled with orf and letters or numbers. The names of the ORFs in parentheses in (d) and (e) are from Bull et al. (2000) and Tizard et al. (1998). In (e), orfA' is the ORF lacking 66 nt from the 3' end of orfA (a and b), and gtfC' is the ORF lacking 258 nt from the 5' end of gtfC (b, c and d). Restriction sites are: H, HindIII; K, KpnI; P, PstI.

 
The ser2 gene cluster is defined as the genomic region encoding the enzymic machinery necessary for the glycosylation of the lipopeptide core that results in the serovar-2-specific GPL (Belisle et al., 1993a). This cluster was previously delineated by restriction mapping, Southern blot analysis and the characterization of naturally occurring rough morphotypes that could produce only the non-glycosylated lipopeptide core (Belisle et al., 1993a, b; Eckstein et al., 2000). Based on the sequence data and ORF analyses, we propose that the genetic definition of the ser2 gene cluster be all of the ORFs from mtfA to mtfE, and gtfC to gtfD (Fig. 1b, c), and that the ser2 region is composed of two functional subclusters designated ser1 and fbt that are flanked by genes that do not appear to be involved in GPL biosynthesis (see Fig. 1 and legend). As explained below, the ser1 and fbt subclusters can be assigned possible functions with regard to the synthesis of the serovar-2-specific GPLs, while the intervening ser3' region appears to be an anomaly.

Definition and proposed role for the ser1 subcluster
The lipopeptide cores of all MAC GPLs are glycosylated at two positions: the C-terminal L-alaninol is glycosylated with a mono- or dimethylated Rha (it can also be trimethylated Rha in Mycobacterium smegmatis), and the D-allo-threonine is glycosylated with 6-dTal. This is the structure of nsGPLs. The simplest of the ssGPLs is that of serovar 1 (Table 3) which contains a single Rha residue linked to 6-dTal. This basic oligosaccharide structure is in all other ssGPLs except for those of serovars 5 and 10/11 (Aspinall et al., 1995) and so the genes for its glycosylation and methylation should be present in all the relevant serovars. The ser1 region (indicated by the solid black box in Fig. 1) was common to both serovar 1 (strain 104) and serovar 2 (strains 2151 and TMC 724), and the ORFs encompassed by this region are proposed to participate in the biosynthesis of the serovar-1-specific GPL.


View this table:
[in this window]
[in a new window]
 
Table 3. Structural similarity of the oligosaccharides of ssGPLs of M. avium serovars 1, 2, 3 and 9

GlcA, Glucuronic acid; Ac, acetyl.

 
The ser1 subcluster encodes two putative glycosyltransferases (GtfA, GtfB), a known rhamnosyltransferase (RtfA) (Eckstein et al., 1998) and four putative methyltransferases (MtfA, MtfB, MtfC and MtfD) (Table 2). The deduced amino acid sequences of GtfA, GtfB and RtfA are highly similar to putative glycosyltransferases of Mycobacterium tuberculosis (Rv1524, Rv1526c) (Cole et al., 1998) and of Mycobacterium leprae (L518_C2_147) (Eiglmeier et al., 1998). They also show high levels of identity (63·4–65·1 %) and similarity (84·9–86·8 %) to one another, with GtfB and RtfA being the most similar. RtfA transfers Rha to 6-dTal (Eckstein et al., 1998) and by analogy we propose that GtfB transfers Rha to the C-terminal L-alaninol of the lipopeptide core. This leaves GtfA as the likely candidate for adding 6-dTal to the D-allo-threonine.

The roles for GtfA and GtfB proposed above and the order in which these glycosylation steps occur were further examined by genetic analysis of strain 104Rg, a naturally occurring rough derivative of M. avium strain 104. Torrelles et al. (2002) determined that 104Rg produces truncated GPLs that lack the 6-dTal attached to D-allo-threonine, but retain the methylated Rha on the L-alaninol. In the present study, PCR analysis of 104Rg genomic DNA in comparison to that of the parental strain indicated that 104Rg contains gtfB, but not gtfA (data not shown). Given the presence of two copies of IS999 flanking the gtfA region in strain 104 (Fig. 1a), we reasoned that the absence of gtfA in 104Rg might be due to deletion, based on our previous finding that homologous recombination between direct repeats of IS1601 resulted in the deletion of the ser2 gene cluster to produce the Rg-1 morphotype of strain 2151 (Eckstein et al., 2000). However, these particular copies of IS999 are inverted repeats rather than direct repeats in the genome sequence of strain 104 (http://www.tigr.org) and homologous recombination between them should result in inversion rather than deletion. Nevertheless, based on the PCR results, the deletion hypothesis was tested by using primers based on orfA and rtfA sequences flanking the left and right (Fig. 1a) copies of IS999, respectively, to amplify 104Rg genomic DNA. A 3690 bp PCR product was obtained and this was cloned and sequenced. The DNA sequence indicates that 104Rg lost the 10 kb region containing orfBgtfA while retaining one copy of IS999 (the left copy in Fig. 1a). It is possible that the 28 bp inverted repeats at the ends of IS999 mediated the deletion event since Mahairas et al. (1996) identified 24 and 12 bp repeats that were involved in the deletion of the RD2 and RD3 regions, respectively, in Mycobacterium bovis BCG. The region deleted from strain 104 includes three genes (mtfA and mtfB as well as gtfA) from the ser1 subcluster. Since 104Rg produces a lipopeptide core with methylated Rha on the L-alaninol (Torrelles et al., 2002), and we found that it has lost gtfA while retaining gtfB, it follows that GtfB is likely to transfer Rha to the C-terminal L-alaninol of the lipopeptide core before gtfa adds 6-dtal to the D-allo-threonine. Thus, the proposed biosynthetic scheme begins with GtfB (Fig. 2).



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 2. Proposed pathway for the biosynthesis of serovar-2-specific GPL in Mycobacterium avium serovar 2. D-Phe, D-phenylalanine; D-aThr, D-allo-threonine; D-Ala, D-alanine; Man, mannose. Enzyme names were used as mentioned in the text. Sugar carriers are not shown.

 
As noted above, 104Rg also lost mtfA and mtfB. The methyltransferases form two different groups based on either genetic location or deduced amino acid similarities. The mtfA and mtfB genes are physically separated from the mtfC and mtfD genes by gtfA and rtfA (Fig. 1). MtfA and MtfD are most similar to each other (37·5 % identity and 65·6 % similarity), as are MtfB and MtfC (85·3 % identity and 94·4 % similarity). Only MtfB and MtfC showed significant similarity to the known mycinamin III-O-methyltransferase of Micromonospora griseorubida (Inouye et al., 1994). MtfD is highly similar (90·6 %) to a methyltransferase (Mtf1 or MeTase1) in M. smegmatis that adds a methyl (Me) group to the 3-position of the Rha attached to L-alaninol (Patterson et al., 2000). Mtf1 is not highly related (53·0 % similarity) to MtfC. It is thus proposed that MtfC is responsible for the generation of 4-O-Me-Rha and MtfD is responsible for the 3-O-Me-Rha to produce nsGPL (Fig. 2). Since strain 104 also produces an nsGPL containing 3-O-Me-6-dTal that is missing in 104Rg (Torrelles et al., 2002), it is reasonable to conclude that the gene responsible for this methylation is within the genetic region deleted in 104Rg. Thus it is proposed that MtfA or MtfB methylates the 6-dTal (in serovar 1 strains). Given the high similarity between MtfA and MtfD, it is more likely that MtfA methylates the 3-position of 6-dTal. In summary, given that the truncated GPL from 104Rg lacks 6-dTal or 3-O-Me-6-dTal on D-allo-threonine, but retains the methylated Rha attached to L-alaninol (Torrelles et al., 2002), our genetic analyses support a proposed pathway for the biosynthesis of the serovar-2-specific GPL in which GtfB, MtfC and MtfD carry out the first two steps, namely the addition of the dimethylated Rha to L-alaninol, while GtfA and MtfA (or MtfB) are most likely involved in the addition of 6-dTal or 3-O-Me-6-dTal to D-allo-threonine (Fig. 2). The final action of the above-mentioned RtfA, namely the addition of Rha to 6-dTal, would thus lead to the formation of the serovar-1-specific GPL as shown in Fig. 2.

The anomalous ser3' gene cluster
The ser3' subcluster (indicated by the white box in Fig. 1) was so-named because it encodes a putative D-glucose dehydrogenase that would be needed to make the D-glucuronic acid found in the ssGPLs of M. avium serovars 3 or 9 (Table 3). The deduced amino acid sequence demonstrated a high degree of similarity (69 and 65 %) to known D-glucose dehydrogenases of Bacillus subtilis (Hilt et al., 1991) and Pseudomonas aeruginosa (van Schie et al., 1985), respectively. The presence of this subcluster in the serovar 2 strains is anomalous because D-glucuronic acid is not found in the serovar-2-specific GPL.

Proposed role for the fbt subcluster in the de novo biosynthesis and transfer of L-fucose (L-Fuc)
The presence of dimethylfucose is the only difference between the serovar-2-specific and serovar-1-specific GPLs (Table 3), and so the genes for Fuc synthesis and addition should be present in serovar 2 strains, but not serovar 1. The region defined as fbt (Fuc biosynthesis and transfer; indicated by the grey box in Fig. 1) fulfils this criterion and appears to contain the genes needed for the de novo biosynthesis and transfer of L-Fuc. The deduced amino acid sequences of gtfC, mdhA, merA, mtfF and gtfD demonstrated a high degree of similarity (52 % identity, 67 % similarity) to proteins involved in the de novo synthesis and transfer of L-Fuc in E. coli (Stevenson et al., 1996; Andrianopoulous et al., 1998). Interestingly, this gene cluster is identical to the GS element of M. avium subsp. paratuberculosis and subsp. silvaticum (Tizard et al., 1998; Bull et al., 2000) although neither of these subspecies expresses GPLs (see Discussion).

It is possible that one of the two putative glycosyltransferase genes (gtfC and gtfD) belongs to the ser3' rather than the fbt gene cluster because a transferase would be needed for the glucuronic acid found in the ssGPLs of serovars 3 or 9. However, mtfE is clearly part of the fbt subcluster, despite its physical separation from the other fbt genes, based on the earlier work of Mills et al. (1994). In that study, transposon mutants of the cloned ser2 region from M. avium strain 724 were analysed in M. smegmatis to identify genes involved in ssGPL biosynthesis (the M. avium genes required for nsGPL synthesis could not be characterized because M. smegmatis makes nsGPLs). The loci responsible for making 2-O-Me-Fuc and 3-O-Me-Fuc map to the mtfF and mtfE genes, respectively.

Comparison of the ser2 regions of strains 2151 and TMC 724
Comparison of the two ser2 gene clusters and their flanking regions revealed that the nucleotide sequences and gene organization were strongly conserved and only a few nucleotide differences were observed. Most were silent mutations, but some affected the deduced amino acid sequences of gtfB, mtfE and gdhA (4, 2 and 2 aa, respectively). Besides these highly conserved characteristics, there were three major differences between the organization of the ser2 gene clusters in M. avium TMC 724 and 2151. First, two ORFs encoding a putative dehydrogenase (dhgA) and a putative haemolytic protein (hlpA) are disrupted by IS1245 and IS1348, respectively, in M. avium 2151, but not in M. avium TMC 724. Second, a potential hot spot for insertion sequences was observed downstream of the fbt gene cluster in both M. avium strains, but IS1601 was found in 2151 while IS2534 was found in TMC 724. The site of integration of these two IS elements in the two strains differs by only 133 bp. Finally, M. avium TMC 724 contained a putative O-acetyltransferase gene (atfA) downstream of the fbt subcluster that is not present in the region sequenced for M. avium 2151.


   DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
In previous studies, detailed restriction endonuclease mapping and Southern blot analyses revealed restriction fragment length differences in the ser2 gene clusters of M. avium serovar 2 strains 2151 and TMC 724 (Belisle et al., 1993b). This work shows that the DNA sequences of both ser2 gene clusters and their flanking regions are nearly identical, with the above-mentioned differences in restriction patterns being due to IS elements. Two ORFs encoding a putative dehydrogenase (dhgA) and a putative haemolytic protein (hlpA) are disrupted by IS elements in strain 2151, but not in strain TMC 724. Given that strains 2151 and TMC 724 both produce ssGPLs (Belisle et al., 1991, 1993a), these genes must not be involved in the biosynthesis of serovar-2-specific GPLs. Maslow et al. (1999) reported a correlation between haemolytic activity and disseminated infection, whereby M. avium strains from patients with disseminated MAC infection demonstrated haemolytic activity while strains from patients with pneumonia lacked this characteristic. Based on this observation, one can speculate that HlpA (as well as DhgA) might contribute to the intracellular survival of the bacilli. Comparison of the growth characteristics of these two strains in macrophages shows that strain TMC 724 grows faster and produces more colony forming units than does strain 2151 (Pedrosa et al., 1994; Florido et al., 1997).

Both ser2 gene clusters have the same organization of genes encoding the enzymic machinery for the biosynthesis of the ssGPL, and this type of genetic conservation has been observed in other systems, such as the gene clusters responsible for expression of E. coli group 1 K-antigens (Rahn et al., 1999). The nucleotide and deduced amino acid sequences of the ORFs in the ser2 regions of both strains are identical, with three exceptions (gtfB, mtfE and gdhA) as noted earlier. Given the similarity in structure of Rha, Fuc and 6-dTal, one would expect a high degree of similarity in the enzymes transferring these 6-deoxyhexose sugars. Indeed, RtfA, GtfA and GtfB are very similar to one another. However, the putative fucosyltransferases GtfC and GtfD are not related to one another or the other three glycosyltransferases. None of the five glycosyltransferases contains the E-x(7)-E-A-x(18)-E hexosyltransferase motif identified by Henderson & Nataro (1999). However, this motif was based on the analysis of glucosyl-, galactosyl- and mannosyltransferases, and not 6-deoxyhexosyltransferases, suggesting that the loss of the hydroxyl group on the sixth carbon might be the reason for the mismatch. Multi-alignment of putative 6-deoxyhexosyltransferases yielded two new motifs (Fig. 3a, b). The first motif, G-[TS]-R-G-D-x-[EQ]-P-x-x-A-x(4)-L-x(3)-G-x-x-V (Fig. 3a), is in the N-terminal 60 aa of the transferase, and the second motif, [DE]-x(18)-[AG]-[VI]-[VI]-H-H-G-G-x-G-[TS]-T (Fig. 3b), is about 200 aa away. Interestingly, NovM, an enzyme that transfers a Rha that is C-methylated at the fifth carbon, does not exhibit the first motif, but contains the second one. This suggests that the fifth and the sixth carbons of the 6-deoxyhexoses play a major role in binding to the first motif. RtfA, GtfA and GtfB contain both of the proposed 6-deoxyhexosyltransferase motifs while GtfC and GtfD do not. This supports our biosynthetic model (see below) in which GtfC or GtfD transfer methylated Fuc molecules and would thus have different binding and transfer motifs than the 6-deoxyhexosyltransferases.



View larger version (83K):
[in this window]
[in a new window]
 
Fig. 3. Motif analysis of different putative 6-deoxyhexosyl transferases. (a) Motif 1, (b) motif 2. RtfA, GtfA and GtfB are derived from this study. Rv1524 and Rv1526c, glycosyltransferases of M. tuberculosis (Q50583 and Q50581, respectively); L518_C2_147, glycosyltransferase of M. leprae (Q49929); Gtf2, glycosyltransferase of M. smegmatis (AF192151); T10P12.7, glycosyltransferase of Arabidopsis thaliana (AC007203); F17A9.17, glycosyltransferase of Arabidopsis thaliana (AC016827); Ugt80A1, glycosyltransferase of Avena sativa (Z83832); BgtfB, balhimycin-glycosyltransferase of Amycolatopsis mediterranei (Y16952); GtfE, glycosyltransferase of Amycolatopsis mediterranei (U84350); TylN, deoxyallosyltransferase of Streptomyces fradiae (AJ005397); NovM, glycosyltransferase of Streptomyces sphaeroides (AF170880).

 
Comparison of the deduced amino acid sequences of the six methyltransferases found in the ser2 gene cluster with putative enzymes responsible for the methylation of 6-deoxyhexoses revealed the motif [LM]-x(13)-[DEN]-x(10-11)-L-x(9)-[GA]-x(6)-[DE] (Fig. 4). This motif is not one of the five highly conserved amino acid sequences (regions I–V) that were identified by Ibrahim et al. (1998) when they analysed several plant O-methyltransferases. These enzymes are responsible for the methylation of chalcones, flavanones, flavones, isoflavones, flavonols and anthocyanins, but not sugars, and so one might not expect any similarity to the methyltransferases in our study. However, regions I–IV are thought to be involved in S-adenosyl-L-methionine and metal binding, and S-adenosyl-L-methionine is the common methyl donor, so we expected our methyltransferases to demonstrate some homology to those proposed regions. One obvious explanation is that the binding site for the methyl donor is different in bacteria and plants.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 4. Motif analysis of different putative O-Me-transferases. MtfA, MtfB, MtfC, MtfD, MtfE and MtfF are derived from this study. Mtf1 and Mtf2, methyltransferases of M. smegmatis (AF192151); AveBVII, dTDP-6-deoxy-L-hexose 3-O-methyltransferase of Streptomyces avermitilis (AB032523); TylF, macrocin-O-methyltransferase of Streptomyces fradiae (AF147703); MycF, mycinamin III O-methyltransferase of Micromonospora griseorubida (D16097); NovP, O-methyltransferase of Streptomyces sphaeroides (AF170880).

 
All of this information was used to propose the biosynthesis cascade shown in Fig. 2. Briefly, GtfB adds Rha to the C-terminal L-alaninol of the lipopeptide core and this Rha is then dimethylated by MtfC and MtfD. Both the transfer of this particular 6-deoxyhexose, as well as the proposal that the sugar is not methylated until after it is transferred, correlate with the high degree of similarity between GtfB and the known rhamnosyltransferase RtfA (Eckstein et al., 1998) as well as the structure of the truncated GPL made by 104Rg (Torrelles et al., 2002). GtfA demonstrated less homology to RtfA than did GtfB, so we hypothesized that GtfA is responsible for the transfer of 6-dTal to D-allo-threonine to produce the nsGPL. The nsGPL then serves as the substrate for RtfA which adds Rha to the 6-dTal to yield the serovar-1-specific GPL. In serovar 1, the 6-dTal can alternatively be methylated at the 3-position (Torrelles et al., 2002), a process that is assigned to MtfA or MtfB, primarily because mtfA and mtfB are also found in serovar 4 (Krzywinska & Schorey, 2003) and are thus not serovar-specific. These methytransferases could also modify the fatty acid portion of the GPLs, as has been shown for Mtf2 in M. smegmatis (Jeevarajah et al., 2002), and so this is the most speculative portion of the proposed pathway. In the final part of the biosynthetic model (Fig. 2), the de novo synthesis of Fuc by MdhA and MerA, and the methylation of Fuc by MtfE and MtfF, are followed by the linkage of the dimethylated Fuc to the non-methylated Rha by GtfC or GtfD to generate the serovar-2-specific GPL.

The proposed pathway is similar to the biosynthetic cascades for other cell-wall surface markers. Stevenson et al. (1996) described a biosynthetic pathway for the extracellular colanic acid of E. coli K-12 in which the four sugars are synthesized de novo in the cytosol and then transferred to the lipid carrier in the inner membrane. Polymerization and acetylation are then performed in the outer membrane of the cell wall. Similar pathways have been proposed by Arakawa et al. (1995) for the biosynthesis of the serotype K2 capsular polysaccharide of Klebsiella pneumoniae, and by Hashimoto et al. (1993) for the synthesis of the Vi antigen in Salmonella typhi. The transport of the cell surface markers to the final destination is usually mediated by an ABC transporter system (Hashimoto et al., 1993; Stevenson et al., 1996). We could not identify the genes for such a transport system for the GPLs within the ser2 gene clusters of M. avium 2151 or TMC 724. However, a second transport mechanism was discussed by Stevenson et al. (1996) in which the lipid carrier involved in the synthesis of colanic acid acts as a transporter. They proposed that the lipid carrier is flipped from the cytoplasm into the periplasm. Further work is needed to determine whether this transport mechanism or the ABC transporter system is responsible for the transfer of the GPLs to the outer surface of the cell wall of M. avium. In this regard, it should be noted that Recht et al. (2000) have suggested that tmtpC may be involved in both the biosynthesis and transport of GPLs in M. smegmatis.

Downstream of the ser2 gene cluster of M. avium TMC 724 is an ORF (atfA) encoding a putative O-acetyltransferase. AtfA contains all three of the acetyltransferase motifs proposed by Slauch et al. (1996). The atfA gene is not found in the analogous regions of strains 2151 or 104, but it is present within the GS gene cluster of M. avium subsp. paratuberculosis and subsp. silvaticum. Although AtfA is not necessary for the biosynthesis of ssGPLs, it might be involved in GPL modification since ssGPLs can sometimes be acetylated, methylated or sulfated (Aspinall et al., 1995; Chatterjee & Khoo, 2001). Recht & Kolter (2001) recently showed that the atf1 gene of M. smegmatis encodes an acetyltransferase that acetylates one or two sites on the 6-dTal of nsGPL. The nucleotide sequences on either side of IS2534-atfA in strain 724 are identical to those surrounding the right copy of IS1601 (Fig. 1) in strain 2151. This suggests that atfA may have been inserted along with the IS element, although IS2534-atfA is not flanked by direct repeats.

The presence of repetitive elements in the ser regions raises interesting possibilities with regard to the evolution of the different serovars. The ser2 region is flanked by IS1601 in strain 2151, and the ser1 region is flanked by IS999 and RE (a repetitive element containing portions of IS999) in strain 104, suggesting that these gene clusters together with the flanking IS elements form ‘biosynthetic islands’. A similar gene organization was found for the cap gene cluster of H. influenzae (Kroll, 1992) and the K10 capsule gene cluster in E. coli (Clarke et al., 1999). The IS elements flanking the cap genes generated a duplication of the gene cluster resulting in enhanced production of the capsule polysaccharide (Kroll, 1992). Clarke et al. (1999) found that the K10 capsule genes are flanked by IS3 and prophage elements, leading them to speculate that these elements were involved in the acquisition of the capsule gene cluster at its present chromosomal location. Two observations from the present work suggest that the acquisition or loss of such biosynthetic islands has occurred in the MAC. First, the anomalous ser3' gene subcluster contains gdhA, a gene encoding a putative D-glucose dehydrogenase. This enzyme produces D-glucuronic acid, which is found only in the ssGPLs of M. avium serovars 3 or 9 (Aspinall et al., 1995) (Table 3). Thus, the ser2 gene cluster in M. avium TMC 724 and 2151 might represent an intermediate step in which a serovar 2 strain acquired new genetic information to produce serovars 3 and 9. The second observation is that the fbt gene cluster of M. avium 2151 and TMC 724 is identical to the GS element, an 8·9 kb genomic region reported to be present in M. avium subsp. paratuberculosis and M. avium subsp. silvaticum, but absent from some strains of M. avium subsp. avium (Tizard et al., 1998; Bull et al., 2000). Interestingly, the GS region from M. avium subsp. silvaticum contains exactly the same IS element in exactly the same site as in M. avium strain TMC 724 (IS1612 is identical to IS2534). This suggests that M. avium subsp. silvaticum might have been derived from a serovar 2 strain of M. avium such as strain TMC 724, or vice versa.

In summary, this work represents a comprehensive analysis of the genes responsible for the glycosylation of the GPLs of M. avium. The data presented show a strong conservation of gene sequences and organization together with significant differences among strains that suggests that the acquisition or deletion of genes has resulted in different serovars with specific haptenic oligosaccharides as well as different subspecies. This work also provides a template for future research to confirm gene function in the pathway for GPL biosynthesis and the role of GPLs in the biology of M. avium. One obstacle to these types of studies is the poor transformability of M. avium subspecies avium. Thus far, we have not been able to transform strain 2151 and so the genetic analyses of GPL biosynthesis in M. avium have been restricted to naturally occurring deletion mutants. However, J. N. Maslow, S.-W. Lee, T. M. Eckstein, J. M. Inamine and J. T. Belisle (unpublished) have recently used a transformation frequency of 4 [4 c.f.u. (µg DNA)-1] with strain 724 to successfully carry out allelic exchange mutagenesis, and so it should be possible to test individual genes in the proposed GPL biosynthesis pathway in future studies.


   ACKNOWLEDGEMENTS
 
This work was supported by grants AI-41925 and AI-51283 from the National Institute of Allergy and Infectious Diseases, National Institute of Health.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Andrianopoulos, K., Wang, L. & Reeves, P. R. (1998). Identification of the fucose synthetase gene in the colanic acid gene cluster of Escherichia coli K-12. J Bacteriol 180, 998–1001.[Abstract/Free Full Text]

Arakawa, Y., Wacharotayankun, R., Nagatsuka, T., Ito, H., Kato, N. & Ohta, M. (1995). Genomic organization of the Klebsiella pneumoniae cps region responsible for serotype K2 capsular polysaccharide synthesis in the virulent strain Chedid. J Bacteriol 177, 1788–1796.[Abstract]

Aspinall, G. O., Chatterjee, D. & Brennan, P. J. (1995). The variable surface glycolipids of mycobacteria: structures, synthesis of epitopes, and biological properties. Adv Carbohydr Chem Biochem 51, 169–242.[Medline]

Belisle, J. T., Pascopella, L., Inamine, J. M., Brennan, P. J. & Jacobs, W. R., Jr (1991). Isolation and expression of a gene cluster responsible for biosynthesis of the glycopeptidolipid antigens of Mycobacterium avium. J Bacteriol 173, 6991–6997.[Medline]

Belisle, J. T., McNeil, M. R., Chatterjee, D., Inamine, J. M. & Brennan, P. J. (1993a). Expression of the core lipopeptide of the glycopeptidolipid surface antigens in rough mutants of Mycobacterium avium. J Biol Chem 268, 10510–10516.[Abstract/Free Full Text]

Belisle, J. T., Klaczkiewicz, K., Brennan, P. J., Jacobs, W. R., Jr & Inamine, J. M. (1993b). Rough morphological variants of Mycobacterium avium. Characterization of genomic deletions resulting in the loss of glycopeptidolipid expression. J Biol Chem 268, 10517–10523.[Abstract/Free Full Text]

Brennan, P. J. (1988). Mycobacterium and other actinomycetes. In Microbial Lipids, vol. 1, pp. 203–298. Edited by C. Ratledge & S. G. Wilkinson. London: Academic Press.

Brennan, P. J. & Goren, M. B. (1979). Structural studies on the type-specific antigens and lipids of the Mycobacterium aviumMycobacterium intracellulareMycobacterium scrofulaceum serocomplex. J Biol Chem 254, 4205–4211.[Medline]

Bull, T. J., Sheridan, J. M., Martin, H., Sumar, N., Tizard, M. & Hermon-Taylor, J. (2000). Further studies on the GS element: a novel mycobacterial insertion sequence (IS1612), inserted into an acetylase gene (mpa) in Mycobacterium avium subsp. silvaticum but not in Mycobacterium avium subsp. paratuberculosis. Vet Microbiol 77, 453–563.[CrossRef][Medline]

Chatterjee, D. & Khoo, K.-H. (2001). The surface glycopeptidolipids of mycobacteria: structures and biological properties. Cell Mol Life Sci 58, 2018–2042.[Medline]

Clarke, B. R., Pearce, R. & Roberts, I. S. (1999). Genetic organization of the Escherichia coli K10 capsule gene cluster: identification and characterization of two conserved regions in Group III capsule gene clusters encoding polysaccharide transport functions. J Bacteriol 181, 2279–2285.[Abstract/Free Full Text]

Cole, S. T., Brosch, R., Parkhill, J. & 39 other authors (1998). Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393, 537–544.[CrossRef][Medline]

Corpet, F. (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16, 10881–10890.[Abstract]

Eckstein, T. M., Silbaq, F. S., Chatterjee, D., Kelly, N. J., Brennan, P. J. & Belisle, J. T. (1998). Identification and recombinant expression of a Mycobacterium avium rhamnosyltransferase gene (rtfA) involved in glycopeptidolipid biosynthesis. J Bacteriol 180, 5567–5573.[Abstract/Free Full Text]

Eckstein, T. M., Inamine, J. M., Lambert, M. L. & Belisle, J. T. (2000). A genetic mechanism for the deletion of the ser2 gene cluster and formation of rough morphological variants of Mycobacterium avium. J Bacteriol 182, 6177–6182.[Abstract/Free Full Text]

Eiglmeier, K., Honore, N., Woods, S. A., Caudron, B. & Cole, S. T. (1993). Use of an ordered cosmid library to deduce the genomic organization of Mycobacterium leprae. Mol Microbiol 7, 197–206.[Medline]

Florido, M., Appelberg, R., Orme, I. M. & Cooper, A. M. (1997). Evidence for a reduced chemokine response in the lungs of beige mice infected with Mycobacterium avium. Immunology 90, 600–606.[Medline]

Hashimoto, Y., Li, N., Yokoyama, H. & Ezaki, T. (1993). Complete nucleotide sequence and molecular characterization of ViaB region encoding Vi antigen in Salmonella typhi. J Bacteriol 175, 4456–4465.[Abstract]

Henderson, I. R. & Nataro, J. P. (1999). A conserved motif in the hexosyltransferases. Mol Microbiol 33, 222.[CrossRef][Medline]

Hilt, W., Pfleiderer, G. & Fortnagel, P. (1991). Glucose dehydrogenase from Bacillus subtilis expressed in Escherichia coli. I: purification, characterization and comparison with glucose dehydrogenase from Bacillus megaterium. Biochim Biophys Acta 1076, 298–304.[Medline]

Ibrahim, R. K., Bruneau, A. & Bantignies, B. (1998). Plant O-methyltransferases: molecular analysis, common signature and classification. Plant Mol Biol 36, 1–10.[CrossRef][Medline]

Inouye, M., Suzuki, H., Takada, Y., Muto, N., Horinouchi, S. & Beppu, T. (1994). A gene encoding mycinamicin III O-methyltransferase from Micromonospora griseorubida. Gene 141, 121–124.[CrossRef][Medline]

Ishikawa, J. & Hotta, K. (1999). FramePlot: a new implementation of the Frame analysis for predicting protein-coding regions in bacterial DNA with a high G+C content. FEMS Microbiol Lett 174, 251–253.[CrossRef][Medline]

Jeevarajah, D., Patterson, J. H., McConville, M. J. & Billman-Jacobe, H. (2002). Modification of glycopeptidolipids by an O-methyltransferase of Mycobacterium smegmatis. Microbiology 148, 3079–3087.[Abstract/Free Full Text]

Kroll, J. S. (1992). The genetics of encapsulation in Haemophilus influenzae. J Infect Dis 165 (Suppl. 1), S93–S96.[Medline]

Krzywinska, E. & Schorey, J. S. (2003). Characterization of genetic differences between Mycobacterium avium subsp. avium strains of diverse virulence with a focus on the glycopeptidolipid biosynthesis cluster. Vet Microbiol 91, 249–264.[CrossRef][Medline]

Mahairas, G. G., Sabo, P. J., Hickey, M. J., Singh, D. C. & Stover, C. K. (1996). Molecular analysis of genetic differences between Mycobacterium bovis BCG and virulent M. bovis. J Bacteriol 178, 1274–1282.[Abstract]

Maslow, J. N., Dawson, D., Carlin, E. A. & Holland, S. M. (1999). Hemolysin as a virulence factor for systemic infection with isolates of Mycobacterium avium complex. J Clin Microbiol 37, 445–446.[Abstract/Free Full Text]

Mills, J. A., McNeil, M. R., Belisle, J. T., Jacobs, W. R., Jr & Brennan, P. J. (1994). Loci of Mycobacterium avium ser2 gene cluster and their functions. J Bacteriol 176, 4803–4808.[Abstract]

Patterson, J. H., McConville, M. J., Coppel, R. L. & Billman-Jacobe, H. (2000). Identification of a methyltransferase from Mycobacterium smegmatis involved in glycopeptidolipid synthesis. J Biol Chem 275, 24900–24906.[Abstract/Free Full Text]

Pedrosa, J., Florido, M., Kunze, Z. M., Castro, A. G., Portaels, F., McFadden, J., Silva, M. T. & Appelberg, R. (1994). Characterization of the virulence of Mycobacterium avium complex (MAC) isolates in mice. Clin Exp Immunol 98, 210–216.[Medline]

Rahn, A., Drummelsmith, J. & Whitfield, C. (1999). Conserved organization in the cps gene clusters for expression of Escherichia coli group 1 K antigens: relationship to the colanic acid biosynthesis locus and the cps genes from Klebsiella pneumoniae. J Bacteriol 181, 2307–2313.[Abstract/Free Full Text]

Recht, J. & Kolter, R. (2001). Glycopeptidolipid acetylation affects sliding motility and biofilm formation in Mycobacterium smegmatis. J Bacteriol 183, 5718–5724.[Abstract/Free Full Text]

Recht, J., Martínez, A., Torello, S. & Kolter, R. (2000). Genetic analysis of sliding motility in Mycobacterium smegmatis. J Bacteriol 182, 4348–4351.[Abstract/Free Full Text]

Slauch, J. M., Lee, A. A., Mahan, M. J. & Mekalanos, J. J. (1996). Molecular characterization of the oafA locus responsible for acetylation of Salmonella typhimurium O-antigen: OafA is a member of a family of integral membrane trans-acylases. J Bacteriol 178, 5904–5909.[Abstract]

Stevenson, G., Andrianopoulos, K., Hobbs, M. & Reeves, P. R. (1996). Organization of the Escherichia coli K-12 gene cluster responsible for production of the extracellular polysaccharide colanic acid. J Bacteriol 178, 4885–4893.[Abstract]

Tizard, M., Bull, T., Millar, D., Doran, T., Martin, H., Sumar, N., Ford, J. & Hermon-Taylor, J. (1998). A low G+C content genetic island in Mycobacterium avium subsp. paratuberculosis and M. avium subsp. silvaticum with homologous genes in Mycobacterium tuberculosis. Microbiology 144, 3413–3423.[Abstract]

Torrelles, J. B., Ellis, D., Osborne, T., Hoefer, A., Orme, I. M., Chatterjee, D., Brennan, P. J. & Cooper, A. M. (2002). Characterization of virulence, colony morphotype and glycopeptidolipid of Mycobacterium avium strain 104. Tuberculosis 82, 293–300.[CrossRef][Medline]

van Schie, B. J., Hellingwerf, K. J., van Dijken, J. P., Elferink, M. G., van Dijl, J. M., Kuenen, J. G. & Konings, W. N. (1985). Energy transduction by electron transfer via a pyrrolo-quinoline quinone-dependent glucose dehydrogenase in Escherichia coli, Pseudomonas aeruginosa, and Acinetobacter calcoaceticus (var. lwoffi). J Bacteriol 163, 493–499.[Medline]

Received 29 May 2003; accepted 26 June 2003.