Phylogeny of Mycobacterium avium strains inferred from glycopeptidolipid biosynthesis pathway genes

Elzbieta Krzywinska{dagger}, Jaroslaw Krzywinski and Jeffrey S. Schorey

Department of Biological Sciences, Center for Tropical Disease Research and Training, University of Notre Dame, 130 Galvin Life Science Center, Notre Dame, IN 46556, USA

Correspondence
Jeffrey S. Schorey
schorey.1{at}nd.edu


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The Mycobacterium avium complex (MAC) encompasses two species, M. avium and Mycobacterium intracellulare, which are opportunistic pathogens of humans and animals. The standard method of MAC strain differentiation is serotyping based on a variation in the antigenic glycopeptidolipid (GPL) composition. To elucidate the relationships among M. avium serotypes a phylogenetic analysis of 13 reference and clinical M. avium strains from 8 serotypes was performed using as markers two genomic regions (890 bp of the gtfB gene and 2150 bp spanning the rtfAmtfC genes) which are associated with the strains' serological properties. Strains belonging to three other known M. avium serotypes were not included in the phylogeny inference due to apparent lack of the marker sequences in their genomes, as revealed by PCR and Southern blot analysis. These studies suggest that serotypes prevalent in AIDS patients have multiple origins. In trees inferred from both markers, serotype 1 strains, known to have the simplest and shortest GPLs among all other serotypes, were polyphyletic. Likewise, comparisons of the inferred phylogenies with the molecular typing results imply that the existing tools used in epidemiological studies may be poor estimators of M. avium strain relatedness. Additionally, trees inferred from each marker had significantly incongruent topologies due to a well supported alternative placement of strain 2151, suggesting a complex evolutionary history of this genomic region.


Abbreviations: GPL, glycopeptidolipid; ITS, internal transcribed spacer; MAC, Mycobacterium avium complex

The GenBank accession numbers for the sequences reported in this paper are AY376356AY376382.

{dagger}Present address: Department of Medical Microbiology, Medical University of Gdansk, Do Studzienki 38, 80-227 Gdansk, Poland.


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The Mycobacterium avium complex (MAC) is recognized as an important group of bacteria causing disease in man and animals. MAC consists of two principal species, M. avium and Mycobacterium intracellulare, containing morphologically and biochemically indistinguishable organisms, and many strains not assigned to either species (Wayne & Sramek, 1992). Widespread reports on the isolation of MAC from patients with AIDS and from patients without predisposing conditions, increases the need for a reliable strain identification system to obtain information on the epidemiology and virulence of MAC. The standard technique used for MAC strain differentiation has been serologic typing based on the sugar residue composition of surface glycopeptidolipids (GPLs). Of 28 described MAC serotypes, 11 have been assigned to M. avium (serotypes 1–6, 8–11 and 21) and 11 to M. intracellulare (serotypes 7, 12–20 and 25), based on DNA and antibody probes and HPLC analysis (Saito et al., 1990). Serotyping analysis showed that MAC infections in AIDS patients were almost exclusively due to M. avium belonging predominantly to serotypes 1, 4, 6 and 8; however, the prevalence of isolated serotypes varied depending on geographic location (Julander et al., 1996; Yakrus & Good, 1990). This raises the question whether these serotypes form a related subset of strains.

Two forms of GPLs are present in MAC, the serotype-specific GPL (ssGPL) and non-serotype specific GPL (nsGPL) (for a review, see Chatterjee & Khoo, 2001). All GPLs have in common an N-acylated lipotetrapeptide core that bears a rhamnosylated alaninyl C terminus. In both GPL forms, a single 6-deoxytalose (6-dTal) unit is attached to the D-allo-Thr. In case of the ssGPL, 6-dTal is further glycosylated to yield a haptenic oligosaccharide that varies in composition among different serotypes. The biosynthetic pathway leading to the formation of the GPLs is poorly understood. Recently, Eckstein et al. (2003) proposed a pathway leading to the formation of serotype-specific GPL of serotype 2. However, the exact function of the genes potentially involved remains unknown, with one exception. It has been experimentally demonstrated that the rtfA gene encodes rhamnosyltransferase that catalyses the addition of rhamnose to 6-dTal (Eckstein et al., 1998; Maslow et al., 2003). Glycosylation performed by rhamnosyltransferase is assumed to be the first step in the conversion of the non-specific GPLs to serotype-specific GPLs.

Although seroagglutination has been the most frequently employed method for differentiating MAC strains, up to one-third of all isolates could not be serotyped due to autoagglutination or a failure to react with any serum (Tsang et al., 1992). This has promoted the development of DNA-based tools to differentiate MAC strains and to analyse strain relatedness. Currently available differentiation methods include DNA–rRNA hybridization with commercial probes (Accuprobe; GenProbe), nucleotide sequence analysis of the 16S–23S rDNA internal transcribed spacer (ITS) (Frothingham & Wilson, 1993), the 16S rRNA (Rogall et al., 1990) and hsp65 (Swanson et al., 1997) genes, as well as IS1245 RFLP typing (Guerrero et al., 1995; van Soolingen et al., 1998) and analysis of restriction polymorphism of a PCR-amplified fragment of the hsp65 gene (Telenti et al., 1993). However, different genotypic assays often yield inconsistent results and they do not correlate well with serotyping (Wasem et al., 1991), suggesting different evolutionary forces acting upon individual markers or differences in their evolutionary histories. Therefore, markers based directly on genes involved in serotype specificity would be more illuminating.

In addition, the evolutionary relationships within M. avium are very poorly understood. Phylogeny of MAC has been studied using 16S–23S rDNA ITS and hsp65 sequences as markers; however, because of extremely low sequence variation the relationships among M. avium strains were nearly completely unresolved (Frothingham & Wilson, 1993; Swanson et al., 1997). Until now, only three sequence variants differing at two alignment positions have been identified in the hsp65 gene (Leao et al., 1999; Swanson et al., 1997). The ITS locus was slightly more variable because it allowed for differentiation of seven distinct M. avium genotypes referred to as sequevars (Frothingham & Wilson, 1993; Mijs et al., 2002). Further phylogenetic studies implementing alternative markers are necessary to help answer fundamental questions related to M. avium epidemiology and pathogenicity, such as why certain serotypes are more prevalent in AIDS patients.

Previously we found relatively high sequence variation within the 5' region of the GPL biosynthesis cluster among four M. avium strains belonging to three serotypes (Krzywinska & Schorey, 2003). In addition, gene organization within this region was highly conserved among all four strains and possibly conserved among other M. avium strains, suggesting its potential value for phylogeny reconstruction. Here we employ selected GPL genes as markers to determine phylogenetic relationships among M. avium serotypes, and compare the inferred trees with serotype designations and information obtained from markers used in earlier studies. Our analysis allowed us to gain new perspective on the utility of the existing M. avium typing tools and on the mechanisms of M. avium evolution, involving likely secondary gain of serotypic characteristics and horizontal gene transfer.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Mycobacterial strains.
MAC strains used in this study, together with their serotypes, IS1245 RFLP patterns and sequevars are listed in Table 1. MAC reference strains were obtained as frozen broth cultures. Other mycobacterial strains have been maintained in our laboratory as described previously (Bohlson et al., 2001). M. intracellulare was included in the study as an outgroup in the phylogenetic analysis.


View this table:
[in this window]
[in a new window]
 
Table 1. MAC strains

 
PCR amplification and DNA sequencing of GPL cluster genes.
Three primer pairs were designed based on the nucleotide sequence alignment of the GPL biosynthesis cluster from M. avium strains A5 (GenBank accession no. AY130970), 724 (AF125999), 2151 (AF143772) and 104 [The Institute for Genomic Research (TIGR), http://www.tigr.org]. Primers rtfAF (5'-GCTGTGGCAAGTTATGGG-3') and rtfAR (5'-CGGGTGAGGTATTCGGG-3') were used to amplify a fragment of the rhamnosyltransferase A gene corresponding to nucleotide positions 2310–3482 in the A5 strain GPL cluster sequence; primers gtfBF (5'-CGGTGGTCTCGATGCTTT-3') and gtfBR (5'-TGCCACGCTCAAATCGGT-3') were used to amplify a fragment of the glycosyltransferase B gene spanning positions 6332–7557; and primers mtfCF (5'-CTGCGGAAGATCCTGGC-3') and mtfCR (5'-TCGATCTCTACGATCTCC-3') were used to amplify a fragment spanning positions 3446–4479, encompassing the 3' end of the rhamnosyltransferase A gene, an intergenic region and most of the methyltransferase C gene. Selection of these genomic regions as markers was dictated by their higher sequence variability relative to other aligned portions of the cluster. The PCR template was prepared by centrifugation of a 200 µl frozen broth culture or glycerol stock of mycobacteria, resuspension in 50 µl sterile TE and boiling for 10 min. The lysates were divided into aliquots and stored at –20 °C. PCR reactions were performed using the Advantage-GC cDNA Polymerase Mix (Clontech), in a final volume of 50 µl containing 1 µl crude lysate, 10 mM dNTP mix and 50 pmol each primer. PCR thermal conditions consisted of an initial denaturation step of 2 min at 94 °C, followed by 30 cycles at 94 °C for 30 s, 58–59 °C for 40 s, 72 °C for 1 min, and a final extension at 72 °C for 10 min. PCR products were excised from a gel, purified with QIAEX II Gel Extraction Kit (Qiagen) and, with one exception (see Results and Discussion), sequenced directly from both strands using forward, reverse and internal primers. Sequencing was carried out on an ABI 3700 sequencer using ABI BigDye terminator chemistry and the sequences were edited using Sequence Navigator (Perkin-Elmer). To confirm the identity of the sequences, amplification and sequencing were performed twice on eight strains selected at random.

Southern blot hybridization.
Mycobacterial genomic DNA was isolated (Hunter et al., 1982) and 5 µg was digested with BamHI, separated by electrophoresis on a 0·6 % agarose gel and capillary transferred to positively charged nylon membranes using a standard protocol (Sambrook et al., 1989). Purified PCR products from M. avium 101, labelled with digoxigenin using the DIG High Prime DNA Labelling Kit (Boehringer Mannheim), were used as probes. Membranes were hybridized overnight at 68 °C with the probes and signal detection was performed according to the manufacturer's protocol. Post-hybridization washes were performed under high stringency (2x SSC, 0·1 % SDS at room temperature; 0·1x SSC, 0·1 % SDS at 68 °C) or low stringency (2x SSC, 0·1 % SDS at room temperature; 1x SSC, 0·1 % SDS at 55 °C) conditions. Before reprobing, membranes were stripped according to the manufacturer's protocol.

DNA fingerprinting.
DNA fingerprinting was performed by restriction fragment length polymorphism (RFLP) typing using the insertion sequence IS1245 as a probe, according to van Soolingen et al. (1998).

ITS sequencing and sequevar determination.
The ITS region was amplified as described by De Smet et al. (1995) and sequenced. To determine the sequevar of each strain, the ITS sequences were compared with the published sequevar sequences (Frothingham & Wilson, 1993; Mijs et al., 2002) following alignment using CLUSTALX (Thompson et al., 1997).

Sequence analysis.
Maximum-parsimony (MP) and maximum-likelihood (ML) phylogenetic analyses of CLUSTALX-aligned sequences were carried out with PAUP 4.0b10 (Swofford, 2001), using heuristic searches and TBR branch-swapping. The analyses were done by stepwise random addition of sequences with 1,000 (MP) or 100 (ML) replications; confidence in the inferred topologies was estimated by bootstrapping, with 500 (MP) or 100 (ML) pseudoreplicates, each with 10 random additions of sequences. The F81+I substitution model used in likelihood analyses was selected for both datasets using MODELTEST 3.06 (Posada & Crandall, 1998). The presence of conflict between the datasets was evaluated by using the incongruence length difference (ILD) test (Farris et al., 1995), and the significance of differences between the most likely tree and the tree with the alternative topology was evaluated in the Shimodaira & Hasegawa (1999) test, both implemented in PAUP.


   RESULTS AND DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The strains in the MAC have gained importance because of the high prevalence of disseminated MAC infection in patients with AIDS and the increasing number of infections in non-AIDS patients. A significant number of studies have been initiated to understand the epidemiology of these mycobacteria; nevertheless, the origin of MAC infections in humans remains a matter of speculation. It has been suggested that MAC infections in AIDS patients may be caused by a limited number of related M. avium strains (Ritacco et al., 1998; Yakrus & Good, 1990), almost invariably belonging to serotypes 1, 4, 6 and 8 (Julander et al., 1996; Yakrus & Good, 1990), but other evidence indicates that they do not represent a single clonal lineage (Feizabadi et al., 1996). Here we address the issue of phylogenetic relationships among M. avium serotypes by using, as markers, genes associated with the strains' serological properties.

DNA amplification, gene sequences and alignment
A total of 19 MAC strains belonging to 13 serotypes were included in this study (Table 1). Data from four strains were derived from GenBank (M. avium 724, accession no. AF125999; M. avium 2151, AF143772; M. avium A5, AY130970) and TIGR (M. avium 104) databases. Sequences from other strains, except the five PCR-negative strains, were obtained in this study (see below). The final dataset used in phylogenetic analysis consisted of 13 strains of M. avium and one strain of M. intracellulare (Table 1).

In all M. avium PCR-positive strains amplification with gtfB primers yielded a 926 bp fragment of the glycosyltransferase B gene, with rtfA primers yielding a 1151 bp fragment of the rhamnosyltransferase A gene and with mtfC primers yielding a 1032–1034 bp fragment, consisting of the 3' end (161 bp) of the rhamnosyltransferase A gene, an intergenic spacer (101–103 bp) and the 5' fragment (770 bp) of the methyltransferase C gene. Use of the rtfA and mtfC primer pairs, amplifying slightly overlapping genomic regions, allowed us to obtain a 2150 bp fragment of contiguous genomic sequence, referred to as rtfA-mtfC. In M. intracellulare serotype 13 (strain 5509-Borstel) the only specific product was obtained with the rtfA primers after lowering the annealing temperature and cloning. Interestingly, the positive clones contained, apart from a corresponding sequence in M. avium, an entire 3' end of the rtfA gene, an intergenic region and a 139 bp fragment of the mtfC gene. Inspection of the obtained sequence revealed that the annealing of the rtfAR primer was prevented by four mismatches with its target site. Instead, the amplification was effected solely by the rtfAF primer, which, not counting the target site within the rtfA gene, annealed in a reverse orientation within the 5' end of the mtfC gene.

Despite efforts involving changes in PCR conditions and using isolated genomic DNA instead of crude bacterial lysates as templates, amplification of markers from M. avium serotypes 5 (strain 25546-759), 10 (strains 1602-1965 and Borne) and 11 (strain 14186-1424), and from M. intracellulare strain 157 Manten was unsuccessful. With the exception of a product from M. intracellulare strain 5509-Borstel amplified using rtfA primers, lowering annealing temperature led to the amplification of only non-specific products. PCR failure in such cases can be attributed to mismatches between primers and their target sites or lack of the target sequences in the genomes. Southern blot analysis of the genomic DNA from PCR-negative strains was performed to determine the cause of the amplification failure (data not shown). Under high stringency conditions, PCR-amplified fragments of rtfA, mtfC and gtfB genes used as probes yielded a signal only for M. intracellulare serotypes 7 (strain 157 Manten) and 13 (strain 5509-Borstel), although in both cases the probes hybridized to different molecular mass fragments compared to M. avium serotype 21 DNA used as a positive control. This indicated that the lack of PCR products resulted from modified primer target sites in 157 Manten and 5509-Borstel strains. No hybridization signal in the PCR-negative M. avium strains suggested that either their genomes lack the DNA fragments homologous to the probes or low similarity to the probes prevented hybridization. To allow detection of more divergent sequences, the same blots were reprobed at low stringency. This approach appeared uninformative, because in each strain, including the control, several DNA fragments showed cross-reactivity with the probes, consistent with the presence of genes similar to the marker sequences within the M. avium genome. Because the rtfA locus in M. intracellulare strain 5509-Borstel was detected at high stringency, despite substantial divergence from the rtfA probe (see below), we conclude that the marker sequences are absent from the PCR-negative M. avium strains 25546-759 (serotype 5), 1602-1965 (serotype 10), Borne (serotype 10) and 14186-1424 (serotype 11). Evidence suggests that these strains may be closely related. IS1245-based RFLP typing clustered serotype 5 and 10 strains into a fairly homogeneous group, which differed from serotype 11 strains in their banding pattern (Ritacco et al., 1998). However, GPLs of serotype 10 and 11 have been shown to be identical by GC-MS and ELISA (Denner et al., 1992). Another distinct characteristic of the ssGPLs that is shared by all three PCR-negative M. avium serotypes is a unique component that allows attachment of haptenic oligosaccharide to the N-acylpeptide core (Chatterjee & Khoo, 2001). We suggest that the lack of genes used as markers in the present study reflects an altered genetic basis for the ssGPL biosynthesis in these strains.

The concatenated nucleotide sequences of both markers with primer sequences removed produced an unambiguous 3007 character-long alignment encompassing 309 variable and 82 parsimony informative positions. There were four indels, 1–3 bp long, present in the intergenic spacer region. The alignments of both markers are available from the NCBI website via Popset. Across all studied strains the numbers of variable and informative sites were, respectively, 218 and 44 for rtfA, 17 (plus 7 indel sites) and 5 (plus 3 indel sites) for the intergenic spacer, 55 and 19 for mtfC, and 19 and 14 for gtfB. There were 50 (11 in M. avium) non-synonymous (causing amino acid change) nucleotide substitutions in rtfA, 2 (both in M. avium) in mtfC and 6 in gtfB. Proportions of polymorphic sites varied between the studied genes. Maximum pairwise sequence divergence within the ingroup reached 1·4 % for the gtfB gene, 3·6 % for the rtfA gene and 4·0 % for the mtfC gene. The divergence of M. avium strains from M. intracellulare within the available region was much greater, reaching ~15 % for each pairwise comparison.

The 13 M. avium strains, for which sequences were obtained, were represented by 9 different genotypes, where a unique combination of two markers defined a particular genotype. Strains 101, 2993, 104, 13528-1079 and SJB#2 shared the same genotype.

Phylogeny
The analysis of the rtfA-mtfC region resulted in trees with basically identical topology regardless of the inference method applied (Fig. 1). Three clades (named here A, B and C) could be distinguished within M. avium, with strains 724, 2151, 6195, B-92 and 17584-286 (clade C) forming a sister group to the remaining strains, among which Sparrow 185 (clade B) emerged as a distinct basal branch. This topology was well supported by bootstrap under MP, but support under ML was lower, especially for the clustering of clades A and B.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 1. Phylogenetic relationships among M. avium strains inferred under ML criteria using rtfA-mtfC (left) and gtfB (right) gene sequences as markers. Sequevar (ITS genotype), IS1245 RFLP pattern (RFLP) and serotype (ser) designations of each strain are mapped on the rtfA-mtfC tree; three distinct clades are denoted as A, B and C. The rtfA-mtfC tree is rooted with M. intracellulare 5509-Borstel; the gtfB tree is unrooted. Because of substantial differences in sequence variation between markers (small number of variable sites in gtfB) branch lengths in both trees are not drawn to the same scale. MP and ML bootstrap values, respectively, are given for clades with support higher than 50 %. Strain 2151, found in discordant positions in trees, is boxed. RFLP patterns: mb, multibanded; bt, bird-type; 1b, one band.

 
Apart from being less resolved, trees based on the gtfB gene fragment differed from the rtfA-mtfC trees in the position of strain 2151 within clade A rather than clade C. This discrepancy was well supported by bootstrap. The other difference concerned strain A5, which, instead of being placed within clade A, was inferred as a sister to (Sparrow 185+remaining strains from clade A). The position of strain A5 in the trees, however, was based on 4 out of 14 informative positions present in the gtfB alignment and the signal from the remaining 10 informative positions was not in conflict with the topology inferred from the rtfA-mtfC marker. It is likely that the characters supporting the A5 location in the gtfB tree in those four positions do not have a single origin, but have arisen by independent substitutions or other mechanisms not involving vertical inheritance.

A phylogenetic analysis of both markers combined was not performed because the two datasets were significantly incongruent according to the ILD test (P<0·001).

In the present dataset, three serotypes were represented by more than one strain. Although limited, this sample allowed us to address the question of whether strains with a given serotype have a clonal origin. Our results suggest that this is likely not to be the case, at least for the serotype 1 strains, here inferred to as polyphyletic, i.e. not sharing the most recent common ancestor. As mentioned earlier, serotypic differences in M. avium are caused by a variation in the ssGPL composition which is controlled by a complex glycosylation pathway. Most of the molecular mechanisms and involvement of individual gene products in this pathway have yet to be elucidated. However, it has been suggested that during ssGPL biosynthesis the sugars are added singly and sequentially to the lipopeptide core (Eckstein et al., 1998). This hypothesis may be fundamental for the understanding of the polyphyly of serotype 1 strains observed in this study. The oligosaccharide chain in the ssGPL of serotype 1, consisting of a disaccharide unit [{alpha}-L-rhap-(1->2)-L-dTal], is the simplest and shortest among all other serotypes (Table 2). With the exception of serotypes 5, 10 and 11, which are distinguished by the 3-O-Me-{alpha}-L-Rhap-(1->2)-dTal component, this unit is present in all serotypes, serving as a core to which the next sugar is added during oligosaccharide chain elongation (Chatterjee & Khoo, 2001). Insertional inactivation by a mobile element or deletion of the gene responsible for the addition of that sugar would lead to the formation of the serotype 1 GPL, regardless of the presence of genes encoding enzymes necessary for further modification of the GPL. This would result in a phylogenetic pattern of serotype 1 having multiple origins. Feasibility of such a scenario is supported by an observation of a spontaneous deletion of the GPL genomic region mediated by IS1601 that caused complete loss of GPL in strain 2151 (Eckstein et al., 2000). It is also supported by a proposed GPL biosynthesis pathway model in which a serotype 1-specific GPL is regarded as an intermediate for the synthesis of the serotype 2 GPL (Eckstein et al., 2003). Moreover, this scenario is consistent with genetic heterogeneity of serotype 1 strains revealed by IS1245-based RFLP typing (this study; Ritacco et al., 1998). Clustering of strains belonging to serotypes 2, 3 and 9 in clade C correlates with the very similar structure of their oligosaccharides (Table 2), strongly suggesting a common origin of these strains. From this perspective, strain B-92, belonging to serotype 1, seems not to fit its nested position within clade C. It is worth noting that strains belonging to serotypes 8 and 21, having very similar GPL structure, were clustered in clade A.


View this table:
[in this window]
[in a new window]
 
Table 2. Comparison of the structure of known serotype-specific GPLs from M. avium serotypes used in present study

 
Is there a correlation between our results and the sequence data used in earlier phylogenetic studies? For the comparisons we used the ITS locus, which was shown to have higher discriminatory power among M. avium strains compared to the hsp65 gene (Frothingham & Wilson, 1993; Mijs et al., 2002). The correlation of the ITS sequence variants with our marker phylogenies was evaluated by mapping each strain's sequevar designation on trees inferred in this study (Fig. 1). Sequevar data were obtained either by sequencing (strains 101, 104, 2151 and A5) or from the literature (Frothingham & Wilson, 1993). With the exception of strain 17584-286, corresponding to sequevar Mav-D, all of the analysed strains belonged to sequevars Mav-A or Mav-B. There was only rough correspondence between sequevars and inferred clades; sequevars were not monophyletic in either tree. This finding, however, may have no biological significance considering the fact that Mav-A is differentiated from Mav-B by only one mutational change, which could have occurred independently in different strains belonging to a given sequevar. The notion of independent acquisition of the same sequevar is supported by discrepancies between Mav-A sequevar designations and IS1245 RFLP patterns (Fig. 1).

IS1245 RFLP typing has become a frequently used molecular tool for distinguishing isolates within the MAC (Guerrero et al., 1995; van Soolingen et al., 1998). Studies have shown that human M. avium isolates almost invariably displayed polymorphic multibanded IS1245 RFLP patterns (Guerrero et al., 1995; Ritacco et al., 1998), similar to isolates from pigs (Komijn et al., 1999). In contrast, bird isolates were found to possess identical three-banded patterns (Ritacco et al., 1998). Mijs et al. (2002) proposed the designation of these bird-type M. avium isolates [additionally characterized by (1) being unable to grow at 24 and 45 °C, (2) belonging to Mav-A ITS sequevar and (3) being associated with serotypes 1, 2 and 3] as a separate, evolutionarily conserved taxon. The limited number of M. avium serotypes that cause infections in humans and similarity in their genotypes manifested by multibanded IS1245 RFLP patterns suggest that strains pathogenic to humans may also be closely related. However, our study suggests that they do not have a common origin, because neither the multibanded strains nor the typically human serotypes were inferred as monophyletic, although they clustered predominantly in clade A (Fig. 1). Similarly, three-banded bird-type isolates were not monophyletic in the rtfA-mtfC tree, but formed a monophyletic clade in the gtfB tree.

Alternative placement of strain 2151 in the trees inferred with two analysed markers begs an explanation. Significant discrepancy between trees (Shimodaira-Hasegawa test, P<0·001) hints at horizontal gene transfer as a likely cause (Philippe & Douady, 2003). Further evidence for the involvement of gene transfer comes from our analysis of a larger genomic region encompassing the GPL gene cluster (Krzywinska et al., 2004). There we show that a short chromosome region encompassing a gtfB gene fragment used for the phylogenetic analysis has been transferred between M. avium strains. Because the rtfA-mtfC marker was outside of the transferred region, we hypothesize that rtfA-mtfC provides a better estimation of M. avium phylogeny than gtfB. However, further studies of MAC phylogeny should utilize other new markers, possibly outside the GPL cluster. Despite a considerable amount of phylogenetic information, a prerequisite of a good phylogenetic marker, rtfA-mtfC cannot be used for tracing the history of the entire group because it was not present in the genomes of all strains studied. It is conceivable that the same will apply to other genes from the cluster, when a range of strains, representing all serotypes, are considered. Moreover, this study clearly shows that phylogenetic inference, to be meaningful, should be based on more than one marker.

Our analysis suggests that serotypes prevalent in AIDS patients have multiple origins. Further study involving a larger number of serotypes and different genomic regions would demonstrate if our hypotheses hold. Here we have proposed a simple scenario for a secondary gain of the serotype 1 GPL that would result in lack of concordance between serotype delineation and the distribution of particular genotypic and phenotypic characters in strains belonging to serotype 1. It cannot be excluded that a similar lack of concordance concerning serotypes 2 and 3 (Komijn et al., 1999; Ritacco et al., 1998) is a consequence of horizontal transfer. However, answering this issue would require further genomic studies involving a larger number of representative strains.


   ACKNOWLEDGEMENTS
 
We thank Andrea Cooper for providing M. avium 724 and 2151, Kathleen Eisenach for providing M. avium A5, and Eric Brown for providing M. avium 104 and 101. We acknowledge Colorado State University and NIH for providing MAC reference strains (NIAID contract AI-75320 entitled ‘TB Research Materials and Vaccine Testing Program’) and the Institute for Genomic Research for providing sequences of M. avium strain 104. We thank Nora J. Besansky for access to computing and Frank Collins for use of the sequencing facility. We also thank Hope Hollocher for her careful reading of the manuscript and helpful suggestions. This work was supported by a grant from the United States Department of Agriculture, CSREES.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Aspinall, G. O., Khare, N. K., Sood, R. K., Chatterjee, D., Rivoire, B. & Brennan, P. J. (1991). Structure of the glycopeptidolipid antigen of serovar 20 of the Mycobacterium avium serocomplex, synthesis of allyl glycosides of the outer di- and tri-saccharide units of the antigens of serovars 14 and 20, and serology of the derived neoglycoproteins. Carbohydr Res 216, 357–373.[Medline]

Bohlson, S. S., Strasser, J. A., Bower, J. J. & Schorey, J. S. (2001). Role of complement in Mycobacterium avium pathogenesis: in vivo and in vitro analyses of the host response to infection in the absence of complement component C3. Infect Immun 69, 7729–7735.[Abstract/Free Full Text]

Chatterjee, D. & Khoo, K. H. (2001). The surface glycopeptidolipids of mycobacteria: structures and biological properties. Cell Mol Life Sci 58, 2018–2042.[Medline]

Denner, J. C., Tsang, A. Y., Chatterjee, D. & Brennan, P. J. (1992). Comprehensive approach to identification of serovars of Mycobacterium avium complex. J Clin Microbiol 30, 473–478.[Abstract]

De Smet, K. A., Brown, I. N., Yates, M. & Ivanyi, J. (1995). Ribosomal internal transcribed spacer sequences are identical among Mycobacterium avium-intracellulare complex isolates from AIDS patients, but vary among isolates from elderly pulmonary disease patients. Microbiology 141, 2739–2747.[Abstract]

Eckstein, T. M., Silbaq, F. S., Chatterjee, D., Kelly, N. J., Brennan, P. J. & Belisle, J. T. (1998). Identification and recombinant expression of a Mycobacterium avium rhamnosyltransferase gene (rtfA) involved in glycopeptidolipid biosynthesis. J Bacteriol 180, 5567–5573.[Abstract/Free Full Text]

Eckstein, T. M., Inamine, J. M., Lambert, M. L. & Belisle, J. T. (2000). A genetic mechanism for deletion of the ser2 gene cluster and formation of rough morphological variants of Mycobacterium avium. J Bacteriol 182, 6177–6182.[Abstract/Free Full Text]

Eckstein, T. M., Belisle, J. T. & Inamine, J. M. (2003). Proposed pathway for the biosynthesis of serovar-specific glycopeptidolipids in Mycobacterium avium serovar 2. Microbiology 149, 2797–2807.[Abstract/Free Full Text]

Farris, J. S., Kallersjo, M., Kluge, A. G. & Bult, C. (1995). Testing significance of incongruence. Cladistics 10, 315–319.[CrossRef]

Feizabadi, M. M., Robertson, I. D., Cousins, D. V., Dawson, D., Chew, W., Gilbert, G. L. & Hampson, D. J. (1996). Genetic characterization of Mycobacterium avium isolates recovered from humans and animals in Australia. Epidemiol Infect 116, 41–49.[Medline]

Frothingham, R. & Wilson, K. H. (1993). Sequence-based differentiation of strains in the Mycobacterium avium complex. J Bacteriol 175, 2818–2825.[Abstract]

Guerrero, C., Bernasconi, C., Burki, D., Bodmer, T. & Telenti, A. (1995). A novel insertion element from Mycobacterium avium, IS1245, is a specific target for analysis of strain relatedness. J Clin Microbiol 33, 304–307.[Abstract]

Hough, L. & Theobald, R. S. (1963). Dealkylation. In Methods in Carbohydrate Chemistry, pp. 203–206. Edited by W. L. Whistler & R. S. Wolfrom. New York: Academic Press.

Hunter, S. W., Fujiwara, T. & Brennan, P. J. (1982). Structure and antigenicity of the major specific glycolipid antigen of Mycobacterium leprae. J Biol Chem 257, 15072–15078.[Abstract/Free Full Text]

Julander, I., Hoffner, S., Petrini, B. & Ostlund, L. (1996). Multiple serovars of Mycobacterium avium complex in patients with AIDS. Apmis 104, 318–320.[Medline]

Khoo, K. H., Jarboe, E., Barker, A., Torrelles, J., Kuo, C. W. & Chatterjee, D. (1999). Altered expression profile of the surface glycopeptidolipids in drug-resistant clinical isolates of Mycobacterium avium complex. J Biol Chem 274, 9778–9785.[Abstract/Free Full Text]

Komijn, R. E., de Haas, P. E., Schneider, M. M., Eger, T., Nieuwenhuijs, J. H., van den Hoek, R. J., Bakker, D., van Zijd Erveld, F. G. & van Soolingen, D. (1999). Prevalence of Mycobacterium avium in slaughter pigs in The Netherlands and comparison of IS1245 restriction fragment length polymorphism patterns of porcine and human isolates. J Clin Microbiol 37, 1254–1259.[Abstract/Free Full Text]

Krzywinska, E. & Schorey, J. S. (2003). Characterization of genetic differences between Mycobacterium avium subsp. avium strains of diverse virulence with a focus on the glycopeptidolipid biosynthesis cluster. Vet Microbiol 91, 249–264.[CrossRef][Medline]

Krzywinska, E., Krzywinski, J. & Schorey, J. S. (2003). Naturally occurring horizontal gene transfer and homologous recombination in Mycobacterium. Microbiology 150, 1707–1712.

Leao, S. C., Briones, M. R., Sircili, M. P., Balian, S. C., Mores, N. & Ferreira-Neto, J. S. (1999). Identification of two novel Mycobacterium avium allelic variants in pig and human isolates from Brazil by PCR-restriction enzyme analysis. J Clin Microbiol 37, 2592–2597.[Abstract/Free Full Text]

Maslow, J. N., Irani, V. R., Lee, S. H., Eckstein, T. M., Inamine, J. M. & Belisle, J. T. (2003). Biosynthetic specificity of the rhamnosyltransferase gene of Mycobacterium avium serovar 2 as determined by allelic exchange mutagenesis. Microbiology 149, 3193–3202.[Abstract/Free Full Text]

McNeil, M., Chatterjee, D., Hunter, S. W. & Brennan, P. J. (1989). Mycobacterial glycolipids: isolation, structures, antigenicity, and synthesis of neoantigens. Methods Enzymol 179, 215–242.[Medline]

Mijs, W., de Haas, P., Rossau, R., Van der Laan, T., Rigouts, L., Portaels, F. & van Soolingen, D. (2002). Molecular evidence to support a proposal to reserve the designation Mycobacterium avium subsp. avium for bird-type isolates and ‘M. avium subsp. hominissuis’ for the human/porcine type of M. avium. Int J Syst Evol Microbiol 52, 1505–1518.[Abstract/Free Full Text]

Philippe, H. & Douady, C. J. (2003). Horizontal gene transfer and phylogenetics. Curr Opin Microbiol 6, 498–505.[CrossRef][Medline]

Posada, D. & Crandall, K. A. (1998). MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818.[Abstract]

Ritacco, V., Kremer, K., van der Laan, T., Pijnenburg, J. E., de Haas, P. E. & van Soolingen, D. (1998). Use of IS901 and IS1245 in RFLP typing of Mycobacterium avium complex: relatedness among serovar reference strains, human and animal isolates. Int J Tuberc Lung Dis 2, 242–251.[Medline]

Riviere, M. & Puzo, G. (1992). Use of 1H NMR ROESY for structural determination of O-glycosylated amino acids from a serine-containing glycopeptidolipid antigen. Biochemistry 31, 3575–3580.[Medline]

Rogall, T., Wolters, J., Flohr, T. & Bottger, E. C. (1990). Towards a phylogeny and definition of species at the molecular level within the genus Mycobacterium. Int J Syst Bacteriol 40, 323–330.[Abstract]

Saito, H., Tomioka, H., Sato, K., Tasaka, H. & Dawson, D. J. (1990). Identification of various serovar strains of Mycobacterium avium complex by using DNA probes specific for Mycobacterium avium and Mycobacterium intracellulare. J Clin Microbiol 28, 1694–1697.[Medline]

Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.

Shimodaira, H. & Hasegawa, M. (1999). Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 16, 1114–1116.[Free Full Text]

Swanson, D. S., Kapur, V., Stockbauer, K., Pan, X., Frothingham, R. & Musser, J. M. (1997). Subspecific differentiation of Mycobacterium avium complex strains by automated sequencing of a region of the gene (hsp65) encoding a 65-kilodalton heat shock protein. Int J Syst Bacteriol 47, 414–419.[Abstract/Free Full Text]

Swofford, D. L. (2001). PAUP*: Phylogenetic Analysis Using Parsimony (*and other methods), version 4.0beta. Sunderland, MA: Sinauer.

Telenti, A., Marchesi, F., Balz, M., Bally, F., Bottger, E. C. & Bodmer, T. (1993). Rapid identification of mycobacteria to the species level by polymerase chain reaction and restriction enzyme analysis. J Clin Microbiol 31, 175–178.[Abstract]

Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.[Abstract/Free Full Text]

Tsang, A. Y., Denner, J. C., Brennan, P. J. & McClatchy, J. K. (1992). Clinical and epidemiological importance of typing of Mycobacterium avium complex isolates. J Clin Microbiol 30, 479–484.[Abstract]

van Soolingen, D., Bauer, J., Ritacco, V. & 8 other authors (1998). IS1245 restriction fragment length polymorphism typing of Mycobacterium avium isolates: proposal for standardization. J Clin Microbiol 36, 3051–3054.[Abstract/Free Full Text]

Wasem, C. F., McCarthy, C. M. & Murray, L. W. (1991). Multilocus enzyme electrophoresis analysis of the Mycobacterium avium complex and other mycobacteria. J Clin Microbiol 29, 264–271.[Medline]

Wayne, L. G. & Sramek, H. A. (1992). Agents of newly recognized or infrequently encountered mycobacterial diseases. Clin Microbiol Rev 5, 1–25.[Abstract]

Yakrus, M. A. & Good, R. C. (1990). Geographic distribution, frequency, and specimen source of Mycobacterium avium complex serotypes isolated from patients with acquired immunodeficiency syndrome. J Clin Microbiol 28, 926–929.[Medline]

Received 30 December 2003; revised 15 March 2004; accepted 22 March 2004.