Comparative whole-genome analyses reveal over 100 putative phase-variable genes in the pathogenic Neisseria spp.

Lori A. S. Snyder1, Sarah A. Butcher2 and Nigel J. Saunders1

The Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK1
Oxford University Bioinformatics Centre, The Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK2

Author for correspondence: Nigel J. Saunders. Tel: +44 1865 275521. Fax: +44 1865 275515. e-mail: saunders{at}molbiol.ox.ac.uk


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Previously, a complete genome analysis of Neisseria meningitidis strain MC58 revealed the largest repertoire of putative phase-variable genes described in any species to date. Initial comparisons with two incomplete Neisseria spp. genome sequences available at that time revealed differences in the repeats associated with these genes in the form of polymorphisms, the absence of the potentially unstable elements in some alleles, and in the repertoire of the genes that were present. Analyses of the complete genomes of N. meningitidis strain Z2491 and Neisseria gonorrhoeae strain FA1090 have been performed and are combined with a comprehensive comparative analysis between the three available complete genome sequences. This has increased the sensitivity of these searches and provided additional contextual information that facilitates the interpretation of the functional consequences of repeat instability. This analysis identified: (i) 68 phase-variable gene candidates in N. meningitidis strain Z2491, rather than the 27 previously reported; (ii) 83 candidates in N. gonorrhoeae strain FA1090; and (iii) 82 candidates in N. meningitidis strain MC58, including an additional 19 identified through cross-comparisons with the other two strains. In addition to the 18 members of the opa gene family, a repertoire of 119 putative phase-variable genes is described, indicating a huge potential for diversification mediated by this mechanism of gene switching in these species that is central to their interactions with the host and environmental transitions. Eighty-two of these are either known (14) or strong (68) candidates for phase variation, which together with the opa genes make a total of 100 identified genes. The repertoires of the genes identified in this analysis diverge from the different species groupings, indicating horizontal exchange that significantly affects the species and strain complements of these genes.

Keywords: phase variation, Neisseria gonorrhoeae, Neisseria meningitidis, repeat, genome analysis

Abbreviations: HPT, homopolymeric tract


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The pathogenic Neisseria spp. Neisseria meningitidis and Neisseria gonorrhoeae are causative agents of meningitis and septicaemia, and gonorrhoea respectively. These populations are characterized by genetic diversity at several levels. Horizontal exchange of DNA between strains and neisserial species creates a situation in which sequences present within clones are available over time to the rest of the population (Bowler et al., 1994 ; Feil et al., 1995 ; Saunders et al., 1999 ). There is extensive allelic diversity in some genes, particularly those under antigenic selection pressures (Malorny et al., 1998 ). There are genes within a strain that undergo recombination between expressed and silent cassettes to generate diversity within clonal populations (Haas & Meyer, 1987 ). Finally, there are genes that are switched on and off by phase variation that can provide a large repertoire of phenotypes from within a clonal population, which can provide adaptation to changing environmental conditions (Sparling et al., 1986 ; Stern et al., 1986 ; Meyer & van Putten, 1989 ; Yang & Gotschlich, 1996 ; Saunders et al., 2000 ). In Neisseria spp., phase variation is consistently associated with reversible changes within simple DNA repeats, composed of repeated sequence motifs of less than 10 nt. These repeats are located either within ORFs or within promoters. Repeats located within ORFs alter the relative translational reading frame of the sequence 5' and 3' of the repeat as the number of repeats alters (Stern et al., 1986 ; Sparling et al., 1986 ; Stibitz et al., 1989 ). Changes in the length of repeats located within promoters alter the relative position of promoter components and thereby influence transcription (Sarkari et al., 1994 ; van der Ende et al., 1995 ).

Phase variation is a strategy used by many bacterial species. The advent of complete bacterial genome sequences has facilitated the investigation and identification of the repertoires of genes that are regulated in this way on a whole-system basis. The first analysis of this type was performed in Haemophilus influenzae strain Rd in which 12 genes were identified on the basis of their association with tetrameric repeat sequence motifs (Hood et al., 1996 ). Search methodology was subsequently improved to include all possible repeat motifs and to present each in its sequence context. This was initially used to analyse the genome sequence of Helicobacter pylori strain 26695, identifying 26 candidate phase-variable genes (Saunders et al., 1998 ), several of which have been subsequently confirmed experimentally (Appelmelk et al., 1999 , 2000 ; Peck et al., 1999 ; Wang et al., 1999 ; Josenhans et al., 2000 ; Yamaoka et al., 2000 ). In addition to increased sensitivity, this approach allowed several genes that had previously been suggested to be phase-variable in this strain, on the basis of repeat searching alone (Tomb et al., 1997 ), to be excluded on the basis of the sequence context of the repeat. Analysis of the N. meningitidis strain MC58 genome sequence revealed 65 putative phase-variable genes of which 44 were either previously recognized (13) or were particularly ‘strong’ candidates (31) (Saunders et al., 2000 ). Since the capacity of bacterial populations to generate diversity increases exponentially with the number of genes that can be phase-varied, the number of variable genes found in these whole-genome analyses suggest that clones are capable of switching between vast repertoires of phenotypes. The size of this repertoire and the nature of the genes for which a function can be determined indicate that this type of gene switching is a central aspect of the population’s response to changing conditions. As discussed previously (Saunders et al., 2000 ), the Neisseria spp. represent a unique context in which to interpret the results of this type of analysis because of the extensive existing body of knowledge on the length, composition and instability of repeats associated with phase variation. These include relatively short homopolymeric tracts (HPTs), e.g. of (C)7 in a gene for capsule biosynthesis (Hammerschmidt et al., 1996 ). These observations were consistent with Markov modelling which identified an over-abundance of repeats above this length in N. meningitidis strain MC58 (Saunders et al., 2000 ), which is also seen in the other two neisserial sequences (data not presented), suggestive of a mutational bias for instability in these sequences. Although there may be circumstances when additional regions of instability exist (van der Ende et al., 2000 ), when repeats that are consistently associated with gene switching are identifiable in appropriate sequence contexts this has greater predictive power than in other species. This is enhanced by the opportunity to perform comparative analyses of three genomes.

In contrast to the large repertoire of phase-variable candidates described in N. meningitidis strain MC58, the report of the complete sequence of N. meningitidis strain Z2491 described only 27 potential phase-variable genes (Parkhill et al., 2000 ). This suggested that a significant difference exists between the adaptability and behaviour of these two strains from different serogroups. Initial comparisons with the MC58 sequence suggested that strain Z2491 possesses a larger repertoire than initially described (Saunders et al., 2000 ). However, comparison between different sequences of this type can only be made on the basis of similar analyses using the same methodologies.

This paper describes independent analyses of the complete genome sequences of N. meningitidis strain Z2491 and N. gonorrhoeae strain FA1090 to determine their repertoires of potentially phase-variable genes. The results of these analyses were then combined with the previous analysis of N. meningitidis strain MC58 and a comprehensive comparison was made between the three genome sequences of the candidate genes identified in each individual sequence. This revealed many additional genes in all three genomes and clearly demonstrates the value of performing comparative analysis – the information it provides increasing both the sensitivity and specificity of the search results. It also provides a more comprehensive background for understanding the contribution of phase variation to the behaviour of these species.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 

The complete genome sequences of N. meningitidis serogroup A strain Z2491 (Parkhill et al., 2000 ) and N. gonorrhoeae strain FA1090 (publicly available from 1997, downloaded November 2000 from ftp://ftp.genome.ou.edu/pub/gono/gono2k.fa; GenBank no. AE004969) were analysed using previously described whole-genome analysis methodology (Saunders et al., 1998 , 2000 ), using an ACEDB graphical interface (Durbin & Thierry-Meig, 1991 ; available from www.sanger.ac.uk). Briefly, whole-genome sequences were analysed using BLASTN and BLASTX (Altschul et al., 1997 ; available from http://www.ncbi.nlm.nih.gov) searches against all sequences in the EMBL, TREMBL and SWISS-PROT databases classified below ‘metazoan’ and excluding viral sequences. Repeats composed of perfect repeats with motifs of 1–10 bases were identified using ARRAYFINDER (Hancock et al., 1999 ; software available from J. Hancock on request). As a complementary analysis, tandem repeats composed of repeated motifs of up to 1000 bp were identified using ETANDEM from EMBOSS (European Molecular Biology Open Software Suite, version 1.9.0). All repeats, were displayed in their sequence contexts with respect to ORFs and termination codons using the tools within ACEDB, and their protein and nucleotide homologies. These complete genome sequence databases were then analysed for simple DNA repeats within their sequence contexts to determine the repertoire of putative phase-variable genes. HPTs of greater than 6 Gs or Cs, and greater than 8 As or Ts, were each investigated and repeats below these thresholds when associated with a frameshift. Other repeats composed of >=4 copies of dinucleotides and >=3 copies of tetramer and longer motifs were also investigated. All repeats were analysed to interpret the significance of the repeat on the basis of sequence context and the potential effect of length variation on the expression of the associated reading frames. To facilitate this analysis the complete genome of N. gonorrhoeae strain FA1090 was annotated and assigned sequence feature numbers (XNG#) using ACEDB as described previously (Tettelin et al., 2000 ) with the addition of BLASTN and BLASTX searches using the two meningococcal annotations (Parkhill et al., 2000 ; Tettelin et al., 2000 ). The annotated gonococcal ORFs were extracted from the genome sequence in FASTA format and translated using the TRANSEQ program from EMBOSS. Three new ACEDBs were then generated in which each of the three genomes was compared to the annotations for the other two by BLASTN and BLASTX searches. These were used to investigate the analogous sequence regions containing repeats potentially associated with phase variation in the three genomes. Further analysis of regions was performed using the tools within GCG10 (Genetics Computer Group, Madison, WI, USA) through the Oxford University Bioinformatics Centre and/or Jellyfish (freeware available from http://www.biowire.com). Gonococcal reading frames are referred to on the basis of their similarity to meningococcal sequences where possible. The sequences of the remaining in-house-annotated genes and NMA1707a, an ORF not annotated by Parkhill et al. (2000) , are available from the Saunders group web page (via http://www.path.ox.ac.uk). Some of these data are also available as supplementary data on the journal web site (http://www.mic.sgmjournals.org).


   RESULTS AND DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Whole-genome analysis of N. meningitidis strain Z2491 for potential phase-variable genes
With their description of the complete sequence, Parkhill et al. (2000; supplementary information, Table 3) published a table of 27 tandem repeat sequences indicative of phase variation in the serogroup A N. meningitidis strain Z2491. The current analysis identified 54 candidate genes (Table 1 & Table 4), including some but not all of those proposed by Parkhill et al. (2000) (note that the repeat described in the original sequence paper table NMA0406 is in the gene annotated as NMA0407). Twenty-nine (53·7%) of these genes were not previously identified as candidates in N. meningitidis strain MC58 (Saunders et al., 2000 ).


View this table:
[in this window]
[in a new window]
 
Table 3. Modified putative phase-variable genes in N. meningitidis strain MC58 and comparison of the genes in the other sequences

 

View this table:
[in this window]
[in a new window]
 
Table 1. Putative phase-variable genes identified by genome analysis of N. meningitidis strain Z492 and comparison of the genes in the other sequences

 

View this table:
[in this window]
[in a new window]
 
Table 4. opa genes of three pathogenic Neisseria spp.

 
Five of the initially described genes (NMA0402, NMA0419, NMA0072, an unascribed non-coding region and NMA1996) were excluded from this list because of the nature and/or location of the repeats. The repeat associated with NMA0402 (truA) is neither within the coding region nor does it appear to be associated with a promoter. NMA0419 is associated with a (GC)6 repeat towards the 3' end of the gene where length variation would not alter the expression of most of the gene and alteration within which would actually increase the length of the final gene product. Since this is an untypical context for a repeat associated with phase variation and also since (GC)n repeats would have a relatively high Tm and have not yet been demonstrated to be associated with phase variation, this gene was excluded. On similar grounds NMA0072 (pdxA) with a (GC)6 repeat was also eliminated, as was an additional gene containing a (GC)6 repeat that was not included in the previously published list from strain Z2491 (NMA0263 lpxC/envA). Parkhill et al. (2000) were correct in ascribing a (G)11 to a non-coding intergenic sequence which was therefore not reported in this study. Finally, the (67 bp)3 repeat within NMA1996 (natD) is located in the 3' end of the gene and is longer than repeats currently associated with phase variation.

Twenty-one of the candidate genes in strain Z2491 (listed in Table 1) were not identified by either of the previous whole-genome analyses of N. meningitidis (Parkhill et al. 2000 ; Saunders et al. 2000 ). Seven of these are hypothetical proteins: NMA0640, NMA1313, NMA1373, NMA1436, NMA1457, NMA1562 and NMA2111. NMA1707a was not annotated by Parkhill et al. (2000) and is located between NMA1707 and NMA1708. Fourteen of the identified genes have defined functions or have been assigned putative functions on the basis of homology. One of these, NMA1371 (vacB) is a homologue of genes associated with virulence through their involvement in host-cell colonization (Tobe et al., 1992 ; Ruiz-Lozano & Bonfante, 2000 ) and is the type of virulence-associated gene typically associated with phase variation. A number of the other genes are apparently metabolic function-related genes indicative of a role of phase variation in other forms of niche adaptation. These include: NMA0555 (ppk) which encodes polyphosphate kinase and is responsible for the synthesis of polyphosphate (Tinsley & Gotschlich, 1995 ); NMA0831, a homologue of potD from E. coli, a polyamine-binding periplasmic protein which is part of an ABC transport system for uptake of polyamine (Antognoni et al., 1999 ); NMA2028, a homologue of amiC, which negatively regulates the amide-inducible aliphatic amidase operon in Pseudomonas aeruginosa (Wilson & Drew, 1991 ; O’Hara et al., 1999 ); NMA2216, a homologue of bioH, which may be involved in biotin biosynthesis (Eisenberg, 1987 ); and NMA2103, a homologue of thiL, which is involved in thiamine biosynthesis (Imamura & Nakayama, 1982 ). One ORF in N. meningitidis strain Z2491 identified in this analysis is not currently annotated as frameshifted by the repeat tract, as suggested by this analysis. Therefore, NMA0286/NMA0285, a putative hydrolase, has two annotation numbers for the 5' and 3' ends of the gene. Also identified were NMA0074 (gidA), NMA0579 (a putative prolyl oligopeptidase family protein), NMA0611 (Mrp/NBP35 family protein), NMA0995 (recB), NMA1149 (sucA), NMA1221 (a phage integrase homologue) and NMA1656 (dnaX).

Whole-genome analysis of N. gonorrhoeae strain FA1090 for potential phase-variable genes
The genome sequence of N. gonorrhoeae strain FA1090 has been publicly available since 1997, and was completed, contiguated and publicly released in this form in September 2000 (GenBank no. AE004969; also http://www.genome.ou.edu/gono.gb). Seventy-two putative phase-variable genes, including the 11 opa genes, were identified in this gonococcal strain (Table 2 & Table 4). Excluding the opa genes, only 20 of these were previously identified between the analyses of Saunders et al. (2000) and Parkhill et al. (2000) in N. meningitidis. In total 39 genes not previously identified as phase-variable were found in this analysis of strain FA1090.


View this table:
[in this window]
[in a new window]
 
Table 2. Putative phase-variable genes identified by genome analysis of N. gonorrhoeae strain FA1090 and comparison of the genes in the other sequences

 
Once the opa genes, which are numerous in the gonococci but are also present in the meningococci, lgtC (XNG2047) and lgtD (XNG2048), which are LPS biosynthetic genes known to be present in other meningococcal strains (Yang & Gotschlich, 1996 ; GenBank no. U14554; Jennings et al., 1999 ), and XNG1207a and XNG1577, which are part of a paralogous family with XNG1788 present in meningococci (NMB0297 and NMA0212), are accounted for, there are currently 11 candidate phase-variable genes that are unique to N. gonorrhoeae. Only four of these are not hypothetical proteins: XNG1341, a putative adhesin which is inactivated in strain FA1090 by two frameshifts in addition to that caused by the (CAAG)20 repeat; and XNG0470, XNG1014 and XNG1513 which form a paralogous family of endodeoxyribonucleases (84·3% nucleotide identity). Of these paralogues, XNG0470 has a (G)7 HPT and is frameshifted while XNG1014 and XNG1513 have similarly located (G)6 HPTs and are in-frame. The remaining genes unique to the gonococcal strain FA1090 are XNG0503a, XNG0473, XNG1000, XNG1511, XNG1733, XNG1834 and XNG1856.

Comparisons using three genes that appear to be non-functional in this gonococcal sequence, but which are associated with repeats likely to mediate switching have revealed previously unrecognized potential phase-variable genes in N. meningitidis. XNG0400, a putative glycosyltransferase, is inactive in N. gonorrhoeae and not a candidate for phase variation on the basis of its (A)7 repeat alone. The meningococcal homologues are not only intact, but also have polymorphisms in the poly-A tract, which is associated with a frameshift in NMB0846 and is in-frame in NMA1058/NMA1057. The azlC-related protein, XNG0439, has a (G)6 in the frameshifted and degenerate gonococcal gene, while in the meningococci (NMB0892 and NMA1111) the (G)5 tract is in-frame. HPTs 5 bp in length are relatively stable in the LPS genes of N. meningitidis (Jennings et al., 1999 ), which suggests that other strains would have the potential for phase variation of this gene at high frequency only with longer repeats. In N. gonorrhoeae strain FA1090, tspA (XNG1534) has five frameshifts, one of which is caused by a (C)7 that in the two N. meningitidis strains is a (C)6 (NMB0341 and NMA2146). Although the gonococcal gene has multiple frameshifts, the polymorphism in the HPT suggests the potential for phase variability of the tspA gene in N. meningitidis. TspA is a T-cell and B-cell stimulating antigen considered to be a possible vaccine candidate in N. meningitidis (Kizil et al., 1999 , GenBank no. AJ010113).

The putative phase-variable genes from N. gonorrhoeae include two genes that have been previously investigated in the Neisseria spp. but not previously reported to be phase-variable. sodB (XNG0431) is frameshifted in the region of an (A)7 in the FA1090 genome while the meningococcal sequences (NMB0884 and NMA1104) and the GenBank entry for sodB from an unspecified N. gonorrhoeae strain (GenBank no. AY010758) have (A)6 at the equivalent location. Variation associated with a repeat of this type has only been described in porA to date (van der Ende et al., 2000 ). The other previously identified gene, lldD (XNG0610; previously lldA; Erwin & Gotschlich, 1996 ), encodes an L-lactate dehydrogenase. The phase variation of this gene has not previously been recognized.

In Escherichia coli and Thermus thermophilus, dnaX contains a repeat that has been demonstrated to generate ribosomal frameshifting two-thirds of the way into the gene, such that two proteins are translated from one gene, the {gamma} and {tau} subunits of the DNA polymerase III holoenzyme (Blinkowa & Walker, 1990 ; Flower & McHenry, 1990 ; Tsuchihashi & Kornberg, 1990 ; Yurieva et al., 1997 ). While the (C)7 identified this as a potential phase-variable gene in N. gonorrhoeae strain FA1090 and N. meningitidis strain Z2491, this HPT is located one-third of the way through the coding sequence. A second C HPT, a (C)6, is present in two of the Neisseria spp. in a location similar to the poly-A tracts of E. coli and T. thermophilus, but is not present in meningococcal strain MC58. This suggests that these tracts are not involved in the joint generation of equivalent {gamma} and {tau} subunits of the DNA polymerase III holoenzyme in strain MC58, although their possible instability in the other strains would need further experimental investigation.

It is interesting to note that the analysis of the gonococcal genome indicates that a homologue of ppx, the gene encoding exopolyphosphatase, is a phase-variable candidate (XNG0954), while in the meningococcal genomes it is not (NMB1467 and NMA1679). In the meningococcal genomes, however, the ppk gene (NMB1900 and NMA0555), encoding polyphosphate kinase, is potentially phase-variable while the gonococcal ppk is not (XNG0003). PPX hydrolyses polyphosphate (polyP) to orthophosphate (Pi) and PPK synthesizes polyP from Pi (Akiyama et al., 1993 ). It appears that each species has the potential to phase-vary one of these two opposing enzymes.

Comparative analysis of the potential phase-variable genes of N. meningitidis strain MC58 revisited
Several aspects of the original analysis of N. meningitidis strain MC58 can be reappraised. In Table 1 of Saunders et al. (2000) , each repeat identified in N. meningitidis strain MC58 was evaluated for its presence in the other two Neisseria spp. genome sequences, including the then incomplete sequence of N. gonorrhoeae strain FA1090. The other sequences were classified as having a repeat of the same length, a polymorphic repeat (i.e. the repeat is of a different length), no repeat at the equivalent location or no homologue of the gene containing the repeat. These limited designations have been extended in this analysis, in which the NMA ORF designations for N. meningitidis strain Z2491 have been included to facilitate cross-comparisons (Table 1, Table 2 & Table 3), as have the actual sequences corresponding to the repeats (Table 3). In addition, two candidates have been removed from the original N. meningitidis strain MC58 list (Saunders et al. 2000 ). These are NMB0300 and NMB1140. NMB0300 is a hypothetical protein homologous to NMA2186. Both contain a G6 HPT but only the serogroup A sequence for this gene is intact. In the serogroup B strain MC58, NMB0300 is a dead gene that is frameshifted, but not at the (G)6 repeat. NMB1140, encoding the cell cycle protein mesJ, has been removed because it is consistently associated with a repeat tract of (A)8 in all three sequenced strains, it is probably essential and alternative annotation now suggests that this gene is probably not frameshifted.

The more comprehensive comparative analysis reveals that six genes described with either polymorphic repeats or containing no repeat in the other sequences from strain MC58 (Saunders et al. 2000 ) are degenerate in one of the other two sequenced strains. The haemoglobin receptor gene hmbR (NMB1668) is degenerate in N. gonorrhoeae strain FA1090, as are the serine protease (NMB1969) and the function-unknown genes NMB0312 and NMB0593. The hypothetical protein NMB1786 is non-functional in N. meningitidis strain Z2491. Lactoferrin-binding protein gene lbpB and the 5' region of the associated lbpA (NMB1540), which contains the (G)8 repeat, are not present in N. gonorrhoeae strain FA1090. The remaining fragment of lbpA contains two frameshifts. The association between ‘dead’ genes and phase variation may not be co-incidental. While it is possible for a phase-variable gene to be essential for some part of the normal life cycle of the organism, it is self evident that a gene that can be switched off in a programmed way cannot be essential under all conditions. If a mutation were to occur in a gene while it was not adaptive or while switched off when transcription-coupled repair might also be reduced, then these genes might be expected to be more prone to gene loss than others. Even phase-variable genes for which host interactions are well established, such as opc, can be lost in this way as in N. meningitidis strain S3446 (Saunders, 1999 ). It may be that phase-variable genes therefore represent a group of genes more prone to degeneration than others and the observation of ‘dead’ candidate genes may be supportive of their phase variability in other strains.

Some of the genes are described differently in the different annotations. The ‘adhesion and penetration protein homologue’/NMB1985 is included in this paper as hap and the ‘saccharide acetylase’/NMB1836 is listed as wbpC. The ‘cell adhesion molecule’/NMB2104 has been included as mafA-3 and does not contain the same (G)6 repeat and frameshift in strain Z2491 as it does in strain MC58. The hypothetical protein NMB1275/NMA1480 has an (AGCA)3 repeat in both meningococcal sequences, while the gonococcal sequence differs at four bases such that it is not repetitive (AACCGGCAAACA). The two meningococcal genes are annotated differently and comparisons including the gonococcal sequence now suggest that this gene is not frameshifted (in agreement with NMA1480). On the basis of the differences between the sequences and the presence of a tetramer frequently associated with unstable repeats this gene remains a candidate but should no longer be regarded as ‘strong’ on the basis of the sequence evidence alone.

Comparative analysis of the repeat-associated genes in the three genome sequences
In N. gonorrhoeae, 11 of the 72 candidates are the opa genes, of which the N. meningitidis strains MC58 and Z2491 have four and three, respectively (Table 4). Excluding the opa genes on the basis of the current analysis, N. meningitidis strain Z2491 has 51, N. gonorrhoeae strain FA1090 has 61 and N. meningitidis strain MC58 has 59 potential phase-variable genes. The previous large difference between the number of phase-variable genes between the serogroup A and B genome sequences is therefore more a reflection of the different analyses than the different strains. In addition to the opa genes (Table 4), cross comparisons reveal there to be a total repertoire of 119 distinct potentially phase-variable genes in the three Neisseria spp. strains investigated (Table 5).


View this table:
[in this window]
[in a new window]
 
Table 5. Repertoire of phase-variable genes in the pathogenic Neisseria spp.

 
In N. meningitidis strain Z2491, an additional seven candidates were found by comparison with N. gonorrhoeae strain FA1090 (NMA0337, NMA0954, NMA1077, NMA1090, NMA1589, NMA1679 and NMA2146) and a further seven from comparison with N. meningitidis strain MC58 (NMA0542, NMA0562, NMA0683, NMA1480, NMA1725, NMA1990 and NMA2146). This brings the total of candidate phase-variable genes to 68 in N. meningitidis strain Z2491, including 22 of the 27 previously reported (Parkhill et al., 2000 ).

Comparative analysis identified 11 additional candidates in N. gonorrhoeae strain FA1090. Five of these are from comparison with N. meningitidis strain Z2491 (homologues of NMA0286/NMA0285, NMA0478, NMA0619, NMA2175 and NMA2216) and six candidates from analysis of strain MC58 (homologues of NMB0488, NMB1507, NMB1525, NMB1734, NMB1913 and NMB1985). There are, therefore, 83 potential phase-variable genes in N. gonorrhoeae strain FA1090.

Comparison with the findings from the other two Neisseria spp. genomes reveals 19 more phase-variable gene candidates in N. meningitidis strain MC58, making 82 in total in this strain. This 23% increase over the single genome analysis of Saunders et al. (2000) emphasizes the way in which comparative genomic analysis can expand the available information from what is accessible from a single genome sequence. These additional N. meningitidis strain MC58 candidates are NMB0040/NMB0039, NMB0270, NMB0341, NMB0377, NMB0385, NMB0741, NMB0785, NMB0846, NMB0872, NMB0955, NMB1001, NMB1202, NMB1255, NMB1350, NMB1467, NMB1846, NMB1900, NMB2093 and NMB2145.

By comparing the results from the three analyses of the Neisseria spp. genome sequences, likely candidacy as phase-variable genes can be strengthened by evidence of interstrain and interspecies polymorphisms in the repeat tract. The repeat polymorphism data from this study alone presents compelling evidence of phase variation for nine genes (in addition to the opa genes) already known to be phase-variable: hpuA (Chen et al., 1998 ), pglA (Jennings et al., 1998 ), opc (Sarkari et al., 1994 ), porA (van der Ende et al., 1995 ), hmbR (Richardson & Stojiljkovic, 1999 ), pilC2 (Jonsson et al., 1991 ), lgtA (Yang & Gotschlich, 1996 ), frpB (renamed fetA by Carson et al., 2000 ) and lgtG (Banerjee et al., 1998 ). Twenty-nine ORFs not previously known to be phase-variable also have polymorphisms in the length of the repeat between the strains that make them clear candidates for translational phase variation and two candidates for transcriptional phase variation (Table 5). Although not evident from the sequenced N. meningitidis strains, interstrain variations in the HPT of NMB0415, dca, has also been demonstrated previously (Snyder et al., 2001 ).

As previously discussed the candidature of different genes varies with the length and composition of the repeat and the availability of polymorphic information (Saunders et al., 2000 ). With respect to the HPTs there is evidence for instability in C or G repeats of 7 bp in length (Hammerschmidt et al., 1996 ), whereas similar repeats of 5 bp are relatively stable (Jennings et al., 1999 ). A threshold of instability at 7 bp in length is also supported by Markov chain analysis (Saunders et al., 2000 ). On these grounds (C or G)6 repeats have not been identified as candidates unless they are also associated with a frameshift in which a longer repeat would be associated with a functional gene and/or variation in length or sequence polymorphism in one of the other sequences. (C or G)7 repeats are accordingly only considered to be moderate candidates unless they are associated with a frameshift or similar variability between the sequences. From this analysis, all candidate dinucleotide repeats with four copies are invariant between the strains. Further, variation of this type and length of repeat has not been demonstrated in this species, although dinucleotide repeats composed of AT/TA, CT/GA or CA/GT pairs are unstable in other Gram-negative species (van Ham et al., 1993 ; Saunders et al., 1998 ; Peck et al., 1999 ; Eckert & Yan, 2000 ). Given the weaker candidature of genes associated with these repeats, the 24 candidates associated with these repeats have been placed in a separate table (Table 6). An indication of the likelihood of phase variability of each of the 95 remaining candidates (in addition to the opa genes) is listed in Table 5. Of these, 82 (86·3%) are either known phase-variable genes (14) or are particularly strong candidates for phase variation (68). Of these 95 candidates, 42 contain frameshifts (44·2%) and 14 are dead genes (14·7%) in at least one of the strains analysed, while 8 candidates have promoter-associated repeats.


View this table:
[in this window]
[in a new window]
 
Table 6. Reduced likelihood candidates with four copies of dinucleotide repeats

 
In addition to the sheer number of genes identified, another noteworthy feature is the diversity of functions that are observed in these genes. While a significant number (31 of 61 with known functions or homologies; 50·8%) of genes fall within the functional categories traditionally associated with phase variation and virulence – namely surface proteins, LPS and sugar metabolism genes, toxins and restriction modification genes – a large number of the remainder (30 of 61; 49·2%) appear to have other functions (Table 5). This in part probably reflects the success of this approach in identifying the role of these genes in as yet undefined aspects of virulence. In addition, it also probably reflects the true role of phase variation as a general mechanism of environmental and niche adaptation, of which changes due to colonization niches and immune responses are only an important subset.

Although the difference between the meningococcal strains is not as great as was initially indicated from the original sequence papers, there is still a significant difference between the two sequenced strains. To put this in context, there is less difference in the total repertoire of variable genes between the gonococcus (83 candidates) and the meningococcal strain MC58 (82 candidates) than there is between serogroup A (68 candidates) and B meningococci. Of the candidate genes that are not common to all three genomes, two are unique to strain Z2491 (NMA0132 and NMA0407) while the other sequenced strains have 13 (N. gonorrhoeae strain FA1090) and 14 (N. meningitidis strain MC58) unique candidates. It should be noted that there is no clear distinction between which genes are present in each of the species, for seven candidate genes are common only to the gonococcus and the meningococcal strain Z2491 and three are common only to the gonococcus and meningococcal strain MC58 (Fig. 1). This suggests that there is a substantial degree of horizontal transfer between these species and strains that affects the gene complement as well as the more recognized mosaicisms, or that there have been multiple acquisition events from a common source. It is therefore meaningful to analyse and consider the genes of this type in these species as a whole, which further increases the potential for diversification and adaptability in these species. That there are more than 100 such genes, including over 80 known or strong candidates within only three strains, indicates that this type of switching event is central to the behaviour, adaptability and virulence of these organisms in a way that is unprecedented in other species studied to date.



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 1. Venn diagram showing the distribution of different candidate genes in the three genomes.

 

   ACKNOWLEDGEMENTS
 
L.A.S.S. is supported by the E. P. Abraham Trust. N.J.S. is supported by a Wellcome Trust Advanced Research Fellowship. This work is supported by the Wellcome Trust (N.J.S.). The N. gonorrhoeae sequence was obtained from the University of Oklahoma, the Gonococcal Genome Sequencing Project which was supported by USPHS/NIH grant no. AI-38399 (L. A. Lewis,1 A. F. Gillaspy,1 R. E. McLaughlin,1 M. Gipson,1 T. Ducey,1 T. Ownbey,1 K. Hartman,1 C. Nydick,1 M. Carson,1 J. Vaughn,1 C. Thomson,1 L. Song,2 S. Lin,2 X. Yuan,2 F. Najar,2 M. Zhan,2 Q. Ren,2 H. Zhu,2 S. Qi,2 S. M. Kenton,2 H. Lai,2 J. D. White,2 S. Clifton,2 B. A. Roe2 & D. W. Dyer1; 1The University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA; 2The University of Oklahoma, Norman, OK, USA).


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Akiyama, M., Crooke, E. & Kornberg, A. (1993). An exopolyphosphatase of Escherichia coli. The enzyme and its ppx gene in a polyphosphate operon. J Biol Chem 268, 633-639.[Abstract/Free Full Text]

Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389-3402.[Abstract/Free Full Text]

Antognoni, F., Del Duca, S., Kuraishi, A., Kawabe, E., Fukuchi-Shimogori, T., Kashiwagi, K. & Igarashi, K. (1999). Transcriptional inhibition of the operon for the spermidine uptake system by the substrate binding protein PotD. J Biol Chem 274, 1942-1948.[Abstract/Free Full Text]

Appelmelk, B. J., Martin, S. L., Monteiro, M. A. & 10 other authors (1999). Phase variation in Helicobacter pylori lipopolysaccharide due to changes in the lengths of poly(C) tracts in alpha3-fucosyltransferase genes. Infect Immun 67, 5361–5366.[Abstract/Free Full Text]

Appelmelk, B. J., Martino, M. C., Veenhof, E. & 7 other authors (2000). Phase variation in H type I and Lewis a epitopes of Helicobacter pylori lipopolysaccharide. Infect Immun 68, 5928–5932.[Abstract/Free Full Text]

Banerjee, A., Wang, R., Uljon, S. N., Rice, P. A., Gotschlich, E. C. & Stein, D. C. (1998). Identification of the gene (lgtG) encoding the lipooligosaccharide beta chain synthesizing glucosyl transferase from Neisseria gonorrhoeae. Proc Natl Acad Sci USA 95, 10872-10877.[Abstract/Free Full Text]

Blinkowa, A. L. & Walker, J. R. (1990). Programmed ribosomal frameshifting generates the Escherichia coli DNA polymerase III gamma subunit from within the tau subunit reading frame. Nucleic Acids Res 18, 1725-1729.[Abstract]

Bowler, L. D., Zhang, Q. Y., Riou, J. Y. & Spratt, B. G. (1994). Interspecies recombination between the penA genes of Neisseria meningitidis and commensal Neisseria species during the emergence of penicillin resistance in N. meningitidis: natural events and laboratory simulation. J Bacteriol 176, 333-337.[Abstract]

Carson, S. D., Stone, B., Beucher, M., Fu, J. & Sparling, P. F. (2000). Phase variation of the gonococcal siderophore receptor FetA. Mol Microbiol 36, 585-593.[Medline]

Chen, C. J., Elkins, C. & Sparling, P. F. (1998). Phase variability of hemoglobin utilization in N. gonorrhoeae. Infect Immun 66, 987-993.[Abstract/Free Full Text]

Durbin, R. & Thierry-Mieg, J. T. (1991). A C. elegans DataBase. Documentation, code and data available from http://www.acedb.org.

Eckert, K. A. & Yan, G. (2000). Mutational analyses of dinucleotide and tetranucleotide microsatellites in Escherichia coli: influence of sequence on expansion mutagenesis. Nucleic Acids Res 28, 2831-2838.[Abstract/Free Full Text]

Eisenberg, M. A. (1987). Biosynthesis of biotin and lipoic acid. In Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology , pp. 544-550. Edited by F. C. Neidhardt, J. L. Ingraham, B. Magasanik, K. B. Low, M. Schaechter & H. E. Umbarger. Washington, DC:American Society for Microbiology.

van der Ende, A., Hopman, C. T., Zaat, S., Essink, B. B., Berkhout, B. & Dankert, J. (1995). Variable expression of class 1 outer membrane protein in Neisseria meningitidis is caused by variation in the spacing between the -10 and -35 regions of the promoter. J Bacteriol 177, 2475-2480.[Abstract]

van der Ende, A., Hopman, C. T. & Dankert, J. (2000). Multiple mechanisms of phase variation of PorA in Neisseria meningitidis. Infect Immun 68, 6685-6690.[Abstract/Free Full Text]

Erwin, A. L. & Gotschlich, E. C. (1996). Cloning of a Neisseria meningitidis gene for L-lactate dehydrogenase (L-LDH): evidence for a second meningococcal L-LDH with different regulation. J Bacteriol 178, 4807-4813.[Abstract]

Feil, E., Carpenter, G. & Spratt, B. G. (1995). Electrophoretic variation in adenylate kinase of Neisseria meningitidis is due to inter- and intraspecies recombination. Proc Natl Acad Sci USA 92, 10535-10539.[Abstract]

Flower, A. M. & McHenry, C. S. (1990). The gamma subunit of DNA polymerase III holoenzyme of Escherichia coli is produced by ribosomal frameshifting. Proc Natl Acad Sci USA 87, 3713-3717.[Abstract]

Haas, R. & Meyer, T. F. (1987). Molecular principles of antigenic variation in Neisseria gonorrhoeae. Antonie Leeuwenhoek 53, 431-434.[Medline]

van Ham, S. M., van Alphen, L., Mooi, F. R. & van Putten, J. P. (1993). Phase variation of H. influenzae fimbriae: transcriptional control of two divergent genes through a variable combined promoter region. Cell 73, 1187-1196.[Medline]

Hammerschmidt, S., Muller, A., Sillmann, H. & 7 other authors (1996). Capsule phase variation in Neisseria meningitidis serogroup B by slipped-strand mispairing in the polysialyltransferase gene (siaD): correlation with bacterial invasion and the outbreak of meningococcal disease. Mol Microbiol 20, 1211–1220.[Medline]

Hancock, J. M., Shaw, P. J., Bonneton, F. & Dover, G. A. (1999). High sequence turnover in the regulatory regions of the developmental gene hunchback in insects. Mol Biol Evol 16, 253-265.[Abstract]

Hood, D. W., Deadman, M. E., Jennings, M. P., Bisercic, M., Fleischmann, R. D., Venter, J. C. & Moxon, E. R. (1996). DNA repeats identify novel virulence genes in Haemophilus influenzae. Proc Natl Acad Sci USA 93, 11121-11125.[Abstract/Free Full Text]

Imamura, N. & Nakayama, H. (1982). thiK and thiL loci of Escherichia coli. J Bacteriol 151, 708-717.[Medline]

Jennings, M. P., Virji, M., Evans, D., Foster, V., Srikhanta, Y. N., Steeghs, L., van der Ley, P. & Moxon, E. R. (1998). Identification of a novel gene involved in pilin glycosylation in Neisseria meningitidis. Mol Microbiol 29, 975-984.[Medline]

Jennings, M. P., Srikhanta, Y. N., Moxon, E. R., Kramer, M., Poolman, J. T., Kuipers, B. & van der Lay, P. (1999). The genetic basis of the phase variation repertoire of lipopolysaccharide immunotypes in Neisseria meningitidis. Microbiology 145, 3013-3021.[Abstract/Free Full Text]

Jonsson, A. B., Nyberg, G. & Normark, S. (1991). Phase variation of gonococcal pili by frameshift mutation in pilC, a novel gene for pilus assembly. EMBO J 10, 477-488.[Abstract]

Josenhans, C., Eaton, K. A., Thevenot, T. & Suerbaum, S. (2000). Switching of flagellar motility in Helicobacter pylori by reversible length variation of a short homopolymeric sequence repeat in fliP, a gene encoding a basal body protein. Infect Immun 68, 4598-4603.[Abstract/Free Full Text]

Kizil, G., Todd, I., Atta, M., Borriello, S. P., Ait-Tahar, K. & Ala’Aldeen, D. A. (1999). Identification and characterization of TspA, a major CD4(+) T-cell- and B-cell-stimulating Neisseria-specific antigen. Infect Immun 67, 3533-3541.[Abstract/Free Full Text]

Malorny, B., Morelli, G., Kusecek, B., Kolberg, J. & Achtman, M. (1998). Sequence diversity, predicted two-dimensional protein structure, and epitope mapping of neisserial Opa proteins. J Bacteriol 180, 1323-1330.[Abstract/Free Full Text]

Meyer, T. F. & van Putten, J. P. (1989). Genetic mechanisms and biological implications of phase variation in pathogenic neisseriae. Clin Microbiol Rev Suppl 2, S139-S145.

O’Hara, B. P., Norman, R. A., Wan, P. T., Roe, S. M., Barrett, T. E., Drew, R. E. & Pearl, L. H. (1999). Crystal structure and induction mechanism of AmiC–AmiR: a ligand-regulated transcription antitermination complex. EMBO J 18, 5175-5186.[Abstract/Free Full Text]

Parkhill, J., Achtman, M., James, K. D. & 25 other authors. (2000). Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature 404, 502–506.[Medline]

Peak, I. R. A., Jennings, M., Hood, D. W. & Moxon, E. R. (1999). Tetranucleotide repeats identify novel virulence determinant homologues in Neisseria meningitidis. Microb Pathog 26, 13-23.[Medline]

Peck, B., Ortkamp, M., Diehl, K. D., Hundt, E. & Knapp, B. (1999). Conservation, localization and expression of HopZ, a protein involved in adhesion of Helicobacter pylori. Nucleic Acids Res 27, 3325-3333.[Abstract/Free Full Text]

Richardson, A. R. & Stojiljkovic, I. (1999). HmbR, a hemoglobin-binding outer membrane protein of Neisseria meningitidis, undergoes phase variation. J Bacteriol 181, 2067-2074.[Abstract/Free Full Text]

Ruiz-Lozano, J. M. & Bonfante, P. (2000). A Burkholderia strain living inside the arbuscular mycorrhizal fungus Gigaspora margarita possesses the vacB gene, which is involved in host cell colonization by bacteria. Microb Ecol 39, 137-144.[Medline]

Sarkari, J., Pandit, N., Moxon, E. R. & Achtman, M. (1994). Variable expression of the Opc outer membrane protein in Neisseria meningitidis is caused by size variation of a promoter containing poly-cytidine. Mol Microbiol 13, 207-217.[Medline]

Saunders, N. J. (1999). Bacterial phase variation associated with repetitive DNA. PhD thesis, The Open University.

Saunders, N. J., Peden, J. F., Hood, D. W. & Moxon, E. R. (1998). Simple sequence repeats in the Helicobacter pylori genome. Mol Microbiol 27, 1091-1098.[Medline]

Saunders, N. J., Hood, D. W. & Moxon, E. R. (1999). Bacterial evolution: bacteria play pass the gene. Curr Biol 9, R180-R183.[Medline]

Saunders, N. J., Jeffries, A. C., Peden, J. F., Hood, D. W., Tettelin, H., Rappuoli, R. & Moxon, E. R. (2000). Repeat-associated phase variable genes in the complete genome sequence of Neisseria meningitidis strain MC58. Mol Microbiol 37, 207-215.[Medline]

Snyder, L. A. S., Saunders, N. J. & Shafer, W. M. (2001). A putatively phase variable gene (dca) required for natural competence in Neisseria gonorrhoeae but not Neisseria meningitidis is located within the division cell wall (dcw) gene cluster. J Bacteriol 183, 1233-1241.[Abstract/Free Full Text]

Sparling, P. F., Cannon, J. G. & So, M. (1986). Phase and antigenic variation of pili and outer membrane protein II of Neisseria gonorrhoeae. J Infect Dis 153, 196-201.[Medline]

Stern, A., Brown, M., Nickel, P. & Meyer, T. F. (1986). Opacity genes in Neisseria gonorrhoeae: control of phase and antigenic variation. Cell 47, 61-71.[Medline]

Stibitz, S., Aaronson, W., Monack, D. & Falkow, S. (1989). Phase variation in Bordetella pertussis by frameshift mutation in a gene for a novel two-component system. Nature 338, 266-269.[Medline]

Tettelin, H., Saunders, N. J., Heidelberg, J. & 39 other authors (2000). Complete genome sequence of Neisseria meningitidis serotype B strain MC58. Science 287, 1809–1815.[Abstract/Free Full Text]

Tinsley, C. R. & Gotschlich, E. C. (1995). Cloning and characterization of the meningococcal polyphosphate kinase gene: production of polyphosphate synthesis mutants. Infect Immun 63, 1624-1630.[Abstract]

Tobe, T., Sasakawa, C., Okada, N., Honma, Y. & Yoshikawa, M. (1992). vacB, a novel chromosomal gene required for expression of virulence genes on the large plasmid of Shigella flexneri. J Bacteriol 174, 6359-6367.[Abstract]

Tomb, J. F., White, O., Kerlavage, A. R. & 39 other authors (1997). The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388, 539–547.[Medline]

Tsuchihashi, Z. & Kornberg, A. (1990). Translational frameshifting generates the gamma subunit of DNA polymerase III holoenzyme. Proc Natl Acad Sci USA 87, 2516-2520.[Abstract]

Wang, G., Rasko, D. A., Sherburne, R. & Taylor, D. E. (1999). Molecular genetic basis for the variable expression of Lewis Y antigen in Helicobacter pylori: analysis of the {alpha}(1,2) fucosyltransferase gene. Mol Microbiol 31, 1265-1274.[Medline]

Wilson, S. & Drew, R. (1991). Cloning and DNA sequence of amiC, a new gene regulating expression of the Pseudomonas aeruginosa aliphatic amidase, and purification of the amiC product. J Bacteriol 173, 4914-4921.[Medline]

Yamaoka, Y., Kwon, D. H. & Graham, D. Y. (2000). A M(r) 34,000 proinflammatory outer membrane protein (oipA) of Helicobacter pylori. Proc Natl Acad Sci USA 97, 7533-7538.[Abstract/Free Full Text]

Yang, Q. L. & Gotschlich, E. C. (1996). Variation of gonococcal lipooligosaccharide structure is due to alterations in poly-G tracts in lgt genes encoding glycosyl transferases. J Exp Med 183, 323-327.[Abstract]

Yurieva, O., Skangalis, M., Kuriyan, J. & O’Donnell, M. (1997). Thermus thermophilis dnaX homolog encoding gamma- and tau-like proteins of the chromosomal replicase. J Biol Chem 272, 27131-2719.[Abstract/Free Full Text]

Received 17 April 2001; revised 21 May 2001; accepted 5 June 2001.