USDA Agricultural Research Service, Russell Research Center, PO Box 5677, Athens, GA 30604-5677, USA1
Author for correspondence: Richard Meinersmann. Tel: +1 706 546 3236. Fax: +1 706 546 3633. e-mail: rmeiners{at}asrr.arsusda.gov
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: Campylobacter, flagellin, concerted evolution, DNA multiple-sequence analyses
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The genes encoding flagellin in Campylobacter (flaA and flaB) serve as an opportunity to further study gene duplication. The sequences for both flaA and flaB have been published for two isolates of Campylobacter jejuni (Khawaja et al., 1992 ; Nuijten et al., 1990
) and one isolate of Campylobacter coli (Logan et al., 1989
). The coding regions for the flaA and flaB sequences were 17221731 bases in length and the flaB sequence is separated from flaA by approximately 180 bases. In each pair of genes from a single isolate, flaA differed from the flaB by about 5%. Variation of each of the two genes from one strain to another was as much as 30%, with most of the variation concentrated in a central region about a third the length of the gene (Khawaja et al., 1992
; Logan et al., 1989
; Nuijten et al., 1990
). We have recently completed DNA sequence analyses of flaA from 15 Campylobacter isolates and verified the presence of a major hypervariable region from approximately base 700 to base 1450, and a short variable region between bases 450 to 600 (Meinersmann et al., 1997
). In Salmonella, nanostructure analyses of the flagellin demonstrated that similar variable regions are located within domains of the protein that are surface-exposed (Yamashita et al., 1995
). We expect that the flagellins from Campylobacter have a similar conformational structure. It has been hypothesized that the second copy of fla serves as a potential donor for reassortment and recombination of the DNA as a mechanism for creating new antigenic variants for immune avoidance (Alm et al., 1993
). Wassenaar et al. (1993)
supported this hypothesis by selecting a variant in which flaB apparently replaced a defective flaA.
A 28-like promoter site has been identified for flaA and a
54-like site has been identified for flaB (Guerry et al., 1990
; Nuijten et al., 1990
). An inverted repeat suggestive of a transcription terminator was found downstream from the flaA and upstream of the flaB promoter (Khawaja et al., 1992
; Logan et al., 1989
; Nuijten et al., 1990
). Different environmental factors appear to affect the separate promoters and, in culture-grown cells, flaB is expressed at a rate lower than that of flaA (Guerry et al., 1990
) or not at all (Nuijten et al., 1990
). Insertional mutagenesis of flaA yields cells with a truncated flagellum, greatly reduced motility, and reduced ability to colonize animals (Nachamkin et al., 1993
; Wassenaar et al., 1991
, 1993
). Insertional mutagenesis of flaB yields cells that have flagella with apparently normal morphology and colonization ability and, perhaps, slightly decreased motility.
The high similarity between flaA and flaB in individual strains suggests that duplication of the gene is a relatively recent event. However, given the strain-to-strain variation of the fla genes, this conclusion is not clear. This study was undertaken to determine if patterns of evolution could be determined to explain the maintenance of a pair of highly similar fla genes within each strain of Campylobacter. We expected that the hypothesis that the duplicates provided variants for phase variation was correct and we wanted to determine the extent of variation that mechanism would create. However, the results indicated that the evolution of the flaA and flaB genes in Campylobacter is coordinated. The residues that are predicted to be exposed are conserved between the flaA and flaB of a particular isolate while the same region has a high variability between strains. At the same time, there are specific portions of the sequence that distinguish flaA from flaB and are conserved between strains.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To obtain the DNA sequences for flaB, sequences upstream of flaB were first determined as described below in order to produce additional information on the non-coding intergenic region. Template DNA was generated by PCR using primers that annealed to a conserved portion of the 3' end of flaA paired with a downstream primer that annealed to a conserved portion at the 5' end of the known flaB sequences. The products of these PCR reactions were ligated into a T/A cloning vector and sequenced as described by Meinersmann et al. (1997) . An upstream primer was then designed based on a conserved portion of the intergenic sequence upstream of the flaB start codon. This was paired with a primer that was designed to complement conserved sequence 3' distal to the flaB coding region of aligned sequences for strains VC167, TGH9011 and 81116. These primers were used in a PCR to amplify the entire flaB and a small amount of flanking sequence at both ends under permissive conditions (i.e. with an annealing temperature 5 °C lower than expected optimum), and the products were sequenced as described by Meinersmann et al. (1997)
utilizing the same primers as were used for flaA sequencing. The sequences of this final PCR product were used for the flaB data in the analyses described below.
Computational analyses.
Alignments were made of the DNA sequence of flaA from 16 strains or flaB from seven strains using CLUSTALX (Thompson et al., 1994 ) and the alignments were manually edited to move gaps inserted by the alignment program so that they did not interrupt the codon reading frame. Alignment scores were either unchanged or improved by such editing. The aligned sequences were analysed with PAUP version 4.0b2a (Swofford, 1999
) to reconstruct possible phylogenetic relationships and produce homoplasy indexes [homoplasy index=1 minus (the minimum possible number of steps between two individuals on a tree divided by the number of steps observed in the reconstructed tree between those two individuals); see Discussion for further details on homoplasy]. The aligned sequences were subjected to Sawyers runs test (Sawyer, 1989
), which looks for substitution patterns that are statistically consistent with recombination events, implemented in the program GENECONV version 1.02 (distributed by S. Sawyer at http://www.math.wustl.edu/~sawyer). This program performs the Bonferroni correction of Karlin-Altschul P values, which is the most conservative method of determining the P values.
Alignments were analysed in the MULTICOMP program (Reeves et al., 1994 ) using the algorithm developed by Li (1993)
for determining synonymous and non-synonymous substitutions. KS is the synonymous substitution rate and KA is the non-synonymous substitution rate expressed as the corrected number of substitutions per site. Rates greater than 1·00 are not real, but are due to corrections for possible multiple substitutions at a single site and remain proportional to the rate of neighbouring substitutions. The analyses were performed with a window size of 30 bases (10 codons). Use of longer windows gave less saturated results but the resolution was less than desired. To analyse synonymous and non-synonymous substitutions from flaA to flaB within strains, the sequences for seven strains were aligned pairwise for each strain, and each pair was analysed with the SITES program (distributed by Jody Hey, Rutgers University) to show the sites of synonymous and non-synonymous substitutions for each sequence pair. Aligned sequences were analysed with PAUP* 4.0b2a (Swofford, 1999
) to construct the most parsimonious relationships and the homoplasy indexes were also determined with PAUP*.
Peptide sequences.
The DNA sequences were translated to peptide sequences that were then aligned. One hundred per cent consensus peptide sequences were generated for the aligned flaA sequences and the aligned flaB sequences. Positions in which a consensus was not achieved were marked with an X, unless all the residues were in the same pam250S group of similar amino acids (Dayhoff et al., 1978 ), in which case the position was marked 0 for Ala, Gly, Pro, Ser, Thr; 1 for Asp, Glu, Asn, Gln, Asx, Glx; 2 for His, Lys, Arg; 3 for Ile, Leu, Met, Val; 4 for Phe, Trp, Tyr; or 5 for Cys. The two consensus sequences were aligned and a new consensus was generated.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The extremely low number of intrastrain flaA to flaB synonymous differences in the region of DNA sequence bases 832 and 1515 between the flaA and flaB of individual strains implies editing or gene conversion with selection at the DNA level (Endo et al., 1996 ). The higher level of intrastrain flaA to flaB synonymous differences between bases 244 and 735 suggests selection for conservation at the protein or functional level without gene conversion (Endo et al., 1996
). However, the observed rates for this region could be within the range that might occur in parts of the Campylobacter genome that do not have selection for diversity. No studies of multiple-sequence analyses of Campylobacter housekeeping genes have yet been published that will allow this comparison to be made. From the KS/KA ratios in Table 2
it appears that there is greatest functional constraint on sequence in Block V and the lowest functional constraint in Block II and Block IV. Block IV is the segment that has almost no differences between the flaA and flaB within individual strains. However, even with the degree of variability seen, one constraint that was maintained was the G+C content. The G+C content of the fla sequences from C. jejuni strains used in this study was from 35·0 to 37·2 mol% (mean 36·4 mol%).
Recombinant shuffling of the variable regions of flaA and flaB within a strain would not itself generate interstrain variability with intrastrain conservation. In fact, the rate of concerted evolution is expected to be higher when there are structural constraints (Li, 1997 ). Recombination errors that may increase variability should occur at the recombination junctions. If this was occurring with fla, then it is expected that the interstrain variability would be concentrated at the ends of the region that is intrastrain conserved. Alignment gaps were seen that are consistent with illegitimate recombinant events (Jakupciak & Wells, 1999
). However, we have found that interstrain variability is greatest within the segment that is conserved intrastrain, perhaps indicating that the generation of diversity in that region is not recombination-dependent.
Campylobacter is competent for transformation (Wang & Taylor, 1990 ) and does not appear to be clonal (Aeschbacher & Piffaretti, 1989
). Therefore, it is possible that generation of variability may involve recombinant shuffling of segments of fla between strains. Harrington et al. (1997)
presented convincing evidence of interstrain recombination within the flaA gene. There would then need to be a simultaneous or sequential insertion of the new sequence into the analogous segment of the other copy of fla within the strain. Intrastrain recombinant events between the flaA and flaB that cause a loss of the interstrain-conserved regions, i.e. the regions that distinguish flaA from flaB within a strain, would have to be treated as an error and lost (purged). Unequal crossing-over appears to be a prevalent mechanism of concerted evolution in eukaryotes (Li, 1997
). It is unlikely that unequal crossing-over occurs with Campylobacter fla because the mechanism usually involves changes in the copy number. Precision splicing of regions of the genes between flaA and flaB could occur, but putative insertion sequences or chi sequences (West, 1992
) that would facilitate such a mechanism were not conserved within the fla sequences examined in this study. Also, recombination between the two copies of the flagellin would have had to invoke some copy-choice mechanism that favoured the low G+C content that is characteristic of the Campylobacter genome. That is to say, if a mutation occurred in one copy of the flagellin that substituted a new G or C, the amelioration process of the two copies would need to favour the A or T copy. Since transitional nucleotide substitutions are biochemically more favoured than transversions (Li, 1997
), failure to do so would drive towards a 50% G+C content ratio. Alternatively, there may be non-replicative mechanisms for conversion of one fla sequence based on the other fla sequence such as a mismatch repair mechanism. If there was no preference for which version of the gene was maintained, either mechanism, recombination or mismatch repair, might increase the rate of generation of diversity of the genes within a population of the organism. Some progeny of a cell with a mutation in the intrastrain conserved segment would end up with the version that was originally flaA and others would end up with the flaB version. Gene conversion using mismatch repair of the two genes might also allow for simpler mechanisms that could favour adenosine or thymidine. The mean interstrain differences for flaA were slightly higher (about 162 bp difference) than the interstrain flaB differences (about 159 bp). If changes usually occurred first in flaA and are later transferred to flaB, a higher number of interstrain differences would be expected in flaA. That conclusion can not be drawn from the data because the differences are not great enough.
Examination of the amino acid sequence differences between the consensus sequences of FlaA and FlaB may explain why two copies of flagellin are maintained. In the amino acid sequence from residue 290 to residue 480, only 71 of the residues were conserved in the aligned FlaA or FlaB amino acid sequences that were analysed. This is an indication that the flagellin protein has little functional dependence on primary amino acid sequence in that region. This is the region, however, that is maintained between the flaA and flaB within an individual strain and, based on homology with the Salmonella flagellin, is expected to be the bearer of surface-exposed epitopes (Yamashita et al., 1995 ). Furthermore, the high homoplasy indexes for flaBs imply a stronger selective pressure on flaB than on flaA. Six of the 12 amino acids that are strictly conserved in each copy of FlaA but differ between FlaA to FlaB represent differences in amino acids that can be O-linked (serine or threonine). Five of these modifiable amino acids are in the carboxy-terminal region. Doig et al. (1996)
have demonstrated that Campylobacter flagellins are glycosylated. It would seem that the different copies of the flagellin gene offer the greatest opportunity for antigenic phase variation by changing the glycosylation sites.
Copy-to-copy conversion can increase the mutation rate of the gene by bringing together the mutations from two sites. This would contribute to clonal differences of the organism but at the expense of the ability to have greater phase variation. The advantage of maintaining two copies of the flagellin gene remain unclear, but it would be efficient if they contributed both to function or phase variation and to increasing the genetic pool for producing greater clone-to-clone variation. Presumably, host immune responses would exert selection for diversity that would drive the expansion of new alleles of flagellin. The two independent promoters for flaA and flaB (Guerry et al., 1990 ; Nuijten et al., 1990
) would allow a mechanism for phase variation, but the host factors that might activate either promoter are not known.
The pattern of concerted evolution noted in this study is unusual for prokaryotes. Most Salmonella maintain two copies of flagellin genes, fliC and fljB. Okazaki et al. (1993) demonstrated exchange of genetic information between the phase variant flagellin genes fliC and fljB of S. typhimurium after selection against expression of parental fliC epitopes. Neisseria gonorrhoeae uses gene conversion to exchange sequence between the expressed pilin gene and non-expressed pseudogenes (Zhang et al., 1992
). We are not aware of any study to analyse the Salmonella flagellin or Neisseria pilin gene families for concerted evolution.
Concerted evolution is defined as less than expected divergence of copies of genes within an individual than the divergence of the gene from other species (Li, 1997 ). We have demonstrated that concerted evolution of segments of Campylobacter fla appears to occur at a rate that at least equals the rate of clonal divergence. There is no information currently available on the divergence rate of other genes in Campylobacter. Flagellin clearly has a greater diversity than expected for most of the genome. Thus it can be concluded that the intrastrain-conserved region (Block IV) in Campylobacter fla demonstrates concerted evolution relative to the remainder of the flagellin gene, but we do not know if this is true relative to the entire genome. The concerted evolution occurring in Block IV must happen by gene conversion events that occur almost as fast as the mutation rate since the Block IV sequences of flaA and flaB are almost identical within a strain.
![]() |
ACKNOWLEDGEMENTS |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aeschbacher, M. & Piffaretti, J. C. (1989). Population genetics of human and animal enteric Campylobacter isolates. Infect Immun 57, 1432-1437.[Medline]
Alm, R. A., Guerry, P. & Trust, T. J. (1993). Significance of duplicated flagellin genes in Campylobacter. J Mol Biol 230, 359-363.[Medline]
Dayhoff, M., Schwartz, R. M. & Orcutt, B. C. (1978). Atlas of Protein Sequence and Structure, vol. 5, suppl. 3, p. 345. Silver Spring, MD: National Biomedical Research Foundation.
Doig, P., Kinsella, N., Guerry, P. & Trust, T. J. (1996). Characterization of a post-translational modification of Campylobacter flagellin: identification of a sero-specific glycosyl moiety. Mol Microbiol 19, 379-387.[Medline]
Endo, T., Ikeo, K. & Gojobori, T. (1996). Large-scale search for genes on which positive selection may operate. Mol Biol Evol 13, 685-690.[Abstract]
Guerry, P., Logan, S. M., Thornton, S. & Trust, T. J. (1990). Genomic organization and expression of Campylobacter flagellin genes. J Bacteriol 172, 1853-1860.[Medline]
Harrington, C. S., Thomson-Carter, F. M. & Carter, P. E. (1997). Evidence for recombination in the flagellin locus of Campylobacter jejuni: implications for the flagellin gene typing scheme. J Clin Microbiol 35, 2386-2392.[Abstract]
Jakupciak, J. P. & Wells, R. D. (1999). Genetic instabilities in (CTG.CAG) repeats occur by recombination. J Biol Chem 274, 23468-23479.
Jensen, R. A. & Gu, W. (1996). Evolutionary recruitment of biochemically specialized subdivisions of Family I within the protein superfamily of aminotransferases. J Bacteriol 178, 2161-2171.
Khawaja, R., Neote, K., Bingham, H. L., Penner, J. L. & Chan, V. L. (1992). Cloning and sequence analysis of the flagellin gene of Campylobacter jejuni TGH9011. Curr Microbiol 24, 213-221.
Kowalchuk, G. A., Gregg-Jolly, L. A. & Ornston, L. N. (1995). Nucleotide sequences transferred by gene conversion in the bacterium Acinetobacter calcoaceticus. Gene 153, 111-115.[Medline]
Li, W.-H. (1993). Unbiased estimation of the rates of synonymous and nonsynonymous substitutions. J Mol Evol 36, 96-99.[Medline]
Li, W.-H. (1997). Molecular Evolution. Sunderland, MA: Sinauer Associates.
Logan, S. M., Trust, T. J. & Guerry, P. (1989). Evidence for posttranslational modification and gene duplication of Campylobacter flagellin. J Bacteriol 169, 3031-3038.
Madoff, L. C., Michel, J. L., Gong, E. W., Kling, D. E. & Kasper, D. L. (1996). Group B streptococci escape host immunity by deletion of tandem repeat elements of the alpha C protein. Proc Natl Acad Sci USA 93, 4131-4136.
Mattatall, N. R., Daines, D. A., Liu, S.-L. & Sanderson, K. E. (1996). Salmonella typhi contains identical intervening sequences in all seven rrl genes. J Bacteriol 178, 5323-5326.[Abstract]
Meinersmann, R. J., Helsel, L. O., Fields, P. I. & Hiett, K. L. (1997). Discrimination of Campylobacter jejuni by fla gene sequencing. J Clin Microbiol 35, 2810-2814.[Abstract]
Nachamkin, I., Yang, X. H. & Stern, N. J. (1993). Role of Campylobacter jejuni flagella as colonization factors for three day old chicks: analysis with flagellar mutants. Appl Environ Microbiol 59, 1269-1273.[Abstract]
Nuijten, P. J., van Asten, F. J., Gaastra, W. & van der Zeijst, B. A. (1990). Structural and functional analysis of two Campylobacter jejuni flagellin genes. J Biol Chem 265, 17798-17804.
Okazaki, N., Matsuo, S., Saito, K., Tominaga, A. & Enomoto, M. (1993). Conversion of the Salmonella phase 1 flagellin gene fliC to the phase 2 fljB on the Escherichia coli K-12 chromosome. J Bacteriol 175, 758-766.[Abstract]
Reeves, P. R., Farnell, L. & Lan, R. (1994). MULTICOMP: a program for preparing sequence data for phylogenetic analysis. Comput Appl Biosci 10, 281-284.[Abstract]
Sawyer, S. A. (1989). Statistical tests for detecting gene conversion. Mol Biol Evol 6, 526-538.[Abstract]
Stewart, C. B. (1993). The powers and pitfalls of parsimony. Nature 361, 603-607.[Medline]
Swofford, D. L. (1999). PAUP*. Phylogenetic analysis using parsimony (*and other methods), version 4. Sunderland, MA, Sinauer Associates.
Swofford, D. L., Olsen, G. J., Waddell, P. J. & Hillis, D. M. (1996). Phylogenetic inference. In Molecular Systematics , pp. 407-514. Edited by D. M. Hillis, C. Moritz & B. K. Mable. Sunderland, MA: Sinauer Associates.
Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673-4680.[Abstract]
Wang, Y. & Taylor, D. E. (1990). Natural transformation in Campylobacter species. J Bacteriol 172, 949-955.[Medline]
Wassenaar, T. M., Bleumink-Pluym, N. M. C. & van der Zeijst, B. A. M. (1991). Inactivation of Campylobacter jejuni flagellin genes by homologous recombination demonstrates that flaA but not flaB is required for invasion. EMBO J 10, 2055-2061.[Abstract]
Wassenaar, T. M., van der Zeijst, B. A., Ayling, R. & Newell, D. G. (1993). Colonization of chicks by motility mutants of Campylobacter jejuni demonstrates the importance of flagellin A expression. J Gen Microbiol 139, 1171-1175.[Medline]
West, S. C. (1992). Enzymes and molecular mechanisms of genetic recombination. Annu Rev Biochem 61, 603-640.[Medline]
Yamashita, I., Vonderviszt, F., Mimori, Y., Suzuki, H., Oosawa, K. & Namba, K. (1995). Radial mass analysis of the flagellar filament of Salmonella: implications for the subunit folding. J Mol Biol 253, 547-558.[Medline]
Zhang, Q. Y., DeRyckere, D., Lauer, P. & Koomey, M. (1992). Gene conversion in Neisseria gonorrhoeae: evidence for its role in pilus antigenic variation. Proc Natl Acad Sci USA 89, 5366-5370.[Abstract]
Received 18 February 2000;
revised 19 May 2000;
accepted 26 May 2000.