Model of Vibrio cholerae toxin coregulated pilin capable of filament formation

Rajagopal Chattopadhyaya1,3 and Asoke Chandra Ghose2

1 Department of Biochemistry and 2 Department of Microbiology, Bose Institute, Calcutta 700054, India


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
A complete three-dimensional model (RCSB001169; PDB code 1qqz ) for the Vibrio cholerae toxin coregulated pilus protein (TcpA), including residues 1–197, is presented. We have used the crystal structure of the Neisseria gonorrhoeae pilin (PilE), available biochemical data about TcpA, variations in the primary sequences of TcpA among various Vibrio cholerae strains and secondary structure prediction, hydrophilicity, surface probability and antigenicity plots for TcpA to build our model. In our TcpA model, the first 137 residues possess a structure similar to the PilE, but the remainder is different. Though the ladle shape is still preserved, TcpA possesses a larger ladle head or globular domain compared to PilE. Using this model, it has been possible to identify two kinds of conserved residues: (i) those forming the core of the TcpA monomer and (ii) those involved in the monomer–monomer interactions leading to fibre formation. Residues on the fibre exterior, important in the mediation of bacterium (pilus)–bacterium (pilus) and bacterium (pilus)–host interactions, show more variability in comparison to those of (i) and (ii).

Keywords: adhesion/fibre-forming protein/pilin/TcpA/Vibrio cholerae


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
The gram negative bacterium Vibrio cholerae, the causative agent of cholera, produces cholera toxin and colonizes in the small intestine of the host. Along with the toxin, a toxin coregulated pilus (TCP) is expressed by V.cholerae as part of the toxR virulence regulon (Miller et al., 1987Go; Taylor et al., 1987Go). TCP is required for colonization of the bacterium in the intestinal mucosa of the host, shown in the infant mouse cholera model (Taylor et al., 1987Go; Thelin and Taylor, 1996Go) and human volunteer studies (Herrington et al., 1988Go).

TCP is a member of the type IV pili which are 1000–4000 nm long, 60–90 Å diameter fibres formed by ordered association of thousands of identical pilin subunits plus a few copies of pilus associated proteins (Rudel et al., 1995Go). TcpA, the major component of TCP, is a 20 kDa protein involved in bacterial adhesion to host cell surfaces (Strom and Lory, 1993Go). Other functions of TcpA as a CTX{phi} receptor (Waldor and Mekalanos, 1996Go) and as a coat protein of a bacteriophage VPI{phi} (Karaolis et al., 1999Go) make it more important for study. Many bacteria possess a similar type IV pilin like TcpA (Strom and Lory, 1993Go; Giron et al., 1997Go) assembled by a common cellular process (Strom et al., 1994Go). The crystal structure of only one type IV pilin, i.e. PilE of Neisseria gonorrhoeae (Parge et al., 1995Go) was used to build our TcpA model but subsequently another crystal structure of a truncated Pseudomonas aeruginosa pilin (Hayes et al., 2000Go) was reported.

We demonstrate that using the crystal structure of the 158-residue N.gonorrhoeae pilin (PilE) it is possible to build a reliable model of the V.cholerae TcpA consistent with the available data in the literature on TcpA.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
PilE crystal structure provides a basis for TcpA structure

We argue that the PilE crystal structure is an excellent guide for a major part of the TcpA structure since:

(1) The hydropathy plot is extremely similar between TcpA and PilE for the 30 N-terminal residues (Figure 3aGo).



View larger version (44K):
[in this window]
[in a new window]
 
Fig. 3. Hydrophilicity, surface probability and antigenicity index predicted from the primary sequence of TcpA of the classical O1 are plotted in (ac), respectively, as a function of the residue number. The hydrophilicity plot for PilE is included in (a) for comparison (dotted lines). This shows immediately that on the whole, TcpA is more hydrophobic compared to PilE. The secondary structures of both structures are also plotted between (a) and (b).

 
(2) Identical residues at positions –1 (Gly), 3 (Leu), 5 (Glu), 9 (Val) and 18 (Ala) are found in all nine type IV pilins including PilE and TcpA (Strom and Lory, 1993Go).

(3) Single residue mutations at positions –1, 4, 9 and 20 in PilE were found to abolish autoagglutination (AA) in N.gonorrhoeae. Corresponding mutations in TcpA also demonstrated AA defects and were highly defective in colonization (Chiang et al., 1995Go).

(4) In addition, both PilE and TcpA have their first cysteine at position 120 or 121, though the position of the second cysteine differs considerably. This suggests that the protein fold must be extremely similar for both pilins till approximately position 120.

(5) Proteins sharing an {alpha}–ß roll structure usually have related functions (Orengo and Thornton, 1993Go) and therefore the crystal structure (Parge et al., 1995Go) represents a new structural and functional class of {alpha}–ß roll having wider applicability beyond N.gonorrhoeae.

(6) A sequence alignment of classical TcpA with PilE shows only ~23% sequence similarity. This value of sequence similarity between two proteins may or may not mean the same fold. Sequence similarity is no longer considered to be a good criterion in this `grey' area, as pairwise sequence comparisons have been assessed and statistical methods of identifying distant evolutionary relationships evaluated (Brenner et al., 1998Go). For investigating such relationships, a SCOP database search was performed. The primary sequence of classical strain of TcpA was used as a query sequence in a SCOP database search at the website http://stash.mrc-lmb.cam.ac.uk/PDB_ISL/ and this strongly indicated structural similarity with PilE with a Z-score of 776 and an E-value of 3.3x10–48. This also showed a pilin fold as in PilE and an {alpha}–ß protein.

Highly conserved TcpA sequences

Pairwise sequence identities between the four TcpA sequences range from 78 to 82%, with 138 invariant positions out of 197 (70%) (Figure 1Go). Such a high degree of sequence identity suggests a near identical three-dimensional structure (Chothia and Lesk, 1986Go; Brenner et al., 1998Go) among the strains, with hypervariable regions predicted to be mostly on the surface, as in the case of PilE (Jonsson et al., 1994Go; Parge et al., 1995Go). The variable regions 48–63, 69–85, 91–103, 111–114, 126–143 and 153–158 of PilE in various gonococcal strains (Haas and Meyer, 1986Go) were examined by us in the light of the PilE crystal structure: regions 56–63 and 79–85 are partly exposed, whereas 91–103, 126–143 and 153–157 are exposed and only 111–114 buried in the PilE fibres.



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 1. Structural alignment of a PilE primary sequence with four primary sequences of TcpA associated with classical O1 (Taylor et al., 1987Go), El Tor O1 (Rhine and Taylor, 1994Go), non-O1 non-O139 (Nandi et al., 2000Go) and another non-O1 non-O139 strain (Novais et al., 1999Go) shown up to residue 137. After this residue, the two structures do not share corresponding secondary structural elements. Dots in the last three TcpA sequences indicate identity with the classical O1 strain. Residues contributing side chains for any one of the hydrophobic cores in our monomer model are marked with the symbol `c'. The 72 core residues are mostly hydrophobic in TcpA. Residues with side chains exposed in the fibre are marked with the symbol `e'. The next line indicates the secondary structure of TcpA in our final model and the secondary structure of PilE is given on the last line for comparison, showing that up to ß6, there exists corresponding secondary structural elements in both pilin structures.

 
Secondary structure prediction for TcpA and PilE by the GCG package

We compared the secondary structure prediction by both Chou–Fasman and Garnier–Osguthorpe–Robson methods (Genetics Computer Group, 1995Go) for PilE with its secondary structure in the crystal (Parge et al., 1995Go). By the Chou–Fasman method, only 20 of the first 54 residues comprising {alpha}1 are predicted to be {alpha}-helical, regions {alpha}2 and {alpha}3 in the crystal structure not predicted to be helical, and among the six ß-strands, only ß2, ß4 and ß5 (partly) are predicted. The accuracy of this secondary structure prediction for PilE was poor by the GCG package. Therefore, it can be expected to be poor for TcpA as well, since TcpA is structurally and functionally homologous to PilE.

Residuewise raw Chou–Fasman probabilities used to build model except for {alpha}1

Though secondary structure predictions performed within the GCG package were judged to be unreliable, residuewise P{alpha}, Pß, Pt raw values obtained within the InsightII package (Biosym/MSI, 1995Go) were used to build our model. The region 1–54 showed higher Pß values than P{alpha} particularly for regions 1–21 and 42–45. Though region 60–70 is variable among the four TcpA sequences (Figure 1Go), it showed higher P{alpha} values compared to Pß for all four sequences, thus indicating {alpha}2 in this region. Figure 2Go plots Pß values of PilE and TcpA. In all four sequences, region 70–77 (ß1 and preceding loop) showed higher Pß values than P{alpha}. Similarly, region 87–98 (ß2) showed higher Pß values than P{alpha}. However, residues 104–110 (ß3) showed almost equal values of P{alpha} and Pß but since it is so hydrophobic it is likely to be a ß-strand; also by comparison with PilE. All four TcpA sequences indicated higher Pß values than P{alpha} for the regions 117–121 (ß4), 122–127 (ß5) and 130–136 (ß6). Higher P{alpha} values than Pß were found for all four sequences for regions 138–145 ({alpha}4) and 152–158 ({alpha}5). Higher Pß values than P{alpha} were found for regions 159–165 7), 175–179 (ß8) and 192–196 9). Thus, except for {alpha}1 and some ambiguity about 104–110, the residuewise plots of Chou–Fasman probabilities were used for the secondary structure assignment in our model.



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 2. Chou–Fasman ß-strand probabilities for PilE and TcpA using the InsightII package displayed for the 140 N-terminal residues. Although probabilities for both proteins predict a ß-structure in portions of {alpha}1, we favoured a long helix in TcpA following the crystal structure. However, peaks in region 70–140 for TcpA were used to predict the start and end points of the first six ß-strands, e.g. 77–80, 88–94, 103–111, 117–121, 123–127, 130–136. The corresponding peaks for the remaining strands ß7, ß8 and ß9 are not shown here as they are all beyond residue 140.

 
PilE crystal structure used for building TcpA model for residues 1–137

The residuewise Chou–Fasman probabilities of TcpA and PilE indicated that up to residue 137, TcpA possesses corresponding secondary structural elements as in the PilE crystal structure. Since TcpA will be stabilized by a similar hydrophobic core as in PilE, secondary structural elements were accordingly placed so as to conserve the hydrophobic cores. Differences between our TcpA model and the PilE crystal structure within the first 137 residues can be noted from Figure 4aGo and the description of the model below.




View larger version (70K):
[in this window]
[in a new window]
 
Fig. 4. (a) {alpha}-Carbon superposition in stereo of our TcpA model (green) with the crystal structure of PilE (red), which has been used to build our model till residue 137. Every tenth residue of TcpA is labelled. (b) Six TcpA molecules representing the fibre formation are shown as {alpha}-carbon stereo, in a formation analogous to that postulated for PilE by computational methods (figure 5e of Parge et al., 1995Go). Each molecule is related to its immediate neighbour by a 72° rotation about the fibre axis (5-fold symmetry), which is vertical in this figure, and an 8.2 Å translation parallel to the same axis. The magenta coloured molecule has the same orientation as the red one, but is 41 Å (the pitch) above the latter.

 
Hydrophilicity plots of PilE and TcpA compared

Figure 3aGo gives Kyte–Doolittle hydrophilicity plots of both proteins obtained using the GCG package (Genetics Computer Group, 1995Go). The plot is extremely similar for both pilins for regions 1–30, 50–70 and 90–104. However, TcpA appears to be more hydrophobic compared to PilE (Figure 3aGo), particularly in regions 41–48 (the later part of {alpha}1), 70–80 (ß1 and its preceding loop), 105–115 4), 123–137 (ß5, ß6).

How region 138–169 was built

After residue 137, PilE does not have any {alpha}-helix (Parge et al., 1995Go). But for TcpA, regions 138–145 and 153–158 show strong helical preference (P{alpha} >= 1.2) in all four sequences. Similarly, all four sequences show a strong preference for ß-strand in region 160–167 (Pß >= 1.15). Two additional constraints were used for this region: (i) mutating Glu158 to Leu produces a kinked pilus (Sun et al., 1991Go), hence Glu158 must be in the interior of the fibre, interacting with {alpha}1, and (ii) ß7 (163–167) is one of the most hydrophobic regions (Figure 3aGo) and should remain buried in the fibre; this was achieved by forming an antiparallel sheet of ß7 with ß4 (Figure 4aGo).

Placement of the two last ß-strands

All four TcpA sequences indicated ß8 in the region 176–180 (Pß >= 1.2) and ß9 in the region 192–196 (Pß >= 1.15). From residue 169 onwards, we predict that the protein chain must travel radially outwards in the fibre since both the surface probability (Figure 3bGo) and antigenicity index (Figure 3cGo) show large peaks beginning at residue 170. Here the protein chain must not cover the loop containing Glu83, thought to be exposed (Sun et al., 1991Go). Hence ß8 can form a parallel ß-sheet with the hitherto exposed face of ß6. Next, the Cys186-containing loop runs antiparallel to ß4, which contains Cys120. The last ß-strand ß9 is one of the most hydrophobic regions of the protein (Figure 3aGo) and was therefore buried in the hydrophobic pocket between ß4 and ß7 within the subunit.

Refinement

The whole protein structure was energy refined in stages using the Discover program within InsightII.

Surface probability plot

In our model, {alpha}1 and {alpha}2 are exposed on the surface (Figure 4aGo), thus agreeing with peaks in surface probability for residues 22–70 (Figure 3bGo). The highest peak at 82–85 agrees with our model as this is an exposed loop containing Glu83 (Figures 3b and 4aGoGo). Other peaks at 96–103, 114–116, 129–131, 148–152 agree with our model as they are part of loops (Figures 3b and 4aGoGo). There is a broad peak at 169–175 agreeing with a long exposed loop (Figures 3b and 4aGoGo). Region 138–145 is hydrophobic (Figure 3aGo) and has low surface probability by Emini's method (Figure 3bGo), but this region is exposed in our model. This is justified due to the hypervariability noticed in 135–144 (Figure 1Go) by analogy with PilE.

Antigenicity index plot

The Jameson–Wolf antigenicity index plot shows broad peaks in regions 27–41*, 49–61*, 81–88, 98–104, 113–120*, 148–152, 170–175, 180–190, starred regions being buried in the fibre. The remaining correspond to the loops between ß1–ß2, ß2–ß3, {alpha}4{alpha}5, ß7–ß8 and ß8–ß9, respectively, and are expected to be antigenic in the fibre. This agrees with published data (Sun et al., 1991Go, 1997Go).


    Results and discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
TcpA structure

Our TcpA molecule has an elongated (85x42x26 Å), ladle-shaped pilin structure, similar to PilE (85x34x26 Å), the size of ladle head or the {alpha}–ß roll globular domain being larger in TcpA (Figure 4aGo). The TcpA model is composed of: (1) a long {alpha}-helix {alpha}1 (residues 2–54; TcpA secondary structural elements denoted by a subscript number, whereas those of PilE do not have their numbers as subscripts) exactly as in PilE; (2) a second helical region {alpha}2 (residues 59–68); (3) two ß-hairpins forming a four stranded antiparallel ß-sheet (residues 76–95, 104–121), all with nearest neighbour (+1) ß-strand connections; (4) the ß2–ß3 loop connection including a single turn {alpha}3 (residues 96–103); (5) Cys120 emerges out of ß4 but points in the radially outward direction of the fibre, coinciding with Phe120 of PilE; (6) strands ß5 (residues 123–127) and ß6 (residues 130–136) are also disposed like in PilE; (7) two {alpha}-helices {alpha}4 (residues 138–145); and {alpha}5 (residues 152–158), separated by a loop (residues 146–151); (8) ß7 (residues 163–168) forming two hydrogens bonds with ß4; (9) a long loop region (residues 169–174) travelling radially outwards in the fibre; (10) a short ß-strand ß8 (residues 175–179) forming hydrogen bonds with ß6; (11) a loop region (residues 180–191) including Cys186 which points radially inwards in the fibre towards Cys120 and is bridged to it and (12) a final ß9 (residues 192–196) sandwiched between ß4 and ß7.

Several hydrophobic cores stabilize the TcpA structure

Within the ladle head, a hydrophobic core exists between the latter part of the long helix {alpha}1 (residues 30–54) and the four-stranded antiparallel sheet. Residues contributing side chains in any of four hydrophobic cores are marked by the symbol `c' in Figure 1Go. Though there is some variation in this region in {alpha}1 or the four-stranded sheet, viable mutations always preserve the nature and size of the amino acid (Figure 1Go).

A second hydrophobic core exists between the sheets (ß2, ß3) and (ß5, ß6). A third hydrophobic core is formed between ß6 and ß8, the loop containing Cys186. A fourth hydrophobic core is formed between ß3, ß4, ß7 and ß9. Side chains participating in this core are also similarly conserved (Figure 1Go).

Consistency of structural database with the disulfide

In PilE, Cys121 emanates from a ß-strand and Cys151 from a loop antiparallel to it (Parge et al., 1995Go). This property has been maintained in our TcpA model. Well over half of the backbone strands are in extended conformation at the ends of disulfides (Richardson, 1981Go). Though a small fraction of backbone conformations at the end of disulfides are part of ß-sheet, it is extremely rare to have a disulfide bridging two ß-strands. Since Cys120 in TcpA is part of ß4 by analogy with the PilE structure, this demands that Cys186 be part of a loop as we have in our model. Moreover, the Pß values near residue 186 were low in the Chou–Fasman prediction.

Protective epitopes in TcpA agree with model

From a collection of monoclonal antibodies that recognize the native structure of TcpA, Mab16.1 and Mab169.1 were used to localize the corresponding epitopes (Sun et al., 1991Go). Mab169.1 recognized peptide 157–182 (rather, 168–182, a loop + ß8, since 157–167 is buried in the fibre), but not peptides 144–168 or 174–199. Mab16.1 recognized 174–199 (ß8 + a loop including Cys186) but not 157–182 or 144–168. Mab169.1 recognizes the folded structure of TcpA as it does not interact with the reduced form (Sun et al., 1991Go). It was found to be highly protective when used for immunization in the infant mouse model and Mab16.1 was ~50% protective. In the pilin monomer as well as in the fibre generated from our model, both regions recognized by these two monoclonal antibodies are on the surface.

Six peptides corresponding to regions 1–26, 24–43, 77–90, 145–168, 157–182 and 174–199 were used to immunize rabbits (Sun et al., 1997Go). The last four peptides conferred significant protection against V.cholerae. In our model, all these four regions are either partly or fully exposed in the fibre, hence antigenic. The last two peptides are recognized as the epitopes of Mab169.1 and Mab16.1, respectively. Region 77–90 includes ß1 (one side of which is exposed) and the ß1–ß2 loop containing Glu83. Region 145–168 includes the loop 146–151, which in our model is on the fibre surface.

Agreement of model with V8 protease sites

V8 protease cuts TcpA after E83 only, though there are potential proteolytic sites at E151 and E158; these two sites are not accessible to the enzyme (Sun et al., 1991Go). E158 interacts with {alpha}1 and is partly buried in the subunit, hence inaccessible. E151 is partly buried by 190–192 in the subunit and also forms a hydrogen bond with the adjacent N152 side chain. Only E83 is in an exposed loop and therefore V8 cuts here. Thus, the model explains why only one of the three potential sites can be cleaved by V8 protease.

Consistency with mutagenesis data

Our model is consistent with the site-directed mutagenesis data of Sun et al. (Sun et al., 1997Go) and Kirn et al. (Kirn et al., 1997Go, 2000Go) as explained below. These studies describe the morphology, AA, colonization and CTX phage sensitivity in TcpA due to 18 different single amino acid substitutions.

Abnormal fold mutants. C120S or C186S mutant does not form pilin since TcpA does not fold properly due to lack of any of its two cysteines, as a disulfide bridge cannot be formed.

No fibre mutant. K165A mutant produced no fibre at all and is therefore remarkable. Therefore, K165 is an interior residue in the fibre. In our model, the K165 side chain interacts with main chain carbonyls of {alpha}2 of a neighbouring molecule, thus aiding fibre formation. Alanine at 165 cannot retain that favourable interaction stabilizing the fibre.

AA defective mutants. D129A, D146A, D149A, H181A and E183A mutants are AA defective. All five side chains are exposed in our fibre model, since AA has to do with inter-filament interaction.

Normal AA mutants. K137A, E158A, K172A, K184A and K187A are normal regarding AA. The surface charge of K137 is somehow not essential for AA since El Tor has Q137, thus lacking a positively charged residue here (Figure 1Go). E158A is also a natural mutation with normal AA; it is buried in the fibre. K172 is on a surface loop but the K172 N{zeta} atom forms a hydrogen bond with D113 O in our model, thus really not a side chain affecting AA. K184A is normal in AA since, the K184 side chain is buried in the subunit, with a hydrogen bond between K184 N{zeta} and T125 O{gamma} atoms in our model. The K187 side chain is in a pocket near the disulfide bridge and is partly buried in the subunit. So changing it does not alter surface property remarkably.

Improved AA mutants. P169A and D175A mutations improved AA, hence they must either be on the fibre surface or affect the surface residues somehow. P169 is in the interior of the fibre but the loop following this residue can be affected by this mutation, whereas D175A is on the surface of the fibre.

Mutants affecting fibre formation through {alpha}1. The K121A mutant forms long bundles—fibre formation is obviously affected by this mutation, suggesting K121 is in the interior of the fibre. In our model, the K121 side chain points radially inwards and interacts with N41 O{delta}1 in {alpha}1. A smaller side chain at 121 should lead to a more compact ladle head in TcpA, affecting fibre formation. Though E158A is a normal mutation found in V.cholerae strains normal in AA, E158L gives kinked, unraveled pilus further confirming E158 must be in the interior in the fibre. The E158sup-1 mutant represents a spontaneous revertant (Kirn et al., 1997Go) whose location was within {alpha}1. Examining {alpha}1, we concluded that only R26 can provide a positively charged side chain making a salt bridge with an E158 side chain. In our model, E158 and R26 side chains interact. The E158L mutation displaces {alpha} slightly modifying fibre formation.

Comparison with P.aeruginosa pilin

After the TcpA model was built by us using the PilE structure, a crystal structure of P.aeruginosa type IVA (PAK) pilin with a partially truncated N-terminal helix (residues 29–144) was reported (Hayes et al., 2000Go). The presence of {alpha}ß-roll structure in PAK pilin is consistent with our prediction that this class of pilins should share this fold. Though the PAK pilin structure is smaller than PilE due to the truncation and smaller number of residues, it affirmed some of our predictions regarding TcpA: (i) there is variability among the pilins in the secondary structural elements immediately following the long N-terminal helix before the start of {alpha}1 (Figure 1Go); (ii) the structure of the ß2–ß3 loop is more or less conserved among type IV pilins (Figure 4aGo) [figure 2Go of Hayes et al. (Hayes et al., 2000Go)]; (iii) the first Cys residue emanates from ß4 and the spatial position of the disulfide is conserved (Figure 4aGo); (iv) there is variation among pilins in the number of secondary structure elements near the C-terminus, including the second Cys (Figure 1Go); and (v) the size of the ladle head or globular domain increases with the total number of residues in the pilin (Figure 4aGo) [figure 2Go of Hayes et al. (Hayes et al., 2000Go)].

Fibre formation by TcpA

Parge et al. (Parge et al., 1995Go) generated over 115 000 models by systematically building repetitive right- and left-helical assemblies of pilin monomers and dimers and concluded that PilE fibre has five subunits per turn of helix, 40–41 Å pitch, 70 Å outer diameter from electron micrographs. We argue that, since several residues are invariant in {alpha}1 among pili from various organisms, the manner of association of TcpA to form its fibres would be similar to that indicated for PilE, with a salt bridge from the positively charged N-terminus of subunit n + 1 to the conserved negatively charged side chain from Glu5 of subunit n. For this interaction to be preserved in TcpA, we further argue that the pitch of the TcpA fibre must be equal to that of PilE and the extra residues in TcpA must be accommodated by increasing the diameter of the filament originating from a larger globular domain. Assuming TcpA fibre has equal density as that of PilE, the average diameter for the TcpA filament calculates to be 81 Å including side chain atoms (for 197 residues as opposed to 158), matching the actual average diameter in the fibre generated from our model. It is within the normal range (Giron et al., 1997Go) for these pilins.

The association of several subunits is shown in Figure 4bGo. The regions including {alpha}2 and the following loop (65–74) and ß7 are involved in lateral contacts between adjacent monomers (subunits n and n + 1) in the fibre, thus explaining the result obtained for the K165A mutation (Kirn et al., 1997Go, 2000Go). The K165 N{zeta} atom hydrogen bonds to G72 O of an adjacent subunit in our model. In addition, several hydrophobic side-chain–side-chain interactions are observed in the inter-subunit interface: P169 with L69', P191 with G72', F192 with F96' and V194 with V74'. Positions 69, 72, 74, 165, 191, 192 and 194 are totally conserved in the four sequences (Figure 1Go). Position 96 allows either F or W, similar side chains, and position 169 shows a T in one of the sequences instead of P (Figure 1Go).

In another type of interaction parallel to the fibre axis, subunit n interacts with subunit n + 5 having an identical orientation but differing in position by the pitch, 41 Å. Residues involved from residue n include: Y51, R52, G53, L54, G55 (five residues at the end of {alpha}1), L115, T116, Q117 (in the ß3–ß4 loop), P169 and A170 (in the ß7–ß8 loop). Residues involved from residue n + 5 include: A18, V21, T22 (three in {alpha}1), I95, R100 (in the ß2–ß3 loop), D146 and L147 (in the ß4–ß5 loop). Positions 18, 21, 22, 51, 52, 54, 55, 95, 100, 115, 116, 117, 146 and 147 are totally invariant in the four strains (Figure 1Go). Positions 53 (G, S or A), 169 (P or T) and 170 (S or G) show some variation but the nature of the amino acid is still conserved by these mutations.

Thus, the reason for the >=78% sequence identity observed between the TcpA sequences from the various strains is that in addition to the core residues, the surface residues important in fibre formation must also be conserved. Invariance among core residues accounts for only 25% (48/197) of the sequence identity.

Variation of molecular surfaces in various strains

Most of the hypervariable regions in TcpA are exposed in the fibre: 62–68, 92–96, 135–144, 170–184; only 152–158 is not fully exposed. Residues from the hypervariable regions have side chains on the fibre surface. The antigenic surfaces vary considerably within the TcpA strains, thus escaping immune detection. Further, TcpA is known to act as a receptor of CTX{phi}, a filamentous phage that is known to infect V.cholerae (Waldor and Mekalanos, 1996Go). A recent study (Boyd et al., 2000Go) has shown diversity in CTX core region sequences, particularly in OrfU, a phage coat protein likely to mediate its attachment to the host bacterium. Thus, the variability in TcpA sequences is expected to play a crucial role in the specific recognition of the phage by the bacterium through TcpA–OrfU interaction. However, the limited nature (up to 22% level) of variation noted amongst TcpA sequences of various V.cholerae strains would suggest that further changes in these sequences may not be advantageous to the organism which needs to preserve the structure of TcpA and its fibre forming contacts for the maintenance of its function.

Predictions from model

It is evident from the model that out of 72 core amino acids (marked by `c' in Figure 1Go), comprising primarily of hydrophobic residues, 48 are invariant. Further, alterations in the remaining 24 positions are, more or less, restricted within the category of `homologous' changes. Therefore, residues involved in the formation of the core region appear to be quite critical for the monomeric structure of the protein which is likely to be disrupted by certain mutations of these residues. Although similar changes within the surface exposed residues 1–31, or those involved in the two types of subunit–subunit interactions (e.g. n with n + 1 and n with n + 5) may not disrupt the monomer structure itself, they might affect the pilus formation capacity of the monomers. Side chains exposed in the pilus fibre (marked by `e' in Figure 1Go) are primarily located in the hypervariable regions mentioned and are likely to account for the antigenic variations among the strains (Jonson et al., 1991Go; Rhine and Taylor, 1994Go; Novais et al., 1999Go; Nandi et al., 2000Go). It is also predictable that mutations in these positions would affect the following properties of the organism that are shown to be TCP mediated, e.g. the ability to autoagglutinate through bacterium–bacterium interaction, the ability to interact with CTX{phi} and/or the ability to colonize through bacterium–host cell receptor (putative) interaction.

The model has been constructed basing it on PilE as far as correspondence between secondary structural elements was observed. The remaining portion of TcpA is constrained to assume our proposed structure due to the existence of several biochemical data in the literature. Further, the existence of the disulfide bridge constrains the possible structures.


    Notes
 
3 To whom correspondence should be addressed. E-mail : raja{at}boseinst.ernet.in Back


    Acknowledgments
 
The authors thank the Department of Biotechnology, Government of India for supporting a molecular modelling facility at our Institute. The work was partly supported by the Council of Scientific and Industrial Research, Government of India.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Biosym/MSI (1995) Insight II User Guide. Biosym/MSI, San Diego.

Boyd,E.F., Heilpern,A.J. and Waldor,M.K. (2000) J. Bacteriol., 182, 5530–5538.[Abstract/Free Full Text]

Brenner,S.E., Chothia,C. and Hubbard,T.J.P. (1998) Proc. Natl Acad. Sci. USA, 95, 6073–6078.[Abstract/Free Full Text]

Chiang,S.L., Taylor,R.K., Koomey,M. and Mekalanos,J.J. (1995) Mol. Microbiol., 17, 1133–1142.[ISI][Medline]

Chothia,C. and Lesk,A.M. (1986) EMBO J., 5, 823–826.[Abstract]

Genetics Computer Group (1995) Wisconsin Sequence Analysis Package Program Manual, section 11–45. Genetics Computer Group, Madison, WI.

Giron,J.A., Gomez-Duarte,O.G., Jarvis,K.G. and Kaper,J.B. (1997) Gene, 192, 39–43.[CrossRef][ISI][Medline]

Haas,R. and Meyer,T.F. (1986) Cell, 44, 107–115.[ISI][Medline]

Hayes,B., Sastry,P.A., Hayakama,K., Read,R.J. and Irvin,R.T. (2000) J. Mol. Biol., 299, 1005–1017.[CrossRef][ISI][Medline]

Herrington,D.A., Hall,R.H., Losonsky,G., Mekalanos,J.J., Taylor,R.K. and Levine,M.M. (1988) J. Exp. Med., 168, 1487–1492.[Abstract]

Jonson,G., Holmgren,J. and Svennerholm,A.-M. (1991) Microb. Pathogen., 11, 179–188.[ISI][Medline]

Jonsson,A., Liver,D., Falk,P. and Normark,S. (1994) Mol. Microbiol., 13, 403–416.[ISI][Medline]

Karaolis,D.K.R., Somara,S., Maneval,D.R., Johnson,J.A. and Kaper,J.B. (1999) Nature, 399, 375–379.[CrossRef][ISI][Medline]

Kirn,T.J., Sandoe,C.M.P., Chiang,S.L., Mekalanos,J.J. and Taylor,R.K. (1997) Abstracts, Thirty-third Joint Conference on Cholera and Related Diarrheal Diseases, December 3–5, 1997. The U.S.–Japan Cooperative Medical Science Program, Clearwater Beach, FL, pp. 30–32.

Kirn,T.J., Lafferty,M.J., Sandoe,C.M.P. and Taylor,R.K. (2000) Mol. Microbiol., 35, 896–910.[CrossRef][ISI][Medline]

Miller,V.L., Taylor,R.K. and Mekalanos,J.J. (1987) Cell, 48, 271–279.[ISI][Medline]

Nandi,B., Nandy,R.K., Vicente,A.C.P. and Ghose,A.C. (2000) Infect. Immun., 68, 948–952.[Abstract/Free Full Text]

Novais,R.C., Coelho,A., Salles,C.A. and Vicente,A.C.P. (1999) FEMS Microbiol. Lett., 171, 49–55.[CrossRef][ISI][Medline]

Orengo,C. and Thornton,J.M. (1993) Structure, 1, 105–120.[ISI][Medline]

Parge,H.E., Forest,K.T., Hickey,M.J., Christensen,D.A., Getzoff,E.D. and Tainer,J.A. (1995) Nature, 378, 32–38.[CrossRef][ISI][Medline]

Rhine,J.A. and Taylor,R.K. (1994) Mol. Microbiol., 13, 1013–1020.[ISI][Medline]

Richardson,J. (1981) Adv. Protein Chem., 34, 167–339.[Medline]

Rudel,T., Scheuerpflug,I. and Meyer,T.F. (1995) Nature, 373, 357–359.[CrossRef][ISI][Medline]

Strom,M.S. and Lory,S. (1993) Annu. Rev. Microbiol., 47, 565–596.[CrossRef][ISI][Medline]

Strom,M.S., Nunn,D.N. and Lory,S. (1994) Methods Enzymol., 235, 527–540.[ISI][Medline]

Sun,D., Seyer,J.M., Kovari,I., Sumrada,R.A. and Taylor,R.K. (1991) Infect. Immun., 59, 114–118.[ISI][Medline]

Sun,D., Lafferty,M.J., Peek,J.A. and Taylor,R.K. (1997) Gene, 192, 79–85.[CrossRef][ISI][Medline]

Taylor,R.K., Miller,V.L., Furlong,D.B. and Mekalanos,J.J. (1987) Proc. Natl Acad. Sci. USA, 84, 2833–2837.[Abstract]

Thelin,K.H. and Taylor,R.K. (1996) Infect. Immun., 64, 2853–2856.[Abstract]

Waldor,M.K. and Mekalanos,J.J. (1996) Science, 272, 1910–1914.[Abstract]

Received June 12, 2001; revised December 14, 2001; accepted January 4, 2002.