Centre for Functional and Applied Genomics, Institute for Molecular Bioscience1 and Department of Biochemistry2, University of Queensland, Brisbane, QLD 4072, Australia
Author for correspondence: John S. Mattick. Tel: +61 7 3365 4446. Fax: +61 7 3365 4388. e-mail: j.mattick{at}imb.uq.edu.au
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: proteome, gene families, regulators, adhesins, type II secretion
Abbreviations: FHA, filamentous haemagglutinin; MCP, methyl-accepting chemotaxis protein
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A total of 6405 ORFs were surveyed manually to identify errors such as large overlaps, the majority of which were found to be opposite strand ORF shadows on the basis of weak BLAST matches to GenPept and/or the P. putida genome. Of these, 1162 apparently spurious ORFs were deleted. The remaining 5243 ORFs were loaded into a public web-browsable genome database. BLAST outputs were manually analysed and an annotation was added to the database for each ORF. The BLAST matches were also used to automatically name known P. aeruginosa genes and also the closest named protein from other organisms. Furthermore, additional data were parsed from the BLAST output using a perl script to produce a descriptive line, for example n% identical to [GenBank accession no.] gene product [organism] in a 341 aa overlap. Domain analysis of our ORF set was performed using SMART (http://smart.embl-heidelberg.de/) (Schultz et al., 2000 ) and the Pfam database (http://pfam.wustl.edu/) (Bateman et al., 2000
) using HMMER (http://hmmer.wustl.edu/). Domain data were added to the SQL database and integrated into the genome browser in both tabular and graphical form.
![]() |
RESULTS AND DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
gene name, restricted to known P. aeruginosa genes;
name of homologue, found in closest named protein match field (see below);
gene ID number, a unique identifier exclusive to this database;
keyword, found in manual annotation;
position in genome, according to nucleotide number, minutes or a clickable genome map;
Pfam domains, from a selection of 665 domains identified by HMMER analysis;
sequence similarity, using BLASTX, BLASTN and BLASTP.
When a search is performed the results are listed on a separate page with links for each ORF to the gene viewer. The gene viewer (Fig. 1) contains the following information:
|
gene ID number;
gene description, a manually annotated field;
gene position, indicated by ORF start and ORF stop fields (note these do not indicate transcriptional orientation, merely the region of the genome spanned by the ORF);
peptide length;
closest named protein match, automatically generated from BLASTP output corresponding to the closest homologue in which the protein name has been defined in GenBank;
comment line, automatically generated from BLASTP output corresponding to the closest homologue;
Medline links to known P. aeruginosa genes and to the closest named protein match;
smart domains and sequence features, such as coiled coils, signal peptides and transmembrane regions, in tabular form with links to additional information, alignments and structures;
Pfam domains in graphical form and in tabular form with links to domain information and alignments;
genomic context, a graphical view of surrounding ORFs in a 20 kb scrollable window including transcriptional orientation and reading frame;
nucleotide and peptide sequence.
The genome database contains information for 5243 ORFs which we estimate represent more than 95% of the P. aeruginosa proteome. A total of 1291 ORFs had no or low similarity (defined here as expectation value E >10-10) to all other sequences in the current databases. While the formal annotation and publication of the P. aeruginosa genome is the prerogative of those who have done the sequencing and assembly, the genome database described herein provides a useful interactive tool for gene discovery and analysis of the publicly available genome sequence. Searching of the database enables the identification and retrieval of information on various categories of proteins, their genomic position and surrounding genomic landscape, with graphical outputs of domain homologies and links to other databases, including Medline.
This database can be used to analyse genes and gene clusters interactively and in more detail. We have identified a number of interesting loci which illustrate the usefulness of the database for probing P. aeruginosa molecular biology. Analysis of the database reveals that the P. aeruginosa genome contains, for example, 84 ORFs encoding proteins with a CheY-like domain (response regulator receiver domain in Pfam) and 55 ORFs encoding proteins with a histidine kinase domain (E<10-7). Seventeen proteins were found to contain both domains, indicating that they could function as dual sensor-regulators, and two of these contained an additional C-terminal CheY-like domain. There are three orphan histidine kinases and 19 proteins containing receiver domains that are not associated with a cognate histidine kinase. Numerous other proteins with Pfam domains representing other well-characterized regulatory domains were identified, including 117 LysR-type and 58 AraC-type helixturnhelix domains (E<10-7).
The genome database also reveals that P. aeruginosa contains 38 ORFs with DUF1 and/or DUF2 domains (SMART), which are domains of unknown function. DUF1 domain was first recognized in the Caulobacter crescentus signalling protein PleD which controls cell differentiation (Hecht & Newton, 1995 ; Aldridge & Jenal, 1999
) and subsequently, along with DUF2, in enzymes which also contain putative oxygen-sensing domains and that are involved in cyclic di-GMP regulation of cellulose synthesis in Acetobacter xylinus (DUF1 and DUF2; Tal et al., 1998
). These domains, originally referred to as GGDEF (DUF1) and EAL (DUF2), are found in a wide variety of bacterial proteins and occur in P. aeruginosa either singly (often with predicted transmembrane domains), together or in various combinations with other known or suspected signalling domains, including (i) CheY-like response regulator receiver domains; (ii) PAS/PAC domains (which are input-sensing domains found in many prokaryotic and eukaryotic regulatory proteins, including those regulating virulence, oxygen sensing, hydrogen sensing, sporulation, nitrogen metabolism, light reception and photopigments and circadian rhythms; see Ponting & Aravind, 1997
); (iii) GAF domains (which are present in diverse phototransducing proteins and cGMP-specific phosphodiesterases; see Aravind & Ponting, 1997
); (iv) HAMP domains [which are also known as domain found in bacterial signalling proteins (Pfam) and which occur in numerous bacterial histidine kinases, adenylyl cyclases, methyl-accepting proteins and other signalling proteins, possibly involved in sensing conformational change; see Aravind & Ponting 1999
]; and (v) periplasmic substrate-binding domains, also known as extracellular solute-binding proteins, family 3 (Pfam). It is clear that DUF1- and DUF2-containing proteins are all members of a signal transduction or biochemical system whose precise function(s) and mode of action remains virtually unknown, but which indicates that there is still much to learn about signalling pathways and networks in P. aeruginosa.
Analysis of the P. aeruginosa genome with this database has also yielded other interesting observations, including the discovery of multiple chemotaxis-like systems, novel filamentous haemagglutinin (FHA)-like genes, novel type 1 fimbrial genes and composite genetic elements, among others, which were hitherto not known to exist in P. aeruginosa and which are presented below. The positions of loci are given as nucleotide numbers corresponding to the 15 December 1999 sequence release.
P. aeruginosa possesses multiple chemotaxis-like systems
The term chemotaxis is used to describe the ability of bacteria to adjust their swimming motility in response to environmental chemorepellents/attractants. The best characterized of the chemotaxis systems is that of Escherichia coli (Parkinson, 1993 ; Eisenbach, 1996
). This system is composed of membrane-bound methyl-accepting chemotaxis proteins (MCPs) which sense specific environmental signals and are methylated or demethylated by the methyltransferase CheR or the methylesterase CheB, respectively, as part of a feedback circuit which adjusts the methylation state of the MCP allowing temporal sensing of chemical gradients. The MCPs are coupled via the adaptor protein CheW to the histidine kinase CheA which autophosphorylates in response to cues from the MCPs. CheA then phosphotransfers to the response regulator CheY which interacts directly with the flagellar motor switch to control the direction of flagellar rotation, conferring chemotaxis to the bacterium. CheA also competitively phosphotransfers to the methylesterase CheR. This system also contains another protein, CheZ, which serves to dephosphorylate the response regulator CheY. Related chemotaxis systems have been described in many other bacteria. These systems share the six core components of an MCP, and homologues of CheW, CheA, CheY, CheR and CheB, but may also include other proteins which are specific for each system.
The chemotaxis system which controls swimming motility in P. aeruginosa has been described by Masduki et al. (1995) and Kato et al. (1999)
. This system is encoded by the genes cheY, cheZ, cheA, cheB and cheW, which are located in a cluster downstream of the flagellar structural genes, and cheR, which is reportedly situated at a distinct locus (>1·8 Mb away; Kato et al., 1999
). Three MCPs encoded by pctA, pctB and pctC have been identified which feed into this chemotaxis pathway and are also situated at a distinct locus (Kuroda et al., 1995
; Taguchi et al., 1997
). A second related system has been described in P. aeruginosa which controls type IV fimbrial-mediated twitching motility (Darzins, 1993
, 1994
, 1995
; Alm & Mattick, 1997
; C. B. Whitchurch and others, unpublished). This system is composed of a putative MCP (PilJ), a methyl transferase CheR-like protein (PilK), two CheW homologues (PilI and ChpC), a methyl esterase CheB homologue (ChpB) and a CheA-like histidine kinase ChpA. This system also has three CheY-like response regulator/receiver modules: PilG, PilH and a domain located at the C terminus of ChpA.
Using the search word chemotaxis in the search by word in description field on the Search page of the database yields 52 ORFs with this word in the description line. Of these, 26 encode putative MCPs which are scattered throughout the genome. Only four of these genes (pctA, pctB, pctC and pilJ) have been characterized to date, which means much remains to be learned about chemo-sensing in P. aeruginosa and the specificities of each of these MCPs, as well as the particular pathways into which they connect.
The remaining chemotaxis ORFs are situated in five distinct clusters (see genome positions 194014, 449639, 1585440, 3759948 and 4143740 and surrounds; Fig. 2
).
|
The gene cheR, which is required for swimming chemotaxis in P. aeruginosa, is situated around 2 Mb away, as predicted by Kato et al. (1999) , at
3759948 (Fig. 2
). Upstream of cheR is an ORF which is predicted to encode a protein homologous to CheV of Campylobacter jejuni, Helicobacter pylori, Bacillus subtilis and V. parahaemolyticus (Fig. 2
). These proteins are composed of two domains, an N-terminal CheW domain and a C-terminal CheY domain. Interestingly, the P. aeruginosa CheV homologue is very closely related to that of V. parahaemolyticus (57% identity, 74% similarity) and, as is the case in P. aeruginosa, the cheV gene of V. parahaemolyticus is also located directly upstream of cheR at a site remote from that of the remaining chemotaxis genes (see GenBank U12817). The conserved genetic arrangement of the swimming chemotaxis genes of P. aeruginosa and V. parahaemolyticus suggests lateral gene transfer and/or a common ancestry, and it will be interesting to compare not just gene and predicted protein sequences, but also the conservation of gene order and genomic landscape between species as a means of trying to unravel their histories. The role of the P. aeruginosa CheV homologue is unknown but we predict that it may also play some role in the chemotactic control of swimming motility, or a related function. The genes encoding the MCPs PctA, PctB and PctC, which are known to control swimming motility in response to amino acids, are located at
481252.
The cluster of twitching-motility-related chemotaxis-like genes, pilG, pilH, pilI, pilJ, pilK, chpA, chpB and chpC, is situated at 449639 (Fig. 2
). These genes are similar in arrangement and structure to the frz genes controlling gliding motility in Myxococcus xanthus (Ward & Zusman, 1999
), except that they are more complex. In particular, the central chpA gene encodes a protein with six predicted Hpt (histidine phosphotransfer) domains, one of which has a serine in place of the conserved histidine, as well as a C-terminal CheY receiver domain, making this one of the most complex signal transduction proteins yet described in nature (C. B. Whitchurch and others, unpublished). It is also worth noting that the current release of the P. aeruginosa genome sequence appears to contain a sequence error at this point, or at least a difference from the pre-existing GenBank entry covering this region, which results in the loss of a stop codon and the apparent fusion of two adjacent genes (pilL and chpA; see Gene ID 551, cf. GenBank U79580). Given the magnitude of the sequence and the high G+C content of the genome, it is not necessarily surprising that this sequence release may have errors which will affect the accuracy of the annotation at various sites.
The remaining two clusters of putative chemotaxis-like genes (clusters 4 and 5; Fig. 2) are entirely novel and have no known function. Cluster 4 is a complete set of chemotaxis-like genes encoding homologues of CheY, CheA, CheW, CheR and CheB, and has two associated MCPs. This cluster appears to be related to the chemotaxis genes of a number of bacteria including Caulobacter crescentus, Rhodobacter sphaeroides and Bacillus subtilis in that it also encodes a homologue of the chemotaxis protein CheD which facilitates the methylation of MCPs by CheR in these systems (Rosario & Ordal, 1996
).
Cluster 5 encodes homologues of CheR, CheB, CheA and CheW as well as an MCP and a second protein containing a C-terminally located CheW domain. Unlike the other chemotaxis-like gene clusters of P. aeruginosa, cluster 5 does not encode an individual CheY protein. However, SMART and Pfam analyses show the CheA homologue of this cluster (Gene ID 4817) is actually a CheA/CheY hybrid protein, like ChpA, as it also possesses a response regulator receiver domain situated at its C terminus. SMART and Pfam analyses also show that a CheY-like response regulator receiver domain is found at the N terminus of the protein encoded by the ORF situated at the end of this cluster (Gene ID 4814). These analyses also show that this protein contains a DUF1 domain at its C terminus (see above). Other proteins with a similar domain structure include the C. crescentus protein PleD, which is part of a signal transduction pathway controlling cell differentiation (Hecht & Newton, 1995 ; Aldridge & Jenal, 1999
). We predict that this protein in P. aeruginosa participates in a phosphotransfer-dependent signal transduction pathway involving the cluster 5 chemotaxis-like system. Interestingly, this cluster shows very close similarity and syntony to a locus of seven genes in Pseudomonas fluorescens, termed the wsp (wrinkly spreader) operon (E. Bantanaki & P. B. Rainey, personal communication). Mutations in the wsp operon affect the production of cellulose (which is encoded by the wss operon) and abolish rapid surface spreading of colonies (A. J. Spiers & P. B. Rainey, unpublished). Intriguingly, however, PAO1 lacks a homologue of the wss operon.
It is also interesting that the twitching motility chemotaxis-like system, the swimming motility chemotaxis system and the wsp-like chemotaxis system contain more than one CheW module. Considering the large number of ORFs encoding putative MCPs (26) scattered around the genome, it is tempting to speculate that the presence of multiple CheW homologues in these systems may facilitate the interaction of a greater number of MCPs into the various signal transduction pathways, thereby diversifying the range of chemical signals to which these pathways can respond.
Novel general secretion pathways
P. aeruginosa contains at least 35 genes required for the biosynthesis and function of type IV fimbriae (Alm & Mattick, 1997 ; Semmler et al., 2000
), which are polar filaments involved in epithelial attachment and twitching motility (Semmler et al., 1999
). These filaments are composed of a structural subunit, PilA or pilin, which has a highly conserved and highly hydrophobic N-terminal domain that forms the core of a helical structure (Parge et al., 1995
). Assembly of type IV fimbriae also requires a number of accessory proteins, including other pilin-like proteins, a specific prepilin peptidase, inner- and outer-membrane proteins and nucleotide-binding proteins. Homologues of these proteins are also involved in type II protein secretion in P. aeruginosa and DNA uptake in a wide variety of bacteria, suggesting that these are all subsets of a supersystem which shares a common evolutionary origin, architecture and functional basis (Hobbs & Mattick, 1993
; Mattick & Alm, 1995
). Given our laboratorys long-standing interest in this system we used the genome database to see if any other type IV pilin-like genes existed in P. aeruginosa in addition to those required for type IV fimbrial production (pilAEVWX and fimTU; Alm et al., 1996
; Alm & Mattick, 1997
) and type II protein secretion (xcpTX; Tommassen et al., 1992
; Bleves et al., 1998
). A BLAST search using the first 50 residues of unprocessed PilA identified two separate loci (A and B, at
3011000 and
731500, respectively), each of which contained a cluster of five ORFs which are predicted to encode proteins with a type IV pilin-like hydrophobic N-terminal domain (Fig. 3
).
|
The similarity of type IV fimbriae and type II protein secretion apparatus and DNA uptake extends beyond the common use of prepilin-like subunits to include the accessory proteins: GspE-like nucleotide-binding proteins PilB/XcpR/ComG-1; GspF-like inner-membrane proteins PilC/XcpS/ComG-2; GspO-like prepilin peptidase PilD (shared by fimbrial and type II secretion prepilin subunits)/ComC; and GspD-like outer-membrane proteins PilQ/XcpQ (Alm & Mattick, 1997 ; Tommassen et al., 1992
; Albano et al., 1989
). Using the graphical viewer of our database it was clear that locus A contained ORFs encoding a nucleotide-binding protein and inner-membrane protein, whilst locus B contained three ORFs encoding a nucleotide-binding protein and inner- and outer-membrane proteins (Fig. 4
). Furthermore, a closer examination of locus B revealed the existence of GspC, GspL and GspM homologues (Fig. 4
). The presence of an apparently complete set of type II secretion genes paralogous to the xcp cluster suggests a possible role for locus B in protein secretion, independent of the Xcp gene products.
|
Novel F17-like fimbriae loci
The P. aeruginosa genome contains at least three loci encoding type 1-like fimbriae, which are quite distinct from type IV fimbriae. These fimbriae are commonly found in enteric pathogens and include classical type 1 fimbriae, P pili and F17 fimbriae, all of which belong to the group of fimbrial structures assembled by the chaperone/usher pathway and which appear to function strictly as adhesion devices (Hung & Hultgren, 1998 ; Soto & Hultgren, 1999
). This type of fimbria has not previously been identified in P. aeruginosa. The chaperone/usher pathway requires the interaction of a periplasmic immunoglobulin-like chaperone protein with a fimbrial subunit, targeting of the chaperonesubunit complex to an outer-membrane usher protein, disassociation of the complex and assembly into pili across the outer membrane (Hung & Hultgren, 1998
). The simplest fimbriae assembled by this pathway are the thin, flexible pili (including F17) which generally require only a chaperone, usher, subunit and an adhesin (assembled at the distal end of the structure) (Soto & Hultgren, 1999
).
Initial BLAST searches of our database identified three genes which showed similarity to the major subunit of E. coli F17-A fimbriae (see positions 1073235, 4569005 and 2342419 for loci A, B and C, respectively; Fig. 5
). Each fimbrial subunit contained a signal sequence required for export from the cytoplasm (a feature of known fimbrial subunits) and a typical ß-zipper motif (a penultimate tyrosine, alternating hydrophobic residues at positions 4, 6 and 8, and a glycine at position 14 from the C terminus) known to be involved directly in the interaction with a chaperone (Holmgren et al., 1992
). By using the Pfam search tool to identify ORFs in our database containing the Gram-negative pili assembly chaperone Pfam domain (Bateman et al., 2000
), we found that all three fimbrial subunit genes were associated with one (or two) gene(s) encoding a FGS-type fimbrial chaperone protein (Fig. 5
). A multiple alignment of putative P. aeruginosa chaperone peptide sequences with PapD (the prototype FGS-type chaperone; Lindberg et al., 1989
) revealed that the P. aeruginosa homologues shared the majority of residues found to be conserved in other members of the chaperone family (Holmgren et al., 1992
) (data not shown). Similarly, a Pfam search using the fimbrial usher protein Pfam domain (Bateman et al., 2000
) revealed that all three loci were accompanied by genes encoding typical fimbrial usher proteins (Hung & Hultgren, 1998
; Fig. 5
). These searches also revealed other, apparently orphan proteins with similarity to fimbrial chaperones and ushers (at position 558380 and 5217449; Gene ID 665 and 6046, respectively) which are adjacent to ORFs encoding proteins with low similarity to other fimbrial proteins and that may represent other as yet unrecognized fimbrial loci or surface structures.
|
Loci B and C and the homologous locus in Y. pestis each have an additional chaperone, a feature not previously described in other organisms. Fimbrial chaperones and outer-membrane ushers are known to operate in parental pairs, which are able to assemble foreign fimbrial subunits (albeit less efficiently than their natural partners) provided they possess the necessary motifs (i.e. a ß-zipper) (Jones et al., 1993 ). However, the partners within chaperone/usher pairs are not thought to be as readily interchangeable (Klemm et al., 1995
; Jones et al., 1993
). Given that the order of subunit assembly is proposed to be determined by kinetic partitioning (Hung & Hultgren, 1998
), it is possible that additional chaperones may allow more efficient incorporation of fimbrial subunits from heterologous loci via heterologous membrane usher proteins. Elucidating the role (if any) that an additional chaperone plays in fimbrial biogenesis awaits mutational studies.
The adhesive specificity of type 1 and related fimbriae is usually determined by the structure of the tip protein(s) known as fimbrial adhesins (Hultgren et al., 1993 ). Fimbrial adhesins can be thought of as having a C-terminal region with the conserved features of the major and minor fimbrial subunits that allow interaction with a chaperone (i.e. ß-zipper), and a variant N-terminal receptor-binding domain (Hung & Hultgren, 1998
). Generally, fimbrial adhesins are encoded in the same cluster as the other structural components of the fimbriae (Hung & Hultgren, 1998
). Gene 2776 of locus C (corresponding to y39 of Y. pestis) has a signal sequence and similarity to fimbrial subunits in the C-terminal domain (including a ß-zipper), indicating that it probably encodes a fimbrial adhesin. Similarly, gene 5316 of locus B also appears to be a fimbrial adhesin. Although there was no apparent adhesin gene associated with locus A, a gene with low similarity to type 1 fimbrial subunits was identified elsewhere in the genome (position 5948579; Gene ID 6870). Closer analysis revealed a signal sequence, ß-zipper and an N-terminal domain with no similarity to other fimbrial subunits, suggesting that this ORF is likely to encode a fimbrial adhesin. It should also be noted that genes 663 and 664, associated with the aforementioned orphan chaperone, could also encode fimbrial adhesins. Furthermore, the existence of other, as yet unidentified adhesin ORFs in the genome cannot be ruled out. A single fimbrial adhesin, FimD, is known to be shared by two fimbrial types in Bordetella pertussis to facilitate antigenic variation (Willems et al., 1992
) and thus it is possible that this type of sharing also occurs in P. aeruginosa.
Finally, we undertook a preliminary search for physical evidence of thin fimbrial structures by electron microscopy. Fig. 6 shows the presence of at least one type, and possibly two types, of peritrichous thin fimbriae in P. aeruginosa. This is the first time, to our knowledge, that this type of fimbriae has been reported in P. aeruginosa. Which loci encode which particular fimbriae and whether these fimbriae are involved in P. aeruginosa virulence, and under what circumstances, will require gene knockout and in vivo and in vitro analyses, but the presence of such fimbriae suggests that a new aspect of the biology of P. aeruginosa remains to be explored.
|
|
Despite the similarity of P. aeruginosa FHA-like proteins to Bordetella pertussis FHA, there are significant differences, particularly in the nature of their repeat regions. Bordetella pertussis FHA contains a set of short repeats in the N-terminal domain prior to the RGD box (Locht et al., 1993 ; Fig. 7
). P. aeruginosa FHA-like A, B, D, E and F also contain repeats which vary in length and position across the family. FHA-A and FHA-B are nearly 100% identical across much of their nucleotide sequence and the additional nucleotides of sequence present in FHA-A are largely due to a string of additional repeats near the N-terminal domain. FHA-E and FHA-F share
40% amino acid identity overall but FHA-E has a unique N-terminal region of ten 18-aa repeats as well as patches of additional sequence in the C-terminal repeat region. The close similarity seen between FHA-A and FHA-B, and FHA-E and FHA-F, respectively, suggests that that each pair arose from duplication events (more recent in the case of FHA-A and FHA-B). The remaining two FHA-like proteins are remarkably different. FHA-C contains no obvious repeats whereas FHA-D contains a unique set of 13 near-identical 81-aa repeats and one partial repeat in its C-terminal domain. We consider it likely that the FHA-like ORFs in P. aeruginosa encode a group of large surface-expressed proteins that are exported in a manner similar to Bordetella pertussis FHA, are probably antigenic and may have a role in adhesion to eukaryotic cells. These are all experimentally testable hypotheses.
Composite genetic elements
E. coli contains a class of complex genetic composites known as Rhs elements whose evolution has been studied extensively (Wang et al., 1998 ). In E. coli the Rhs element generally consists of (i) a 3·7 kb G+C-rich core encoding a large protein (composed of a core element and variable 100200 aa C-terminal core extension) thought to associate with the cell surface, (ii) a small downstream ORF, (iii) a downstream insertion sequence and sometimes (iv) one or more ORFs at a 5' position (such as vgr). Vgr is a large protein of unknown function distinguished by 1819 repetitions of a Val-Gly dipeptide occurring with an 8-residue periodicity (Wang et al., 1998
). It has been speculated that, as both Vgr and the Rhs core ORF are hydrophilic and are characterized by a regularly repeated peptide motif, both ORFs may encode ligand-binding proteins found either on the bacterial cell wall or secreted into the medium (Hill et al., 1995
). Although a precise role for these elements has yet to be determined, the degree of sequence conservation and their relative abundance (0·8% of E. coli K-12 genome) suggest that maintenance of Rhs elements provides some advantage to the host cell (Wang et al., 1998
). As far as we are aware, there has been no report of these elements in bacterial species other than E. coli, although their unusually high G+C content suggests that they may have originated elsewhere and been transferred into the E. coli genome (Wang et al., 1998
). However, they do appear to be present in other unfinished bacterial genomes (see below).
During annotation of the P. aeruginosa genome we identified several ORFs encoding proteins with significant similarity to Vgr. Using BLAST we were able to identify 10 complete homologues of Vgr in the P. aeruginosa genome (Table 1). Using the graphical viewer of our genome database we were also able to identify adjacent ORFs that, together with these Vgr-like ORFs, might be considered composite genetic elements (Table 1
). Interestingly, only one Vgr-like ORF was found to be associated with a complete Rhs element (which shares 32% identity to the E. coli Rhs core element and features a unique 149-aa core extension). The Rhs core ORF at Vgr locus I contains more than 20 repetitions of the motif YDxxGRL, concentrated (and most periodic) between residues 400 and 800, reflecting the pattern observed in the E. coli core ORF (Wang et al., 1998
). No additional Rhs core elements were identified in the genome, although the remnants of such a sequence exist at a position not linked with a Vgr ORF (
2758000). All but one of the remaining Vgr-like sequences were found to be upstream of ORFs encoding (generally large) proteins that, despite sharing no similarity, are nearly all hydrophilic proteins of unknown function and characterized by repeating peptide motifs (i.e. of a similar nature to the Vgr and the Rhs core ORF) (Table 1
). Although no other downstream ORFs exhibit the periodicity of the Rhs core ORF at Vgr-I, we consider it likely that (at least) some of these ORFs may form a composite genetic element with their associated Vgr-like ORF. Interestingly, Vgr-A, Vgr-C and Vgr-D are each associated with downstream ORFs that are present in other unfinished genome sequences, notably those genomes that also encode homologues of Rhs core ORFs and Vgr (data not shown). Some composite genetic elements in E. coli contain a small ORF, encoding a product with high similarity to Vibrio cholerae haemolysin co-regulated protein of unknown function (hcp), upstream of vgr (Williams et al., 1996
; Wang et al., 1998
). There are three hcp homologues (58% identity with both E. coli and V. cholerae Hcp) in the P. aeruginosa genome which share almost identical nucleotide sequence and are found upstream of Vgr homologues (Table 1
). Therefore, in P. aeruginosa, the Vgr-like ORFs appear to exist as part of larger composite elements, similar to the arrangement in E. coli.
|
Other features of the genome
The previous sections outline a few of the findings which have arisen from our analysis of the P. aeruginosa genome using the interactive database. All of these reveal hitherto unsuspected aspects of P. aeruginosa biology and open up new avenues of experimental investigation. There are many more, for which space does not permit a full discussion, but which include the following.
A 15 kb cluster of fix-like genes encoding a micro-oxic respiration system, similar to that used by nitrogen-fixing bacteria (Fischer, 1994
), including two adjacent homologous operons encoding membrane-bound cbb3-type cytochrome oxidase complexes (fixNOQP), an adjacent operon encoding a membrane-bound cation pump complex (fixGHIS) and genes encoding homologues of nitrogen fixation/micro-oxic regulatory cascade proteins (fixL, fixJ and fixK/anr) (see
1678000).
An almost complete copy of bacteriophage pf1 (see
784500).
A cluster of three genes encoding a homogentisate catabolism pathway that demonstrates remarkable (
50%) identity to homologues in both humans and fungi (Fernandez-Canon & Penalva, 1995
) (see
2186000).
A number of novel colicins, bacteriocins and accessory proteins (see
1066500, 3490500, 4634000 and 4245000).
Six almost identical copies of a
1300 nt insertion element encoding a 38 kDa transposon and conserved flanking regions (see
500500, 2557000, 3044000, 3843000, 4473500 and 5382000).
It is already clear that the study of any bacterium is greatly assisted by the availability of the genome sequence, which ipso facto reveals the full repertoire of its genes and its proteome, and which can vastly accelerate mutational and molecular genetic analyses. Conversely, it is also becoming apparent that the lack of a genome sequence is a great impediment to the experimental investigation of those bacteria for which the sequence is not available and renders such studies, by comparison, inefficient and incomplete, to say the least. This is amply illustrated by the case studies given in this paper. We hope that this database, integrating a variety of bioinformatic analyses and links, will be a useful platform for the ongoing genomic and molecular biological analysis of P. aeruginosa. It is anticipated that future versions of the database will be based upon the complete annotated P. aeruginosa genome when this becomes publicly available and will incorporate experimental information and linked clone resources, such as a minimal cosmid tiling path covering the genome (B. Huang and others, unpublished). The platform on which this database has been developed also makes it amenable to the generation of similar genome databases using other complete or near complete microbial genomes.
![]() |
NOTE ADDED IN PROOF |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
ACKNOWLEDGEMENTS |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aldridge, P. & Jenal, U. (1999). Cell cycle-dependent degradation of a flagellar motor component requires a novel-type response regulator. Mol Microbiol 32, 379-391.[Medline]
Alm, R. A. & Mattick, J. S. (1996). Identification of two genes with prepilin-like leader sequences required for type 4 fimbrial biogenesis in Pseudomonas aeruginosa. J Bacteriol 178, 3809-3817.[Abstract]
Alm, R. A. & Mattick, J. S. (1997). Genes involved in the biogenesis and function of type-4 fimbriae in Pseudomonas aeruginosa. Gene 192, 89-98.[Medline]
Alm, R. A., Hallinan, J. P., Watson, A. A. & Mattick, J. S. (1996). Fimbrial biogenesis genes of Pseudomonas aeruginosa pilW and pilX increase the similarity of type 4 fimbriae to the GSP protein-secretion systems and pilY1 encodes a gonococcal PilC homologue. Mol Microbiol 22, 161-173.[Medline]
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389-3402.
Aravind, L. & Ponting, C. P. (1997). The GAF domain: an evolutionary link between diverse phototransducing proteins. Trends Biochem Sci 22, 458-459.[Medline]
Aravind, L. & Ponting, C. P. (1999). The cytoplasmic helical linker domain of receptor histidine kinase and methyl-accepting proteins is common to many prokaryotic signalling proteins. FEMS Microbiol Lett 176, 111-116.[Medline]
Barenkamp, S. J. & Leininger, E. (1992). Cloning, expression, and DNA sequence analysis of genes encoding nontypeable Haemophilus influenzae high-molecular-weight surface-exposed proteins related to filamentous hemagglutinin of Bordetella pertussis. Infect Immun 60, 1302-1313.[Abstract]
Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Howe, K. L. & Sonnhammer, E. L. (2000). The Pfam protein families database. Nucleic Acids Res 28, 263-266.
Bleves, S., Voulhoux, R., Michel, G., Lazdunski, A., Tommassen, J. & Filloux, A. (1998). The secretion apparatus of Pseudomonas aeruginosa: identification of a fifth pseudopilin, XcpX (GspK family). Mol Microbiol 27, 31-40.[Medline]
Buchrieser, C., Prentice, M. & Carniel, E. (1998). The 102-kilobase unstable region of Yersinia pestis comprises a high-pathogenicity island linked to a pigmentation segment which undergoes internal rearrangement. J Bacteriol 180, 2321-2329.
Buchrieser, C., Rusniok, C., Frangeul, L., Couve, E., Billault, A., Kunst, F., Carniel, E. & Glaser, P. (1999). The 102-kilobase pgm locus of Yersinia pestis: sequence analysis and comparison of selected regions among different Yersinia pestis and Yersinia pseudotuberculosis strains. Infect Immun 67, 4851-4861.
Cope, L. D., Thomas, S. E., Latimer, J. L., Slaughter, C. A., Muller-Eberhard, U. & Hansen, E. J. (1994). The 100 kDa haem:haemopexin-binding protein of Haemophilus influenzae: structure and localization. Mol Microbiol 13, 863-873.[Medline]
Darzins, A. (1993). The pilG gene product, required for Pseudomonas aeruginosa pilus production and twitching motility, is homologous to the enteric, single-domain response regulator CheY. J Bacteriol 175, 5934-5944.[Abstract]
Darzins, A. (1994). Characterization of a Pseudomonas aeruginosa gene cluster involved in pilus biosynthesis and twitching motility: sequence similarity to the chemotaxis proteins of enterics and the gliding bacterium Myxococcus xanthus. Mol Microbiol 11, 137-153.[Medline]
Darzins, A. (1995). The Pseudomonas aeruginosa pilK gene encodes a chemotactic methyltransferase (CheR) homologue that is translationally regulated. Mol Microbiol 15, 703-717.[Medline]
Domenighini, M., Relman, D., Capiau, C., Falkow, S., Prugnola, A., Scarlato, V. & Rappuoli, R. (1990). Genetic characterization of Bordetella pertussis filamentous haemagglutinin: a protein processed from an unusually large precursor. Mol Microbiol 4, 787-800.[Medline]
Eisenbach, M. (1996). Control of bacterial chemotaxis. Mol Microbiol 20, 903-910.[Medline]
Fernandez-Canon, J. & Penalva, M. (1995). Molecular characterization of a gene encoding a homogentisate dioxygenase from Aspergillus nidulans and identification of its human and plant homologues. J Biol Chem 270, 21199-21205.
Fischer, H. M. (1994). Genetic regulation of nitrogen fixation in rhizobia. Microbiol Rev 58, 352-386.[Abstract]
Hecht, G. B. & Newton, A. (1995). Identification of a novel response regulator required for the swarmer-to-stalked-cell transition in Caulobacter crescentus. J Bacteriol 177, 6223-6229.[Abstract]
Hill, C. W., Feulner, G., Brody, M. S., Zhao, S., Sadosky, A. B. & Sandt, C. H. (1995). Correlation of Rhs elements with Escherichia coli population structure. Genetics 141, 15-24.
Hobbs, M. & Mattick, J. S. (1993). Common components in the assembly of type 4 fimbriae, DNA transfer systems, filamentous phage and protein secretion apparatus; a general system for the formation of surface-associated protein complexes. Mol Microbiol 10, 233-243.[Medline]
Holmgren, A., Kuehn, M. J., Branden, C. I. & Hultgren, S. J. (1992). Conserved immunoglobulin-like features in a family of periplasmic pilus chaperones in bacteria. EMBO J 11, 1617-1622.[Abstract]
Hultgren, S. J., Abraham, S., Caparon, M., Falk, P., St Geme, J. W. D. & Normark, S. (1993). Pilus and nonpilus bacterial adhesins: assembly and function in cell recognition. Cell 73, 887-901.[Medline]
Hung, D. L. & Hultgren, S. J. (1998). Pilus biogenesis via the chaperone/usher pathway: an integration of structure and function. J Struct Biol 124, 201-220.[Medline]
Jones, C. H., Pinkner, J. S., Nicholes, A. V., Slonim, L. N., Abraham, S. N. & Hultgren, S. J. (1993). FimC is a periplasmic PapD-like chaperone that directs assembly of type 1 pili in bacteria. Proc Natl Acad Sci USA 90, 8397-8401.
Kato, J., Nakamura, T., Kuroda, A. & Ohtake, H. (1999). Cloning and characterization of chemotaxis genes in Pseudomonas aeruginosa. Biosci Biotechnol Biochem 63, 155-161.[Medline]
Klemm, P., Jorgensen, B. J., Kreft, B. & Christiansen, G. (1995). The export systems of type 1 and F1C fimbriae are interchangeable but work in parental pairs. J Bacteriol 177, 621-627.[Abstract]
Kuroda, A., Kumano, T., Taguchi, K., Nikata, T., Kato, J. & Ohtake, H. (1995). Molecular cloning and characterization of a chemotactic transducer gene in Pseudomonas aeruginosa. J Bacteriol 177, 7019-7025.[Abstract]
Lindberg, F., Tennent, J. M., Hultgren, S. J., Lund, B. & Normark, S. (1989). PapD, a periplasmic transport protein in P-pilus biogenesis. J Bacteriol 171, 6052-6058.[Medline]
Locht, C., Bertin, P., Menozzi, F. D. & Renauld, G. (1993). The filamentous haemagglutinin, a multifaceted adhesion produced by virulent Bordetella spp. Mol Microbiol 9, 653-660.[Medline]
Martinez, A., Ostrovsky, P. & Nunn, D. (1998). Identification of an additional member of the secretin superfamily of proteins in Pseudomonas aeruginosa that is able to function in type II protein secretion. Mol Microbiol 28, 1235-1246.[Medline]
Masduki, A., Nakamura, J., Ohga, T., Umezaki, R., Kato, J. & Ohtake, H. (1995). Isolation and characterization of chemotaxis mutants and genes of Pseudomonas aeruginosa. J Bacteriol 177, 948-952.[Abstract]
Mattick, J. S. & Alm, R. A. (1995). Common architecture of type 4 fimbriae and complexes involved in macromolecular traffic. Trends Microbiol 3, 411-413.
Motallebi-Veshareh, M., Rouch, D. A. & Thomas, C. M. (1990). A family of ATPases involved in active partitioning of diverse bacterial plasmids. Mol Microbiol 4, 1455-1463.[Medline]
Parge, H. E., Forest, K. T., Hickey, M. J., Christensen, D. A., Getzoff, E. D. & Tainer, J. A. (1995). Structure of the fibre-forming protein pilin at 2·6 resolution. Nature 378, 32-38.[Medline]
Parkinson, J. S. (1993). Signal transduction schemes of bacteria. Cell 73, 857-871.[Medline]
Pestova, E. V. & Morrison, D. A. (1998). Isolation and characterization of three Streptococcus pneumoniae transformation-specific loci by use of a lacZ reporter insertion vector. J Bacteriol 180, 2701-2710.
Ponting, C. P. & Aravind, L. (1997). PAS: a multifunctional domain family comes to light. Curr Biol 7, R674-R677.[Medline]
Poole, K., Schiebel, E. & Braun, V. (1988). Molecular characterization of the hemolysin determinant of Serratia marcescens. J Bacteriol 170, 3177-3188.[Medline]
Pugsley, A. P. (1993). The complete general secretory pathway in Gram negative bacteria. Microbiol Rev 57, 50-108.[Abstract]
Read, T. D., Dowdell, M., Satola, S. W. & Farley, M. M. (1996). Duplication of pilus gene complexes of Haemophilus influenzae biogroup aegyptius. J Bacteriol 178, 6564-6570.[Abstract]
Relman, D., Tuomanen, E., Falkow, S., Golenbock, D. T., Saukkonen, K. & Wright, S. D. (1990). Recognition of a bacterial adhesion by an integrin: macrophage CR3 (alpha M beta 2, CD11b/CD18) binds filamentous hemagglutinin of Bordetella pertussis. Cell 61, 1375-1382.[Medline]
Rosario, M. M. & Ordal, G. W. (1996). CheC and CheD interact to regulate methylation of Bacillus subtilis methyl-accepting chemotaxis proteins. Mol Microbiol 21, 511-518.[Medline]
Schultz, J., Copley, R. R., Doerks, T., Ponting, C. P. & Bork, P. (2000). SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res 28, 231-234.
Semmler, A. B. T., Whitchurch, C. B. & Mattick, J. S. (1999). A re-examination of twitching motility in Pseudomonas aeruginosa. Microbiology 145, 2863-2873.
Semmler, A. B. T., Whitchurch, C. B., Leech, A. J. & Mattick, J. S. (2000). Identification of a novel gene, fimV, involved in twitching motility in Pseudomonas aeruginosa. Microbiology 146, 1321-1332.
Soto, G. E. & Hultgren, S. J. (1999). Bacterial adhesins: common themes and variations in architecture and assembly. J Bacteriol 181, 1059-1071.
Stibitz, S., Weiss, A. A. & Falkow, S. (1988). Genetic analysis of a region of the Bordetella pertussis chromosome encoding filamentous hemagglutinin and the pleiotropic regulatory locus vir. J Bacteriol 170, 2904-2913.[Medline]
Stover, C. K., Pham, X. Q., Erwin, A. L. & 28 other authors (2000). Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen. Nature 406, 959964.[Medline]
Strom, M. S. & Lory, S. (1991). Amino acid substitutions in pilin of Pseudomonas aeruginosa effect on leader peptide cleavage, amino-terminal methylation, and pilus assembly. J Biol Chem 266, 1656-1664.
Taguchi, K., Fukutomi, H., Kuroda, A., Kato, J. & Ohtake, H. (1997). Genetic identification of chemotactic transducers for amino acids in Pseudomonas aeruginosa. Microbiology 143, 3223-3229.[Abstract]
Tal, R., Wong, H. C., Calhoon, R. & 11 other authors (1998). Three cdg operons control cellular turnover of cyclic di-GMP in Acetobacter xylinum: genetic organization and occurrence of conserved domains in isoenzymes. J Bacteriol 180, 44164425.
Tommassen, J., Filloux, A., Bally, M., Murgier, M. & Lazdunski, A. (1992). Protein secretion in Pseudomonas aeruginosa. FEMS Microbiol Rev 103, 73-90.
Turner, L. R., Lara, J. C., Nunn, D. N. & Lory, S. (1993). Mutations in the consensus ATP-binding sites of XcpR and PilB eliminate extracellular protein secretion and pilus biogenesis in Pseudomonas aeruginosa. J Bacteriol 175, 4962-4969.[Abstract]
Uphoff, T. S. & Welch, R. A. (1990). Nucleotide sequencing of the Proteus mirabilis calcium-independent hemolysin genes (hpmA and hpmB) reveals sequence similarity with the Serratia marcescens hemolysin genes (shlA and shlB). J Bacteriol 172, 1206-1216.[Medline]
Wang, Y. D., Zhao, S. & Hill, C. W. (1998). Rhs elements comprise three subfamilies which diverged prior to acquisition by Escherichia coli. J Bacteriol 180, 4102-4110.
Ward, M. J. & Zusman, D. R. (1999). Motility in Myxococcus xanthus and its role in developmental aggregation. Curr Opin Microbiol 2, 624-629.[Medline]
Wheeler, R. T. & Shapiro, L. (1997). Bacterial chromosome segregation: is there a mitotic apparatus? Cell 88, 577-579.[Medline]
Willems, R. J., van der Heide, H. G. & Mooi, F. R. (1992). Characterization of a Bordetella pertussis fimbrial gene cluster which is located directly downstream of the filamentous haemagglutinin gene. Mol Microbiol 6, 2661-2671.[Medline]
Willems, R. J., Geuijen, C., van der Heide, H. G., Renauld, G., Bertin, P., van den Akker, W. M., Locht, C. & Mooi, F. R. (1994). Mutational analysis of the Bordetella pertussis fim/fha gene cluster: identification of a gene with sequence similarities to haemolysin accessory genes involved in export of FHA. Mol Microbiol 11, 337-347.[Medline]
Williams, S. G., Varcoe, L. T., Attridge, S. R. & Manning, P. A. (1996). Vibrio cholerae Hcp, a secreted protein coregulated with HlyA. Infect Immun 64, 283-289.[Abstract]
Wu, H., Kato, J., Kuroda, A., Ikeda, T., Takiguchi, N. & Ohtake, H. (2000). Identification and characterization of two chemotactic transducers for inorganic phosphate in Pseudomonas aeruginosa. J Bacteriol 182, 3400-3404.
Yahr, T. L., Goranson, J. & Frank, D. W. (1996). Exoenzyme S of PseudomoNas aeruginosa is secreted by a type III pathway. Mol Microbiol 22, 991-1003.[Medline]
Received 15 May 2000;
revised 10 July 2000;
accepted 13 July 2000.