Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, The Technical University of Denmark, DK-2800 Kgs, Lyngby, Denmark
Correspondence
David W. Ussery
(dave{at}cbs.dtu.dk)
Genomes of the month
Twelve new microbial genomes have been published since the last Genome Update column was written. The collection of this month's genomes, listed in Table 1, include five published bacterial genomes (Candidatus Blochmannia pennsylvanicus, Colwellia psychrerythraea, Mycoplasma hyopneumoniae, Mycoplasma synoviae and Pseudomonas syringae pv. syringae B728a). An additional four bacterial genomes have been deposited in GenBank (Candidatus Pelagibacter ubique, Pseudomonas syringae pv. phaseolicola 1448A, Psychrobacter arcticum and Staphylococcus haemolyticus). Furthermore, three protozoan genomes have been published (Leishmania major, Trypanosoma cruzi and Trypanosoma brucei).
|
|
Candidatus Blochmannia pennsylvanicus' is an endosymbiont of ants and is related to other insect mutualists (e.g. Buchnera of aphids and Wigglesworthia of tsetse flies). All of these are -Proteobacteria which have undergone genome reduction and have stable genome organization with low levels of rearrangement. Degnan et al. (2005)
have sequenced the Candidatus B. pennsylvanicus' BPEN genome and have taken advantage of this stability of endosymbiont genomes to compare the rate of gene evolution. Overall they found a 10- to 50-fold faster amino acid substitution rate for Candidatus Blochmannia, compared to other bacteria. However, there is a strong conservation of gene order and strand orientation of the genes of Candidatus Blochmannia species, so the underlying architecture of the chromosome is stable, even though the amino acid sequences are changing.
Colwellia psychrerythraea 34H is the type species of the genus Colwellia and is a good model organism for the study of life in permanently cold environments; although members of the -Proteobacteria group, Colwellia species are strictly psychrophilic and thus require temperatures of less than 20 °C. C. psychrerythraea prefers temperatures ranging from 1 °C to +10 °C, but can grow at lower temperatures in sugar solutions or under deep-sea pressures. A recent comparative genomic analysis of the 5·4 Mbp sequenced genome suggests that a collection of synergistic changes in the overall genome content and amino acid composition supports its psychrophilic lifestyle (Methé et al., 2005
). An increase in polar residues (in particular serine), the favouring of aspartate over glutamate and a decrease in charged surface residues could all be likely to enhance architectural changes in enzymes, resulting in increased effectiveness at cold temperatures. In addition, of the 4937 predicted CDSs, many are likely to confer cryotolerance, e.g. polyhydroxyalkanoates (PHA) that may also aid in pressure adaptation, extracellular polysaccharides that can serve as cryoprotectants, genes involved in synthesis of branched membrane lipids that reduce membrane viscosity at cold temperatures and cold-shock proteins (Methé et al., 2005
).
Mycoplasma hyopneumoniae is the aetiological agent of swine enzootic pneumonia. It is a major threat to swine health and is responsible for great economic damage every year in the swine industry. Mycoplasma synoviae is a poultry pathogen causing respiratory disease and synovitis. Vasconcelos et al. (2005) compared and analysed three complete Mycoplasma genomes, two M. hyopneumoniae strains (strain 7448, a pathogenic strain, and strain J, a non-pathogenic strain) and one strain of the avian pathogen M. synoviae (strain 53). To examine different aspects of Mycoplasma evolution these genomes were also compared with eight other available Mycoplasma genome sequences. Genomic comparison revealed strain-specific regions, high rates of genomic rearrangements, alterations in adhesin sequences and possible horizontal gene transfer between M. synoviae and Mycoplasma gallisepticum (Vasconcelos et al., 2005
).
Pseudomonas syringae is a widespread bacterial pathogen of many plant species. This month, a second P. syringae genome has been published (P. syringae pv. syringae B728a; Feil et al., 2005) and the sequence of a third strain deposited in GenBank (P. syringae pv. phaseolicola 1448A). As can be seen in Table 1
, the P. syringae pv. syringae B728a genome consists of only one circular chromosome of 6·1 Mb, with no plasmid, whilst the P. syringae pv. phaseolicola 1448A genome is a bit smaller (5·9 Mbp for the main chromosome) with an additional two plasmids. It is beyond the scope of this article to compare the three P. syringae genomes, but Table 2
lists the number of sigma factors (Kill et al., 2005
) and the number of two-component regulatory systems for all eight Pseudomonas genomes that have been sequenced so far. It is worth noting that, although the number of ECF sigma factors in all three P. syringae genomes is only about half that found in other Pseudomonas species (e.g. 10 vs about 20), the number of histidine kinases is about the same, roughly 60, and the number of response regulators (8589) in the three P. syringae genomes is also close to that found in the other Pseudomonas genomes.
|
While L. major probably has genes permitting it to invade white blood cells (Ivens et al., 2005), T. brucei can evade the immune system by creating a smokescreen of millions of molecules on its surface (Berriman et al., 2005
); this is made by combining fragmented products of pseudogenes. Due to the huge genetic diversity, the sequencing of T. cruzi was especially difficult. Only with data from the other two genomes, combined with special build tools based on the defined nucleotide positions (DNP) method, was it possible for the T. cruzi genome to be assembled (El-Sayed et al., 2005b
).
Method of the month distributions of two-component transduction systems in bacterial genomes
Two-component signal transduction systems (sometimes simply referred to as two-component systems or here abbreviated to TCSs) are found in both prokaryotes and some eukaryotes. They provide an elegant system for signal transduction of extracellular signals. They are composed of a sensor histidine kinase (HK), which is typically membrane-spanning, and a response regulator (RR). When the HK receives a signal it autophosphorylates its HK domain and the phosphate group is then transferred to the receiver domain of the RR, thus activating the response regulator.
To recognize the RR we used a profile HMM (Hidden Markov model), downloaded from the Pfam database of protein families and HMMs (http://pfam.wustl.edu/), which targets the receiver domain. The HK domains are more diverse and four different HMMs are found in the Pfam database, which will recognize different classes. Table 2 details the HKs and RRs found in this month's bacterial genomes. For the Mycoplasma genomes, there are no TCSs detected by our models. There are considerably more TCSs in the Pseudomonas genomes and, as can be seen from the table, this is generally true for all the Pseudomonas genomes sequenced so far. Compared to the phyla mean plotted in Fig. 1
, the Candidatus Pelagibacter ubique, Psychrobacter arcticum and Staphylococcus haemolyticus genomes have very few TCSs. Of course, at this stage we do not know really what to expect, since most of the bacteria sequenced are still from a fairly narrow taxonomic range.
|
In the Firmicutes we find no HKs or RRs in the genera Mycoplasma, Ureaplasma and Phytoplasma; although it is not clear why this might be, it is worth noting that all of these organisms lack a cell wall. If the TCSs are responsible for sensing changes in the external environment, perhaps there is not a need for this in constant environments which change little. The Firmicutes with the most TCSs are members of the Bacillus and Clostridium genera. In the Proteobacteria (the phylum with the most genomes) we also find organisms that lack TCSs completely, for example in many of the reduced genomes of endosymbionts (i.e. there are no TCSs in Candidatus Blochmannia, Buchnera and Ehrlichia canis). Wigglesworthia glossinidia and Chlorochromatium aggregatum both lack HKs, but have a single RR. On the other hand, there are some Proteobacteria (often environmental organisms) with a quite large number of TCSs. For example, Dechloromonas aromatica and the Pseudomonas species, as well as genomes from Bradyrhizobium, Geobacter, Burkholderia and Rhizobium species. The Spirochaetes have a very large variance, with only the Leptospira in the high end; the rest (i.e. Treponema and Borrelia) have few TCSs.
Based on the results of the 250 genomes, it seems that the number of HKs and RRs are generally closely linked for most of the genomes, as can be seen in Fig. 1. In spite of the scale difference (larger scale for RRs), the ranges shown in the box and whisker plots look quite similar in both diagrams. Free-living bacteria have around 20 TCSs on average, although environmental bacteria can have considerably more. The Bacteroides have about 50 TCSs per genome, the largest mean number of TCSs for a phylum. It is hoped that the next 250 genomes sequenced will be more reflective of the biological diversity around us, and perhaps will help us to get a better grasp of what range of numbers to expect for a given phylum.
In addition to just counting the number of HKs and RRs in a given genome, the question remains as to whether all (or most) of the two-component systems are genes in the same operon, as classically described. In summary, we find that the majority of HKs and RRs are not found to be in the same operon (e.g. within 2000 bp of each other, in the same direction), although most of the time they are found to occur within clusters of 15 000 bp (see Fig. 2). In addition, there are some orphan RRs, which are found in isolated places throughout the genome, outside the clusters containing HKs and RRs.
|
Acknowledgements
This work was supported by a grant from the Danish Center for Scientific Computing. We also thank the Sanger Centre (www.sanger.ac.uk/Projects/) and the Joint Genomes Initiative (http://genome.jgi-psf.org/finished_microbes/) for making their genome sequences and preliminary annotations available to the public.
REFERENCES
Berriman, M., Ghedin, E., Hertz-Fowler, C. & 99 other authors (2005). The genome of the African trypanosome Trypanosoma brucei. Science 309, 416422.
Degnan, P. H., Lazarus, A. B. & Wernegreen, J. J. (2005). Genome sequence of Blochmannia pennsylvanicus indicates parallel evolutionary trends among bacterial mutualists of insects. Genome Res 15, 10231033.
El-Sayed, N. M., Myler, P. J., Blandin, G. & 42 other authors (2005a). Comparative genomics of trypanosomatid parasitic protozoa. Science 309, 404409.
El-Sayed, N. M., Myler, P. J., Bartholomeu, D. C. & 79 other authors (2005b). The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. Science 309, 409415.
Feil, H., Feil, W. S., Chain, P. & 17 other authors (2005). Comparison of the complete genome sequences of Pseudomonas syringae pv. syringae B728a and pv. tomato DC3000. Proc Natl Acad Sci U S A 102, 1106411069.
Ivens, A. C., Peacock, C. S., Worthey, E. A. & 98 other authors (2005). The genome of the kinetoplastid parasite, Leishmania major. Science 309, 436442.
Kill, K., Binnewies, T. T., Sicheritz-Pontén, T., Willenbrock, H., Hallin, P. F., Wassenaar, T. M. & Ussery, D. W. (2005). Genome update: sigma factors in 240 bacterial genomes. Microbiology 151, 31473150.[CrossRef][Medline]
Margulies, M., Egholm, M., Altman, W. E. & 53 other authors (2005). Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376380.[Medline]
Methé, B. A., Nelson, K. E., Deming, J. W. & 24 other authors (2005). The psychrophilic lifestyle as revealed by the genome sequence of Colwellia psychrerythraea 34H through genomic and proteomic analyses. Proc Natl Acad Sci U S A 102, 1091310918.
Pennisi, E. (2005). Biochemistry. Cut-rate genomes on the horizon? Science 309, 862.
Shendure, J., Porreca, G. J., Reppas, N. B. & 7 other authors (2005). Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 17281732.
Van Domselaar, G. H., Stothard, P., Shrivastava, S. & 7 other authors (2005). BASys: a web server for automated bacterial genome annotation. Nucleic Acids Res 33, W455W459.
Vasconcelos, A. T., Ferreira, H. B., Bizarro, C. V. & 83 other authors (2005). Swine and poultry pathogens: the complete genome sequences of two strains of Mycoplasma hyopneumoniae and a strain of Mycoplasma synoviae. J Bacteriol 187, 55685577.
HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
J MED MICROBIOL | ALL SGM JOURNALS |