* School of Biotechnology & Biomolecular Sciences, University of New South Wales, Australia; Molecular & Computational Biology, University of Southern California, Los Angeles, California;
Stanford University School of Medicine, Stanford, California
Correspondence: E-mail: m.tanaka{at}unsw.edu.au.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Mycobacterium tuberculosis molecular epidemiogy insertion sequence transposition rate regulation Akaike information criterion
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In fact, many mechanisms of regulation have been described for various families of IS elements. One of the better-characterized elements in prokaryotes is IS10, in which several different mechanisms have been found to regulate transposition. These include the transcription of an antisense RNA that blocks translation of the transposase, dam-mediated methylation of the element, and the action of host factors IHF and HU (Kleckner et al. 1996). IS3 and IS911 are of particular interest here because they belong to the same group of IS elements as IS6110 (Fayet et al. 1990; McAdam et al. 1990). The IS3 transposase is inhibited by proteins (OrfA and OrfB) that are alternatively expressed through a single-base frameshift during translation of the IS3 message (Sekine, Eisaki, and Ohtsubo 1994). Similarly, it has been shown that IS911 produces a repressor that competes with the transposase (Haren et al. 2000). Additionally, IS911 makes use of alternative promoters (PIRL and Pjunc) to modulate transposition rates: the stronger promoter (Pjunc) is formed only transiently during transposition (Duval-Valentin et al. 2001). Taken together, these studies highlight the great diversity of mechanisms of regulation operating on IS elements. As it is likely that additional mechanisms have yet to be discovered, it is difficult to determine a priori which, if any, regulation mechanisms exist for a given element.
Although IS6110 is widely used as a genetic marker in the molecular epidemiology of tuberculosis, the details of transposition in IS6110 are poorly understood. However, some progress has been made by conducting manipulative experiments of IS6110 in the related species M. smegmatis. We now know that particular insertions can alter the expression of nearby genes (Safi et al. 2004), and that transposition rates depend on the genetic background and environmental factors: they are stimulated by the presence of nearby promoters (Wall et al. 1999) and by microaerobic exposure (Ghanekar et al. 1999).
In the epidemiological setting, it is important to know the rate at which the marker changes in vivo in order to make inferences about the speed of transmission of the infectious agent (Yeh et al. 1998; de Boer et al. 1999; Tanaka and Rosenberg 2001). It is also important to know if the rate of change is a function of other factors (e.g., Eilers et al. 2004). In the case of insertion sequences, if each copy of the element in a given genome acts independently, then the rate of change (the transposition rate, or more precisely, the substitution rate for IS element genotypic profiles) is a linear function of copy number (Rosenberg, Tsolaki, and Tanaka 2003; Tanaka and Rosenberg 2001). Departures from independence will be reflected in departures from linearity in this relationship.
We offer a novel approach to the detection of control of IS elements, applicable to naturally occurring strains of pathogens. We statistically quantify the copy number control of the insertion sequence IS6110 in M. tuberculosis. Our method is potentially applicable to other IS elements and prokaryotes, although we have not found data sets of the appropriate kind.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In Figure 1, for the 303 serial isolates, we plot time intervals between repeated visits as a function of copy number. Serial samples that involve a change in the IS6110 fingerprint are distinguished from those that do not involve a change. It appears from this plot that strains of intermediate copy number (say 7 to 17 copies) are less stable than strains of low and high copy number. The apparent variation in stability across copy number is likely due at least in part to the heavier sampling of strains of intermediate copy number, as depicted in Figure 4b of Rosenberg, Tsolaki, and Tanaka (2003). One of our goals is to explore this issue quantitatively in order to resolve this ambiguity.
|
We start with a general model providing the probability of a fingerprint with a given copy number k changing within a patient in a given time period t. Let p be the set of parameters of a particular model.
We assume that negative selection results from lethal effects of transposition. We further assume that there is no cost to simply carrying the element, so that a mutation will be selectively neutral or advantageous in the cellular population within the host, provided it survived the transposition event. The negligible metabolic burden of carrying multiple copies of the element justifies this assumption. Let the probability that a mutant survives the effects of a transposition event be (k, p) and let the probability of a mutant reaching fixation given that it survives transposition be u. Then, if transposition follows a Poisson process with transposition rate
g(k, p) per genome, substitution follows a marked Poisson process, and is therefore Poisson with rate
g(k, p)
(k, p)u. Note that both the transposition and selection functions may depend on copy number k. Henceforth we omit the parameter u and let it be subsumed by the transposition function so that the change rate (
, to be described in more detail later) describes the overall substitution process.
Analogously to the model of Rosenberg, Tsolaki, and Tanaka (2003) with "change resolution" and "frequent sampling," the probability w of a change being observed during time interval t is
![]() | (1) |
Letting Gi indicate whether the ith sample in the data corresponds to a changed fingerprint, the likelihood of the parameters given the data is
![]() | (2) |
![]() |
Selecting from a Set of Models |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Models of IS Element Change
We investigate four models of transposition. Each model may include or exclude selection against the element, and we explore two ways to model selection. The combinations of the transposition and selection models result in 4 x 3 = 12 different models of genetic marker change. We describe below the components of all of these models, and depict the relationships among all twelve models in Figure 2.
|
![]() |
We now turn to the models of selection.
![]() |
![]() |
The various combinations of the models will also be labeled according to the numbering of the components given above. For example, the model involving sharing of transposase with copy interaction is labeled 2.3.
Model Selection and Statistics
As described earlier in the General Framework section, we use the various models established in the previous section to find the likelihood values given by equation (2). These likelihoods are then used to derive the AIC values for the different models. Table 1 shows the results of our model-selection analysis; using the estimates for the three best and two worst models, we plot the transposition function w in Figure 3. The model in which transposase is shared by all copies of the element in the genome, combined with negative selection against the element via copy interaction, best explains the data, with an Akaike weight of around 45%. Note that the top six models, with a total weight of 97%, all include selection in some form. Although the best two models produce very similar results, as shown in Figure 3, Sharing + Copy Interaction has the lower AIC value because of its ability to explain the data more parsimoniously (with one less parameter).
|
|
Hypothesis testing
Hypotheses regarding whether one of a pair of models has a "significantly better" fit than the other can readily be tested. Because there is some nesting of the models (as shown in Figure 2), we can use likelihood ratio tests (LRTs) to compare certain pairs of models. The test statistic values for LRTs can be extracted from the AIC values in Table 1. Table 2 shows the results of all 13 tests. All but four tests gave rise to significant refinements to the simpler model.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Notably, the best six models in Table 1 all involve negative selection against the element. Even if copy interaction is not the actual cause of selection, it is likely that some form of negative selection is acting on the element. It is also interesting that the best two models both involve transposase sharing. This supports the hypothesis that separate copies of the element do not operate independently. In agreement with the findings of van der Spuy et al. (2003), the current evidence suggests that the linear model of transposition (Independent) should be replaced for IS6110; our analysis offers some alternatives.
Transposition Rate as a Function of Copy Number
This study concerns transposition rates as functions of copy number, a topic of relevance to molecular epidemiology and molecular evolution. However, the body of research on molecular mechanisms regulating transposition rate does not often consider the effect of other copies. If the molecular mechanisms are to be effective in regulating the expansion of insertion sequences, they should preferentially lower transposition rates in high-copy strains. If molecular mechanisms simply slow down the transposition rate regardless of copy number, the uncontrolled proliferation of copies may be temporarily retarded, but in the long run, the copy numbers may increase to extreme levels.
It is also important to consider transposition rates changing as a function of genetic and environmental factors. For instance, there may be location effectsrates may depend on insertion position in the genome; transposition may occur as a "stress response." Another way to put this is that there could be heterogeneities in the rate over space and time. It is possible that the isolates of intermediate copy number in our sample represent a set of strains that are predisposed to change.
The examination of the mobility of IS elements as a function of copy number raises the question of what will happen to the population of insertion sequences in a species in the long term. Will IS6110 go extinct in the long term or will it persist? The persistence of IS6110 may be allowed by a balance between element replication and negative selection against copies, as suggested by our analysis. There may be occasional beneficial effects produced by the element. Although the element probably does not move between bacterial cells at a pace rapid enough to escape its destructive effects, the long-term rate of (possibly trans-specific) horizontal transfer may be sufficient to ensure survival (Bergstrom, Lipsitch, and Levin 2000). The extinction of IS6110 is also a possible long-term outcome. There is no a priori reason to expect a family of IS elements to evolve strategies to create "safe" equilibrium distributions of copy number. The peaked distributions of IS6110 copy number suggest that the dynamics of the element are out of equilibrium, which may reflect a transient presence of the element in M. tuberculosis (Tanaka et al. 2000).
It is not known whether elements other than IS6110 follow the same process of copy number control suggested by our analysis, but the analysis could easily be adapted for other data sets. It should be possible to design experiments using well-characterised elements and host species (e.g., IS3 or IS10 in E. coli) to study a range of alternative models as done here.
Genomic Conflict: Something to Fight About?
Two main alternative views exist about the relationship between IS elements (and other mobile genes) and the rest of the genome, which can be discussed in terms of the metaphor of genomic conflict. First, insertion sequences might be selfish, implying that they replicate within genomes despite causing deleterious effects in the host genome. According to this metaphor, it is in the evolutionary interest of the insertion sequence to increase its replication rate, whereas in contrast, the genome should do the oppositenamely, down-regulate the rate of transposition. A second and opposing view is that the genome is a well-coordinated system that has resolved most conflicts or inefficiencies. That is, insertion sequences have a role in the genome to produce beneficial effects aligned with the interests of the rest of the genome.
Although the results of this study favor the first view, the two views are not mutually exclusive. The evolution of mechanisms that regulate copy number effectively would benefit both host and element in the case of organisms that undergo little genetic exchange, such as M. tuberculosis. Furthermore, it is likely that insertion sequences, like all mutation rate modifiers, produce both beneficial and deleterious effects as they undergo transposition (Chao et al. 1983). In the context of pathogenic bacteria, an important example of the adaptive role of insertion sequences is their complicity in the acquisition of antibiotic resistance genes and virulence factors. As the workings of bacterial genomes are unraveled, we will need to assess the role of IS elements: how they affect genome organization and give rise to genetic innovation.
Molecular Epidemiology and IS Elements
Insertion sequences have been widely exploited for genotyping bacterial pathogens, many of which have little variation at individual nucleotides. The mycobacterial insertion sequence IS6110 exhibits great variability in both copy number and genomic location (Hermans et al. 1990; McAdam et al. 1990; Stanley and Saunders 1996), making it a valuable tool for studying tuberculosis. IS6110-based genotyping is the most widely used marker for molecular epidemiologic studies that have provided fundamental insights into the contemporary transmission and pathogenesis of tuberculosis (Small et al. 1994).
In order to use any genetic marker rationally, however, we must know something about its underlying biology. For example, if a marker changes very slowly, clusters of identical genotypes overestimate the severity of disease transmission, whereas if it evolves very fast, clusters will differentiate quickly and an outbreak may be underestimated.
In the analysis of clusters of IS6110-based genotypes, it is important to recognise that strains with different copy numbers evolve at different rates. This study demonstrates statistically that strains with intermediate copy numbers (717) are substantially less stable than strains of low and high copy numbers. Thus, for intermediate copy numbers, more permissive definitions of clusters might be used.
Insertion Sequences and Error Catastrophe
We tentatively raise an intriguing medical implication following from an understanding of IS element control. Our analysis demonstrates the presence of negative selection against IS6110 increasing with copy number. If it is possible to increase the rate of transposition, sufficient damage may be caused to the genome to lead to the demise of the bacterial host. Hence, it may be possible to develop a drug treatment that targets IS6110 by interfering with its regulation within M. tuberculosis. One advantage of such a drug for tuberculosis would be its specificity to bacterial transposition. A potential difficulty, as with many antibacterials, is the probable evolution of resistance.
Although a treatment of this kind is not likely to be soon attainable, there is a precedent for this idea in anti-viral therapy. Ribavirin works by elevating the mutation rate beyond the "catastrophe threshold" such that the viral population is no longer viable (Crotty et al. 2000). Also related is the phenomenon of hybrid dysgenesis in Drosophila caused by P elements (Kidwell, Kidwell, and Sved 1977), in which the removal of transposase repression leads to elevated transposition rates and consequently to deleterious effects to the genome.
![]() |
Appendix A: Standard Errors |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
The (a, b)th element of the information matrix for the single observation is given by
![]() | (3) |
![]() | (4) |
![]() | (5) |
The Fisher information matrix I is computed by constructing the matrix with elements (a, b) given by the sum over all data points:
![]() | (6) |
The variance-covariance matrix is the inverse of this matrix, which can readily be evaluated numerically.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bergstrom, C. T., M. Lipsitch, and B. R. Levin. 2000. Natural selection, infectious transfer and the existence conditions for bacterial plasmids. Genetics 155:15051519.
Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. Springer-Verlag, New York.
Calabrese, P., and R. Durrett. 2003. Dinucleotide repeats in the Drosophila and human genomes have complex, length-dependent mutation processes. Mol. Biol. Evol. 20:715725.
Chao, L., C. Vargas, B. B. Spear, and E. C. Cox. 1983. Transposable elements as mutator genes in evolution. Nature 303:633635.[ISI][Medline]
Charlesworth, B., and D. Charlesworth. 1983. The population dynamics of transposable elements. Genet. Res. 42:127.[ISI]
Crotty, S., D. Maag, J. Arnold, W. Zhong, J. Lau, Z. Hong, R. Andino, and C. Cameron. 2000. The broad-spectrum antiviral ribonucleoside ribavirin is an RNA virus mutagen. Nat. Med. 6:13751379.[CrossRef][ISI][Medline]
De Boer, A. S., M. W. Borgdorff, P. E. W. de Haas, N. J. D. Nagelkerke, J. D. A. van Embden, and D. van Soolingen. 1999. Analysis of rate of change of IS6110 RFLP patterns of Mycobacterium tuberculosis based on serial patient isolates. J. Infect. Dis. 180:12381244.[CrossRef][ISI][Medline]
Doolittle, W. F., and C. Sapienza. 1980. Selfish genes, the phenotype paradigm and genome evolution. Nature 284:601603.[ISI][Medline]
Duval-Valentin, G., C. Normand, V. Khemici, B. Marty, and M. Chandler. 2001. Transient promoter formation: a new feedback mechanism for regulation of IS911 transposition. EMBO J. 20:58025811.
Eilers, P. H. C., D. V. Soolingen, N. T. N. Lan, R. M. Warren, and M. W. Borgdorff. 2004. Transposition rates of Mycobacterium tuberculosis IS6110 restriction fragment length polymorphism patterns. J. Clin. Microbiol. 42:24612464.
Fayet, O., P. Ramond, P. Polard, M. F. Prere, and M. Chandler. 1990. Functional similarities between retroviruses and the IS3 family of bacterial insertion sequences?. Mol. Microbiol. 4:17711777.[ISI][Medline]
Ghanekar, K., A. McBride, O. Dellagostin, S. Thorne, R. Mooney, and J. McFadden. 1999. Stimulation of transposition of the Mycobacterium tuberculosis insertion sequence IS6110 by exposure to a microaerobic environment. Mol. Microbiol. 33:982993.[CrossRef][ISI][Medline]
Gray, Y. H. M., M. M. Tanaka, and J. A. Sved. 1996. P-element-induced recombination in Drosophila melanogaster: hybrid element insertion. Genetics 144:16011610.
Haren, L., C. Normand, P. Polard, R. Alazard, and M. Chandler. 2000. IS911 transposition is regulated by protein-protein interactions via a leucine zipper motif. J. Mol. Biol. 296:757768.[CrossRef][ISI][Medline]
Hermans, P. W., D. van Soolingen, J. W. Dale, A. R. Schuitema, R. A. McAdam, D. Catty, and J. D. van Embden. 1990. Insertion element IS986 from Mycobacterium tuberculosis: a useful tool for diagnosis and epidemiology of tuberculosis. J. Clin. Microbiol. 28:20512058.[ISI][Medline]
Kidwell, M. G., J. F. Kidwell, and J. A. Sved. 1977. Hybrid dysgenesis: a syndrome of aberrant traits including mutation, sterility and male recombination. Genetics 86:813833.
Kleckner, N., R. M. Chalmers, D. Kwon, J. Sakai, and S. Bolland. 1996. Tn10 and IS10 transposition and chromosome rearrangements: mechanism and regulation in vivo and in vitro. Curr. Top. Microbiol. Immunol. 204:4982.[ISI][Medline]
Langley, C. H., E. Montgomery, R. Hudson, N. Kaplan, and B. Charlesworth. 1988. On the role of unequal exchange in the containment of transposable element copy number. Genet. Res. 52:223235.[ISI][Medline]
McAdam, R. A., P. W. Hermans, D. van Soolingen, Z. F. Zainuddin, D. Catty, J. D. van Embden, and J. W. Dale. 1990. Characterization of a Mycobacterium tuberculosis insertion sequence belonging to the IS3 family. Mol. Microbiol. 4:16071613.[ISI][Medline]
Niemann, S., E. Richter, and S. Rusch-Gerdes. 1999. Stability of Mycobacterium tuberculosis IS6110 restriction fragment length polymorphism patterns and spoligotypes determined by analyzing serial isolates from patients with drug-resistant tuberculosis. J. Clin. Microbiol. 37:409412.
Orgel, L. E. and F. H. C. Crick. 1980. Selfish DNA: the ultimate parasite. Nature 284:604607.[ISI][Medline]
Rosenberg, N. A., A. G. Tsolaki, and M. M. Tanaka. 2003. Estimating change rates of genetic markers using serial samples: applications to the transposon IS6110 in Mycobacterium tuberculosis. Theor. Popul. Biol. 63:347363.[CrossRef][ISI][Medline]
Safi, H., P. F. Barnes, D. L. Lakey, H. Shams, B. Samten, R. Vankayalapati, and S. T. Howard. 2004. IS6110 functions as a mobile, monocyte-activated promoter in Mycobacterium tuberculosis. Mol. Microbiol. 52:9991012.[CrossRef][ISI][Medline]
Sawyer, S., D. E. Dykhuizen, R. F. DuBose, L. Green, T. Mutagadura-Mhlanga, D. F. Wolczyk, and D. L. Hartl. 1987. Distribution and abundance of insertion sequences among natural isolates of Escherichia coli. Genetics 115:5163.
Sawyer, S. and D. Hartl. 1986. Distribution of transposable elements in prokaryotes. Theor. Popul. Biol. 30:116.[ISI][Medline]
Sekine, Y., N. Eisaki, and E. Ohtsubo. 1994. Translational control in production of transposase and in transposition of insertion sequence IS3.. J. Mol. Biol. 235:14061420.[CrossRef][ISI][Medline]
Small, P. M., P. C. Hopewell, S. P. Singh, A. Paz, J. Parsonnet, D. C. Ruston, G. F. Schecter, C. L. Daley, and G. K. Schoolnik. 1994. The epidemiology of tuberculosis in San Francisco: A population-based study using conventional and molecular methods. N. Engl. J. Med. 330:17031709.
Stanley, J., and N. Saunders. 1996. DNA insertion sequences and the molecular epidemiology of Salmonella and Mycobacterium. J. Med. Microbiol. 45:236251.[Abstract]
Tanaka, M. M., and N. A. Rosenberg. 2001. Optimal estimation of transposition rates of insertion sequences for molecular epidemiology. Stat. Med. 20:24092420.[CrossRef][ISI][Medline]
Tanaka, M. M., P. M. Small, H. Salamon, and M. W. Feldman. 2000. The dynamics of repeated elements: applications to the epidemiology of tuberculosis. Proc. Natl. Acad. Sci. U. S. A. 97:35323537.
Van der Spuy, G. D., R. M. Warren, M. Richardson, N. Beyers, M. A. Behr, and P. D. van Helden. 2003. Use of genetic distance as a measure of ongoing transmission of Mycobacterium tuberculosis. J. Clin. Microbiol. 41:56405644.
Van Embden, J. D. A., M. D. Cave, J. T. Crawford, J. W. Dale, K. D. Eisenach, B. Gicquel, P. Hermans, C. Martin, R. McAdam, T. M. Shinnick, et al. 1993. Strain identification of Mycobacterium tuberculosis by DNA fingerprintingrecommendations for a standardized methodology. J. Clin. Microbiol. 31:406409.[Abstract]
Wall, S., K. Ghanekar, J. McFadden, and J. W. Dale. 1999. Context-sensitive transposition of IS6110 in Mycobacteria. Microbiology 145(Pt 11):31693176.[ISI][Medline]
Yeh, R. W., A. Ponce De Leon, C. B. Agasino, J. A. Hahn, C. L. Daley, P. C. Hopewell, and P. M. Small. 1998. Stability of Mycobacterium tuberculosis DNA genotypes. J. Infect. Dis. 177:11071111.[ISI][Medline]
|