Discrimination of Mycobacterium tuberculosis complex bacteria using novel VNTR-PCR targets

Robin A. Skuce1, Thomas P. McCorry2, Julie F. McCarroll1, Solvig M. M. Roring2, Alistair N. Scott2, David Brittain1, Stephen L. Hughes3, R. Glyn Hewinson3 and Sydney D. Neill1,2

Veterinary Sciences Division, Department of Agriculture and Rural Development, Stormont, Belfast BT4 3SD, UK1
The Queen’s University of Belfast, Department of Veterinary Science, Stormont, Belfast BT4 3SD, UK2
Veterinary Laboratories Agency, Department for Environment, Food and Rural Affairs, Weybridge, Surrey KT15 3NB, UK3

Author for correspondence: Robin A. Skuce. Tel: +44 28 90 525771. Fax: +44 28 90 525745. e-mail: Robin.Skuce{at}dardni.gov.uk


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
The lack of a convenient high-resolution strain-typing method has hampered the application of molecular epidemiology to the surveillance of bacteria of the Mycobacterium tuberculosis complex, particularly the monitoring of strains of Mycobacterium bovis. With the recent availability of genome sequences for strains of the M. tuberculosis complex, novel PCR-based M. tuberculosis-typing methods have been developed, which target the variable-number tandem repeats (VNTRs) of minisatellite-like mycobacterial interspersed repetitive units (MIRUs), or exact tandem repeats (ETRs). This paper describes the identification of seven VNTR loci in M. tuberculosis H37Rv, the copy number of which varies in other strains of the M. tuberculosis complex. Six of these VNTRs were applied to a panel of 100 different M. bovis isolates, and their discrimination and correlation with spoligotyping and an established set of ETRs were assessed. The number of alleles varied from three to seven at the novel VNTR loci, which differed markedly in their discrimination index. There was positive correlation between spoligotyping, ETR- and VNTR-typing. VNTR-PCR discriminates well between M. bovis strains. Thirty-three allele profiles were identified by the novel VNTRs, 22 for the ETRs and 29 for spoligotyping. When VNTR- and ETR-typing results were combined, a total of 51 different profiles were identified. Digital nomenclature and databasing were intuitive. VNTRs were located both in intergenic regions and annotated ORFs, including PPE (novel glycine-asparigine-rich) proteins, a proposed source of antigenic variation, where VNTRs potentially code repeating amino acid motifs. VNTR-PCR is a valuable tool for strain typing and for the study of the global molecular epidemiology of the M. tuberculosis complex. The novel VNTR targets identified in this study should additionally increase the power of this approach.

Keywords: bovine tuberculosis, Mycobacterium bovis, tandem repeat DNA, spoligotyping, molecular epidemiology

Abbreviations: DR, direct repeat; ETR, exact tandem repeat; HGDI, Hunter–Gaston discrimination index; MIRU, mycobacterial interspersed repetitive units; PPE, novel glycine-asparagine-rich; QUB, Queen’s University Belfast; UPGMA, unweighted pair group method with arithmetic means; VNTR, variable-number tandem repeat


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Bovine tuberculosis, caused by Mycobacterium bovis, is considered to be a barrier to trade and continues to pose a significant problem for the agricultural economies of many countries (Clifton-Hadley & Wilesmith, 1995 ). Progress towards eradication of this disease has been hampered by a lack of precise epidemiological data. Epidemiology suggests that bovine tuberculosis is multifactorial. Key issues that need to be addressed include quantifying the relative importance of interbovine transmission and the role of wild and feral animals in disease maintenance and transmission, as well as husbandry and environmental factors (Goodchild & Clifton-Hadley, 2001 ).

In recent years molecular epidemiology, i.e. the integration of robust strain-typing procedures with conventional epidemiological traceback approaches, has produced more specific data with which to inform, influence and monitor control and surveillance strategies for Mycobacterium tuberculosis (van Soolingen et al., 1999 ). Strain typing has often challenged accepted dogmas (van Helden, 1998 ) and has been used to investigate important biological properties of strains (Kato-Maeda et al., 2001 ). The recent integration of molecular epidemiology and mathematical modelling offers the potential to quantify the risks posed by different subpopulations of the community (Borgdorff et al., 2000 ).

M. bovis belongs to the M. tuberculosis complex and has an extremely wide host range (O’Reilly & Daborn, 1995 ). Despite demonstrable phenotypic differences, members of the M. tuberculosis complex possess a remarkably high degree of genetic identity (Domenech et al., 2001 ). They are rich in repetitive DNA (Cole et al., 1998 ), a feature which has been exploited for molecular typing. Restriction fragment, or restriction enzyme, analysis has been successfully applied to the molecular epidemiology of M. bovis, most notably in New Zealand (Collins, 1998 ). The restriction fragment length polymorphism (RFLP) analysis technique, which exploits repetitive DNA elements for use as probes in Southern blotting has proven to be a highly discriminatory tool (Skuce et al., 1996 ). However, this technique is cumbersome and technically demanding, not least in the analysis, nomenclature and databasing of complex banding patterns (Heersma et al., 1998 ). Most M. bovis isolates, particularly those of bovine origin, harbour one or more copies of IS6110. Therefore, the accepted IS6110-RFLP protocol agreed for M. tuberculosis (van Embden et al., 1993 ) is not appropriate for M. bovis. To identify M. bovis strains, additional discrimination is required with further RFLP procedures, such as PGRS-RFLP analysis and direct repeat(DR)-RFLP analysis (Skuce et al., 1996 ), or alternatives such as pUCD probing (O’Brien et al., 2000a ). However, these are not ideally suited to inter-laboratory typing studies.

Spoligotyping is based on the detection of DNA polymorphisms within the DR cluster (Groenen et al., 1993 ; van Embden et al., 2000 ), which is specific to the M. tuberculosis complex. The number of DR elements in the cluster can vary between strains of the M. tuberculosis complex. These 36 bp DRs are interspersed by non-repetitive DNA spacers of 36–41 bp. Spacers have been sequenced, 37 from M. tuberculosis H37Rv and six from M. bovis BCG, synthesized as oligonucleotides and immobilized on a nylon membrane. Isolates are strain-typed on the basis of detecting the presence, or absence, of specific spacers using PCR and a reverse-line cross-blot technique (Kamerbeek et al., 1997 ). However, spoligotyping was found to be less discriminatory than RFLP analysis (Roring et al., 1998 ) for M. bovis isolates.

Tandem repeat loci, similar to eukaryotic minisatellites, have been identified in M. tuberculosis. These so-called variable-number tandem repeats (VNTRs) often differ in copy number between isolates (Frothingham & Meeker-O’Connell, 1998 ). During the preparation of this manuscript several groups have described, classified and analysed tandem repeat loci within the available genome sequences of the M. tuberculosis complex (Supply et al., 2000 ; Smittipat & Palittapongarnpim, 2000 ). Structures consisting of 40–100 bp repetitive sequences, called mycobacterial interspersed repetitive units (MIRUs; Magdalena et al., 1998a , b ; Supply et al., 1997 , 2000 ), were found scattered in 41 locations in the M. tuberculosis H37Rv chromosome; twelve were polymorphic in MIRU copy number between isolates. These novel targets offer the potential for the development of high-resolution, convenient and high-throughput typing methods. The key advantages of VNTR-typing are already evident for M. tuberculosis (Mazars et al., 2001 ). VNTR–MIRU-typing is PCR-based, and the typing data produced are numerical and easily managed; the data are also applicable to the global molecular epidemiology of the M. tuberculosis complex (Mazars et al., 2001 ). VNTR–MIRU-typing nomenclature is informative in as much that it is based on the repeat copy number at specific loci. This allows the calculation of relatedness for isolates. Identification of further VNTR–MIRU loci should allow this approach to be extended to M. bovis, which has proven difficult to type by existing methods (Cousins et al., 1998 ; Skuce & Neill, 2001 ).

In this study, novel VNTR loci were identified by a bioinformatics approach using the available genome sequences for the M. tuberculosis complex. These were applied to a panel of M. bovis isolates from various animal sources, European locations and representative reference isolates of the M. tuberculosis complex. Results were assessed for their discriminatory power, and for the correlation of calculated genetic distances with an established panel of exact tandem repeats (ETRs; ETR-A to ETR-E; Frothingham & Meeker-O’Connell, 1998 ) and spoligotyping.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Isolates, culture and DNA extraction.
DNA was extracted from a test panel of 100 M. bovis isolates collected from partner laboratories in a multicentre M. bovis genotyping project (EU SMT4 CT96 2097) (Skuce et al., 1998 ). These isolates differed in geographical, temporal and host species origin. Representative cattle isolates from Northern Ireland (NI) (n=36), the Republic of Ireland (ROI) (n=15), England (n=8) and Spain (n=9) were cultured from lymph-node tissues. Single isolates originating from badgers in NI (n=2), the ROI (n=10) and England (n=1) were also included. Deer isolates from the ROI (n=3) and Sweden (n=1), goat isolates from Spain (n=10) and the ROI (n=1), and four single isolates from the ROI originating from a dog, sheep, pig and human, respectively, were also included in the study. Isolates were coded, so that their identities were not disclosed until after the analysis was complete. A total of 21 isolates were included from six bovine tuberculosis outbreaks (A–F) in Ireland. The remaining 79 isolates had no known epidemiological association. Reference subspecies of the M. tuberculosis complex were also typed by VNTR-PCR (Table 3). M. tuberculosis H37Rv, Mycobacterium microti NCTC 5710 and M. bovis BCG NCTC 5692 were obtained from the National Collection of Type Cultures, Colindale, UK. M. tuberculosis CDC1551 DNA was kindly provided by Colorado State University. Mycobacterium africanum ATCC 25420 was obtained from the American Type Culture Collection, Manassas, VA, USA. M. bovis BCG P3 and M. tuberculosis RIVM 14323 (Heersma et al., 1998 ) were obtained from the National Institute of Public Health and the Environment (RIVM, Bilthoven, the Netherlands). M. bovis AF2122/97 was obtained from the Veterinary Laboratories Agency (Weybridge, UK). A loopful of each isolate was harvested into 50 µl TE buffer (10 mM Tris/HCl, pH 8·0; 1 mM EDTA) in screw-cap 1·5 ml microcentrifuge tubes and boiled. After microcentrifugation (10 s), the supernatant was transferred to fresh tubes. DNA concentration was estimated using a Hoefer TKO fluorimeter (Hoefer Scientific Instruments). DNA was diluted with TE buffer to a final concentration of 10 ng µl-1 and stored at -20 °C.

Bioinformatics.
The ‘lookup’ program of the SEQNET computing service, now merged with the Human Genome Mapping Project Resource Centre (http://www.hgmp.mrc.ac.uk), was used to interrogate the available annotation of the M. tuberculosis H37Rv sequence contigs (as of January 1998, http://www.sanger.ac.uk) for the keyword ‘repeat’. Cosmid sequences were downloaded and scanned manually for perfect, or near-perfect, tandem repeat loci in the range 20–150 bp. VNTR loci were located and analysed on the M. tuberculosis H37Rv sequence, and displayed using the Artemis computer program (http://www.sanger.ac.uk). Genome sequences for M. tuberculosis CDC1551 and M. bovis AF2122/97 were searched at the Institute for Genome Research (http://www.tigr.org) and the Sanger Centre (http://www.sanger.ac.uk), respectively. Orthologous VNTR loci were identified in these sequences. PCR genotyping data were recorded in Microsoft Excel 97 and analysed using the programs GelCompar version 4.0 and BioNumerics 2.1 (Applied Maths).

VNTR-PCR.
VNTR-PCR primers (Table 2) were designed to anneal upstream and downstream of each tandem-repeat locus. PCR was performed in a total volume of 60 µl containing ~10 ng template DNA and 45 µl PCR Supermix (Gibco-BRL Life Technologies) that contained 22 mM Tris/HCl pH 8·4, 55 mM KCl, 1·65 mM MgCl2, 220 µM of each of the four dNTPs and 1·1 U Taq DNA polymerase; 20 µM of each primer was used. PCRs were run in DNA thermal cyclers (model 480; Perkin Elmer) under the following conditions: 95 °C for 12 min, 40 cycles of 94  °C for 30 s, 60 °C for 1 min and 72 °C for 2 min, followed by a final extension at 72 °C for 7 min. Isolates were amplified using primers for ETR-A through ETR-E (Frothingham & Meeker-O’Connell, 1998 ) according to the authors’ recommendations. PCR products were analysed by agarose gel electrophoresis, using 100 bp Stepladders (Promega) and 20 bp (FMC) DNA ladders, and visualized by ethidium bromide staining. Product sizes were estimated and the exact number of complete repeats present was calculated using a derived allele-naming table, based on the number of complete repeats which could theoretically be present in a PCR product of a given size, allowing for extra flanking nucleotides and primer size. Loci were named simply on the basis of the order in which they were found by the initial search. VNTR allele calls were entered and manipulated in BioNumerics as ‘character’ data. Composite datasets were created for the six QUB(Queen’s University Belfast)-VNTRs (QUB-5, 11a, 11b, 18, 23 and 26) and the five ETRs (ETR-A, B, C, D and E) results. Distance trees were derived by clustering with the unweighted pair group method with arithmetic means (UPGMA), using ‘categorical’ character table values.

Spoligotyping.
The M. bovis test panel (n=100) was spoligotyped as described by Kamerbeek et al. (1997) . DNA (~10 ng) was prepared from boiled cells and amplified by PCR using the primers DRa and DRb. The resultant spoligotype patterns were recorded and analysed using GelCompar. Spoligotype data were entered and manipulated in the BioNumerics package as ‘character’ data. The Dice coefficient was used to plot dendrograms using UPGMA. Spoligotypes were determined for some of the reference isolates of the M. tuberculosis complex.

Allelic diversity and discrimination.
The Hunter–Gaston equation (Hunter & Gaston, 1988 ), an application of Simpson’s index of diversity (Simpson, 1949 ), was used to calculate the allelic diversity, or the ‘Hunter–Gaston discrimination index’ (HGDI), at each locus. The equation reads:

where D is the index of discriminatory power, aj is the number of strains in the population which are indistinguishable from the jth strain, and N is the number of strains in the population (Struelens et al., 1996 ).

Correlation and congruence.
Genetic relationships among isolates in the panel, with no known epidemiological associations (n=79), were estimated using UPGMA and plotted as dendrograms. Agreement between the genetic relationships inferred from the similarity matrices used to plot the dendrograms, based on the two sets of VNTR-PCR and spoligotyping datasets, was estimated by calculating the experiment congruence (using Pearson’s correlation) and the BioNumerics software.


   RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
M. tuberculosis H37Rv bioinformatics search
The initial search for the repeated sequences reported in this study was done in mid-1998, using the then incomplete M. tuberculosis H37Rv genome sequence contig. Twenty-eight cosmids whose annotation included the keyword ‘repeat’ were identified. Seven such loci (QUB-5, 11a, 11b, 15, 18, 23 and 26), containing perfect, or near-perfect, repeats followed by partial repeats of varying lengths (Tables 1 and 2), were selected for further study. VNTR loci described in this study fit the proposed nomenclature convention, based on the first four of seven digits of their nucleotide number (Smittipat & Palittapongarnpim, 2000 ). However, for clarity the original names, i.e. QUB or ETR, are used throughout. With the exception of QUB-5, the novel VNTR loci contain repeats which are multiples of three nucleotides. QUB-15 appears to be intergenic, whereas the other loci are associated with annotated ORFs (Table 2). QUB-11a and QUB-11b map to the novel glycine-asparagine-rich (PPE) protein Rv1917c of the multiple polymorphic tandem repeat class (MPTR), and the VNTRs appear to code for unrelated repeating 23 aa motifs. QUB-18 was located in the middle of PPE ORF Rv1743c and comprised five near-perfect 78 bp DRs, with the potential to code for multiple 26 aa motifs. QUB-23 was located in Rv1435c and comprised five imperfect 21 bp repeats, which could code for multiple 7 aa repeats, constituting 17 % of this ORF. QUB-26 is composed of five 111 bp DR units with the potential to code for multiple 37 aa repeat motifs making up 89% of Rv3611. The location of the ETRs has been described previously by Frothingham & Meeker-O’Connell (1998) .


View this table:
[in this window]
[in a new window]
 
Table 1. Location and arrangement of novel (this study) and existing (Frothingham & Meeker-O’Connell, 1998 ) VNTR loci within M. tuberculosis genome sequences, showing perfect (or near-perfect) repeat sizes (bp), complete copy number and downstream partial repeat sizes (bp)

 

View this table:
[in this window]
[in a new window]
 
Table 2. VNTR-PCR primer sequences used in this study, their relationship to existing VNTRs and their potential involvement with ORFs from M. tuberculosis H37Rv

 
In silico analysis
Results of in silico analysis are summarized in Table 1; several of the QUB-VNTRs and ETRs differed in copy number between the genome sequences of strains of the M. tuberculosis complex. For example, the alleles found at QUB-11a for M. tuberculosis H37Rv, M. tuberculosis CDC1551 and M. bovis AF2122/97 had copy numbers of two, six and nine, respectively. This 69 bp repeat encodes a series of in-frame 23 aa repeats within the PPE protein Rv1917c, which differ in copy number between these sequenced strains. The same phenomenon is seen with QUB-11b, QUB-18, QUB-23 and QUB-26 (Table 1). The specific PCR primers for each VNTR locus and their potential involvement with ORFs from M. tuberculosis H37Rv are given in Table 2. VNTR allele profiles for the test panel and M. tuberculosis complex reference strains are given in Table 3. Additional M. tuberculosis complex subspecies reference strains were also VNTR-typed (Table 3). Unique allele profiles were obtained from each of these eight M. tuberculosis complex strains, including two M. bovis BCGsubstrains. The apparent discrepancy between the bioinformatics-predicted (Table 1) and the ‘actual’ (Table 3) allele calls for the sequenced strains of the M. tuberculosis complex can be explained by differences in the interpretation of the makeup of these loci and their derived allele-naming table. Where the partial repeat of a VNTR locus consists of a significant proportion of the entire repeat, the allele-naming table considers this partial repeat as a whole repeat. For example, Frothingham & Meeker-O’Connell (1998) describe the ETR-C locus in M. tuberculosis H37Rv as (4x58 bp)-21, whereas Smittipat & Palittapongarnpim (2000) describe ETR-C as (3x58 bp)+37. The allele-naming table used in this study would call this the 4 allele. The same type of discrepancy is seen with ETR-E.


View this table:
[in this window]
[in a new window]
 
Table 3. VNTR allele profiles and spoligotypes of M. bovis isolates and reference isolates of the M. tuberculosis complex

 
M. bovis test panel
Applied to our test panel, the number of alleles detected varied from three to seven for QUB-VNTRs, and from three to six for ETRs, leading to 5100000 possible combinations, assuming that the loci are independent. For example, QUB-11a (Fig. 1) is a VNTR locus with seven alleles in this panel, consisting of two, five, six, seven, eight, nine or ten copies of a 69 bp repeat, whereas QUB-11b has three alleles, consisting of two, three or four copies of an unrelated 69 bp repeat.



View larger version (79K):
[in this window]
[in a new window]
 
Fig. 1. Length polymorphisms at the QUB-11a locus in multiple isolates of M. bovis. Locus QUB-11a was amplified by PCR and the products were resolved by agarose gel electrophoresis. Length polymorphisms correspond to multiples of a 69 bp tandem repeat unit. Lanes: 1, PCR negative control; Ma, 100 bp DNA ladder (Promega); 2, M. tuberculosis H37Rv (allele 2); 3, M. bovis AF2122/97 (allele 9); 4, M. bovis 028 (allele 10); 5, M. bovis 029 (allele 10); 6, M. bovis 030 (allele 10); Mb, 100 bp DNA ladder; 7, M. bovis 031 (allele 10); 8, M. bovis 032 (allele 10); 9, M. bovis 033 (allele 9); 10, M. bovis 034 (allele 10); 11, M. bovis 035 (allele 9); Ma, 100 bp DNA ladder; 12, M. bovis 036 (allele 5).

 
Although there were different QUB-15 alleles within the sequenced strains of the M. tuberculosis complex, QUB-15 was not further characterized in this study.

A total of 33 different allele profiles were identified by the QUB-VNTRs, compared with 22 for the ETR set and 29 for spoligotyping (Table 4). When the allele profiles for the QUB and ETR sets were combined a total of 51 different profiles were identified. VNTR allele profiles and spoligotypes of M. bovis isolates and reference isolates of the M. tuberculosis complex are available as supplementary data at www.mic.sgmjournals.org


View this table:
[in this window]
[in a new window]
 
Table 4. HGDI calculated for individual VNTRs, various combinations of VNTRs and spoligotyping of the M. bovis test panel (n=100)

 
Outbreak isolates
Isolates from outbreaks B, C, D and F had identical spoligotypes and VNTR types, with the exception of isolate 068 which appeared to have the 4 allele at QUB-26 instead of the 3 allele shared by the other isolate (100) in outbreak F (Table 3). Outbreaks A and E comprised isolates with different spoligotypes and VNTR profiles differing at several loci. Outbreak E isolates were recovered from four cattle and two badgers trapped on the same farm premises. Interestingly, the M. bovis isolate from one of the badgers (056) appears to be the same as that which caused the outbreak in cattle. However, the M. bovis isolate (057) from the other badger was significantly different. This observation was supported by the spoligotyping data and previously published RFLP data (Skuce et al., 1996 ).

Allelic diversity and discrimination
With our M. bovis test panel the HGDI for individual loci varied from 0·06 for QUB-23 to 0·62 for QUB-11a, and from 0·08 for ETR-D to 0·65 for ETR-A. QUB-5, QUB-11b, QUB-18 and QUB-23 showed striking differences in their HGDIs, despite the fact that three alleles were identified at each locus (Table 4). By combining the results at various loci the discrimination of VNTR-PCR was significantly improved (Table 4). For example, 67·6% of the discrimination of the combined QUB and ETR sets was due to ETR-A alone. Similarly, 97·1% of the total discrimination of all 12 VNTRs was provided by just four VNTRs (ETR-A, QUB-11a, QUB-26 and ETR-B, in order of discrimination). Spoligotyping was also capable of resolving isolates which had the same VNTR profile (Table 3).

Forty-five of the isolates were spoligotype ST140, the most common spoligotype among M. bovis strains from the UK and Ireland (Skuce et al., 1996 ; Costello et al., 1999 ; Durr et al., 2000 ). QUB-VNTRs resolved these 45 isolates into 14 VNTR profiles, with the largest subset being 15 isolates (HGDI=0·84). The ETRs resolved the ST140 isolates into seven profiles, with 27 isolates in the largest subset (HGDI=0·60). QUB-VNTRs and ETRs combined resolved these 45 ST140 isolates into 20 profiles, the largest subset comprising nine isolates (HGDI=0·92).

Correlation (experiment congruence) between VNTRs, ETRs and spoligotyping
Dendrograms were created using the UPGMA genetic distance matrix, calculated from each of the QUB-VNTR, ETR and spoligotyping datasets for 79 M. bovis isolates with no known epidemiological connections (i.e. the test panel, excluding the 21 outbreak isolates). The resulting dendrograms (not shown) were compared and the experiment congruence between the datasets was calculated using Pearson’s correlation ({rho}). A weak to moderate, positive, linear correlation was found between the QUB-VNTR and ETR datasets ({rho}=0·448), and between spoligotyping and QUB-VNTRs ({rho}=0·416). A moderate to strong, positive, linear association was found between spoligotyping and the ETR data ({rho}=0·678).


   DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Novel targets
We have described the identification of novel VNTR loci within the M. tuberculosis H37Rv genome and their testing as new genetic tools for the study of the molecular epidemiology of M. bovis. In this study, 28 such targets were identified in the first genome screen; six are described in detail. VNTRs were located in both coding and intergenic regions in M. tuberculosis H37Rv. All reported VNTR loci within bacterial coding regions have repeat sizes which are multiples of three nucleotides. In pathogenic bacteria, VNTRs in coding regions potentially give rise to variations in the surface-exposed proteins involved in pathogenicity, which might help the pathogen adapt to avoid the host immune response (Domenech et al., 2001 ).

Location and consequences
Of the VNTR loci described here, six are located in ORFs, where they potentially code protein repeat motifs of 7–37 aa, which differ in copy number between the three sequenced strains of the M. tuberculosis complex. Intriguingly, QUB-11a, QUB-11b and QUB-18 are located in PPE proteins, which have recently been cited as potential sources of antigenic variation in M. tuberculosis complex strains (Cole et al., 1998 ). The QUB-11a and QUB-11b VNTR loci map to Rv1917c and are largely responsible for the RFLPs attributed to the pUCD probe (O’Brien et al., 2000b ). Apparent size differences have been detected amongst the novel glycine-alanine-rich (PE) proteins of the polymorphic gene sequence (PGRS) class of different isolates (Domenech et al., 2001 ), and PE-PGRS genes of Mycobacterium marinum are upregulated in experimental granuloma models (Ramakrishnan et al., 2000 ). The cellular location and function of PE and PPE proteins is currently unknown. From large amounts of sequence data, Musser and colleagues have extrapolated that ~20 of the 167 PE/PPE proteins (~12%) in M. tuberculosis H37Rv would be polymorphic in other members of the M. tuberculosis complex (Musser et al., 2000 ). In addition to the PPE protein Rv3135 (Musser et al., 2000 ), we have confirmed that there is the potential for length variation in the protein products of ORFs Rv1917c (PPE) and Rv1753c (PPE) (Brosch et al., 2001 ). Length variation in the protein products of ORFs Rv1435c and Rv3611 has not been previously reported.

Performance, discrimination and experiment congruence
With the test panel, the HGDI varied from 0·06 for QUB-23 to 0·65 for ETR-A. Using all 11 VNTRs and ETRs the HGDI was 0·96, which demonstrated significantly higher resolution than spoligotyping. The allelic diversity of the 12 VNTR–MIRU loci (Mazars et al., 2001 ) was equivalent to the highly discriminatory IS6110-RFLP for M. tuberculosis isolates, although the authors cautioned against dispensing with certain VNTRs which had relatively low allelic diversity. Such loci resolved some strains which remained unresolved at other VNTR loci. ETR-PCR has been proposed as a novel method for genotyping within the M. tuberculosis complex (Frothingham & Meeker-O’Connell, 1998 ). However, although ETR-PCR was able to discriminate M. bovis isolates and M. bovis BCG substrains, it was subsequently shown to lack discrimination with isolates of the M. tuberculosis complex (Kremer et al., 1999 ; Filliol et al., 2000 ). We have shown that most of the discrimination attributed to the ETR-A to ETR-E set was due to ETR-A and ETR-B. Similarly, QUB-11a, QUB-11b and QUB-26 contributed most of the discrimination of the QUB-5 to QUB-26 set. We show that the discrimination of the VNTR technique can be greatly improved by combining VNTR loci (Table 4).

Proposed strategy for use of VNTR-PCR
There is now an increasingly large panel of VNTR-type loci whose performance has not been systematically evaluated. The application of the Hunter–Gaston equation to the genotyping of a comprehensive panel of isolates provides a mechanism for recording the discrimination provided by individual loci, or combinations of loci. The correlation between MIRU-typing and IS6110-RFLP for a test panel of M. tuberculosis isolates was highly significant ({rho}=0·512, Mazars et al., 2001 ). The positive correlation determined between QUB-VNTR-, ETR-typing and spoligotyping suggests that the methods group isolates in a similar fashion. Polymorphisms found with different molecular markers show strong mutual association because the M. tuberculosis complex has a strongly clonal structure (Sreevatsan et al., 1997 ).

VNTR-PCR proved to be a robust, convenient, highly discriminatory technique, which is reproducible and appropriate for typing isolates of the M. tuberculosis complex, including those with a low IS6110 copy number. If these loci are independent there would be 5100000 possible allelic variants using these markers with our test panel. The significance of VNTR–MIRU-derived clusters remains to be determined empirically (Frothingham & Meeker-O’Connell, 1998 ). VNTR–MIRU loci appear to be sufficiently stable to allow meaningful epidemiological studies to be conceived and undertaken (Mazars et al., 2001 ). A further attraction is that the performance of VNTR-PCR can be tailored to suit specific studies, where high throughput, convenience or discrimination may be issues. Nomenclature and databasing, which have proven so difficult for restriction-enzyme-analysis-, RFLP- and PFGE-based genotyping, are relatively simple and intuitive for VNTR-PCR. An added attraction of VNTR-typing would be the potential to detect and genotype bacteria of the M. tuberculosis complex directly in a range of clinical samples, as has been demonstrated for spoligotyping (Kamerbeek et al., 1997 ; Roring et al., 2000 ). The technique would be suitable for high-throughput automation using PCR workstations, and DNA sequencing platforms running allele-calling software.

Understanding the molecular basis of pathogen variation is not only important for discriminating and tracing clinically relevant strains, but also provides insights into pathogenesis, host adaptation and the origin of new pathogenic forms (Reid et al., 2001 ). For M. bovis, the integration of VNTR-typing with conventional epidemiological approaches, advanced animal movement recording and geographical information systems has the potential to be a powerful new technology, which should improve our understanding of bovine tuberculosis. The novel loci described here, when used in combination with other VNTR-type loci, should provide a robust and high-resolution tool for the molecular epidemiology of the M. tuberculosis complex.


   ACKNOWLEDGEMENTS
 
This work was funded by the European Union, Framework IV [Standards, Measurements and Testing Programme (T.P.McC., EU SMT4 CT96 2097)]. Balancing funding was provided by the Department of Agriculture and Rural Development for Northern Ireland (R.A.S., D.B., J.F.McC. and S.D.N.) and the Department for Environment, Food and Rural Affairs (S.M.M.R., A.N.S., R.G.H and S.L.H.). The authors wish to acknowledge Mark Smyth for his excellent technical support, Stanley McDowell for his advice and help with spreadsheets and data management, and Richard Porter for supervising bacterial culture.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Borgdorff, M. W., Behr, M. A., Nagelkerke, M. J., Hopewell, P. C. & Small, P. M. (2000). Transmission of tuberculosis in San Francisco and its association with immigration and ethnicity. Int J Tuber Lung Dis 4, 287-294.

Brosch, R., Pym, A. S., Gordon, S. V. & Cole, S. T. (2001). The evolution of mycobacterial pathogenicity: clues from comparative genomics. Trends Microbiol 9, 452–458.[Medline]

Clifton-Hadley, R. & Wilesmith, J. (1995). An epidemiological outlook on bovine tuberculosis in the developed world. In Proceedings of the Second International Conference on Mycobacterium bovis, pp. 178–182, Dunedin, New Zealand: University of Otago Press.

Cole, S. T., Brosch, R., Parkhill, J. & 39 other authors (1998). Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393, 537–544.[Medline]

Collins, D. M. (1998). Molecular epidemiology: Mycobacterium bovis. In MycobacteriumMolecular Biology and Virulence , pp. 123-135. Edited by C. Ratledge & J. Dale. Oxford:Blackwell Science.

Costello, E., O’Grady, D., Flynn, O., O’Brien, R., Rogers, M., Quigley, F., Egan, J. & Griffin, J. (1999). Study of restriction fragment length polymorphism analysis and spoligotyping for epidemiological investigation of Mycobacterium bovis infection. J Clin Microbiol 37, 3217-3222.[Abstract/Free Full Text]

Cousins, D. V., Skuce, R. A., Kazwala, R. R. & van Embden, J. D. A. (1998). Towards a standardised approach to DNA fingerprinting of Mycobacterium bovis. Int J Tuber Lung Dis 2, 471-478.

Domenech, P., Barry, C. E.III & Cole, S. T. (2001). Mycobacterium tuberculosis in the post-genomic age. Curr Opin Microbiol 4, 28-34.[Medline]

Durr, P. A., Hewinson, R. G. & Clifton-Hadley, R. S. (2000). Molecular epidemiology of bovine tuberculosis I. Mycobacterium bovis genotyping. Rev Sci Tech Off Int Epizoot 19, 675-688.

van Embden, J. D. A., Cave, M. D., Crawford, J. T. & 7 other authors (1993). Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standard methodology. J Clin Microbiol 31, 406–409.[Abstract]

van Embden, J. D. A., van Gorkom, T., Kremer, K., Jansen, R., van der Zeijst, B. A. M. & Schouls, L. M. (2000). Genetic variation and evolutionary origin of the direct repeat locus of Mycobacterium tuberculosis complex bacteria. J Bacteriol 182, 2393-2401.[Abstract/Free Full Text]

Filliol, I., Ferdinand, S., Negroni, L., Sola, C. & Rastogi, N. (2000). Molecular typing of Mycobacterium tuberculosis based on variable number of tandem DNA repeats used alone and in association with spoligotyping. J Clin Microbiol 38, 2520-2524.[Abstract/Free Full Text]

Frothingham, R. & Meeker-O’Connell, W. A. (1998). Genetic diversity in the Mycobacterium tuberculosis complex based on variable numbers of tandem DNA repeats. Microbiology 144, 1189-1196.[Abstract]

Goodchild, A. V. & Clifton-Hadley, R. S. (2001). Cattle-to-cattle transmission of Mycobacterium bovis. Tuberculosis 81, 23-41.[Medline]

Groenen, P. M., Bunschoten, A. E., van Soolingen, D. & van Embden, J. D. A. (1993). Nature of DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis: application for strain differentiation by a novel typing method. Mol Microbiol 100, 1057-1065.

Heersma, H. F., Kremer, K. & van Embden, J. D. A. (1998). Computer analysis of IS6110 RFLP patterns of Mycobacterium tuberculosis. Methods Mol Biol 101, 395-422.[Medline]

van Helden, P. (1998). Molecular epidemiology: human tuberculosis. In Mycobacteria – Molecular Biology and Virulence , pp. 110-122. Edited by C. Ratledge & J. Dale. Oxford:Blackwell Science.

Hunter, P. R. & Gaston, M. A. (1988). Numerical index of the discriminatory ability of typing systems: an application of Simpson’s index of diversity. J Clin Microbiol 26, 2465-2466.[Medline]

Kamerbeek, J., Schouls, L. M., Kolk, A. & 8 other authors (1997). Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol 35, 907–914.[Abstract]

Kato-Maeda, M., Bifani, P. J., Kreiswirth, B. N. & Small, P. M. (2001). The nature and consequence of genetic variability within Mycobacterium tuberculosis. J Clin Invest 107, 533-537.[Free Full Text]

Kremer, K., van Soolingen, D., Frothingham, R. & 9 other authors (1999). Comparison of methods based on different molecular epidemiological markers for typing of Mycobacterium tuberculosis complex strains: interlaboratory study of discriminatory power and reproducibility. J Clin Microbiol 37, 2607–2618.[Abstract/Free Full Text]

Magdalena, J., Vachee, A., Supply, P. & Locht, C. (1998a). Identification of a new DNA region specific for members of Mycobacterium tuberculosis complex. J Clin Microbiol 36, 937-943.[Abstract/Free Full Text]

Magdalena, J., Supply, P. & Locht, C. (1998b). Specific differentiation between Mycobacterium bovis BCG and virulent strains of the Mycobacterium tuberculosis complex. J Clin Microbiol 36, 2471-2476.[Abstract/Free Full Text]

Mazars, E., Lesjean, S., Banuls, A.-L., Gilbert, M., Vincent, V., Gicquel, B., Tibayrenc, M., Locht, C. & Supply, P. (2001). High-resolution minisatellite-based typing as a portable approach to global analysis Mycobacterium tuberculosis molecular epidemiology. Proc Natl Acad Sci USA 98, 1901-1906.[Abstract/Free Full Text]

Musser, J. M., Amin, A. & Ramaswamy, S. (2000). Negligible genetic diversity of Mycobacterium tuberculosis host immune system protein targets: evidence of limited selective pressure. Genetics 155, 7-16.[Abstract/Free Full Text]

O’Brien, R., Flynn, O., Costello, E., O’Grady, D. & Rogers, M. (2000a). Identification of a novel DNA probe for strain typing Mycobacterium bovis by restriction fragment length polymorphism analysis. J Clin Microbiol 38, 1723-1730.[Abstract/Free Full Text]

O’Brien, R., Danilowiez, B. S., Bailey, L., Flynn, O., Costello, E., O’Grady, D. & Rogers, M. (2000b). Characterization of the Mycobacterium bovis restriction fragment length polymorphism DNA probe pUCD and performance comparison with standard methods. J Clin Microbiol 38, 3362-3369.[Abstract/Free Full Text]

O’Reilly, L. M. & Daborn, C. J. (1995). The epidemiology of Mycobacterium bovis infections in animals and man: a review. Tuber Lung Dis 76, 1-46.

Ramakrishnan, L., Federspiel, N. A. & Falkow, S. (2000). Granuloma-specific expression of mycobacterium virulence proteins from the glycine-rich PE-PGRS family. Science 288, 1436-1439.[Abstract/Free Full Text]

Reid, S. D., Hoe, N. P., Smoot, L. M. & Musser, J. M. (2001). Group A Streptococcus: allelic variation, population genetics, and host–pathogen interactions. J Clin Invest 107, 393-399.[Free Full Text]

Roring, S. M. M., Brittain, D., Bunschoten, A. E., Hughes, M. S., Skuce, R. A., van Embden, J. D. A. & Neill, S. D. (1998). Spacer oligotyping of Mycobacterium bovis isolates compared to typing by restriction fragment length polymorphism analysis using PGRS, DR and IS6110. Vet Microbiol 61, 111-120.[Medline]

Roring, S. M. M., Hughes, M. S., Skuce, R. A. & Neill, S. D. (2000). Simultaneous detection and strain differentiation of Mycobacterium bovis directly from bovine tissue specimens by spoligotyping. Vet Microbiol 74, 227-236.[Medline]

Simpson, E. H. (1949). Measurement of diversity. Nature 163, 688.

Skuce, R. A. & Neill, S. D. (2001). Molecular epidemiology of Mycobacterium bovis: exploiting molecular data. Tuberculosis 81, 169-175.[Medline]

Skuce, R. A., Brittain, D., Hughes, M. S. & Neill, S. D. (1996). Differentiation of Mycobacterium bovis isolates from animals by DNA typing. J Clin Microbiol 34, 2469-2474.[Abstract]

Skuce, R. A., Brittain, D., van Embden, J. D. A., Smyth, T., Sharp, J. M., Rogers, M., Hewinson, R. G., Garcia-Marin, J. F. & Neill, S. D. (1998). Development of novel standardised methodology and nomenclature for the identification of Mycobacterium bovis strains. EU Contract SMT4 CT96 2097 Final Report (EU Commission).

Smittipat, N. L. & Palittapongarnpim, P. (2000). Identification of possible loci of variable number of tandem repeats of Mycobacterium tuberculosis. Tuber Lung Dis 80, 69-74.[Medline]

van Soolingen, D., Borgdoff, M. W., de Haas, P. E., Sebek, M. M., Veen, J., Dessens, M., Kremer, K. & van Embden, J. D. A. (1999). Molecular epidemiology of tuberculosis in the Netherlands: a nationwide study from 1993 through 1997. J Infect Dis 180, 726-736.[Medline]

Sreevatsan, S., Pan, X. & Musser, J. M. (1997). Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc Natl Acad Sci USA 94, 9869-9874.[Abstract/Free Full Text]

Struelens, M. J. (1996). Consensus guidelines for the appropriate use and evaluation of microbial epidemiologic typing systems. Clin Microbiol Infect 2, 2-11.[Medline]

Supply, P., Magdelena, J., Himpens, S. & Locht, C. (1997). Identification of novel intergenic repetitive units in a mycobacterial two-component system operon. Mol Microbiol 26, 991-1003.[Medline]

Supply, P., Mazars, E., Lesjean, S., Vincent, V., Gicquel, B. & Locht, C. (2000). Variable human minisatellite-like regions in the Mycobacterium tuberculosis genome. Mol Microbiol 36, 762-771.[Medline]

Received 13 August 2001; revised 26 October 2001; accepted 30 October 2001.