Centers for Disease Control and Prevention, Respiratory Diseases Branch, 1600 Clifton Rd, Mailstop C02, Atlanta, GA 30333, USA1
National Centre for Streptococcus, Provincial Laboratory of Public Health for Northern Alberta, 8440-112 St, Edmonton, Alberta, Canada T6G 2J22
Author for correspondence: Bernard Beall. Tel: +1 404 639 1237. Fax: +1 404 639 3123. e-mail: beb0{at}cdc.gov
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: emm gene sequences, sof variable gene sequences, Streptococcus pyogenes, opacity factor
Abbreviations: AOF, anti-opacity factor; CDC, Centers for Disease Control; GAS, group A streptococci,; NT, nontypable; OF, opacity factor
a Present Address: Libera Università Campus Bio-Medica, Via E. Longoni 83, 00155 Rome, Italy.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
M protein serotyping has served as a subtyping standard for GAS for much of the 20th century. It has long been known that GAS strains within certain M surface-virulence-protein serotypes are associated with the opacity-factor-positive (OF+) phenotype (Gooder, 1961 ; Widdowson et al., 1970
). Of 86 known M-protein serotypes and provisional serotypes, 36 of these historically correlate with the OF+ phenotype (Fraser & Colman, 1985
; Johnson & Kaplan, 1993
; Facklam et al., 1999
), and these strains are commonly found in sterile-and nonsterile-site infections (Colman et al., 1993
). Antisera against these OF+ GAS strains have been reported to inhibit the OF+ reaction only in strains of the same M serotype (Maxted et al., 1973
). This observation of anti-OF (AOF) specificity is consistent with previous observations that the sof locus is quite variable between strains of different M serotypes (Rakonjac et al., 1995
). In fact, much of the entire N-terminal 80% of the Sof protein sequence appears to be hypervariable, with interspersed small conserved regions (Rakonjac et al., 1995
; Courtney et al., 1999
). Most of this large, variable domain has been found to be essential for OF activity, which complicates the determination of epitopes targeted by AOF sera.
Although specific M serotypes have been shown to be conferred by epitopes at the mature M protein N terminus (see Fischetti, 1989 for review of M protein structure), for unknown reasons many OF+ strains have always been very difficult to M serotype. Instead, in many studies the M serotype has been inferred based upon AOF specificity. This is a fundamentally illogical inference, since the emm and sof genes are situated at least 15 kb apart (Rakonjac et al., 1995
), and horizontal gene transfer events do occur in GAS (Bessen & Hollingshead, 1994
; Whatmore et al., 1994
). M-protein gene (emm) sequences have been documented in some instances to be identical between strains of differing genetic lineages (Whatmore et al., 1994
). The differing strain backgrounds within specific emm types are often reflected by differing serological specificities of the poorly defined T antigens (Beall et al., 1997
), although each of the commonly occurring emm/M types are represented primarily by a closely related T agglutination pattern, suggesting overall genetic relatedness within many emm types (Johnson & Kaplan, 1993
; Beall et al., 1998
). Recently we replaced M typing at the CDC with emm sequence typing, since limited M-typing data indicated that 5' emm sequence can be correlated very well to M serological data (Beall et al., 1996
, 1997
). Furthermore, we have found that isolates within the same emm type that share similar or identical T agglutination patterns are usually genetically highly related on the basis of genomic-restriction-digest pattern analysis (unpublished observations).
For only a minority of emm types has it been shown that the emm gene specifically encodes the M serotype (for examples see Hollingshead et al., 1986 ; Robbins et al., 1987
; Miller et al., 1988
; Mouw et al., 1998
; Dale et al., 1993
). Since many GAS strains, including most OF+ strains, have additional emm-like genes in addition to emm (described as the single gene amplified by a specific primer set; Whatmore et al., 1994
) at the vir locus (Hollingshead et al., 1993
; Whatmore et al., 1995
) these other emm-like genes potentially contribute to the M serotype since they are likely to be present in crude M antigen extracts. In this work, using a set of highly geographically and temporally diverse OF+ GAS strains, we have found additional circumstantial evidence that 5' emm sequences dictate M-serotype specificity. We also present data demonstrating that although the sequence of the first 190240 codons of sof is generally highly predictive of AOF type, a 100450 residue region within the previously defined enzymic domain (Rakonjac et al., 1995
; Courtney et al., 1999
) appears likely to dictate AOF type specificity. We show several instances where sof types are not predictive of M type or the corresponding emm sequence type, and in several instances the combination of sof and emm type appears to be highly predictive of genetically related strain sets.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Of the 86 recognized M serotypes, 36 have been consistently associated with the OF+ phenotype (Fraser & Colman, 1985 ; Johnson & Kaplan, 1993
; Colman et al., 1993
; Facklam et al., 1999
). For two of these serotypes historically associated with the OF+ phenotype, M13 and M27, two distinct reference emm sequence types exist (Facklam et al., 1999
). These emm types are shown in Table 1
as M13L/emm13L and M13W/emm13W, and M27L/emm27L and M27G/emm27G. Additionally, it has been found that isolates containing the commonly occurring emm sequence types st2967 (M. Lovgren & G. Tyrrell, unpublished data) and pt5118 (M92) (Facklam et al., 1999
) represent unique M and/or AOF sero-specificities which brings the total of OF+ GAS emm types associated with known serological correlates to 39.
For each of the emm types featured in this study, the 5' sof sequence was obtained from a CDC reference strain. CDC reference strains for many M nontypable (NT) strains with new emm sequence types were also subjects of this study. To maximize potential strain variability within emm types, strains with unusual T pattern/emm type associations and strains from diverse geographic locations were examined.
Sequence analysis.
Sequence analysis was carried out using the Wisconsin package version 10.0. Signal-sequence predictions were carried out as described at the web site http://www.cbs.dtu.dk/services/SignalP/ (Nielsen et al., 1997 ), using the N-terminal 22 aa from the GenBank accession AF019890 (or U02290 and X88303, which are identical) plus the first 48 aa deduced from primer F-based sequence.
emm typing.
emm and sof gene-specific PCR was performed using standard protocols described for the Boehringer Mannheim Hi Fidelity system. emm sequence typing and criteria defining emm type designations have been previously described (Beall et al., 1998 ; CDC, 1999b). All emm sequences used for this study are available at http://www.cdc.gov/ncidod/biotech/strep/strepindex.html) and were independently obtained in the CDC streptococcal laboratory from CDC reference strains. All of these emm sequences were in close agreement with the given GenBank accession numbers in Table 1
, except that in some instances longer sequences were generated for purposes of sequence comparisons. Sequences of a designated emm type shared 97100% sequence identity over at least 252 bases of the corresponding CDC reference strain emm sequence encoding the mature protein. The two emm68.1 isolates were deleted of M68 mature protein codons 39 and had three single-base changes resulting in conservative substitutions.
sof amplification and sequencing.
Conserved primer sets were based upon comparison of the sof gene with GenBank accession nos U02290/X88303 and AF019890, which represented the only two GAS sof sequences in GenBank at the time of this work. Primers F2 (5'-GTATAAACTTAGAAAGTTATCTGTAGG-3') and R3 (5'-GGCCATAACATCGGCACCTTCGTCAATT-3') were used to generate approximately 560700 bp fragments from all strains that encompassed sof sequence encoding the mature protein plus 22 residues of signal sequence. Primer F (5'-GGGCTCGTCTCCGTCGGAACGATGCTG-3') was used for sequencing the sof 5' region encoding 731 signal-sequence residues and up to 270 mature-protein residues. For many strains, primers F2+R5 (5'-GTAAAGGATGCTTCACGTTTGTCTCCAG-3') were used to amplify most of the sof structural gene. F3 (5'-GAAG/CAAATTGACGAAGGTGCCGATGT-3') was another universal primer used for sequence analysis and PCR. Various other conserved or nonconserved sof primers were also used for amplification and sequencing reactions.
PFGE.
Selected strains were typed by PFGE of chromosomal digests using SmaI. Isolates differing by only 16 bands from a common reference strain for each group were assigned a common type. More than six bands of difference from subtype 1 of each type were considered unrelated isolates and assigned a different PFGE type (Tenover et al., 1995 ).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Ten of the 57 M nontypable (NT) strains were found to be within the emm types 22, 25, 28, 48, 73, 75, 81 and 27/77, for which corresponding M-typing sera was available. At present, the basis for this nontypability is not known. In contrast, only one of the strains shown in this study, 1588-96, was not emm sequence typable since an emm-specific amplicon could not be generated. Although this strain was also M NT, it is likely that altered primer-annealing site(s), rather than the absence of the emm gene, prevented a successful PCR reaction since this strain multiplies in the indirect bactericidal assay (data not shown; see Johnson & Kaplan, 1996 for assay).
In one example, previous M-typing results recorded many years earlier were in disagreement with our emm-sequencing results. Strain D734, the source of the first sof gene to be sequenced, was previously recorded as an M type 22 strain (Rakonjac et al., 1995 ). We found it to be M NT and to have the emm sequence type pt2233 (Table 1
). It must be noted that D734 was a strain from Dr Lancefields collection (http://www.rockefeller.edu/vaf) that was serotyped long before type PT2233 was documented (Fraser & Colman, 1985
) and may have cross-reacted with M type 22 antisera.
Sequence correlates of new emm sequence types with classical M serotypes
Four M-typable strains were found to have new emm gene sequences that differed significantly from the sequences predicted by their M types. Still, for each of these four strains, clear correlations with the M type emm sequences were found. Two type st1160 isolates were found to type as serologically identical to M2 in gel-diffusion tests. This observation correlated to 23 residues of identical sequence shared between ST1160 and a portion of the 35 N-terminal M2 residues known to elicit opsonic type-specific antibodies (Dale et al., 1996 ) (Fig. 1a
). This 23-residue sequence is not perfectly conserved between any other two known M sequences. It is interesting that while the N-terminal ends of the predicted ST1160, M73 and ST2967 proteins all have highly related sequences with similar overall homology to M2, strains with the emm sequence st1160 are serotype M2, while the emm73 and st2967 sequences are correlated with the specific serotypes M73 and ST2967 respectively (Table 1
).
|
The remaining M serotypable strain, SS1416 (st4532), typed as M76, even though the deduced ST4532 50 N-terminal residues share only 54·2% sequence identity with the corresponding M76 sequence. However, besides the M27G sequence, M76 represents the closest match to the ST4532 N terminus (and we do not have anti-M27G typing sera). A 40-residue segment of the mature ST4532 protein consisting of residues 2766 is consistent with the serological M76 result, since other than seven substitutions (six conservative), it is identical to the corresponding M76 segment (data not shown).
Serological non-identical M cross-reactivity correlates with emm sequence type
In two examples where partial M serotype identity was found in gel-diffusion tests, clear sequence correlations were also found. In one example, M extracts from two strains carrying an emm68 allele (emm68.1) deleted of mature codons 39 and containing three conservative substitutions (Fig. 1c) were found to specifically cross-react only with M68 antiserum and showed partial identity against reference M68 strain extracts.
All three st448 strain extracts, representing two different genetic backgrounds on the basis of sof sequence types and T agglutination patterns, also showed partial identity with M49 extracts when tested with M49 antisera. Significantly, the closest sequence match to the ST448 protein is M49 and the two deduced proteins share marked similarity between M49 residues 2690 and the corresponding N-terminal residues 1881 of ST448 (Fig. 1d).
Distinct classical M serotypes corresponding with identical 5' emm sequences overlap in M type specificity
For unknown reasons, classical M type reference strains for M27L, M77, M44, M61, M81 and PT1658 have been reported to have distinct M types, even though the strain pairs for M27L/M77, M44/M61, M44/M61 and M81/PT1658 have recently been found to share emm gene sequence types (Whatmore et al., 1994 ; Beall et al., 1996
). Our data were in partial disagreement with classical M-typing data in that we observed that two of four strains sharing the emm44/61 sequence, including the CDC Lancefield M44 type strain SS511, were found in this study to M type as both M44 and as M61 (Tables 1
and 2
). Why only T pattern 5/27/44, sof44, emm44/61 (M44+61) strains, but not T pattern 11/12, emm44/61, sof61 (M61) strains displayed this dual M type specificity is unknown (Table 1
).
|
Another example of distinct M type reference strains with identical M serotypes and corresponding identical emm sequence types were the PT1658 (in Table 1 with sof1658) and M81 reference strains (Whatmore et al., 1994
; http://www.cdc.gov/ncidod/biotech/strep/strepindex.html). All five strains with the emm81 sequence, accounting for T NT, T4 or T3/13/B3264-related agglutination patterns and three distinct sof sequence types, were found to be M type 81, and this result was recently confirmed at the Public Health Laboratory Service, Colindale, UK (Table 1
).
5' sof sequences
All of the OF+ reference strains and various OF+ isolates shown yielded approximately 580750 bp sof-specific PCR products with the primer pair F2+R3, which anneal to sequences encoding signal sequence and a conserved amino-proximal region respectively (Fig. 2). This conserved amino-proximal region represents one of several highly conserved short sequences previously seen to be distributed along the length of the enzymic domain (Courtney et al., 1999
). Additionally, all strains tested yielded approximately 3·0 kb amplicons with the F2+R5 primer pair, with R5 annealing to a conserved sequence overlapping the wall-attachment-motif encoding sequence (Fig. 2
). With the exception of sof61, sof3875, sof1482, sof2967, sof1207, sof9 and sof44, for which protein sequences of 872922 amino acids were deduced, the 62 different sof designations shown in Table 1
represent a remarkably variable set of related partial 190470 residue Sof proteins that share 5070% sequence identity. The fnbA product from Streptococcus dysgalactiae (Lindgren et al., 1993
) which also functions as an OF (Courtney et al., 1999
), shared approximately 3540% sequence identity over this range (data not shown).
|
Each of the sof sequences determined was predicted to encode a membrane export signal peptide, with the first 10 residues (corresponding to amino acids 2332 of sof proteins encoded by the sequences with accession nos U02290/X88303 and AF019890) totally conserved in the majority of the isolates. Based upon sequence differences and predicted cleavage sites, there was a total of 12 different predicted signal peptides of 2953 residues in length.
One striking feature found in all of the various Sof peptides was an abundance of N-terminal serine and threonine residues, which comprised about 50% of the first 100150 residues. This region lies outside of the putative enzymic domains of these proteins and displays a remarkable degree of sequence diversity. It is also interesting that three residues (corresponding to Sof2967 Q67, N114 and E120; see accession no. AF139749) are totally conserved among all of the known Sof protein sequences. The functional significance of these shared features of the Sof N-terminal region lying amino-proximal to known Sof functional domains remain to be determined (Fig. 2).
The only classically OF- M/emm type examined that yielded sof-specific PCR products (both F2+R3 and R2+R5 generated) was M12/emm12. This is consistent with previous data demonstrating the presence of sequences in an M type 12 strain hybridizing under high stringency to a sof gene probe (Rakonjac et al., 1995 ). We have not explored the basis of the OF- phenotype in these strains. We were unable to amplify sof gene sequences from reference strains and/or clinical isolates corresponding to the emm/M types 1, 3, 5, 6, 15, 18, 33, 41, 43, 56, 64, 69 and 86, which are all commonly associated with an OF- phenotype (Fraser & Colman, 1985
; Podbielski et al., 1991
; Colman et al., 1993
; Whatmore et al., 1994
; http://www.cdc.gov/ncidod/biotech/strep/strepindex.html). We found sof12 amplicons from strains isolated in the US and South America that had the identical 5' sequence as the sof amplicon from the CDC M12 reference strain (Tables 1
and 2
). DdeI digests of the 3 kb F2+R5-generated amplicon from 30 random emm12 isolates in the CDC collection shared an identical six-band profile (data not shown), further demonstrating that the entire sof12 gene is highly conserved among emm12 isolates.
sof 5' sequence types usually predict AOF types
Of 113 strains that we attempted to type for AOF, 75 were typable. Nine of the 160 strains included in this study failed to produce detectable OF and could not be tested. AOF specificities of 73 of the 75 AOF-typable strains were directly predictable by sof gene sequences that were identical or nearly identical to the M type reference strain (Table 1). The AOF types 2, 4, 9, 11, 22, 25, 27G, 28, 44, 48, 49, 58, 59, 60, 61, 62, 63, 66, 68, 73, 75, 76, 77, 78, 81, 87, 89, 92 and ST2967 were only found in reference strains and/or clinical isolates with the corresponding designated sof sequence types (Table 1
). In four instances, AOF NT strains were found within a given sof type for which corresponding AOF typing sera was available. These included the only sof13L strain, one of three sof4 strains and two of five sof75 strains. The basis for this nontypability is not known, although among the sof75 strains it did not often appear to be due to gross alterations of the sof structural gene, since DdeI restriction digest profiles of F2+R5-generated 3 kb amplicons showed identical seven-band profiles.
In four instances, identical AOF types were found between strains with different emm types and/or M serotypes. The only strains that typed as AOF75 were of the emm types emm75, emm84 and st1815 (Table 1), which correlated with perfect or nearly perfect sequence identity over the 5' 621 bases. Similarly, identical sof sequence and DdeI digestion data found for all five of the emm76 and emm85 strains (four of which were AOF type 76) indicates the identity or near identity of the sof genes among these strains. A third example of multiple emm types sharing the same AOF type and 5' sof sequence was found with emm type st1160 and st2967 strains (Table 1
). Finally, the emm type st833 isolate was AOF type 90, in agreement with its partial sof product differing from the corresponding Sof90 sequence by only a five-residue deletion in the serine-rich region amino-proximal to the enzymic domain (Table 1
), and the indistinguishable DdeI profiles shared between the two sof amplicons (data not shown). Nearly identical sequence to sof90 was also found in a strain with the emm type st6735 (Table 1
).
Sof9 and Sof44 confer distinct AOF types, but share identity over their N-terminal 43%
In only one instance were identical 5' sof sequences for the first 180270 codons found between strains belonging to two distinct AOF types. The sof9 and sof44 sequences were identical over their 5' 342 codons. However, in agreement with the clearly distinct AOF specificity between AOF type 9 and 44 strains, the sof9 and sof44 genes were found to abruptly diverge after residue 343 of the predicted mature Sof9 product (Fig. 3a). The sequence of Sof9 residues 343810, corresponding to the C-terminal 445 residues of the 695 aa Sof2 enzymic domain (Courtney et al., 1999
), was distinct from the equivalent region of Sof44 (residues 343794). The distinct AOF types conferred by Sof9 and Sof44 suggests that type-specific AOF epitopes may only reside in the C-terminal 450 residues of the enzymic domain.
|
Analogous to the AOF type 61 situation, SS1457 (sof1207) was found to be AOF type ST2967, while the Sof2967 and Sof1207 mature N-terminal residues (1782 and 1808 respectively) were only 68% identical over their entire overlap. Closer analysis revealed striking similarity (91% identity) between between the two proteins over a 107-residue region within their putative enzymic domains (Fig. 3c). In contrast, for 15 other available GAS Sof proteins for which this sequence was available, their corresponding 107-residue regions shared only 2058% identity. These Sof proteins included four from previous studies (Rakonjac et al., 1995
; Courtney et al., 1999
) and 12 from the present study, including Sof11, Sof12, Sof28, Sof44, Sof61, Sof77, Sof81, Sof82, Sof87, Sof88, Sof1482 and Sof4539 (see accession nos in Table 1
). Additionally, Sof9 was found to share 84% identity with Sof2967 over this 107-residue region (corresponding to Sof9 residues 356462, compare Fig. 3a
and 3c
), however the region also included seven nonconservative substitutions dispersed along the length of the Sof2967/Sof9 overlap in contrast to only two nonconservative subtitutions found in the Sof2967/Sof1207 overlap. At this point it appears logical to speculate that all strains within a given AOF type may share strong homology in the region corresponding to this 107-residue domain associated with AOF type 2967.
A portion of the region putatively encoding the fibronectin-binding repeats (Fig. 2), in the seven sof genes for which these longer sequences (2·72·8 kb) were obtained (sof61, sof1482, sof3875, sof9, sof44, sof1207 and sof2967), was as expected, highly conserved with the corresponding regions of other known sof products (Kreikemeyer et al., 1995
; Rakonjac et al., 1995
; Courtney et al., 1999
).
Concordance between sof and emm types
For the majority of strains, the 5' 189258 codon sof gene sequence was either identical or highly similar to the corresponding sof sequence found in the reference strain of the same emm type (Table 1). For example, for both emm22 and emm28 strains, 6/6 strains contained identical 5' 657696 bp sof sequences (data not shown, depicted as identical amino acid sequences in Table 1
). For the majority of the specific emm type reference strains shown in Table 2
, dating from as long ago as 1949, their deduced Sof amino acid sequences were >99% identical to 15 recent clinical isolates with the corresponding emm type (Table 1
). Analogous to what has been seen with M types and corresponding emm sequence types (Whatmore et al., 1994
), for the majority of strains within specific AOF types there appears to be surprisingly little allelic variation of sof genes within the common 5' variable region analysed. For the majority of given sof gene comparisons, identical amino acid sequences corresponded to identical nucleotide sequences, with few examples of silent base substitutions. Although any base substitution was uncommon within the various sof gene designations assigned in Table 1
, there were more base substitutions resulting in missense mutations than in silent substitutions. Additionally, deletions/insertions of 114 codons, or two single base deletions resulting in short frame-shifts were not common (Tables 1
and 2
). The observed deletions were often associated with tandem homologous repeats, analogous to those seen in the GAS emm and sic deletion alleles (Hollingshead et al., 1997
; Mejia et al., 1997
).
PFGE profiles from strains with sof/emm combinations of the same designations were very similar to PFGE profiles from the majority of randomly selected strains within the same emm type, indicating that these particular emm types are comprised mainly of highly genetically related strains (Table 2). Not surprisingly, strains with unusual sof gene associations for a given emm type also differed in their PFGE patterns from the major pattern observed within an emm type (Table 2
, see strains 2920-97 and 4835-96). Strains sharing both highly conserved sof and emm genes also usually shared related T agglutination patterns, although exceptions are evident in Table 1
(for example, see strain 826-97 compared to other emm2, sof2 strains; 6039-99 compared to other emm89, sof89 strains, and the two emm59, sof59 strains SS1454 and SS913).
Table 3 summarizes the number of isolates within various sof-positive emm types that we have identified during the last 3 years. In general, for the emm types with 10 or more isolates listed in Table 3
, this reflects their relative isolation frequency compared to the other sof-positive emm types in our ongoing population-based GAS surveillance within the US (Beall et al., 1997
, 1998
; Zurawski et al., 1998
; http://www.cdc.gov/ncidod/biotech/strep/strepindex.html). It is evident that there is usually a specific 5' sof sequence type most commonly associated with a given frequently occurring emm type and it is also evident that these sof genes are sometimes conserved between strains of different genetic backgrounds (reflected by different emm types and T agglutination patterns).
|
Distinct 5' sof sequences found within the same emm type
In several examples strains within highly conserved 5' emm sequence types were characterized by having distinct AOF types and/or 5' sof sequence types. These types included the emm27L/77 and emm44/61 strains described above, which were further distinguishable by differing T agglutination types (Table 1). A third sof sequence found in an emm44/61 strain was the unique sof3930 sequence, which correlated with AOF nontypability. The other examples of non-concordant emm/sof associations were found within the emm types emm4 (AOF4/sof4 and AOF NT/sof2920), emm11 (AOF11/sof11 and AOF25/sof25), emm25 (sof25, sof4958 and sof75), emm68 (AOF68/sof68, AOF NT/sof4470 and AOF NT/sof4438), emm81 (AOF81/sof81, AOF NT/sof1658, AOF NT/sof1965), emm88 (sof88 and sof1482), emm89 (AOF89/sof89, AOF NT/sof4835), st4935 (sof4935, sof1881, AOF27G/sof27G) and st448 (sof448, sof3894). When strains of the same emm type were of different 5' sof sequence types, it is probable that this would correlate with dissimilar PFGE profiles, indicating divergent genetic lineages as in the two examples shown in Table 2
.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
It is important to note that although emm is often referred to as the M protein gene, it is sometimes referred to as any one of up to three emm-like genes that lie at the vir locus. In this work emm refers to the specific gene amplified by primers 1 and 2 (Whatmore et al., 1994 ) and it is this specific gene that has been shown in several strains to encode the protein that evokes M-type-specific antibodies (Hollingshead et al., 1986
; Robbins et al., 1987
; Miller et al., 1988
; Mouw et al., 1988
; Dale et al., 1993
, 1996
). Nonetheless, at present it has not been established that the emm gene provides the basis of M type specificity in all GAS strains. This is especially true of OF+ strains in which emm is usually flanked by two additional emm-like genes situated at the vir locus (Haanes et al., 1992
; Hollingshead et al., 1993
; Podbielski, 1993
). This study may be the first fairly extensive analysis of the circumstantial coincidence of specific emm sequences and M serotypes in OF+ clinical isolates.
This work provides further strong circumstantial evidence that it is the emm gene that encodes M type specificity. The first observation is that in a diverse group of GAS including 64 of 68 M-typable strains, the specific M type correlated with a highly specific emm sequence type. Second, in many instances the same M and corresponding emm type was found among strains judged to be of differing genetic backgrounds on the basis of differing sof genes and T agglutination patterns (Table 1). Third, in three circumstances, identical M types and corresponding identical emm sequences were found within distinct M type reference strains (see M44/M61, M27L/M77 and PT1658/M81). In each of these three examples, one serotype was established many years prior to the later one, suggesting that inadvertently the later M type reference strain may not have been exhaustively tested against all previous M typing sera. The M27L/M77 and PT1658/M81 results are straightforward and expected on the basis of identical emm genes. The basis of the apparent contradiction of emm44/61 (sof44 T5/27/44) strains having dual M44/M61 specificity while the emm44/61 (sof61 or sof3930, T11/12) strains had solely M61 type specificity is unknown. Since the M type 44 and 61 emm genes have not been entirely sequenced, it is possible that M44 contains epitopes that are not present in M61, or that even another emm-like gene encodes M44-specific epitopes. It should be noted that the M-type reference strains for M44 and M61, besides having identical 5' emm sequences encoding at least their first 85 mature residues, also have identical 5' enn sequences which map immediately downstream of emm (Whatmore et al., 1995
). The fourth and perhaps strongest line of circumstantial evidence provided here that emm encodes the M-serotype-specific epitopes came from four sets of strains that shared M type identity or partial identity with M type reference strains (Fig. 1
, see st1160 and M2, st448 and M49, st2147 and M59, M68.1 and M68). Although these strains had 5' emm gene sequences with significant differences from the M-type reference emm sequences, these emm genes obviously shared with them identical potential epitope-encoding sequences.
There have been few studies concerning the genetic diversity of strains within GAS M serotypes. Historically, M serotypes have been treated as indicative of strain types. An earlier study documented the identical emm types shared among various reference strains and demonstrated their differing genetic backgrounds (Whatmore et al., 1994 ). This study also clearly demonstrates that presenting M serotypes as strain types is an oversimplification. Besides the M type reference strains discussed above, we have found current clinical isolates with identical M serotypes that vary in their sof gene sequences and associated AOF types, emm gene sequences and PFGE profiles (Tables 13
). There have been multiple studies that have assumed M serotypes based upon AOF types (for one example see Colman et al., 1993
). This approach is possibly valid for the majority of isolates obtained in developed countries, although at this point it is not possible to be certain. Between roughly August 1999 and November 1999 we analysed more than 80 additional clinical isolates from patients within the US, including two or more isolates within the frequently encountered emm types 2, 4, 11, 12, 22, 27L/77 (T13), 28, 48, 58, 75, 82, 87 and 89. In each isolate, there was perfect agreement of emm and sof sequence designations as determined either by direct sequencing or by comparison of the highly conserved emm and sof restriction profiles. However, in this study we found strains within the M types 2, 11 and 61 that correlate with AOF types ST2967, 25 and 44 respectively, clearly indicating that M serotypes should not be inferred from AOF typing (Table 1
). To our knowledge, this is the first report of strains with M types associated with more than one AOF type. These data are corroborated in strains 1160-99 (M2, AOF2967), 4808-96 (M11, AOF25) and SS511 (M44/61, AOF44), by the presence of distinct sof and emm gene sequences that are nearly identical to the corresponding serotyping-reference-strain gene sequences. It must be re-emphasized that this remarkable diversity of sof types within defined emm types would not be expected from a random study of strains within given emm types, but is a direct result of our attempts to include genetically diverse GAS strains through examining strains with unusual T type/emm associations and from developing countries, where we have previously found a large degree of strain diversity (Jamal et al., 1999
; Facklam et al., 1999
).
It appears that continued sequence-based analysis of heterologous sof and emm gene sets that confer identical serological specificity may aid in the identification of the specific epitopes responsible for M type and AOF type specific reactions. The identical Sof 655-residue sequence immediately N-terminal to the fibronectin-binding repeats shared between the three AOF61 strains, as well as the highly conserved 107-residue region shared between sof1207 and sof2967 are totally consistent with the previously mapped Sof2 enzymic domain (Fig. 3a, c
). This indicates the liklihood that critical type-specific AOF epitopes of these proteins reside within Sof regions corresponding to Sof9 residues 343810. Since Sof9 residues 356462 correspond to the apparently critical Sof2967 residues 341447, it is possible that this 107-residue region represents the sole region determining AOF type specificity. Further work, possibly involving the use of purified protein fragments and site-directed mutagenesis, is required for further elucidation. It is also possible that additional short regions that are conserved among all Sof proteins contain epitopes critical for the AOF reaction. Work involving simple AOF sera absorptions with heterologous Sof proteins should address this possibility.
This work clearly indicates that the probable basis of the OF- phenotype in most classically OF- emm/M types is simply the absence of the sof gene, which is consistent with previous work demonstrating the absence of sof-hybridizing sequences in several OF- M types (Rakonjac et al., 1995 ) and the absence of sof sequences in the type M1 GAS genome (see http://www.genome.ou.edu/strep.html). We were unable to amplify sof sequences from various OF- strains (with the exception of emm12 strains) although the possibility exists that the primers used do not anneal with sof sequence types present in some OF- strains. Due to the variability of OF activity within certain strains, we find that sof amplification is much more reliable than OF testing for the general deduction of whether an isolate has a classical OF+ or OF- emm type (Johnson & Kaplan, 1993
). Typically, OF- strains, including M/emm types 1, 3, 5, 6, 12, 18, 33 and 56 which were emm types found in some of the sof negative strains referred to in this study, are designated class I GAS due to their M protein reactivity with defined monoclonal antibodies associated with class I M proteins (Bessen et al., 1989
). These strains typically have emm and emm-like gene arrangements at the vir locus categorized as one of the patterns A, B or C based upon their number and their peptidoglycan-spanning-domain sequence (Bessen et al., 1996
). The emm12 isolates are the only class I and/or pattern AC strains known at this time that have been associated with a specific sof sequence.
The results shown in this work indicate that sof or emm sequence-based analysis is generally more discriminating than serological analysis for subtyping strains. For example, strains within M serotypes 2 and 59, and within AOF types 61 and ST2967, could be further subdivided by emm and sof sequence differences. There are many examples of AOF NT and M NT strains listed in Table 1, although all were sof and emm typable (with the single exception of the emm NT sof2 strain 1588-99). Even the sof9 and sof44 amplicons, which share the identical sequence over their first 1026 bases, are readily distinguishable by further sequence comparison that can be easily obtained with universal sof sequencing primers. Also, the apparent full-length conservation among many sof genes should allow their identification through conserved sof amplicon restriction profiles.
![]() |
ACKNOWLEDGEMENTS |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Beall, B., Facklam, R., Hoenes, S. & Schwartz, B. (1997). A survey of emm gene sequences from systemic Streptococcus pyogenes infection isolates collected in San Francisco, California; Atlanta, Georgia; and Connecticut in 1994 and 1995.J Clin Microbiol 35, 1231-1235.[Abstract]
Beall, B., Facklam, R. R., Elliott, J. A., Franklin, A. R., Hoenes, T., Jackson, D., Laclaire, L., Thompson, T. & Viswanathan, R. (1998). Streptococcal emm types associated with T agglutination types and the use of conserved emm gene restriction fragment patterns for subtyping group A streptococci.J Med Microbiol 47, 893-898.[Abstract]
Bessen, D. E. & Hollingshead, S. K. (1994). Allelic polymorphism of emm loci provides evidence for horizontal gene spread in group A streptococci.Proc Natl Acad Sci USA 91, 3280-3284.[Abstract]
Bessen, D., Jones, K. F. & Fischetti, V. A. (1989). Evidence for two distinct classes of streptococcal M protein and their relationship to rheumatic fever.J Exp Med 169, 269-283.[Abstract]
Bessen, D. E., Sotir, C. M., Readdy, T. L. & Hollingshead, S. K. (1996). Genetic correlates of throat and skin isolates of group A streptococci.J Infect Dis 173, 896-900.[Medline]
Colman, G., Tanna, A., Efstratiou, A. & Gaworzewska, E. T. (1993). The serotypes of Streptococcus pyogenes present in Britain during 19801990 and their association with disease.J Med Microbiol 39, 165-178.[Abstract]
Courtney, H. S., Hasty, D. L., Li, Y., Chiang, H. C., Thacker, J. L. & Dale, J. B. (1999). Serum opacity factor is a major fibronectin-binding protein and a virulence determinant of M type 2 Streptococcus pyogenes. Mol Microbiol 32, 89-98.[Medline]
Dale, J. B. (1999). Group A streptococcal vaccines.Infect Dis Clin North Am 13, 227-243.[Medline]
Dale, J. B., Chiang, E. Y. & Lederer, J. W. (1993). Recombinant tetravalent group A streptococcal M vaccine.J Immunol 151, 2188-2194.
Dale, J. B., Simmons, M., Chiang, E. C. & Chiang, E. Y. (1996). Recombinant, octavalent group A streptococcal M protein vaccine.Vaccine 14, 944-948.[Medline]
Dale, J. B., Cleary, P. P., Fischetti, V. A., Kasper, D. L., Musser, J. M. & Zabriskie, J. B. (1997). Group A and group B streptococcal vaccine development: a round table presentation.Adv Exp Med Biol 418, 863-868.[Medline]
Dillon, H. C. & Dillon, M. S. A. (1974). New streptococcal serotypes causing pyoderma and acute glomerulonephritis types 59, 60, and 61.Infect Immun 9, 1070-1078.[Medline]
Facklam, R., Beall, B., Efstratiou, A. & 13 other authors (1999). Demonstration of emm typing and validation of provisional M types for group A streptococci. Emerg Infect Dis 5, 247253.[Medline]
Fischetti, V. A. (1989). Streptococcal M protein: molecular design and biological behavior.Clin Microbiol Rev 2, 285-314.[Medline]
Fraser, C. A. M. & Colman, G. (1985). Some provisional types among Streptococcus pyogenes. In Recent Advances in Streptococci and Streptococcal Diseases: Proceedings of the IX Lancefield Symposium on Streptococci and Streptococcal Diseases, pp. 3536. Edited by Y. Kimura, S. Kotami & Y. Shiokaswa. Bracknell, UK: Reedbooks.
Gooder, H. (1961). Association of a serum opacity reaction with serological type in Streptococcus pyogenes.J Gen Microbiol 25, 347-352.[Medline]
Haanes, E. J., Heath, D. J. & Cleary, P. P. (1992). Architecture of the vir regulons of group A streptococci parallels opacity factor phenotype and M protein class.J Bacteriol 174, 4967-4976.[Abstract]
Hollingshead, S. K., Fischetti, V. A. & Scott, J. R. (1986). Complete nucleotide sequence of type 6 M protein of the group A streptococcus.J Biol Chem 261, 1677-1686.
Hollingshead, S. K., Readdy, T. L., Yung, D. L. & Bessen, D. E. (1993). Structural heterogeneity of the emm gene cluster in group A streptococci.Mol Microbiol 8, 707-717.[Medline]
Hollingshead, S. K., Fischetti, V. A. & Scott, J. R. (1997). Size variation in group A streptococcal M protein is generated by homologous recombination between intragenic repeats.Mol Gen Genet 207, 196-203.
Jamal, F., Pit, S., Facklam, R. & Beall, B. (1999). New emm (M protein gene) sequences obtained from group A streptococci isolated from Malaysian patients.Emerg Infect Dis 5, 182-183.[Medline]
Johnson, D. R. & Kaplan, E. L. (1993). A review of the correlation of T-agglutination patterns and M-protein typing and opacity factor production in the identification of group A streptococci.J Med Microbiol 38, 311-315.[Abstract]
Johnson D. R. & Kaplan, E. L. (1996). Laboratory Diagnosis of Group A Streptococcal Infections. Bahrain: World Health Organization.
el-Kholy, A. M., Sorour, A. H., Rotta, J. & Guirguirs, N. (1973). Group A beta hemolytic streptococci in skin lesions among an Egyptian school children population.J Hyg Epidemiol Microbiol Immunol 17, 316-322.[Medline]
Kreikemeyer, B., Talay, S. R. & Chhatwal, G. S. (1995). Characterization of a novel fibronectin-binding surface protein in group A streptococci.Mol Microbiol 17, 137-145.[Medline]
Krumwiede, E. (1954). Studies on a lipoproteinase of group A streptococci.J Exp Med 100, 629-638.
Lancefield, R. C. (1962). Current knowledge of the type specific M antigens of group A streptococci.J Immunol 89, 307-313.[Medline]
Lindgren, P. E., McGavin, M. J., Signas, C., Guss, B., Gurusiddappa, S., Hook, M. & Lindberg, M. (1993). Two different genes coding for fibronectin-binding proteins from Streptococcus dysgalactiae: the complete nucleotide sequences and characterization of the binding domains.Eur J Biochem 214, 819-827.[Abstract]
Maxted, W. R., Widdowson, J. P. M., Fraser, C. A., Ball, L. & Bassett, D. C. J. (1973). The use of the serum opacity reaction in the typing of group A streptococci.J Med Microbiol 6, 83-90.[Medline]
Mejia, L. M., Stockbauer, K. E., Pan, X., Cravioto, A. & Musser, J. M. (1997). Characterization of group A Streptococcus strains recovered from Mexican children with pharyngitis by automated DNA sequencing of virulence-related genes: unexpectedly large variation in the gene (sic) encoding a complement-inhibiting protein.J Clin Microbiol 35, 3220-3224.[Abstract]
Miller, L., Gray, L., Beachey, E. & Kehoe, M. (1988). Antigenic variation among group A streptococcal M proteins: nucleotide sequence of the serotype 5 M protein gene and its relationship with genes encoding types 6 and 24 M proteins.J Biol Chem 263, 5668-5673.
Mouw, A. R., Beachey, E. H. & Burdett, V. (1988). Molecular evolution of streptococcal M protein: cloning and nucleotide sequence of the type 24 M protein gene and relation to other genes of Streptococcus pyogenes.J Bacteriol 170, 676-684.[Medline]
Nielsen, H., Engelbrecht, J., Brunak, S. & von Heijne, G. (1997). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites.Protein Eng 10, 1-6.[Abstract]
Pack, T. D. & Boyle, M. D. (1995). Characterization of a type IIo group A streptococcal immunoglobulin-binding protein.Mol Immunol 32, 1235-1243.[Medline]
Podbielski, A. (1993). Three different types of organization of the vir regulon in group A streptococci.Mol Gen Genet 237, 287-300.[Medline]
Podbielski, A., Melzer, B. & Lutticken, R. (1991). Application of the polymerase chain reaction to study the M protein(-like) gene family in beta-hemolytic streptococci.Med Microbiol Immunol 180, 213-227.[Medline]
Rakonjac, J. V., Robbins, J. C. & Fischetti, V. A. (1995). DNA sequence of the serum opacity factor of group A streptococci: identification of a fibronectin-binding repeat domain.Infect Immun 63, 622-631.[Abstract]
Robbins, J. C., Spanier, J. G., Jones, S. J., Simpson, W. J. & Cleary, P. P. (1987). Streptococcus pyogenes type 12 M protein gene regulation by upstream sequences.J Bacteriol 169, 5633-5640.[Medline]
Saravani, G. A. & Martin, D. R. (1990). Opacity factor from group A streptococci is an apoproteinase.FEMS Microbiol Lett 56, 35-39.[Medline]
Tenover, F. C., Arbeit, R. D., Goering, R. V., Mickelson, P. A., Murray, B. E., Persing, D. H. & Swaminathan, B. (1995). Interpreting chromosomal DNA restriction patterns produced by pulsed-field gel electrophoresis: criteria for bacterial strain typing.J Clin Microbiol 33, 2233-2239.
Whatmore, A. M., Kapur, V., Sullivan, D. J., Musser, J. M. & Kehoe, M. A. (1994). Non-congruent relationships between variation in emm gene sequences and the population genetic structure of group A streptococci.Mol Microbiol 14, 619-631.[Medline]
Whatmore, A. M., Kapur, V., Musser, J. M. & Kehoe, M. A. (1995). Molecular population genetic analysis of the enn subdivision of group A streptococcal emm-like genes: horizontal gene transfer and restricted variation among enn genes.Mol Microbiol 15, 1039-1048.[Medline]
Widdowson, J. P., Maxted, W. R. & Grant, D. L. (1970). The production of opacity in serum by group A streptococci and its relation with the presence of the M antigen.J Gen Microbiol 61, 343-353.[Medline]
Zurawski, C. A., Bardsley, M. S., Beall, B., Elliott, J. A., Facklam, R., Schwartz, B. & Farley, M. M. (1998). Invasive group A streptococcal disease in metropolitan Atlanta: a population-based assessment.Clin Infect Dis 27, 150-157.[Medline]
Received 19 October 1999;
revised 6 January 2000;
accepted 14 January 2000.