Department of Biosciences, The Karolinska Institute, Novum, SE-14157 Huddinge, Sweden and 1 Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, L-441, PO Box 808, Livermore, CA 94550, USA
* To whom correspondence should be addressed. Tel: +46 8 6089254; Fax: +46 8 6081501; E-mail: bo.lambert{at}cnt.ki.se
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Abbreviations: HPRT, hypoxanthineguanine phosphoribosyl transferase; MF, frequency of mutation
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Epidemiological studies have provided associations between human cancer morbidity and environmental exposures and life style, but many of these associations have not yet been mechanistically explained. Multiple mutations are implicated in carcinogenesis, and somatic mutagenesis is one possible link between environmental exposure and cancer disease. Results from a recent study of the influence of dietary factors on the frequency of somatic in vivo mutation provided a mechanistic support for a cancer-protective effect of vegetables and fruit by modulation of somatic mutagenesis (1).
A mutation has a certain degree of specificity in that it may bear the signature of the type of damage that induced it, be it a spontaneous mistake during normal DNA replication or repair, some endogenous metabolite, or an environmental chemical or radiation exposure. The spectrum of gene-specific mutations in a tissue, i.e. the frequency distribution of different types of mutation along a defined nucleotide sequence in DNA, could provide information about the aetiology of mutations. Results from studies of the p53 tumour suppressor gene-specific mutational spectrum in various human tumours from different regions of the world have provided evidence of the environmental factors implicated in skin, liver and lung carcinogenesis (2). Similarly, the spectrum of somatic mutation in somatic cells from the individuals of a healthy population could serve as a useful in vivo marker of past and present exposure to genotoxic agents, and help to explain why some specific environmental factors are associated with an increased cancer risk.
Several studies of HPRT (hypoxanthineguanine phosphoribosyl transferase) gene mutations in human cultured cells and T-lymphocytes in vivo have provided evidence for age, exposure and genetics to influence mutation frequency (MF). An increased MF with increasing age in normal healthy people is generally observed [reviewed in (3,4)]. Moreover, certain occupational exposures (5,6) and life style factors such as smoking [reviewed in (3,4)] have been associated with an increased MF, while the intake of specific dietary items seems to have a protective effect against mutations and cancer (1). Some inherited polymorphisms of genes involved in metabolism and DNA repair have also been shown to influence the HPRT MF [e.g. (5,6)].
In order to study the possible influence of environmental and life style factors on somatic mutagenesis, we have identified 161 HPRT mutations in T-lymphocytes in a population of 50 healthy Russians, and compared the base substitutions in this Russian spectrum with previously established spectra of base substitution mutations in populations from USA (7) and Sweden (8,9).
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The USA and Swedish populations, which are included for comparisons, have also been described previously. The USA population comprised healthy smokers and non-smokers from the RaleighDurham area of North Carolina (7). The Sweden 1 population comprised a group of healthy garage workers, laboratory personnel and mechanics, all non-smokers. The garage workers had been occupationally exposed to diesel exhausts, and as a result had increased levels of aromatic DNA adducts, but no increase of HPRT MF compared with the laboratory personnel and mechanics who served as controls (6). The Sweden 2 population were collected as healthy controls for lung cancer patients, and comprised smokers as well as non-smokers. The mean age of this population was older than any of the other populations (12). All individuals included in the two Swedish spectra were either working or living in the county of Stockholm. The composition of the populations with respect to age, gender and smoking habits are shown in Table I.
|
For each donor, 15 thioguanine-resistant clones were expanded by dilution from one well to 4896 wells, depending upon the size and vigour of the clone in the initial well. All wells of a clone were pooled and harvested after 728 days. An aliquot of the cells was lysed for DNA analysis and the rest frozen in a controlled freezing chamber and stored in liquid nitrogen.
The mutants were analysed for HPRT deletions with PCR-based methods aiming at detection of retention, loss or change of eight gene fragments containing the exons and flanking intron sequences of genomic HPRT DNA. In the population of healthy controls who have been further analysed in the present work, 19% of mutants were found to contain a genomic HPRT deletion (14). Mutant clones, which showed no detectable change in the deletion analysis and contained sufficient amount of DNA for further study, were selected for analysis of point mutations by reverse transcriptasePCR and DNA sequencing methods. Frozen aliquots of these clones in dimethyl sulfoxide medium were sent in two separate shipments in dry ice containers to Stockholm in 1999,
4 years after they had been collected and stored. Upon arrival, samples in one of the shipments were partly melted, and only a few of these clones were suitable for analysis. The second shipment was in good condition. The cell pellets were thawed, washed in 5 ml 1x phosphate-buffered saline (PBS), diluted with 4 ml PBS and redistributed into four tubes. One tube was used directly for RNA isolation, while the three other tubes were put into the 80°C freezer.
RNA was isolated with a Purescript kit (GENTRA Systems). cDNA synthesis was carried out for 1.5 h at 37°C in 4 µl M-MLV RT 5x buffer (Promega) containing 500 µM of each dNTP (Promega), 1.6 µM of reverse primer Y3, 1 U/µl RNAsin (Pharmacia) and 2.5 U/µl M-MLV reverse transcriptase(Promega). Reverse transcriptasePCR was carried out as described (9), except that biotinylated primers were not used. The nested PCR product was cleaned up using MicroSpin Column (S-400 HR) (Amersham Pharmacia Biotech).
Cycle sequencing was performed with Big Dye v.2 (Applied Biosystems): 4 µl Big Dye Terminator mix, 2 µl of 5x sequencing buffer, 1 µl of 20 µM primer and 11 µl nested PCR product (400 ng of double-stranded cDNA) was added to 20 µl of water. Cycle sequencing was run at 96°C for 3 min, 96°C for 10s, 50°C for 5 s and 60°C for 4 min, for 25 cycles. The PCR product was cleaned by ethanol precipitation. Sequencing and primers were as in Podlutsky et al. (9). The reaction was run on a 377A Automated Sequencer (Applied Biosystems), and the sequences were analysed using Sequence Navigator and Edit View (Applied Biosystems).
Data analysis
The probability that two or more base substitutions in a set of random mutations would occur at the same site was calculated using the Poisson distribution and the Bonferroni correction to account for multiple comparisons (15). The calculations are based on the assumptions that (i) all observed mutations are independent and (ii) there are 300 mutable sites in the hprt coding sequence (8,9,16). For a set of 94 simple base substitutions, as in the present work, the probability of observing five or more mutations at any single site is <0.006, while four mutations in one position yields a P-value of 0.09. For the compiled spectrum of 382 mutations, the probability of observing eight or more mutations in one position is <0.025.
Mutational spectra in the three study populations were compared with the Monte Carlo method of Adams and Skopek (15) and the program described by Cariello et al. (17). Two spectra were compared at a time, and positions showing no mutations were not used in the calculations. All P-values were based on 30 000 iterations. A P-value of <0.05 means that the spectra are different in a pairwise comparison, but since six comparisons were made, a Bonferroni correction for multiple comparisons with a corrected significance level of 0.0083 (0.05/6) was applied.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The aim of the present study was to compile a mutational spectrum of single base substitution mutations for comparisons with previously established mutational spectra in healthy populations from USA (7) and Sweden (8,9). The HPRT mutant clones studied in this work included only those which had shown no detectable change in the previous deletion analysis (14). Moreover, since smoking is one factor that may influence the mutational spectrum, the intention was to include equal numbers of smokers and non-smokers. Owing to exhaustion of material, and problems during shipping, storage and thawing of the samples, these criteria were not fully achieved. As shown in Table I, mutant clones from 25 current smokers and 17 non-smokers were included. Three former smokers and five donors for whom smoking data were missing were also included to bring the total number of donors to 50. The average age and HPRT MF in this subset of Russian controls were similar to the original, complete control population. Within the present study population, smokers and non-smokers showed similar means for age and HPRT MF (Table I).
Numbers and types of mutations
A mutation was identified in 176 clones from the 50 individuals studied. In some donors, more than one mutant clone with the same mutation was identified. Identical mutations in different clones from one donor may be replicates of one original mutation, or they may represent separate, unique mutational events. Since no attempts were made to characterize these mutants further, each distinct mutation was counted only once per individual in the spectrum analysis, but the mutant clones with identical mutations are listed in Tables IIV. There were a total of 161 mutations after discounting these clonal replicates (Table II). The mean number of mutations per donor was 3.26. In 46 donors, between 1 and 5 mutations were collected. In 4 donors, the number of mutations/donor ranged from 7 to 17 (Figure 1).
|
|
|
|
|
Mutations affecting splicing functions
In 44 mutants (including 4 possible clonal replicates), the HPRT cDNA had either lost one or several exons, or contained intron sequences, indicating mutations affecting the splicing functions (Table III). A single exon was missing in most of these mutants. Two mutants had a duplication of exon 2+3 and exon 6, respectively. In several mutants, cryptic splice sites were used in intron 1 and 5 and in exon 8 and 9. All these types of mutation have been described previously (20,21), and no attempts were made to further characterize the underlying change in genomic DNA. One mutant appeared to have two independent mutations; in addition to the loss of exon 6 there was also a 5 nt deletion in exon 2 (Table III, 1052ns/90). This mutant is also listed among the deletion/insertions in Table IV.
Deletion/insertion mutations
In 28 mutants (including one possible clonal replicate), the HPRT cDNA was found to contain small deletions and insertions ranging from 1 to 60 nt, which are listed in Table IV. Two mutants contained two changes: one was classified as compound, and the other one as complex. The first one (1052ns/90) showed a loss of the entire exon 6 in addition to a 5 nt deletion in exon 2; these were two apparently independent changes, as already mentioned above. The complex mutation (Table IV, 1379u/50) comprised one base substitution and one dinucleotide deletion separated by 5 nt in exon 7. Each of these changes is likely to give rise to a TG-resistant phenotype. The base substitution, 488TC, predicts a change of residue 162Leu
Ser, and has been reported previously in the Human HPRT Mutation Database (22), and in T-cells in vivo (7). The dinucleotide deletion gives rise to a stop codon in the ninth codon downstream from the deletion point. However, since these two changes are located so close to each other, it is most likely that they are part of the same complex mutational event.
Of the remaining 26 mutants, 13 were ± 1 nt deletions, and 13 were deletions of 460 nt (Table IV). There were 9 deletions of 1 nt as compared with 4 insertions of 1 nt, and both types of change were more common among smokers than non-smokers. There were no deletion/insertion mutations in exon 1 and 5, which are the two shortest exons, but as many as 10 (36% of all) were in exon 2, which is twice as many as expected from a random distribution according to exon length. The other exons showed between 1 and 5 mutations each, with a distribution close to expectation. Thus, it seems that exon 2, and especially the 5' half of the exon, is a region particularly prone to undergo deletion/insertion mutagenesis, as also observed and discussed in our previous work (23,24).
Interestingly, there were two identical mutations in exon 8, with a breakpoint within the hypothetical palindromic sequence spanning nucleotide positions 533557, which was previously pointed out as a possible mutational hotspot region in patients with lung cancer (25).
Base pair substitutions
In total, 94 single base pair substitutions were identified: 50 transitions and 44 transversions comprising 62 different substitutions at 55 different nucleotide positions. There were 85 missense and 9 nonsense mutations, all of which are listed in Table V with position, type of change, sequence context and predicted amino acid change. Different types of substitution were observed at seven positions; 3GA or C, 74C
G or T, 131A
G or T, 136A
G or C, 197G
A or T, 368C
G or A, 606
C or T.
Surprisingly many of these mutations were novel in that they have not been previously reported to occur. Out of the 62 different base substitutions in Table V, as many as 8 (13%) are new additions to the 279 single base pair substitutions that are included in the human HPRT Mutation Data Base (22) (annotated as new in Table V), and as many as 21 (34%) are not included among the 169 different kinds of base substitutions that we have reported previously in human T-cells in vivo (79) (annotated as new T in Table V). However, 2 of these 21 mutations (Table V, 104TA and 533T
G) are included among 48 different kinds of base substitutions that were detected in a study of T-cell mutations in Russian twins (26).
One of the 8 new mutations, 430CT, creates a stop codon, 143Gln
Term. The other seven mutations are all missense: two (130G
C, 614T
A) occur in positions where different kinds of base substitutions had been reported previously, and 5 (136A
C, 136A
G, 410T
G, 479T
A, 487T
G) occur in positions where no base substitution mutations have been reported before. When these results are added to the database of presently known human HPRT mutations, the number of positions in the 657 bp coding region of the human HPRT gene that can give rise to missense or nonsense mutation by single nucleotide substitution amounts to 292, which corresponds to 44% of the total number of nucleotides. Although this is one further step towards saturation, the HPRT mutational spectrum is not yet completed, since there are still 14 possible nonsense mutations that have not been reported so far.
It is also interesting that, among the new mutations, there were some that changed one of the first or last two nucleotides of an exon, and still produced cDNA with no signs of exon skipping. These mutations were the second nucleotide of exon 3 (136AG and 136A
C), the second nucleotide of exon 7 (487T
G) and the first nucleotide of exon 8 (533T
G). The absence of a splicing effect of other mutations in the first or last position of an exon, such as the last nucleotide of exon 2 (del134G), the first nucleotide of exon 6 (403G
C), and the first (del610C) and second (611A
G) nucleotide of exon 9 have been discussed before (21).
The types of base substitutions are summarized in Table VI. Transitions predominated over transversions among non-smokers while the reverse was true among smokers (current, ex or others). Although the difference between smokers and non-smokers was not significant (P = 0.2 for subtype distribution and P = 0.1 for all transitions versus all transversions), this finding confirms our previous observations of a higher frequency of GCTA and AT
CG transversions, and lower frequency of GC
AT transitions among smokers than non-smokers (7,9).
|
|
|
|
In total, there are 382 mutations at 126 different positions in the four spectra together, but only 11 positions are mutated in all spectra, and only five site-specific mutations occur in all four spectra (Figures 2 and 3). The site-specific mutations are 3GA, 143G
A, 197G
A, 508C
T and 617G
A. The 11 common positions that are mutated in all four spectra are 119G
T or A, 208G
A or T, 464C
G or T, 539G
A or T, 568G
A or T or C and 611A
T or G, plus the five mentioned above.
In the compiled spectrum of 382 single base pair substitutions, any position with 8 or more mutations is significantly different from a random distribution (see Materials and methods). There are eight such sites in the compiled spectrum, 3G, 74C, 143G, 146T, 197G, 508C, 611A and 617G (Figure 2). All but two of these hotspot positions are mutated in all three country spectra. The exceptions are 74C (with 9 mutations, but none in the Sweden 1 spectrum) and 146T (with 10 mutations, but none in the Russian spectrum). In the separate spectra, five mutations are needed for a position to qualify as a hotspot (see Materials and methods). In the Russian spectrum, positions 197G and 617G are significant hotspots with 8 and 5 mutations each. In the USA spectrum, 197G, 617G and 508C are hotspots (7). Hotspots in the Sweden 1 spectrum are 146T and 197G, and 143G, 197G and 617G in the Sweden 2 spectrum. Thus, there is a considerable overlap between the four spectra with respect to mutational hotspots.
In contrast, there are several positions that seem to distinguish one spectrum from the other. In the Russian spectrum, there are four 118GA transitions and four 611A
T transversions. The latter mutation was also detected in a study of mutants recovered from the Russian spectrum by Curry et al. (26). None of these mutations occur in any of the other spectra. In the USA spectrum, there are two transitions (C
T) and two transversions (C
G) at 551C, a position that is not mutated in any of the other spectra. In the Sweden 1 spectrum, there are five 146T
C transitions, a significant hotspot that does not occur in any of the other spectra. In the Sweden 2 spectrum, there are 11 transitions 143G
A, again a significant hotspot, which shows at most two mutations of the same kind at this position in the other spectra. Thus, these mutations could possibly be related to more regional environmental exposures. Analyses of spectra overall presented below appear to rule out age and smoking status as being responsible for the differences between the Sweden 1 and Sweden 2 spectra.
Between 27 and 44% of the mutations that occur in each spectrum are unique, in that they occur only in that spectrum and not in any of the others. The Russian spectrum contains more of unique mutations and unique mutated positions than any of the other spectra (Table VII). The degree of similarity between two spectra was evaluated by counting overlapping mutations, i.e. mutations that are identical in a pairwise comparison of two spectra. As shown in Table VIII, there is more overlap of mutations between the USA spectrum and the three other spectra, and less overlap of the two Swedish spectra and the Russian spectrum.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The comparison of the mutational spectra between the three populations from Russia, USA and Sweden revealed considerable similarities, both with respect to the overall distribution of mutations, and the hotspot positions. The particular strength with this comparison, in contrast to earlier analyses of mutational spectra compiled in silico from many smaller datasets produced in various laboratories [e.g. (4)], is that all the data were obtained in two laboratories with similar criteria for selection of study population and analytical procedures and methods.
The extensive overlap between the three mutational spectra from different parts of the world strongly supports the view that a great part, perhaps a predominating part, of the mutations are caused by endogenous factors inherent to human physiology and metabolism, rather than by some more or less specific life style factors or environmental exposures. For example, positions 197 and 617 were hotspots in four and three of the study populations, respectively, whose spectra were compared in this study. At both positions, G to A transitions predominated 4-fold over G to T transversions. The existence of these hotspots and the specific mutations that occur at these sites suggest a consistency of frequency of damage formation, misrepair and/or misreplication across the populations, phenomena to which the shared (TG)TGTG sequence context may contribute. Hence, our results indicate that 197 and 617 are hotspots for multiple mutagenic events, and in multiple populations, as also observed by others (16,22,28). This result is consistent with the sequence (context) having features that invite more damage and/or poorer repair and/or replication errors. In this context, it is of interest that neither position 197 nor 617 is frequently substituted among inherited mutations in LeschNyhan patients (29).
On the other hand, the presence of some frequently mutated positions or significant hotspots in one but not the other spectra, e.g. positions 118G, 143G, 146T, 551C and 611A, may reflect the influence of modulating factors or perhaps represent direct fingerprints of some specific environmental exposures to mutagenic agents. At the DNA level, such agents would be expected to induce a sequence-specific damage formation for which there is a potential for misrepair or misreplication. The large number of sites in the HPRT coding sequence that inactivate the HPRT protein (8,16,30) enhances the potential to detect differences in spectra.
In view of the diversity of these spectra, and the wide distribution of mutations along the HPRT coding sequence with as many as 127 mutated positions, and 37 of these showing more than one kind of base substitution, it is of interest that some pairwise comparisons did not reveal differences of spectra despite differences among study groups for age and smoking, and others revealed differences of spectra despite similarities in these variables. There was no statistical difference between either the USA spectrum and the Russian spectrum, or the USA spectrum and the two Swedish spectra. In contrast, the two Swedish spectra and the Russian spectrum were significantly different, even when applying the Bonferroni correction factor for multiple comparisons. An explanation for these apparent discrepancies, in part, is that the statistical method used for analysis of mutational spectra (18,20) compares only two spectra at a time and only at base positions where mutations have occurred in at least one of the two groups. For instance, the two Swedish spectra may not share so many base positions, but each of the them could have a considerable number of positions that are mutated in the US spectrum as well. Seeking explanations for both similarities and differences in mutation spectra is necessary for understanding the strengths and limitations of assessing mutation spectra in addition to analysing covariates for MF. This quest will be complex, requiring new statistical tools for identifying the elements in mutation spectra that make them distinctive, a continued research into the variables that affect the frequency and nature of mutations at specific sequences, and an ongoing iteration between epidemiological detection of key exposures and the detailed mechanisms of somatic mutagenesis.
Possible explanations for the differences between spectra could be age and/or smoking, factors which differed more between the Swedish and Russian populations, than between the Russian and USA populations (see Table I). However, neither of these factors seemed to explain the difference between the two Swedish populations. The spectrum of mutations in smokers within the Sweden 2 population was not different from that in non-smokers of similar age in the same population, and the spectrum of mutation in the older non-smoking subjects in the Sweden 2 population was not different from the spectrum of mutation in the young non-smoking subjects of the Sweden 1 population. Thus, age and smoking did not seem to influence the spectrum of base substitution mutations in these two Swedish populations, which is in agreement with previous observations in other studies that were not able to detect significant effects of smoking or age on the HPRT mutational spectrum in T-cells (4,7). The apparent lack of influence of age and smoking on the mutational spectrum could be attributable to a similarity in the net mutagenic effect of two different but highly complex exposures associated with ageing and smoking. A number of agents can lead to the same mutation, as discussed below.
The difference between the Swedish and Russian mutational spectra are not primarily associated with the prominent and significant hotspot positions, but with the occurrence of new and unique mutations at sites with low MF. As mentioned above, in the Russian spectrum with 94 mutations, there were many new mutations, and many that were not detected in either the USA or the Swedish spectra. Moreover, Curry et al. (26) observed many new HPRT mutations in T-cells from Russian twins. These authors (26) also found the spectrum of mutations in Russian twins to be significantly different from an age-matched western mutant dataset that included data from Burkhart-Schultz et al. (7) and Podlutsky et al. (8) used in the present work. However, the spectra studied by Curry et al. (4,26), included not only base substitutions, but also other categories of mutation, which make the result difficult to interpret. When the 55 single base substitution mutations in Russian twins reported by Curry et al. (26) were compared with the present Russian set of 94 mutations, the two spectra turned out to be significantly different with a P-value of <0.0001 (Table IX). Moreover, the russian twin spectrum of Curry et al. (26) was also significantly different from the USA spectrum (P < 0.0001, n = 149) and the Sweden 1 and Sweden 2 spectra (P = 0.002, n = 142 and P < 0.0001, n = 162, respectively). The 6 sets of Russian twins studied by Curry et al. (26) may have had distinct exposures that contribute to these differences.
In conclusion, these comparisons of mutational spectra of single base substitutions in the HPRT gene in T-cells of populations from Russia, USA and Sweden have demonstrated an overall similarity in the type, frequency and distribution of mutations. The results suggest that most mutations are induced by mechanisms that are inherent to human metabolism and little influenced by differences in life style or environmental exposures. In this respect the HPRT mutation spectra results are consistent with the analysis of mutations in genes associated with the development of cancer (31,32). However, similarities of mutation spectra do not rule out differences in aetiology. The same mutation may be induced by multiple distinct events, some associated with endogenous agents, others with exogenous agents. For example, GT transversions may be associated with 8-hydroxyguanine, an oxidatively damaged base arising from normal metabolism or from oxidative agents in cigarette smoke, or with exogenous exposures to a variety of agents that produce bulky adducts such as polycyclic aromatic hydrocarbons in cigarette smoke, or aflatoxin B1 [reviewed in (33)]. Such an overlap in spectrum may explain the increase in HPRT MF associated with smoking despite little difference in mutation spectrum.
The differences in mutation spectra, which do exist in spite of the overall similarities, are intriguing (Figure 3). A minor fraction of the mutations are certainly caused by factors that are different between Russia and Sweden, as well as between populations living in the same region within Sweden. Smoking does not seem to be involved in causing this difference, since the Russian spectrum did not differ from the spectrum in the USA population, with similar smoking habits and tobacco products as in Sweden. Age, another well-documented factor, which is associated with an increase in the frequency of HPRT mutations in T-cells, is also not likely to be responsible for the difference, because there was no statistical difference between the mutational spectra of the non-smokers in the two Swedish populations with >25 years difference in mean age. Thus, other factors causing mutational differences between populations need to be explored.
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|