Molecular modeling of the major adduct of (+)-anti-B[a]PDE (N2-dG) in the eight conformations and the five DNA sequences most relevant to base substitution mutagenesis

Richard E. Kozack and Edward L. Loechler1

Department of Biology, Boston University, 2 Cummington Street, Boston, MA 02215, USA


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The potent mutagen/carcinogen 7R,8S-dihydroxy-9S, 10R-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene [(+)-anti-B[a]PDE], which is the activated form of benzo[a]pyrene (B[a]P), is able to induce different kinds of mutations (G->T, G->A, etc.). One hypothesis for this is that different mutations are induced depending upon the conformation of its major adduct ([+ta]-B[a]P–N2-dG) when bypassed during DNA replication. Based on molecular modeling, there appear to be at least 16 potential conformations that the major adduct [+ta]-B[a]P–N2-dG can adopt in dsDNA. Regarding base substitution mutagenesis, eight conformations are most likely to be relevant. In two conformations the dG moiety of the adduct is base paired with its complementary dC and the B[a]P moiety is in the minor groove. In two others the dG moiety of the adduct is in the Hoogsteen orientation and the B[a]P moiety is in the major groove. There are four base displaced structures in which the B[a]P moiety of the adduct is stacked with the surrounding base pairs, two with dG in the major groove and two with dG in the minor groove. Using a simulated annealing protocol, these eight conformations were evaluated in five different DNA sequence contexts (5'-TGC-3', 5'-CGT-3', 5'-AGA-3', 5'-CGG-3' and 5'-GGG-3'); the latter were chosen because they may be particularly revealing about mutagenic mechanism based on studies with [+ta]-B[a]P–N2-dG and (+)-anti-B[a]PDE. For each conformation and each sequence context, 25 simulated annealing runs were conducted by systematically varying several parameters (such as the initial annealing temperature) based on a protocol established recently. The goal of this work was to exclude conformations that are clearly inferior. Three conformations are virtually always high in energy, including the two Hoogsteen oriented species and one of the base displaced species with dG in the major groove. Remarkably, the remaining five conformations are often quite close in energy and are deemed most likely to be relevant to mutagenesis (see accompanying paper).

Abbreviations: (+)-anti-B[a]PDE, 7R,8S-dihydroxy-9S,10R-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene; B[a]P, benzo[a]pyrene; [+ta]-B[a]P–N2-dG, the major adduct of (+)-anti-B[a]PDE formed by trans addition of N2-dG to (+)-anti-B[a]PDE; ds, double-stranded; {kappa}1, constraint force applied to the hydrogen bonds in a base pair during initial conjugate gradient minimization; {kappa}2, constraint force applied to the hydrogen bonds in a base pair during simulated annealing; PAH, polycyclic aromatic hydrocarbon; ss, single-stranded; t, annealing time; T0, initial annealing temperature; {tau}, molecular dynamics time step.


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
We have been studying mutagenesis by 7R,8S-dihydroxy-9S,10R-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene [(+)-anti-B[a]PDE], which is the activated form of benzo[a]pyrene (B[a]P), as well as by its major adduct [+ta]-B[a]P–N2-dG (1–11; Figure 1Go). B[a]P is a polycyclic aromatic hydrocarbon (PAH), a class of substances which are produced by incomplete combustion, and is a potent mutagen and carcinogen (1220).



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 1. A pictorial representation of the 16 classes of conformations that [+ta]-B[a]P–N2-dG (upper center) appears capable of adopting in dsDNA. The key is shown toward the upper left. The B[a]P moiety of [+ta]-B[a]P–N2-dG is depicted as a half open and half solid rectangle with the solid portion representing the a-face, which is indicated on the [+ta]-B[a]P–N2-dG structure toward the upper center of this Figure. The dG moiety of the adduct is a rectangle with a G towards its center. The base pair on the 3'-side of the adduct is always located toward the top of the adduct and the base pair on the 5'-side of the adduct is always located toward the bottom of the adduct. The major groove is always oriented to the left and the minor groove to the right of the adduct. The abbreviations that represent each of the conformations is based on the designation of the moiety in the groove. For example, BPmi5 indicates that the B[a]P moiety of the adduct is in the minor groove and pointing toward the base on the 5'-side of the adduct. Gmi5 indicates that the dG moiety is in the minor groove, however, in this case, the 5 indicates that the a-face of the B[a]P moiety of the adduct is pointing toward the base on the 5'-side the adduct bond. BPmi5 and BPmi3 have Watson–Crick base pairs, while BPma5 and BPma3 have the Hoogsteen orientation. The four base displaced conformations have the B[a]P moiety stacked with surrounding base pairs and the dG moiety either in the major (Gma5 and Gma3) or minor (Gmi5 and Gmi3) groove. Note that the a-face of the B[a]P moiety (solid surface) differs in (for example) Gma5 versus Gma3 by being selectively stacked with the base on the 5'- versus 3'-side, respectively. The base paired and base displaced conformations retain the same number of steps in the DNA helix as unadducted DNA. Four double displaced conformations are also shown (G/BPma5, G/BPma3, G/BPmi5 and G/BPmi3) and in each case they have both the B[a]P and the dG moiety of the adduct in a groove. Such structures might be expected to be precursors of a –1 frameshift mutation when at a dsDNA–ssDNA junction. Four intercalated conformations are also indicated (Ima5, Ima3, Imi5 and Imi3) and in this case the 5 and 3 indicate the direction of the a-face and not on which side the B[a]P moiety is intercalated. Such structures might be expected to be precursors of a +1 frameshift mutation when at a dsDNA–ssDNA junction.

 
In Escherichia coli, we have shown that base substitutions, frameshifts, insertions and deletions were all induced by (+)-anti-B[a]PDE. For mutations at G:C base pairs alone, G:C->T:A (57%), G:C->A:T (23%) and G:C->C:G (20%) mutations were each significant (3,4). The nature of the mutation seemed to be influenced by the sequence context of the adduct. Adduct site-specific mutational studies with the major adduct, [+ta]-B[a]P–N2-dG, reinforced this conclusion, in that G->T mutations dominated in a 5'-TGC-3' sequence context (2) and G->A mutations dominated in both a 5'-CGT-3' (9,10) and 5'-AGA-3' sequence context (21), while a mixture of G->T, A and C mutations were all prevalent in a 5'-CGG-3' sequence context (6).

These findings lead to the question: how can a single adduct induce different kinds of mutations and why does the pattern change in different sequence contexts? Our working hypothesis for this is that an adduct can adopt multiple conformations, each conformation can cause a different kind of mutation and adduct conformation can be controlled by factors such as DNA sequence context (1,4,10). The notion that an adduct can adopt multiple conformations has some direct experimental support. By NMR it was determined that the pyrene moiety of a [+ta]-B[a]P–N2-dG adduct in a 5'-TGC-3' sequence context is in the minor groove and pointing toward the base on the 5'-side, however, there is a second, minor conformation as well, although it could not be delineated (22). This conformation of [+ta]-B[a]P–N2-dG was also observed in a fully duplex structure in a 5'-CGC-3' sequence context; however, when the base complementary to the adducted dG was removed, the conformation became base displaced (discussed in ref. 23). Fluorescence studies also suggest that this adduct can adopt multiple conformations in duplex DNA, although the conformations are not determinable by this technique (2426). Furthermore, as (for example) the stereochemistry of the B[a]P–N2-dG adduct is varied, a remarkable variety of different structures become dominant as determined by NMR, and in a number of cases minor species are also observed (21,27–30, reviewed in ref. 31).

The study of adduct conformational complexity alone will be inherently difficult and developing an understanding of the relationship between conformations and mutations will be even more complex, both because multiple mutations are possible and because a minor conformation might be responsible for mutagenesis. One means of approaching this issue is by molecular modeling/computational chemistry. Molecular modeling has proved to be valuable for the study of carcinogen adducts. For example, the lowest energy conformations for both [+ta]- and [–ta]-B[a]P–N2-dG, as established by NMR (27,28), were actually predicted using computational means by Singh et al. (31), who have also contributed signficantly to the development of the final NMR structures for B[a]P–N2-dG adducts (23,2730). Computational studies of B[a]P–N2-dG adducts have been undertaken by a variety of other groups and a variety of models have emerged (summarized in ref. 31).

Recently, we developed a molecular dynamics-based, simulated annealing protocol (11) and used it to study [+ta]B[a]P–N2-dG in a 5'-CGC-3' sequence context, for which an NMR structure was known (23,27). Simulated annealing involves adding energy to a system and then gradually removing it in a process that can be thought of as imitating heating the system at some initial high temperature and then cooling it down to absolute zero (32,33). This method, in contrast to simple minimization, does not require a constant decrease in potential energy during the simulation and can allow the molecule to escape from local minima in search of even deeper minima via the input of kinetic energy in the high temperature dynamics. However, the procedure can be gentle enough with a suitable protocol to allow the adduct to explore a conformational region without escaping the input conformational class and without becoming grossly distorted. A four step procedure was followed: the adduct was docked, after which the structure was subjected to an initial conjugate gradient minimization, followed by simulated annealing and a final conjugate gradient minimization. To explore a conformational class more extensively, six parameters were varied in different runs, including the length of the DNA helix, the initial annealing temperature (T0), the annealing time (t), the molecular dynamics time step ({tau}) and two base pairing constraint parameters (11). While there is no single set of optimum parameters, reasonable, low energy structures were obtained with the values t {approx} 40 ps (or longer), T0 {approx} 750 K, {tau} {approx} 1.0 fs and with a helix length of 7 bp and herein the parameters were varied to bracket these values. One inherent weakness that we wish to note is that the force field employed (CHARMm) was optimized for B-DNA-like structures (34,35), which may result in a bias against conformations that are relatively non-B-DNA-like. It is difficult to place a quantitative value on this issue, but it is important to acknowledge and remember the concern.

Using simple computer graphics techniques it appears to us that there are 16 classes of conformations that [+ta]B[a]P–N2-dG could adopt in double-stranded (ds)DNA when considering the same number of bases in each strand and with a dC moiety in the strand opposite the adduct. These 16 are represented pictorially in Figure 1Go. Eight of these conformations seemed most likely to be relevant to base substitution mutagenesis, since they retain the same number of steps in the DNA helix, while the other eight conformations seemed more likely to be associated with frameshift mutagenesis. In the case of each of the conformations, Figure 1Go shows the [+ta]-B[a]P–N2-dG adduct toward the middle with the base pair on the 3'-side of the adduct toward the top and with the base pair on the 5'-side of the adduct toward the bottom. In addition, the major groove is to the left and the minor groove to the right. The nomenclature for the conformations in Figure 1Go is explained in Results.

Twenty-five unique sets of simulated annealing parameters were used to evaluate the eight conformations in the five different DNA sequence contexts that appear likely to be most relevant to [+ta]-B[a]P–N2-dG base substitution mutagenesis. The relative energies of each of these 1000 conformations is reported. The five sequence contexts were chosen based on our mutagenesis work. In four, [+ta]-B[a]P–N2-dG has been studied site-specifically. 5'-TGC-3' (2), 5'-CGT-3' (9,10) and 5'-AGA-3' (21) were studied for their mutational selectivity and gave principally G->T, G->A and G->A mutations, respectively. 5'-CGG-3' is the major base substitution hotspot and gave a more even mixture of G->T, A and C mutations (6). Although we have not studied [+ta]-B[a]P–N2-dG adduct site-specifically in a 5'-GGG-3' sequence context, it is a hotspot for (+)-trans-B[a]PDE mutagenesis and G->T, A and C mutations are observed (3,4).

While it would be tempting to try to speculate on which conformation is lowest in energy based on our calculations, we believe that modeling is probably not sufficiently accurate for that purpose. On the other hand, we felt that it would be possible to eliminate certain conformations if they appear to be generally inferior based on a variety of criteria, and three conformations (Gma5, Gma3 and BPma3) appear to fit this category. Interestingly, the remaining five conformations (i.e. BPmi5, BPmi3, Gma5, Gmi5 and Gmi3) are calculated to be relatively similar in energy (often <~5 kcal/mol difference), suggesting that each may be present at a sufficient level to contribute to the mutagenic mechanism. It is important to emphasize that only BPmi5 has been observed by physical means to date with [+ta]-B[a]P–N2-dG (22,23,27); however, this does not mean that the other conformations are not present as minor species and, of course, it is possible that a minor species could be important to mutagenesis.

One additional concern is the question of the relevance of our study of [+ta]-B[a]P–N2-dG opposite dC in a duplex oligonucleotide to mutagenesis, which actually involves a transition state during incorporation of a base other than dC at a single-stranded (ss)DNA–dsDNA junction in the presence of a DNA polymerase. This issue is discussed in an accompanying paper (36).


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The four step modeling procedure used in this paper was identical to that described previously (11) and as outlined below.

Eight conformations were studied herein and a four step procedure was used. The starting structure (Step 1) for the BPmi5 conformation was developed as described previously (11). BPmi3 was developed from BPmi5 via rotation of the B[a]P moiety in the minor groove about the adduct bond, using what was known about this conformation based on an NMR study with [–ta]-B[a]P–N2-dG (27). The two Hoogsteen oriented conformations (BPma5 and BPma3) were developed following rotation of the dG moiety of [+ta]-B[a]P–N2-dG about the glycosylic bond. The B[a]P moiety, which is in the major groove, was positioned by rotating about the adduct bond such that van der Waals contacts with other bases in the major groove were minimized. The four base displaced conformations (Gma5, Gma3, Gmi5 and Gmi3) were all built by trial and error from BPmi5 by manipulating the B[a]P and dG moieties of the adduct into the requisite orientations. In the case of Gma5 and Gmi3, this was guided by what was known about these conformations based on NMR studies with [+ca]-B[a]P–N2-dG (29) and [–ca]-B[a]P–N2-dG (30), respectively. In all cases the initial conformations were developed to maintain proper bond lengths, but with less regard for proper bond and torsion angles, which were rectified during the initial conjugate gradient minimization (Step 2).

Each of these eight conformations of [+ta]-B[a]P–N2-dG was studied in duplex heptamers, which is a size that we had previously concluded was optimal for these molecular modeling studies (11), although we note that, ultimately, we report the energies for only the innermost 5 bp for reasons discussed (11). The five DNA sequence contexts most relevant to base substitution mutagenesis were used and in each case the sequence context is exactly identical to that of our adduct site-specific and/or random mutagenesis work. [+ta]-B[a]P–N2-dG induced principally G->T mutations (>95%) in a 5'-TGC-3' (5'-CCTGCAG-3) sequence context (2). [+ta]-B[a]P–N2-dG induced principally G->A mutations (>80%) in a 5'-CGT-3' (5'-GCCGTCA-3') sequence context (9,10) and a 5'-AGA-3' (5'-TTAGAGA-3') sequence context (21). [+ta]-B[a]P–N2-dG induced a more even mixture of G->T, A and C mutations in a 5'-CGG-3' (5'-CGCGGCC-3') sequence context (6) and a 5'-GGG-3' (5'-AAGGGAG-3') sequence context (3,4). The latter result is less definitive since it is actually based on our random mutagenesis studies with (+)-anti-B[a]PDE.

For each conformation in each sequence context, 25 simulated annealing runs were performed (Step 3). Each run had a different set of parameters, which varied from a set of canonical parameters as reported in Table IGo. The canonical parameters were T0 = 750 K, t = 40 ps, {tau} = 1.0 fs, {kappa}1 = 300 kcal/mol/Å and {kappa}2 = 300 kcal/mol/Å, which were determined to be most optimal in our previous study (11). T0 is the initial annealing temperature, after which the temperature is continually and uniformly reduced to absolute zero. The annealing time (t) is the total time for cooling. The molecular dynamics time step is {tau}; in standard molecular dynamics, {tau} is chosen to be as large as possible while still enabling accurate numerical integration of the force equations; however, our main goal was to find minimum energy structures, so we decided to increase {tau} at the expense of strict numerical accuracy in order to obtain longer annealing times. Since there were some van der Waals contacts to be resolved during the initial stages of structure refinement, constraints on the atoms participating in Watson–Crick hydrogen bonding were applied in some cases. A harmonic potential was employed with force constants {kappa}1 (during the initial conjugate gradient minimization) and {kappa}2 (during simulated annealing) employing equilibrium distances from canonical B-DNA (1.80 Å for O···H bonds and 1.87 Å for N···H bonds in G:C base pairs and 1.81 Å for O···H bonds and 1.86 Å for N···H bonds in A:T base pairs (37). Table IGo lists the 24 parameter sets that were used for each of the conformations. The twenty-fifth run was a `wild card', which was included because we thought we might be able to evaluate the trends in the energies from the other 24 runs and predict a set of parameters that might give a relatively low energy result. In each case the wild card run used many of the same parameters that gave the lowest energy conformation and the parameters were varied toward those for the next lowest energy conformations. In no case did the wild card run succeed in generating a new lowest energy conformation. A conjugate gradient minimization without constraints was performed to give the final conformation (Step 4).


View this table:
[in this window]
[in a new window]
 
Table I. Simulated annealing parametersa used to study [+ta]-B[a]P–N2-dG in eight conformations and in five different DNA sequence contexts
 

    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Convention for naming the conformations
Based on simple principles, it appears that [+ta]-B[a]P– N2-dG can adopt at least 16 classes of conformations in dsDNA, which are represented pictorially in Figure 1Go, which has a key in the upper left corner. The dG moiety is represented as a rectangle with a G, while the B[a]P moiety is a rectangle, half open and half solid. The solid portion represents the `a-face', which is the face that is normally viewed when [+ta]-B[a]P–N2-dG is drawn (Figure 1Go). The base pairs on the 3'- and 5'-sides of the adduct are always located toward the top and the bottom of the adduct, respectively. The major groove is always oriented to the left and the minor groove to the right of the adduct.

The eight conformations most likely to be potentially relevant to base substitution mutagenesis, since they preserve the number of steps in the adduct-containing strand that would be bypassed by a DNA polymerase when copying the template strand, are in the two left-most columns. The other eight conformations have either one fewer or one more step in the template strand and are more likely to be relevant to frameshift mutagenesis. We hasten to add that we cannot be certain of this assessment, but it simply represents our best guess currently and serves as a starting point for an analysis of the conformations potentially relevant to base substitution mutagenesis.

To make it easier to refer to these conformations, we have developed a simple set of abbreviations. Four of these structures have the dG moiety of the adduct stacked with the surrounding base pairs and the B[a]P moiety is in a groove (Figure 1Go, first column). The other four have the B[a]P moiety of the adduct stacked with the surrounding base pairs and the dG moiety is in a groove; the latter four are called base-displaced structures (Figure 1Go, second column). The nomenclature is based on the designation of the moiety in the groove. For example, BPmi5 indicates that the B[a]P moiety of the adduct is in the minor groove and pointing toward the base on the 5'-side of the adduct. This is the dominant conformation in several sequence contexts as determined in NMR studies (22,23,27). Gmi5 indicates that the dG moiety is in the minor groove; however, in this case, the `5' designates that the a-face of the B[a]P moiety of the adduct (Figure 1Go) is pointing toward the base on the 5'-side. In the case of the [+ta]-B[a]P–N2-dG adduct, this is equivalent to saying that the adduct bond (directionality C10 of B[a]P->N2-dG) is pointing toward the base on the 5'-side. The directionality of the a-face and the adduct bond do not always coincide. We wish to note two additional issues regarding these abbreviations. The a-face and the adduct bond have opposite orientations in the case of other B[a]P–N2-dG stereoisomers (e.g. [–ta]-B[a]P–N2-dG). In these cases, we believe that the a-face orientation is more important than the adduct bond orientation, so our naming convention is dictated by the orientation of the a-face and not the adduct bond. This tendency is also followed with respect to the intercalated structures in Figure 1Go. For example, Imi5 is named based on the orientation of its a-face, which is toward the base on the 5'-side, and not based on the fact that it is actually stacked between the B[a]P moiety and the base on the 3'-side.

The eight conformations most relevant to base substitution mutagenesis
Two of the eight conformations (BPmi5 and BPmi3) have the dG of the adduct in a Watson–Crick base pair with dC and have the B[a]P moiety in the minor groove pointing toward the base on the 5'- or 3'-side, respectively. Two other conformations (BPma5 and BPma3) have an anti->syn rotation about the glycosylic bond of the dG moiety of the adduct giving Hoogsteen oriented pairing, which moves the B[a]P moiety into the major groove where it can point toward the base on either the 5'- or 3'-side, respectively. There are no other ways to accommodate the B[a]P moiety in a groove without significantly distorting the DNA. Two conformations (Gma5 and Gma3) are base displaced and have the B[a]P moiety stacked in the helix with the dG moiety of the adduct in the major groove. There are only two ways that the planar B[a]P moiety can stack: with the a-face pointing toward the base on the 5'- or the 3'-side, respectively. Finally, two conformations (Gma5 and Gma3) are base displaced with the dG moiety of the adduct in the minor groove and with the a-face of the B[a]P moiety base stacked and either pointing toward the base on the 5'- or the 3'-side, respectively.

Simulated annealing protocol
The four step computational method (see Materials and methods) utilized in the work described herein was identical to that used previously (11): conformation construction (Step 1), minimization (Step 2), simulated annealing (Step 3) and final conjugate gradient minimization (without constraints). Twenty-five runs were conducted on each of these 40 conformations in Step 3. Each annealing differed via a systematic variation of annealing parameters based on our previous work (11). The parameters included: initial temperature (T0), length of the cooling period (t), length of the molecular dynamics time step ({tau}) and constraints between the two strands of the DNA molecule (Table IGo). In each case, one of the 25 simulated annealings was a wild card run, in which we evaluated the trends for that particular conformation and picked a set of parameters that we thought might give a low energy conformation (see Materials and methods). Following simulated annealing, a conjugate gradient minimization was performed in which constraints were released (Step 4). A representative example of each of the eight classes of conformations can be found in Figures 2 and 3GoGo of Kozack et al. (36).



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 2. Relative energies for the eight conformations most relevant to base substitution mutagenesis (see text) in the 5'-TGC-3' DNA sequence context. Each curve represents one of the conformations as shown pictorially in Figure 1Go. For a given conformation, the plot shows the energy (ordinate) following final minimization of each of the 25 structures when simply ranked from lowest to highest energy (numbered 1–25 on the abscissa) and equally spaced apart. (Top) BPmi5 (solid line) and BPmi3 (dashed line). (Upper middle) Gma5 (solid line) and Gma3 (dashed line). (Lower middle) BPma3 (solid line) and BPma5 (dashed line). (Bottom) Gmi3 (solid line) and Gmi5 (dashed line).

 


View larger version (12K):
[in this window]
[in a new window]
 
Fig. 3. Relative energies for the eight conformations in the 5'-CGT-3' DNA sequence context. The key is as in Figure 2Go.

 
Data presentation
To evaluate more easily the energies of the 1000 structures that were generated (i.e. 25 annealing runs for eight conformations in five DNA sequence contexts), the data are plotted as shown in Figures 2–6GoGoGoGoGo. For a given conformation and sequence context, the plots show the energy (ordinate) of the 25 structures when simply ranked from lowest to highest energy (numbered 1–25 on the abscissa) and equally spaced apart. The data for the eight conformations for each sequence context are divided into four panels, each showing two curves, as follows: 5'-TGC-3' (Figure 2Go), 5'-CGT-3' (Figure 3Go), 5'-AGA-3' (Figure 4Go), 5'-CGG-3' (Figure 5Go) and 5'-GGG-3' (Figure 6Go).



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 4. Relative energies for the eight conformations in the 5'-AGA-3' DNA sequence context. The key is as in Figure 2Go.

 


View larger version (12K):
[in this window]
[in a new window]
 
Fig. 5. Relative energies for the eight conformations in the 5'-CGG-3' DNA sequence context. The key is as in Figure 2Go.

 


View larger version (12K):
[in this window]
[in a new window]
 
Fig. 6. Relative energies for the eight conformations in the 5'-GGG-3' DNA sequence context. The key is as in Figure 2Go.

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
We have previously shown that a molecular dynamics-based, simulated annealing protocol is a reasonably simple method to obtain energy minima that are much lower than those obtained through simple minimization protocols, such as conjugate gradient (11). In addition, by systematically changing a variety of parameters in a set of simulated annealing runs for a single starting structure, we are able to generate a significant number of different but related structures. The molecular dynamics version of the simulated annealing procedure provides enough input kinetic energy into a structure to permit it to escape local potential energy barriers without escaping from the original structural class and without becoming grossly distorted. In fact, in the 1000 simulated annealing runs described herein, a structure changed its conformational class in only 13 instances [these are discussed in the accompanying paper (36)]. Nevertheless, there was significant flexibility within any particular conformational class based on the variety of conformations that emerged (data not shown) and the large range of energies for any particular curve (Figures 2–6GoGoGoGoGo).

We plotted our data as shown in Figures 2–6GoGoGoGoGo for several reasons. First, it provides a means of looking at the entire data set for any particular conformation and DNA sequence context. Second, we feel that the slopes of the lines may provide useful information. Each point in the plots in Figures 2–6GoGoGoGoGo represents a conformation in its own local energy minimum. It seems reasonable to imagine that if the energy minimum can only be accessed through a relatively narrow region of conformational space (i.e. it is in a steep energy well), then it will be relatively harder to find this minimum and fewer examples will be found. It also seems reasonable to imagine that a relatively steeper line in Figures 2–6GoGoGoGoGo, especially at lower energies, may indicate a relatively narrow energy well, since this conformational region of low energy was only accessed in a relatively small number of simulated annealing runs. A concrete example of this point is discussed in the section on the 5'-TGC-3' DNA sequence context.

Evaluation of the energies of the five sequence contexts
5'-TGC-3'.
The four lowest energy structures in the 5'-TGC-3' sequence context were Gmi5, BPmi5, BPmi3 and Gma5. In fact, the data for each of these were virtually superimposable throughout most of the curves shown in Figure 2Go, and Gmi5 and BPmi5 only emerge as having the lowest energy conformations because in each case a single conformation was much lower in energy than might be expected based upon a simple projection of the rest of its curve (Figure 2Go). Accordingly, there are three reasons why it is impossible to assign lowest energy status to Gmi5 and/or BPmi5 and in fact any of these four structures might potentially be lowest in energy. First, there is uncertainty about the accuracy of the calculated energies, which is very similar (within ~4 kcal/mol) in the case of three of these conformations (Gmi5, BPmi5 and BPmi3). Second, there is only one structure each for Gmi5 and BPmi5 that make them appreciably lower in energy than BPmi3 or Gma5, which raises the possibility that if a few more annealing runs were conducted with BPmi3 and Gma5, then something similar would eventually happen for these also. Third, the energy values in Figure 1Go do not take entropy into consideration. The fact that there is only one example of a relatively low energy species for Gmi5 and BPmi5 may indicate a relatively steep energy well and there may be a compensating entropy penalty paid for entering this energy well, which would tend to dampen the preference when free energy is considered.

In conclusion, it appears that Gmi5, BPmi5, BPmi3 and Gma5 are all candidates to be relatively low energy conformations. Since the goal of this work is to exclude conformations that are relatively less reasonable, Gma3, Gmi3, BPma5 and BPma3 are less likely to be important. BPmi5 was shown to be the lowest energy conformation by NMR in a similar 5'-TGC-3' DNA sequence and there was a minor conformation whose structure could not be determined (22).

5'-CGT-3'.
The lowest energy structure for the 5'-CGT-3' sequence context is BPmi3, although the data for Gmi3 are very similar (Figure 3Go). In addition, the curves for BPmi3 and Gmi3 were virtually superimposable on the curve for BPmi5 except at the very lowest energy conformation, leading to the same concerns as discussed for 5'-TGC-3'. The remaining five conformations, Gmi5, Gma5, Gma3, BPma5 and BPma3, are all much higher in energy and less likely to be relevant to mutagenesis.

5'-AGA-3'.
The lowest energy structure for the 5'-AGA-3' sequence context is Gmi5 by a considerable amount, while BPmi5, Gmi3 and BPmi3 are next lowest and within ~4 kcal/mol of each other. Gma5, Gma3, BPma5 and BPma3 are all significantly higher in energy.

5'-CGG-3'.
Although the lowest energy structure in the 5'-CGG-3' sequence context is Gma5, it and BPmi5, Gmi3 and Gmi5 are all within ~5 kcal/mol. Importantly, both Gma5 and Gmi5 have steep slopes near the low energy end of their curves, which means that the kinds of caveats mentioned in the discussion of 5'-TGC-3' above need to be considered. However, it is clear that BPmi3, BPma5, BPma3 and Gma3 are all relatively higher in energy and less likely to be significant. Based on preliminary NMR results, an oligonucleotide containing [+ta]-B[a]P–N2-dG in the identical 5'-CGG-3' DNA sequence context has BPmi5 as its lowest energy conformation (N.E.Geacintov and D.J.Patel, personal communication).

5'-GGG-3'.
The results for 5'-GGG-3' are the most ambiguous in that all of the conformations are relatively similar in energy. All of the conformations are within ~8 kcal/mol of each other, with the exception of Gma5, which is universally high in energy. In addition, many of the lowest energy values for a particular conformation represent points that deviate downward from the general slope of their curves, most notably with the lowest energy conformation for Gmi5. This raises the issue discussed in the 5'-TGC-3' sequence context.

General thoughts on the conformations
In evaluating the conformations, it was possible to identify factors that contributed to some of the conformations being relatively inferior. In contrast, it was difficult to identify principles that distinguish the relatively more subtle differences between the better conformations. This conclusion also contributed to our thinking that the goal of this work should not be on trying to predict the lowest energy conformation, but rather to assess which conformations are relatively high in energy and, thus, can probably be excluded from our thinking about mutagenesis. In this regard, we have decided that conformations that differ by <~5 kcal/mol are potentially indistinguishable in energy. On the other hand, a conformation that is >~10 kcal/mol higher in energy is probably unlikely to be significant, even given the uncertainties of the calculations. These cut-offs are admittedly subjective, although they seem conservative.

The following example illustrates why it is difficult to identify simple, over-arching principles that favor the relatively low energy conformations. We had predicted that a simple trend should emerge when comparing BPmi5 versus BPmi3, which both have the B[a]P moiety of the adduct in the minor groove. In the minor groove, an A:T base pair is less bulky than a G:C base pair, which has an extra amino group, so we predicted that the bulky B[a]P moiety would prefer to point in the direction of the less bulky A:T base pair. In fact, this expectation is met for 5'-TGC-3' (Figure 2Go), where BPmi5 is lower in energy, and for 5'-CGT-3' (Figure 3Go), where BPmi3 is lower in energy. However, one might expect BPmi5 and BPmi3 to be more similar in energy in sequences with either G:C or A:T base pairs on both sides. This is not the case: BPmi5 is lower in energy for the sequence contexts 5'-AGA-3', 5'-CGG-3' and 5'-GGG-3'. This probably reflects the fact that [+ta]-B[a]P–N2-dG has some inherent preference for the BPmi5 conformation, which has certainly been observed experimentally, in that BPmi5 was observed to be the lowest energy conformation by NMR in the sequence contexts 5'-CGC-3', 5'-TGC-3' and 5'-CGG-3' (21,22,26; N.E. Geacintov and D.J.Patel, personal communication).

Gma3, BPma5 and BPma3 were relatively high in energy in virtually all of the DNA sequence contexts. The reasons for these trends are more explicable.

A comparison of Gma3 with Gma5 (Figure 7Go) illustrates why the former is often high in energy. (Gma3 is somewhat competitive in energy only in a 5'-GGG-3' sequence context; Figure 6Go) The sugar–phosphate backbone for Gma3 is kinked. In contrast, the sugar–phosphate backbone for Gma5 is relatively B-DNA-like in spite of displacement of the dG moiety into the major groove. In Gma3 the dG moiety of the adduct impinges on the base on the 3'-side of the adduct, while the hydroxyl groups on the saturated ring of the B[a]P moiety impinge on the base on the 5'-side of the adduct. In contrast, the dG moiety and the hydroxyl groups in Gma5 project unfettered into the major groove. In summary, Gma3 is a very distorted structure [see Figure 2Go in the accompanying paper (36)].



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 7. Stereo view of the lowest energy versions of the conformations Gma5 and Gma3 to illustrate why the former is lower in energy (see text). The view is from the major groove with the base pair on the 5'- and 3'-side of the adduct toward the bottom and top, respectively.

 
Hoogsteen oriented G:C pairs are less stable than their corresponding normal base pairs, because they do not form as many hydrogen bonds, they do not have the proper distance between the N9 position of the purine and the N1 position of the pyrimidine and they do not stack as well (37,38). Thus, perhaps, it is not surprising that the B[a]P-containing Hoogsteen oriented conformations, BPma5 and BPma3, are also relatively less stable.

Note that we studied Hoogsteen oriented conformations that did not have a trapped proton between N7-dG of the adduct and N3-dC. One cannot study the corresponding protonated species and compare it with the other conformations, since its internal energy is different because of the extra N···H bond and the positive charge. However, we can consider the energies of the non-protonated Hoogsteen oriented conformations and ask whether protonation is likely to lower the energy sufficiently to make them energetically competitive with the other low energy conformations. Generally speaking, pKa perturbations are unlikely to exceed 6 pKa units based on theoretical considerations (3942) and a pKa somewhat above 9.5 has been estimated for a protonated Hoogsteen G:C base pair in a triple helix (43). Taking pKa {approx} 10, a protonated Hoogsteen base pair would be ~103-fold more stable (~2.2 kcal/mol) than the deprotonated Hoogsteen pair at pH 7. The lowest energy, deprotonated Hoogsteen conformations were on average ~13–15 kcal/mol higher in energy than the average of the lower energy conformations based on the results in Figures 2–6GoGoGoGoGo. (There is one exception, Gma3 in the sequence context 5'-GGG-3'.) Because we estimate that the protonated Hoogsteen base pair is no more than ~2.2 kcal/mol more stable than the deprotonated Hoogsteen orientation at pH 7.0, the protonated Hoogsteen base pairs must still be much higher in energy than the lower energy conformations. In fact, one would have to argue that a protonated Hoogsteen base pair in a B[a]P adduct would have to have a pKa value of ~17 for it to be competitive energetically. While there is no information available in this regard, we know of no precedent for a conformational constraint raising an acid dissociation constant by a factor of >~1013, given a pKa of ~3.5 for N3C (42). We also note that no Hoogsteen orientations involving B[a]P–N2-dG adducts have been detected to date, at least when paired with dC (23).

Finally, we wish to note one important caveat to all of the work that we have been doing. The force field employed (CHARMm) was optimized using B-DNA structures (34,35), which may result in a bias against conformations that are relatively non-B-DNA-like. It is hard to know how to integrate this issue into the work that we have done, but it is important to remember this potential concern.

Summary
In principle, [+ta]-B[a]P–N2-dG can adopt at least 16 classes of conformations in DNA. Eight of these seem more likely to be relevant to frameshift mutagenesis. Of the remaining eight, three conformations (Gma3, BPma5 and BPma3) are calculated to be significantly higher in energy. Accordingly, we are focusing our attention on the remaining five conformations (BPmi5, BPmi3, Gma5, Gmi3 and Gmi5), because we think they are most likely to be relevant to base substitution mutagenesis. It should be noted that BPmi5 (27), BPmi3 (28), Gmi3 (29) and Gma5 (30) were the dominant conformations observed by NMR in the case of the stereoisomers [+ta]-, [–ta]-, [+ca]- and [–ca]-B[a]P–N2-dG, respectively, leaving only one of these five (Gmi5) as not having been observed experimentally, although it is important to state that, to date, only BPmi5 has been definitively identified for [+ta]-B[a]P–N2-dG itself in fully duplex DNA (22,27).


    Notes
 
1 To whom correspondence should be addressed Email: loechler{at}bio.bu.edu Back


    Acknowledgments
 
We are grateful to Suse Broyde for kindly providing us with NMR coordinates and constraints. We also thank the Scientific Computing and Visualization group at Boston University for allocations of computational resources. This work was supported by NIH grant CA50432.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 

  1. Loechler,E.L. (1995) How are potent bulky carcinogens able to induce such a diverse array of mutations? Mol. Carcinogen., 13, 213–219.[ISI][Medline]
  2. Mackay,W., Benasutti,M., Drouin,E. and Loechler,E.L. (1992) Mutagenesis by the major adduct of activated benzo[a]pyrene, (+)-anti-BP-N2-Gua, when studied in an Escherichia coli plasmid using site-directed methods. Carcinogenesis, 13, 1415–1425.[Abstract]
  3. Rodriguez,H. and Loechler,E.L. (1993) Mutational spectra of the (+)-anti-diol epoxide of benzo[a]pyrene in a supF gene of an Escherichia coli plasmid: DNA sequence context influences hotspots, mutational specificity and the extent of SOS enhancement of mutagenesis. Carcinogenesis, 14, 373–383.[Abstract]
  4. Rodriguez,H. and Loechler,E.L. (1993) Mutagenesis by the (+)-anti-diol epoxide of benzo[a]pyrene: what controls mutagenic specificity? Biochemistry, 32, 373–383.
  5. Drouin,E. and Loechler,E.L. (1993) Mutagenesis by the (+)-anti diol epoxide benzo[a]pyrene does not significantly involve AP-sites: evidence that the complexity of the mutational spectra is due to adduct conformational polymorphism. Biochemistry, 32, 6555–6562.[ISI][Medline]
  6. Jelinsky,S.A., Mao,B., Geacintov,N.E. and Loechler,E.L. (1995) The major, N2-Gua adduct of the (+)-anti-benzo[a]pyrene diol epoxide is capable of inducing G->A and G->C, in addition to G->T, mutations. Biochemistry, 34, 13545–13553.[ISI][Medline]
  7. Rodriguez,H. and Loechler,E.L. (1995) Are base substitution and frameshift mutagenesis pathways interrelated? An analysis based upon studies of mutagenesis by the (+)-anti diol epoxide of benzo[a]pyrene. Mutat. Res., 326, 29–37.[ISI][Medline]
  8. Drouin,E. and Loechler,E.L. (1995) The major adduct of the (+)-anti diol epoxide benzo[a]pyrene can be unstable in double stranded DNA. Biochemistry, 34, 2251–2259.[ISI][Medline]
  9. Shukla,R., Liu,Y., Geacintov,N. and Loechler,E.L. (1997) The major, N2-dG adduct of (+)-anti-B[a]PDE shows a dramatically different mutagenic specificity (predominantly G->A) in a 5'-CGT-3' sequence context. Biochemistry, 36, 10256–10261.[ISI][Medline]
  10. Shukla, R, Jelinsky,S., Liu T., Geacintov,N.E. and Loechler,E.L. (1997) How stereochemistry affects mutagenesis by N2-dG adducts of B[a]PDE: configuration of the adduct bond is more important than of the hydroxyl groups. Biochemistry, 36, 13263–13269.[ISI][Medline]
  11. Kozack,R. and Loechler,E.L. (1997) Molecular modeling of the conformational complexity of (+)-anti-B[a]PDE-adducted DNA using simulated annealing. Carcinogenesis, 18, 1585–1593.[Abstract]
  12. Harvey,R.G. (1991) Polycyclic Aromatic Hydrocarbons: Chemistry and Cancer. Cambridge University Press, Cambridge, UK.
  13. Phillips,D.H. (1983) Fifty years of benzo[a]pyrene. Nature, 303, 468–472.[ISI][Medline]
  14. Singer,B. and Grunberger,D. (1983) Molecular Biology of Mutagens and Carcinogens. Plenum Press, New York, NY.
  15. Conney,A.H. (1982) Induction of microsomal enzymes by foreign chemicals and carcinogens by polycyclic aromatic hydrocarbons. Cancer Res., 42, 4875–4917.[ISI][Medline]
  16. Dipple,A. (1985) Polycyclic aromatic hydrocarbon carcinogens. In Harvey,R.G. (ed.) Polycyclic Aromatic Hydrocarbons and Carcinogenesis. American Chemical Society Press, Washington, DC, pp. 1–17.
  17. Jones,P.W. (1982) Polynuclear aromatic hydrocarbons. In Bowman,M.G. (ed.) Handbook of Carcinogens and Hazardous Substances. Marcel Dekker, New York, NY, pp. 573–639.
  18. Grasso,P. (1984) Carcinogens in food. In Searle,C.E. (ed.) Chemical Carcinogens, 2nd Edn, ACS Monograph 182. American Chemical Society Press, Washington, DC, pp. 1205–1239.
  19. Cheng,S.C., Hilton,B.D., Roman,J.M. and Dipple,A. (1989) DNA adducts from carcinogenic and noncarcinogenic enantiomers of benzo[a]pyrene dihydrodiol epoxide. Chem. Res. Toxicol., 2, 334–340.[ISI][Medline]
  20. Sayer,J.M., Chadha,A., Agarwal,H.S.K., Yeh,H.J.C., Yagi,H. and Jerina,D.M. (1991) Covalent nucleoside adducts of benzo[a]pyrene 7,8-diol 9,10-epoxides: structural reinvestigation and characterization of a novel adenosine adduct on the ribose moiety. J. Org. Chem., 56, 20–29.[ISI]
  21. Shukla,R., Geacintov,N. and Loechler,E.L. (1999) The major N2dG adduct of (+)-anti-B[a]PDE induces G->A mutations in a 5'-AGA-3' sequence context. Carcinogenesis, 20, in press.
  22. Fountain,M.A. and Krugh,T.R. (1995) Structural characterization of a (+)-trans-anti-benzo[a]pyrene–DNA adduct using NMR, restrained energy minimization and molecular dynamics. Biochemistry, 34, 3152–3161.[ISI][Medline]
  23. Cosman,M., Hingerty,B.E., Luneva,N., Amin,S., Geacintov,N.E., Broyde,S. and Patel,D.J. (1996) Solution conformation of the (–)-cis-anti-benzo[a]pyrenyl-dG adduct opposite dC in a DNA duplex: intercalation of the covalently attached BP ring into the helix with base displacement of the modified deoxyguanosine into the major groove. Biochemistry, 35, 9850–9863.[ISI][Medline]
  24. Suh,M., Jankowiak,R., Ariese,F., Mao,B., Geacintov,N.E. and Small,G.J. (1994) Flanking base effects on the structural conformation of the (+)-trans-anti-benzo[a]pyrene diolepoxide adduct to N-2-dG in sequence-defined oligonucleotides. Carcinogenesis, 15, 2891–2898.[Abstract]
  25. Marsch,G.A., Jankowiak,R., Suh,M. and Small,G.J. (1994) Sequence dependence of benzo[a]pyrene diol epoxide–DNA adduct conformation distribution: a study by laser-induced fluorescence/polyacrylamide gel electrophoresis. Chem. Res. Toxicol., 7, 98–109.[ISI][Medline]
  26. LeBreton,P.E., Huang,C.-R., Fernando,H., Zajc,B., Lakshman,M.K., Sayer,J.M. and Jerina,D.M. (1995) Multiple fluorescence lifetimes for oligonucleotides containing single, site-specific modifications at guanines and adenines corresponding to trans addition of the exocyclic amino groups to (+)-(7R,8S,9S,10R)- and (–)-(7S,8R,9R,10S)-7,8-dihydroxy-9,10-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene. Chem. Res. Toxicol., 8, 338–348.[ISI][Medline]
  27. Cosman,M., de los Santos,C., Fiala,R. et al. (1992) Solution conformation of the major adduct between the carcinogen (+)-anti-benzo[a]pyrene diol epoxide and DNA. Proc. Natl Acad. Sci. USA, 89, 1914–1918.[Abstract/Free Full Text]
  28. de los Santos,C., Cosman,M., Hingerty,B., Ibanez,V., Margulis,L.A., Geacintov,N.E., Broyde,S. and Patel,D.J. (1992) Influence of benzo[a]pyrene diol epoxide chirality on solution conformations of DNA covalent adducts: the (–)-trans-anti-[BP]G·C adduct structure and comparison with the (+)-trans-anti-[BP]G·C enantiomer. Biochemistry, 31, 5245–5252.[ISI][Medline]
  29. Cosman,M., de los Santos,C., Fiala,R. et al. (1993) Solution conformation of the (+)-cis-anti-[BP]dG adduct in a DNA duplex: intercalation of the covalently attached benzo[a]pyrenyl ring into the helix and displacement of the modified deoxyguanosine. Biochemistry, 32, 4145–4155.[ISI][Medline]
  30. Geacintov,N.E., Cosman,M., Hingerty,B.E., Amin,S., Broyde,S. and Patel,D.J. (1997) NMR solution structures of stereoisomeric polycyclic aromatic carcinogen–DNA adducts: principles, patterns and diversity. Chem. Res. Toxicol., 10, 111–131.[ISI][Medline]
  31. Singh,S.B., Hingerty,B.E., Singh,U.C., Greenberg,S.P., Geacintov,N.E. and Broyde,S. (1991) Structure of the (+)- and (–)-trans-7,8-dihydroxy-anti-9,10-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene adducts to guanine-N2 in a duplex dodecamer. Cancer Res., 51, 3482–3492.[Abstract]
  32. Kirkpatrick,S.Jr, Gellat,C.D. and Vecchi,M.P. (1983) Optimization by simulated annealing. Science, 220, 671–680.[ISI]
  33. Geman,S. and Geman,D. (1984) Stochastic relaxation, Gibbs distributions and Bayesian restoration of images. IEEE Trans. Pattern Anal. Machine Intell., 6, 721–741.[ISI]
  34. Nilsson,L. and Karplus,M. (1986) Empirical energy functions for energy 1minimization and dynamics of nucleic acids. J. Comp. Chem., 7, 591–616.[ISI]
  35. Falsafi,S. and Reich,N.O. (1993) Molecular dynamics simulations of B-DNA: an analysis of the role of initial molecular configuration, randomly assigned velocity distribution, long integration times and nonconstrained termini. Biopolymers, 33, 459–473.[ISI][Medline]
  36. Kozack,R.E., Shukla,R. and Loechler,E.L. (1999) A hypothesis for what conformation of the major adduct of (+)-anti-B[a]PDE (N2-dG) causes G->T versus G->A mutations based upon a correlation between mutagenesis and molecular modeling results. Carcinogenesis, 20, 95–102.[Abstract/Free Full Text]
  37. Saenger,W. (1984) Principles of Nucleic Acid Structure. Springer-Verlag, New York, NY.
  38. Loechler,E.L. (1991) Molecular modeling in mutagenesis and carcinogenesis. Methods Enzymol., 203, 458–476.[ISI][Medline]
  39. Gilson,M.K. and Honig,B.H. (1987) Calculation of electrostatic potentials in an enzyme active site. Nature, 33, 84–86.
  40. Strenberg,M.J.E., Hayes,F.R.F., Russell,A.J., Thomas,P.G. and Fersht,A.R. (1987) Prediction of electrostatic effects of engineering of protein charges. Nature, 33, 86–88.
  41. Honig,B. and Nicholls,A. (1995) Classical electrostatics in biology and chemistry. Science, 268, 1144–1149.[ISI][Medline]
  42. Rajasekaran,E., Jayaram,B. and Honig,B. (1994) Electrostatic interaction in aliphatic dicarboxylic acids: a computational route to the determination of pKa shifts. J. Am. Chem. Soc., 116, 8238–8240.[ISI]
  43. Asensio,J.L., Lane,A.N., Dhesi,J., Bergqvist,S. and Brown,T. (1998) The contribution of cytosine protonation to the stability of parallel DNA triple helices. J. Mol. Biol., 275, 811–822.[ISI][Medline]
Received June 10, 1998; revised August 11, 1998; accepted September 2, 1998.