Consulting in Drug Design GbR, Gartenstraße 14, D-16352 Basdorf, Germany and
1 Istituto Superiore di Sanita', Laboratory of Comparative Toxicology and Ecotoxicology, Viale Regina Elena 299, I-00161 Rome, Italy
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Abbreviations: BRM, carcinogenic potency in mice; BRR, carcinogenic potency in rats; QSAR, quantitative structureactivity relationship; SAR, structureactivity relationship.
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The level of use and industrial importance of the aromatic amines have stimulated an enormous amount of investigations, both epidemiological and experimental (3,79). The molecular determinants of the mechanisms of action of the aromatic amines have also been studied with methods based on structureactivity relationship (SAR) and quantitative structureactivity relationship (QSAR) concepts, mostly regarding their mutagenic properties (1). We have recently presented the first detailed QSAR analysis of the molecular determinants that rule the carcinogenic potency gradation within the subset of carcinogenic non-heterocyclic aromatic amines (1).
The following QSAR models were derived for the gradation of carcinogenic potency in carcinogenic non-heterocyclic aromatic amines (BRM, carcinogenic potency in mice; BRR, carcinogenic potency in rats)
|
|
The terms in the equations point to the physical chemical determinants that govern the carcinogenic potency gradation, whereas the signs (+ and ) indicate the direction of the effects (increasing or decreasing). The key factor for carcinogenic potency is hydrophobicity, expressed by logP. Both BRM and BRR increase with increasing hydrophobicity. In the case of BRM the influence of hydrophobicity is stronger for compounds with one amino group [characterized by the indicator variable I(monoNH2)] in comparison with compounds with more than one amino group [characterized by the indicator variable I(diNH2)] (see the different coefficients 0.88 and 0.29). For BRM, electronic factors also play a role: potency increases with increasing EHOMO and with decreasing ELUMO. Such effects seem to be less important for BRR: no electronic terms occur in Equation 2. Carcinogenic potency also depends on the type of the ring system: aminobiphenyls [indicator variable I(Bi)] and, in the case of BRR, also fluorenamines [indicator variable I(F)] are intrinsically more active than anilines or naphthylamines. A bridge between the rings of the biphenyls decreases potency [I(BiBr)]. Steric factors are involved in the case of BRM but cannot be detected in the case of BRR. BRM strongly decreases with bulk in the positions adjacent to the functional amino group, and bulky substituents at the nitrogen and in position 3 also decrease potency. The latter effects are, however, not so important. In the case of BRR, R = (Me)NO strongly enhances potency (compounds with this substituent have no measured value for BRM). [The variables I(monoNH2), I(diNH2), I(Bi), I(F), I(BiBr) and I(RNNO) are called indicator variables, and indicate the presence (value = 1) or absence (value = 0) of the feature that they are coding for. For example, the presence of only one NH2 group in the molecule implies that I(monoNH2) = 1 and I(diNH2) = 0; thus only the coefficient of I(monoNH2) (0.88) contributes to the value taken by BRM.]
The two equations model the gradation of potency of the carcinogenic aromatic amines, but not all aromatic amines are carcinogenic. Even though Equations 1 and 2 have high descriptive power for the gradation of potency, they did not describe inactive compounds correctly: in fact, we found that the non-carcinogenic amines were predicted to be weakly active by applying the two equations (1). In other terms, the models did not discriminate between weakly active amines and inactive amines. This indicates that yes/no activity depends, to some extent, on different molecular properties than the gradation of carcinogenic potency within the active compounds; this has been already shown for the mutagenic activity and potency of aromatic amines by Benigni et al. (2).
In the present work discriminant functions are generated which can separate carcinogenic from non-carcinogenic aromatic amines. The derived discriminant functions point to the molecular determinants that make the difference between the two sets of aromatic amines, thus lending themselves to scientific rationalization. In addition the present and previous QSAR models, combined together, provide a reliable tool for estimating the carcinogenicity of yet untested aromatic amines. Predictions can now be made in two steps: (i) yes/no prediction with the help of these discriminant functions; and (ii) for compounds predicted to be carcinogenic, prediction of the degree of carcinogenic potency from the QSARs presented in Benigni et al. (1).
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
|
|
Biological data
The rodent carcinogenicity data were derived from two sources. The primary source was the results of the National Cancer Institute/National Toxicology Program (NCI/NTP) carcinogenicity experiments, for which we considered the Selkirk and Soward's compilation (11), and for the more recent experiments, the individual NTP Technical Reports. The updated NTP database is also available on the internet at http://ntp-server.niehs.nih.gov/htdocs/pub.html. Only for chemicals not included in the NCI/NTP database, did we use the carcinogenicity database established by Gold et al. (10). This database is also available at http://potency.berkeley.edu/cpdb.html. In the case of chemicals present in both databases and tested in more than one laboratory, we used the NTP data. This was because of the well controlled and constant protocols used by the NTP (11).
Out of the 82 chemicals listed in Table I, three were not considered in the analyses for structural reasons (see above). Out of the 79 chemicals actually considered, the majority (55) were derived from the NTP database and 24 from Gold's database. The chemicals tested by NTP were (numbered in Table I
) 7, 9, 10, 1253, 5558, 73, 74, 76 and 7981.
All biological data are summarized in Table I, which presents carcinogenicity scores for the four rodent experimental groups (rat, mouse, male, female) and for overall carcinogenicity. The scoring for the experimental groups is from 1 to 4 (no evidence, equivocal evidence, some evidence, clear evidence); for the overall value it is from 1 to 3 (inactive, equivocal or borderline, active). In the overall carcinogenicity scores, we relied on the traditional NCI/NTP classification, that considers any chemical as a carcinogen that has at least some evidence of carcinogenicity in one experimental system. The scoring of 14 for the experimental groups follows the recent classification by NCI/NTP. For older experiments, the carcinogens (no distinction between some evidence and clear evidence) were given a score of 4. These scores are only ordinal scores and were not used as quantitative values in the mathematical analyses. Moreover, in order to construct the QSAR models only on the most reliable results, the equivocal evidence chemicals were discarded. The some evidence carcinogens form too small a class, so they were not considered in the discriminant analyses of the individual experimental groups. Therefore, the carcinogenicity classes were then defined as follows: (i) overall carcinogenicity: class 1 (inactive compounds), score 1; class 2 (active compounds), score 3; (ii) rat and mice carcinogenicity: class 1 (inactive compounds), score 1; class 2 (active compounds), score 4.
Discriminant analysis
Non-elementary discriminant analysis (12) was used resulting in canonical discriminant functions of the general form
|
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Overall carcinogenicity
The discriminant function
|
|
In contrast to the QSAR equations describing carcinogenic potency within the active compounds (Equations 1 and 2), no logP term appears in these functions, so that hydrophobicity does not appear to be a key factor in class separation. There is, however, a significant multiple correlation between logP and EHOMO, ELUMO, MR6 and I(BiBr) with r = 0.73, so that some hydrophobic effect might be hidden behind these parameters. The probability of a compound to be assigned to the active class increases with decreasing values of EHOMO and increasing values of ELUMO, the contribution of ELUMO being more important. Clearly, this effect it is opposite to that seen in Equation 1
. This is also true for bulk in the meta-position: according to Equation 1
, bulk in this position decreases carcinogenic potency of active amines while, according to the discriminant functions in Equations 4 and 5
, bulk in the meta-positions supports the assignment of compounds to the active class. As explained in the Discussion, these opposite contributions in the two models are likely to point to the existence of optimal values of the parameters for maximum activity.
With respect to the ortho position, the probability of a compound becoming carcinogenic increases with MR6. According to the conventions used, the appearance of a substituent in position 6 implies 2,6-disubstitution. There are four compounds of this type in the data set with either 2,6-Me2 or 2,6-Cl2 (nos 10, 15, 72 and 77) which all belong to the active class. The real meaning of the MR6 term is that this substitution pattern obviously supports carcinogenic activity, which is in keeping with the statement substitution of a chloro group or methyl group or methoxy group ortho to the amino group often enhances potency (7). Again, the situation for the gradation of potency (Equation 1) is different where bulk in this position appears to be unfavorable for the degree of carcinogenic potency in mice. R(L) and R(B5) in Equations 4 and 5
cannot be replaced by MR(R). This combination is not easy to interpret as both variables are highly correlated (r = 0.97; this is the only disturbing correlation in the data set). R(B5) can be replaced by R(L)2 indicating a possible optimum in substituent length at L(R)0
3.1. This optimum is close to the length of COMe [L(R) = 4.06]. A possible explanation for the steric terms for the amino substituents then is that bulk in this position is unfavorable except for the COMe substituent. Obviously, the key factors governing yes/no activity and the gradation of potency within the active compounds are different; for a more detailed discussion of this point, see Discussion. Finally, other things being equal, the probability of being active decreases if a compound belongs to the group of anilines and if a bridge between the two rings in the biphenyl compounds occurs and increases if NO2 groups are present.
Rat carcinogenicity
Compound nos 50 and 65 behave as outliers and drastically reduce the sharpness of class separation. For this reason they were eliminated. The following discriminant function achieves a highly significant separation of classes for female rat carcinogenicity (without nos 50 and 65)
|
The correct reclassification rate of discriminant function (6) amounts to 91.1% (Class 1, 93.3%; Class 2, 88.5%; misclassified compounds: nos 2, 16, 41, 47 and 66) with a fairly stable cross validation (all compounds: 80.4%; Class 1, 76.7%; Class 2, 84.6%). If the omitted compound nos 50 and 65 are included the discriminant function remains stable but the correct reclassification rate is reduced to 84.5%.
Substitution at the amino nitrogen can also be described by a simple indicator variable instead of Verloop parameters without loss of separating power now including di-substituted derivatives (without compound nos 50 and 65):
|
Exchanging L(R) against I(NR) has no effect on the other terms of the discriminant function and leads to almost the same correct reclassification (all compounds: 94.1%; Class 1, 94.1%; Class 2, 86.2%), but cross validation shows a poorer result (all compounds: 73.0%; Class 1, 70.6%; Class 2, 75.9%). If MR(R) is used instead of I(NR), two more compounds (one from each class) are misclassified (correct reclassification rate 87.1%).
For male rat carcinogenicity a good separation of classes is achieved by the discriminant function in Equation 8 (without compound nos 32 and 78):
|
The correct reclassification rate amounts to 91.7% (Class 1, 92.9%; Class 2, 90.6%; misclassified compounds: nos 13, 27, 36, 42 and 75) with a good result for cross validation (all compounds: 83.3%; Class 1, 82.1%; Class 2, 84.4%). If the omitted compound nos 32 and 78 are included in the analysis, the resulting discriminant function does not change but the correct reclassification rate drops to 87.1%.
L(R) in Equation 8 can be replaced by MR(R) without loss of separating power; the resulting discriminant function, Equation 9
, now also describes NR2 compounds (without no. 75):
|
The results obtained for male and female rats resemble each other and are similar to the overall carcinogenicity results. Of key importance for class separation are electronic properties as expressed by EHOMO and ELUMO, the type of ring system, and substitution in the ortho-position as well as at the amino nitrogen. The probability of a compound being assigned to the active class increases with increasing values of ELUMO, decreasing values of EHOMO, decreasing bulk of substituents in position 2 (ortho-position), decreasing length (or bulk) of substituents at the amino nitrogen and increasing number of aromatic rings (anilines have a distinctively lower probability of being active than biphenyls, fluorenes or naphthalenes). An important feature promoting carcinogenic potency also is the occurrence of an amino group in the ortho-position to the functional amino group. Of lesser importance are the variables I(diNH2), I(BiBr), MR5, and the cross product logP*I(diNH2). The I(diNH2) term seems to indicate that compounds with more than one amino group are intrinsically a little less active than compounds with only one amino group. This effect is counterbalanced by increasing hydrophobicity. Since the logP*I(diNH2) term outperforms the I(diNH2) term the general message is that compounds with more than one amino group have an increased probability of being carcinogenic with increasing hydrophobicity. Other things being equal, the occurrence of a bridge between the two aromatic rings in biphenyls seems to decrease and bulk in meta position seems to increase carcinogenicity potency.
It becomes obvious again that the key factors differentiating between active and inactive compounds on the one hand, and those governing potency within the group of active compounds are different. The most pronounced differences are with respect to the importance of hydrophobicity and the directionality of electronic effects (see Discussion).
Mice carcinogenicity
In contrast to the overall results and to the rat carcinogenicity, only a very moderate class separation (reclassification rate 74%) with no stability in cross validation was achieved for mice carcinogenicity if all compounds were included. Therefore compounds had to be omitted not only in order to obtain an improved result, but also since no meaningful result at all would otherwise be obtained. In fact, the experience accumulated from years of study in the QSAR field has indicated that outliers are most often compounds for which the experimental data are unreliable, or that do not follow the same mechanism of action as other chemicals in the set (12,13). The omitted compounds were no. 34 for female and male mice plus nos 50 and 66 in the case of female mice and nos 42 and 75 in the case of male mice. For female carcinogenicity, the description of N-substitution by L(R) and B5(R) was not essential, so that only the discriminant function with the simple indicator I(NR) is presented. For male mice, on the contrary, replacement of L(R) and B5(R) by simpler descriptors brings about a sharp loss of separating power; therefore, only the discriminant function with these variables will be presented (only four N-disubstituted compounds were lost this way).
For female mice carcinogenicity, the following discriminant function reclassifies 85.7% of the compounds correctly (Class 1, 87.9%; Class 2, 83.3%; misclassified compounds: nos 7, 30, 38, 42, 49, 69, 70, 75 and 77) and is of acceptable stability in cross validation (all compounds: 81.0%; Class 1, 84.8%; Class 2, 76.7%):
|
|
The results were similar to the overall results and to the rat carcinogenicity results, with the difference that electronic effects, as expressed by EHOMO and ELUMO, were of smaller importance. In the case of male mice carcinogenicity, a correct reclassification rate of 79.6% was still obtained if EHOMO and ELUMO were eliminated from discriminant function Equation 11, and for female mice carcinogenicity no electronic terms were observed. There is, however, a significant multiple correlation between the variables appearing in Equation 10
and EHOMO (r = 0.64) as well as (ELUMO EHOMO) (r = 0.52) so that some electronic effect may well be hidden behind these variables. The direction of the electronic effect for male mice carcinogenicity is the same as observed for the other carcinogenicity scales (the probability of a compound being assigned to the active class increases with decreasing values of EHOMO and increasing values of ELUMO) and, thus, again opposite to the effect observed for the gradation of potency in mice (Equation 2
). Very important for class separation is the type of ring system: I(An) alone already reclassifies >80% of the Class 1 compounds (but only ~50% of the Class 2 compounds) correctly so that anilines have a distinctly lower probability of being active than biphenyls, fluorenes or naphthalenes. Hydrophobicity and the number of amino groups also influence carcinogenicity, but this influence is only of secondary importance. For male mice, the probability of a compound being assigned to the active class decreases if more than one amino group is present; this effect is counterbalanced by increasing hydrophobicity of the poly-amino compounds. An increase with hydrophobicity of both, compounds with only one and with more than one amino group, is evident in the case of female mice; this effect is more pronounced in the latter group of compounds. The difference between the discriminant functions in Equations 10
[the terms logP*I(monoNH2) and logP*I(diNH2) occur but not the term I(diNH2)] and 11 [the terms logP*I(diNH2) and I(diNH2) occur but not logP*I(monoNH2)] are probably due to fluctuations in the data structure, as these variables are highly correlated for both the data set for female as well as for male mice.
For female mice, substitution at the amino nitrogen is unfavorable for (decreases) carcinogenic potency as was also seen for female and male rats. For male mice, however, the situation is somewhat more complex and corresponds to the picture obtained for overall carcinogenicity. With regard to mono-substitution at the amino nitrogen, long substituents are unfavorable while substituent width seems to be allowed. The B5(R) term in discriminant function Equation 11 can be replaced by a L(R)2 term without changing anything else, indicating a length optimum for the N-substituent at about L(R)0
3.4. As it is, of course, not possible to increase B5(R) substantially without also increasing L(R), the combination of L(R) and B5(R) as well as of L(R) and L(R)2 do, in fact, tell the same story. The length optimum is close to the length of COMe (L = 4.06) (see Discussion).
An important feature also promoting carcinogenic potency is the occurrence of an amino group in the ortho-position to the functional amino group. Substitution (bulk) in the meta-position supports carcinogenic activity as was also seen for the other carcinogenicity scales; again, this effect is opposite to that reflected by Equations 1 and 2. In the ortho-position a similar effect is seen for male mice carcinogenicity as already observed for the overall carcinogenicity. The MR6 term codes for 2,6-Me2 or 2,6-Cl2 substitution, and this pattern seems to be advantageous for carcinogenic potency in keeping with the evidence presented in Equation 7
. Finally, a bridge between the phenyl rings in the biphenyls seems to decrease carcinogenicity. The important point again is that the effects governing yes/no activity and the gradation of potency within the active compounds are different (see Discussion).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Another difference exists with respect to substitution at the amino nitrogen. The effect for the gradation of potency is only moderate, while a strong effect exists for yes/no activity. The general tendency is that bulk at the nitrogen blocks carcinogenic activity. However, for overall and male mice carcinogenicity a length optimum exists which is close to the length of the COMe group. This probably reflects the fact that the first step of metabolic activation involves N-hydroxylation and/or N-acetylation (7).
According to Equation 1, bulk in the ortho-position decreases potency. This is also seen in the discriminant functions for rat carcinogenicity (MR2 term) but not in those for overall and mice carcinogenicity, where bulk in position 6 increases the probability of a compound falling into the active class (MR6 term). The results in Equation 1
show that it depends on the nature of an ortho-substituent whether it will decrease or increase carcinogenic potency. According to the conventions used, the appearance of a substituent in position 6 implies 2,6-disubstitution. There are four compounds of this type in the data set with either 2,6-Me2 or 2,6-Cl2 (nos 8, 13, 65 and 70) which all belong to the active class with respect to overall and male mice carcinogenicity. The real meaning of the MR6 term is that this substitution pattern obviously supports carcinogenic potency which is in keeping with the statement substitution of a chloro group or methyl group or methoxy group ortho to the amino group often enhances potency (7). An inhibition is reported only for larger substituents.
Equation 1 (but not Equation 2
) also shows a steric effect in meta-position: bulk decreases potency. For yes/no activity, on the contrary, a supporting effect of bulk in this position is observed in the discriminant functions for all carcinogenicity scales. It must be noted that the variation of bulk in the meta position(s) is only limited, so that the meta bulk terms are not very well supported. A possible explanation for the apparently different effects of bulk in meta with respect to yes/no activity and the gradation of mice carcinogenicity may be that bulk in this position affects metabolism and the interaction of the ultimate carcinogen with its target in a different way.
A further difference between the QSARs for the gradation of potency within active compounds and yes/no activity regards the type of aromatic ring system which has a very strong effect for yes/no activity but is of only small importance in the case of the gradation of rat carcinogenicity (Equation 2) and of no importance for the gradation of mice carcinogenicity (Equation 1
). For all carcinogenicity scales, the probability of a compound being assigned to the active class is smallest if only one aromatic ring is present (anilines).
As for the gradation of potency, the presence of a bridge between the two phenyl rings in biphenyls is also unfavorable for yes/no activity. A smaller effect influencing yes/no activity, but not the gradation of carcinogenic potency, is the activity supporting role of a NH2 group adjacent to the functional amino group.
QSAR models in perspective
A remarkable aspect of the present result is that the QSAR models are in keeping with, and can be interpreted based on what is known about, the mechanisms of action of aromatic amines. Aromatic amines have to be metabolized to reactive electrophiles in order to exert their carcinogenic potential. For aromatic amines and amides, this typically involves an initial N-oxidation to N-hydroxyarylamine and N-hydroxyarylamide (7,9). This is in agreement with the importance of the chemical reactivity parameters (EHOMO and ELUMO) in the QSAR models. In particular, EHOMO is a parameter for oxidation reactions. Moreover, steric factors (bulk, shape) are critical in the interaction between enzymes and chemicals to be metabolized (13). All the above parameters appear to make the difference between the amines that can be processed by the cellular machinery and those that cannot. On the contrary, hydrophobicity [which is normally a fundamental parameter for transport and ease of interaction with the enzymes (13)] appears to have a primary role for the gradation of the carcinogenic potency (Equations 1 and 2), but not for setting the threshold between carcinogenic and non-carcinogenic amines. Interestingly, the mutagenic properties of the aromatic amines also pointed to a similar picture: the patterns of molecular determinants for the potency and the yes/no activity were different, and were analogous to those found here for their carcinogenicity (2,14). This evidence also represents an indirect proof of the similarity of the mechanisms by which the aromatic amines act in Salmonella typhimurium (mutagenicity) and in rodents (carcinogenicity).
Another point to be remarked is that the QSAR models obtained can be used directly for estimating the carcinogenicity of non-heterocyclic aromatic amines for which experimental carcinogenicity data are not available. With the QSARs in Benigni et al. (1) and the present results, a two-step prediction of carcinogenicity of aromatic amines seems to be possible: (i) step 1, yes/no activity from the discriminant functions; and (ii) step 2, if the answer from step 1 is yes then prediction of the degree of potency from the Hansch equations in Benigni et al. (1). Thus, the QSAR models can contribute to the following: the direct synthesis of safer chemicals; estimation of the risk posed by amines in the environment; setting priorities for further experimentation, thus also reducing the use of experimental animals. Even though the mathematical models provide estimations and cannot replace the experimental results (when necessary), the goodness of fit of the present models point to a remarkable level of reliability for their practical use.
A critical aspect of QSAR modeling is the availability of a sufficient number of good quality data, and the fulfillment of certain requirements, such as sampling the chemicals in such a way that the chemical descriptors are poorly intercorrelated. This to get clearer responses from the analysis (12,13). Unfortunately, these requirements are seldom fulfilled in toxicological QSAR analyses, notably in the case of rodent carcinogenicity results: the bioassays are too expensive and time consuming, and they are planned according to criteria (extent of use of the chemicals, specific scientific interest) different from those typical of QSAR. One has to use the data that are available in the literature. In the present work, we were in a far better situation than most of the other QSAR studies of carcinogenicity, since the class of aromatic amines is, by far, the most extensively bioassayed: the number of chemicals was sufficient for a thorough QSAR analysis, and out of the 79 chemicals actually analysed, the great majority (55) were derived from the same laboratory and were generated with the same protocol by the NTP. At the same time, we are aware that the different experimental origin of the remaining 24 chemicals (retrieved in the Gold's database) may add some noise to the data set. However, the QSAR modelling we made was largely successful. The QSAR models were both good from a statistical point of view, andmost importantwere coherent and meaningful from a mechanistic point of view. The cogency of our results was even more supported by the fact that the QSAR models for the rodent carcinogenicity of the aromatic amines were quite similar to those obtained previously by us and other investigators (e.g. Corwin Hansch) for Salmonella mutagenicity (2,14). This is in agreement with the accepted notion that the basic steps of the action mechanism of aromatic amines are similar in the two experimental systems. This means that the QSAR modelling was able to highlight the general trends underlying the action of the aromatic amines, and was not confused by the experimental noise.
The successful modeling of in vivo data provided in this and in our previous paper deserves further comment. Whereas experimental results from in vitro systems are generally considered reliable enough for the building models, the quality of in vivo data is often questioned. On the contrary, the robustness and intepretability of our results show that in vivo data can be successfully modelled. The possibility demonstrated in this paper of defining not only the molecular determinants of the gradation of potency, but also the existence of a marked chemical difference between carcinogenic and non-carcinogenic aromatic amines has further implications for the rodent carcinogenicity assay. In opposition to the claims that many positive results in the bioassay are artifacts due to aspecific toxic effects of the high doses employed (15), the chemical difference between active and inactive amines strongly supports the wide range of arguments that toxicity has no, or a minor, role in the bioassay results (1618). Overall, this evidence supports the reliability of the traditional rodent carcinogenicity assay.
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|