1 Department of Pharmaceutical Biosciences, Pharmacology, Uppsala University, Box 591, SE751 24 Uppsala and 2 Melacure Therapeutics AB and Pharmaceutical Chemistry, Uppsala University, Uppsala, Sweden
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: melanocortin receptors/melanocyte stimulating hormone/MSH peptides/protein-ligand interactions/proteo-chemometrics
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In a recent study, we discovered an approach for analysing such receptorligand interaction data that utilized a mathematical multivariate approach (Prusis et al., 2001a). The approach relied on the assignment of descriptors for the physicochemical properties of the receptor proteins and the ligands, and it yielded surprisingly robust models that gave useful information on the interactions of the ligand with the receptors, as well as revealing the presence of intra-receptor interactions. However, these trials involved the analysis of only four chimeric peptides on a set of chimeric receptors. Therefore, we were interested to evaluate the approach on a larger set of peptides that showed wider structural variations. In the present study we have analysed some of our previously published data (Schioth et al., 1998
) on the interaction of linear and cyclic melanocortin peptides on wild-type and chimeric melanocortin MC1 and MC3 receptors using the new approach. In the present case too, we have obtained surprisingly good statistically validated models. We have termed our novel approach proteo-chemometrics. It was recently also used to analyse the binding of the TRH peptide to melanocortin receptors (Prusis et al., 2001b
) as well as the binding of low molecular weight organic compounds to
1-adrenoceptors (Lapinsh et al., 2001
). Proteo-chemometrics promises to be a very effective method for the analysis of proteinligand interactions.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Description of receptors
The wild-type MC1 and MC3 receptors and MC1/MC3 receptor chimeras from a previous original study from our group (Schioth et al., 1998) were described essentially using the binary approach described in another of our recent studies (Prusis et al., 2001a
). In brief, the receptor chimeras had been constructed by combining four sub-sequences of the wild-type MC1 and MC3 receptors, which are G-protein coupled receptors with seven transmembrane
-helices. The four sub-sequences were termed parts A, B, C and D. Part A included the N-terminus and the first transmembrane region (TM1), part B the TM2, TM3 and first extra cellular loop (EL1), part C the TM4, TM5, half of TM6 and EL2, and part D the other half of TM6, TM7 and EL3 of the receptors. Each receptor was coded using a vector comprised of four binary numbers. Each number in the vector corresponded to one of the parts AD. If a part was taken from the MC1 receptor, its binary number was set to 1. If the part was taken from the MC3 receptor the number was set to 1. Together with the descriptors for peptides described in the next section, these descriptors are collectively referred to as `ordinary descriptors'.
Description of peptides
In our original study six peptides had been tested for their binding to the MC1, MC3 and MC1/MC3 receptors (Schioth et al., 1998). The sequences of these peptides are shown in Table I
. Each peptide was described with a vector containing five binary numbers. The first was termed Nle and was set to 1 if the peptide contained an Nle residue in position 4; otherwise it was set to 1. The second, Met, was set to 1 when the peptide contained a Met residue in position 4; when it did not, it was set to 1.The third, Iod, was set to 1 if the Tyr2 residue was iodinated; otherwise it was given a value of 1. The fourth, D-Phe, was set to 1 if the Phe7 was a D-isomer and to 1 otherwise. The fifth, Cycle, was set to 1 if the peptide was cyclized with a Cys4Cys10 bridge; otherwise it was set to 1. The binary descriptors are further indicated in Table I
. Together with the descriptors for receptors described in the previous section these descriptors are collectively referred to as `ordinary descriptors'.
|
The receptorligand binding activities were the Ki (in the case of [125I]-NDP-MSH, the Kd) values reported in our previous study (see Table I in Schioth et al., 1998
). The negative logarithms of these values (pKi) were calculated (herein termed binding data), and, together with the receptor and peptide descriptions, used in the creation of PLS models for peptide binding. We also calculated a selectivity measure for each peptidereceptor combination. These measures (herein termed selectivity data) were obtained by subtracting the pKi value for each peptidereceptor pair from the corresponding peptideMC1 receptor pKi value. The transformed data were used for the PLS analysis of selectivity, in a similar fashion to the analysis of peptide binding.
Partial least squares in projection
The receptor and peptide descriptors were correlated with the corresponding binding and selectivity data using PLS. PLS is a statistical method that correlates a matrix of descriptors X with a matrix of dependent variables Y. The method is based on the transformation of the multivariate spaces of X and Y to new matrices of lower dimensionality that are correlated to each other. The dimensions of these new matrices are called latent variables. The reduction of dimensionality of X and Y matrices is accomplished by principal component analysis (PCA)-like decompositions (Wold et al., 1987) that are slightly tilted to achieve maximum correlation between latent variables of the X matrix and latent variables of the Y matrix (Geladi and Kowalski, 1986
). Models were also generated with cross-terms formed from the multiplication of the ordinary descriptors. Prior to the PLS analysis, the X and Y data were mean centred and scaled to unit variance.
The PLS models were validated with regard to fitted R2 and cross-validated Q2 values (Eriksson et al., 1997). A model is considered acceptable for biological data if R2 > 0.7 and Q2 > 0.4 (Lundstedt et al., 1998
). In addition to cross-validation, models were also validated using permutation validation, using 10 validation rounds (Eriksson et al., 1997
). The permutation validation gives R2 and Q2 intercepts, iR2 and iQ2. These values are estimates of R2 and Q2 values of the completely randomized data. The iQ2 value should preferably be below zero and the iR2 considerably smaller than the R2 value of the model (Eriksson et al., 1997
). Moreover, Q2 was used to determine how many PLS latent variables should be included in the PLS models (Wold, 1978
), as defined in the SIMCA 7.0 manual (SIMCA, 1998
).
Addition of cross-terms caused a considerable increase in the number of descriptors in the X matrix, some of which might not be relevant to the explanation of the studied activity, resulting in lower quality models. One of the methods that can be used to improve models in such cases is orthogonal signal correction (OSC) (Wold et al., 1998). OSC starts by calculating the principal component of the X matrix orthogonal to the studied activity. The extracted component is then subtracted from the X matrix, yielding a new corrected X' matrix. This new matrix can be used to extract more components that are orthogonal to the studied activity (Wold et al., 1998
). When the desired number of orthogonal components has been subtracted from the original X, the new X' matrix can be used for PLS analysis. In the present study, we elected to extract two orthogonal components from the models that included cross-terms. This was because the extraction of more components did not result in considerable improvements. OSC was not applied to models where only original descriptors were used.
External prediction
The current approach was also validated using external prediction (Eriksson et al., 1997). The data set was divided into two equal parts; one part was called the `training set' and the other the `test set'. PLS models were then created using the training set, and the models obtained were used to predict the binding activities of the test set. The observations included in the training set were selected using 30 principal components according to PCA, explaining all the variation of the X matrix (including cross-terms). The distances between each two observations in this 30-dimensional PCA space were then calculated. At the start the first two most distant observations were chosen. The products were calculated for the distances of each of the remaining observations to the two chosen observations, and the observation yielding the largest product was chosen. The process was repeated and the products for the distances of all remaining observations to all the three chosen observations were calculated, and the observation yielding the largest product was again chosen. This process was repeated until all 30 observations for the training set had been chosen. This procedure was adopted to include the largest possible coverage of the descriptor space in the training set. The 30 peptidereceptor combinations selected out of the 60 were: for [125I]-NDP-MSH 1(6)3, 3(4)1, 1(1)3, 1(1)3(4)1, 1(4)3(6)1 and MC3 receptors; for NDP-MSH MC1, 1(6)3 and 3(6)1 receptors; for
-MSH MC1, 1(6)3, 3(6)1, 1(1)3, 1(1)3(4)1, 1(4)3(6)1 and MC3 receptors; for [Nle4]-
-MSH 1(6)3, 3(4)1, 1(1)3, 1(4)3(6)1 and MC3 receptors; for cCDC 1(4)3, 3(4)1, 1(1)3(4)1, 1(1)3(6)1 and MC3 receptors; and for cCLC 1(6)3, 3(4)1, 1(1)3 and 1(4)3(6)1 receptors. [Notations for chimeric receptors are as given in Schioth et al. (Schioth et al., 1998
).] The PLS models were only calculated from the training set for both affinity and selectivity using just ordinary terms, ordinary and cross-terms or ordinary and cross-terms with OSC. The goodness of the external prediction was characterized by calculating eQ2. The eQ2 was calculated identically to the Q2, but with the difference that only the predicted pKi values in the test set data were used.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The results of the PLS modelling are summarized in Table II. As indicated by the R2 and Q2 values, very good models were obtained both when the affinity and the selectivity were related to ordinary descriptors of peptides and receptors. When cross-terms were then added, the R2 and Q2 values increased somewhat, and after applying OSC, the models improved substantially; the R2 and Q2 becoming 0.97 and 0.91 for affinity, and 0.91 and 0.83 for selectivity, respectively. Moreover, permutation validations indicated acceptable iR2 and iQ2 values (Table II
). In Figure 1
the correlation is shown for observed and calculated pKi values for the final affinity and selectivity models.
|
|
|
External prediction
In order to demonstrate the predictive ability of the proteo-chemometrics approach, 30 experiments out of the total 60 were selected and used to create models that were evaluated for their ability to predict the activities of the remaining 30 peptide-receptor combinations (see Materials and methods). The results from this analysis are summarized in Table III and shown graphically in Figure 3
. As can be seen from the table, good models were obtained for the partial data set; essentially the same pattern for the improvement of the models was seen as reported above for the modelling of the full data set. Moreover, models for the partial data set predicted the remaining data excellently (Table III
). For the best models (i.e. the models that included cross-terms and applied OSC) the external predictability, eQ2, amounted to 0.85 and 0.68 for binding and selectivity, respectively.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the present study, we chose to describe the peptides and receptors using a binary approach. Thus, assigning a value of +1 described all the physico-chemical properties of a part selected from one of two particular structures, whereas assigning a value of 1 described all the physico-chemical properties of the part selected from the other structure. In the case of receptors, this approach could be taken because each of the receptor sub-sequences had been taken from either the MC1 receptor or from the MC3 receptor. This approach is also used to describe the presence or the absence of a feature, such as when the term Cycle was used to describe the presence or absence of a cycle in a peptide.
In the present study, we had chosen to represent the original data either as the logarithm of the binding affinity, or as the logarithm of receptor selectivity. This approach enabled analysis to be done both in terms of affinity and selectivity. As the two ways of expressing the data actually represent linear transformations of each other, the results derived from the two data sets show some resemblance, although notable differences also exist.
The success of our present and previous studies indicates the generality of our approach, which we have elected to term proteo-chemometrics. As used herein, the proteo-chemometrics analysis consisted of two steps. The first was to obtain a good statistically valid model that correlates the physico-chemical descriptions of variants of receptors and peptides with their binding interactions. The second was to interpret the results. The latter step includes understanding the meaning of the descriptors and interpreting the PLS coefficients in terms of their absolute values and directions.
Ordinary descriptors account for the interaction of its corresponding moiety with other parts of the receptor and/or peptide. The cross-terms need a more complicated explanation. First, we will interpret ordinary receptor terms and ordinary peptide terms, this is followed by discussions of how to interpret the cross-terms.
Interpretation of PLS coefficients of ordinary descriptors
For both the affinity and selectivity models, the largest absolute values for the PLS coefficients were found for the A, B and D terms, whereas the C term showed very small values. These results clearly indicate that other parts of peptides and/or receptors interact with parts A, B and D of the receptors in such a way that effects are seen on the ligand binding. In contrast, only minor interactions of this kind must take place with the C part of the receptor. All the PLS coefficients for the A, B and D variables were negative in the affinity model, which means that if these parts were taken from the MC1 receptor, the affinity for all peptides would (on average) increase. On the other hand, the positive PLS coefficients for these parts of the selectivity model show that selecting any of these parts from the MC3 receptor increases the selectivity measures for the peptides (on average), something which is, of course, expected.
The interpretation of peptides' ordinary terms is similar to the interpretation of ordinary receptor terms. Each of the peptide descriptors depicts changes in the peptide structure. Therefore, their terms model the interactions between the respective part of the peptide with other parts of the receptors and peptides. Because of the small size of the peptides, it seems reasonable to assume that the most important interactions are between the peptides and the receptors rather than inter-peptide interactions.
For both the binding and selectivity models the PLS coefficients for Cycle, Nle, dPhe and Iod terms gained high absolute values, whereas the PLS coefficient for the Met term was smaller than average. Thus, these results indicate that the introduction of a methionine in position 4 of MSH peptides has a very little influence on the peptide's affinity or selectivity. The positive PLS coefficients in the affinity model and the negative PLS coefficients for the selectivity model for the Nle, dPhe and Iod terms indicates, on the other hand, that the introduction of D-Phe7, Nle4 and/or iodination of tyrosine generally enhances the peptide's binding affinity, while it reduces their selectivity. Moreover, the PLS coefficient of the Cycle term was negative for the affinity model and positive for the selectivity model indicating that cyclization of the peptides reduces their binding affinity, but enhances their selectivity. Interestingly, however, if one inspects the data visually, one finds that iodinated NDP-MSH generally binds with a lower affinity than NDP-MSH, apparently violating one of the foregoing conclusions. The apparent discrepancy can be resolved by inspecting the cross-terms, which indeed predict that the binding affinity of [125I]-NDP-MSH is lower than that of NDP-MSH. A full discussion of this point is given in the next section.
Interpretation of PLS coefficients of receptorreceptor cross-terms
Interpreting cross-terms is a more complex issue. For example, a positive PLS coefficient for the cross-term for two receptor descriptors indicates that if both receptor parts are taken from the same receptor's sequence, they would enhance the activity being studied, whereas if they were taken from two different receptors' sequences they would reduce it. A negative PLS coefficient would of course reverse the direction of the predicted activity change. PLS coefficients of cross-terms between two receptor parts could therefore be assumed to model interactions between amino acids of two different parts that showed sequence variations.
In both the affinity and selectivity models the A*D cross-term obtained higher than average absolute values for their PLS coefficients. Interestingly, the same cross-term was found to be important for TRH peptide binding to chimeric MC1/MC3 receptors (Prusis et al., 2001b). The cross-term B*D also obtained a higher absolute value than average for the PLS coefficient obtained with the affinity model. However, although the PLS coefficient for this cross-term obtained an absolute value below the average in the selectivity model, it still showed a quite high value. The B*D cross-term was also found to be important in our previous study where we analysed the interaction of chimeric MSH peptides with MC receptor variants (Prusis et al., 2001a
). However, in that study the cross-term A*D seemed to be of very minor importance, and was removed by the variable selection procedure applied in that study for the improvement of models (Prusis et al., 2001a
). It may be noted that the chimeric peptides used in our other previous study had been varied in the N-terminal and C-terminal regions of the MSH peptides, whereas the central part remained unchanged (Prusis et al., 2001a
). However, our present set of peptides was varied in the N-terminal and central part of the peptides. Thus, it is tempting to speculate that interactions of receptor parts B and D are more important for the binding of the N-terminal and possibly C-terminal parts of the MSH peptides, whereas interactions of receptor parts A and D are more important for the binding of the central region of the peptides. Moreover, the positive PLS coefficient for the A*D and B*D cross-terms in the affinity model of the present study, and for the B*D cross-term of our previous study (Prusis et al., 2001a
), actually indicates that selecting the corresponding receptor parts from the same MC1/MC3 receptor subtype increases the affinity for the MSH peptides. The negative PLS coefficient in the selectivity model indicates that selecting the corresponding receptor parts from the same MC1 or MC3 receptor makes the receptor more `MC1 receptor-like' with respect to peptide selectivity. The teleological explanation for this would be that the natural combination of parts makes a receptor `MSH receptor-like', whereas the non-natural combinations introduce artificial effects that cause deviations from this pattern.
Interpretation of PLS coefficients of peptidereceptor cross-terms
By analogy, as discussed above, the presence of a cross-term for a receptor and peptide part would indicate that specific interactions take place between the changing parts of the peptides and the receptors. Several such peptidereceptor cross-terms with receptor parts A, B and D were found to have large absolute values of their PLS coefficients, both for the affinity and the selectivity models. However, none of the absolute values of the PLS coefficients of such cross-terms involving part C were found to be above or even close to the average of the absolute values of all the PLS coefficients. Thus, these data give additional support for specific interactions of peptide parts with certain parts of the receptor. Thus, inspection of the particular cross-terms (Figure 2) would indicate that the Met/Nle, D/L-Phe and Tyr amino acids in the melanocortin peptides interact at a location situated in between parts A, B and D of the receptors.
Interpretation of PLS coefficients of peptidepeptide cross-terms
The interpretation of cross-terms between peptide descriptors seems to be somewhat different from the cross-terms discussed in the previous section. We will first concentrate on the Met*D-Phe cross-term, which is the largest such cross-term for both the affinity and selectivity models (Figure 2). In order to understand its meaning we must investigate for which peptides it is positive and for which it is negative. An inspection of Table I
shows that the descriptor for the Met*D-Phe cross-terms take positive values for the [Nle4]-
-MSH and cCLC peptides, whereas they take negative values for the other four peptides used herein. This means that the
-MSH is grouped together with all the D-Phe amino acid containing peptides. In view of the fact that the
-MSH lacks the D-Phe in position 7, and that all the D-Phe containing peptides lack methionine in position 4, the high value for the PLS coefficient of the Met*D-Phe cross-term might be interpreted as changes in the peptides' conformational space, rather than in terms of interactions between amino acids. As the PLS coefficient of the Met*D-Phe cross-term is retained with a negative value in the affinity model, one might assume that conformational effects occur in the
-MSH, making it similar to the peptides having D-Phe amino acid and, thus, that this effect has an enhancing effect on the peptide affinity.
Three cross-terms involving Iod had large absolute values for their PLS coefficients in the affinity model. These cross-terms were Nle*Iod, Iod*D-Phe and Iod*Cycle. By analysing these terms in an analogous way to that presented above, it is found that the large value for the PLS coefficient of the Nle*Iod cross-term indicates a similarity in the conformational space of [125I]-NDP-MSH with the peptides that lack the Nle amino acid. Similarly, the large value for the cross-term Iod*D-Phe indicates the presence of a similarity in the conformational space of [125I]-NDP-MSH with the peptides where the Phe7 is in its L optical isomeric form. Moreover, the large value for the Iod*Cycle cross-term indicates a similarity in the conformational space of [125I]-NDP-MSH with the cyclic peptides. Moreover, the directions of the PLS coefficients for the three major cross-terms indicate that the affinity of [125I]-NDP-MSH would decrease, thus reversing the positive effect explained by the ordinary descriptor Iod. Thus, it is most likely that the lower affinity of [125I]-NDP-MSH compared to NDP-MSH is caused by conformational changes of the peptides. However, the small absolute values of the PLS coefficients of the Nle*Iod, Iod*D-Phe and Iod*Cycle cross-terms in the selectivity model indicate that the conformational changes in the peptides' structure induced by iodination of Tyr2 do not have any great impact on selectivity.
In the selectivity model there was another cross-term between peptide parts that had an absolute value for its PLS coefficient larger than average: D-Phe*Cycle. However, for the affinity model the absolute value of the PLS coefficient for this cross-term was just below the average (Figure 2). The interpretation of this term might indicate either that the cCDC peptide is more similar in its conformational space to the peptides containing the L-Phe7 isomer, or that the cCLC peptide is more similar in its conformational space to the peptides containing the D-Phe7 isomer. The direction for the PLS coefficients for this term indicates either that cyclization of the peptides containing the D-Phe7 isomer increases selectivity and decreases affinity, or that cyclization of the peptides containing the L-Phe7 isomer decreases selectivity and increases affinity. However, the absolute value of the PLS coefficients for the D-Phe*Cycle cross-term indicate that the effects on selectivity are generally larger than on affinity.
Conclusions
We have shown here that the proteo-chemometrics approach yields robust models that cross-validate well, including the rigorous test afforded by external prediction. The method seems to give highly useful information on the molecular interactions of peptides with their receptors, both with respect to their affinity and for the formation of peptide selectivity. Our analysis indicates that it is possible to obtain detailed information on the interactions of the amino acids in peptides with regions of peptide receptors. Moreover, our data indicate that some insight may be obtained into how changes in the conformational space of peptides lead to alterations in their biological activity. The molecules included in the analysis could also be subjected to any degree of modification, which would likely lead to a considerable increase in the resolution of the models. Thus, a task of high priority is to assess the limits of the present approach through further experimentation.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Geladi,P. and Kowalski,B.R. (1986) Anal. Chim. Acta, 185, 117.[CrossRef][ISI]
Lapinsh,M., Prusis,P., Gutcaits,A., Lundstedt,T. and Wikberg,J.E.S (2001) Biochem Biophys Acta, 1525, 180190.[ISI][Medline]
Lundstedt,T., Seifert,E., Abramo,L., Thelin,B., Nyström,Å., Pettersen,J. and Bergman,R. (1998) Chemometr. Intellig. Lab. Syst., 42, 340.[CrossRef][ISI]
Prusis,P., Muceniece,R., Andersson,P., Post,C., Lundstedt,T. and Wikberg,J.E.S. (2001a) Biochim. Biophys. Acta, 1544, 350357.[ISI][Medline]
Prusis,P., Muceniece,R., Lundstedt,T. and Wikberg,J.E.S. (2001b) Lett. Pept. Sci., 7, 225228.[ISI]
Schioth,H.B., Yook,P., Muceniece,R., Wikberg,J.E.S. and Szardenings,M. (1998) Mol. Pharmacol., 54, 154161.
SIMCA 7.0 (1998) Manual. Umetri, Umeå.
Wold,S. (1978) Technometrics, 20, 397405.[ISI]
Wold,S., Esbensen,K. and Geladi,P. (1987) Chemometr. Intellig. Lab. Syst., 2, 3752.[CrossRef][ISI]
Wold,S., Antti,H., Lindgren,F. and Öhman,J. (1998) Chemometr. Intellig. Lab. Syst., 44, 175185.[CrossRef][ISI]
Received March 26, 2001; revised December 5, 2001; accepted January 4, 2002.