Assessing the role of tryptophan residues in the binding site

Uttamkumar Samanta and Pinak Chakrabarti,1

Department of Biochemistry and Bioinformatics Centre, Bose Institute,P-1/12, CIT Scheme VIIM, Calcutta 700 054, India


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Instead of looking at the interfacial area as a measure of the extent of a protein–protein recognition site, a new procedure has been developed to identify the importance of a specific residue, namely tryptophan, in the binding process. Trp residues which contribute more towards the free energy of binding have their accessible surface area reduced, on complex formation, for both the main-chain and side-chain atoms, whereas for the less important residues the reduction is restricted only to the aromatic ring of the side chain. The two categories of residues are also distinguished by the presence or absence of hydrogen bonds involving the Trp residue in the complex. A comparison of the observed change in the accessible surface area with the value calculated using an analytical expression provides another way of characterizing the Trp residues critical for binding and this has been used to identify such residues involved in binding non-proteinaceous molecules in protein structures.

Keywords: accessible surface area/molecular recognition/protein-protein complexes/substrate binding/tryptophan


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
The recognition and association between macromolecules are fundamental to the functioning of biological systems. The affinity between two molecules for the formation of non-covalent complexes can be quantified on a structural basis (Janin, 1995aGo,bGo). For example, in protein–protein complexes, such as those between protease and inhibitor, antibody and antigen, etc., the interface covers an area of ~1500 Å2 and contains ~10 hydrogen bonds (Chothia and Janin, 1975Go; Janin and Chothia, 1990Go). The two surfaces have good shape and electrostatic complementarity (Norel et al., 1994Go; Jones and Thornton, 1996Go; McCoy et al., 1997Go). Although analyzing the whole surface, these studies do not provide much insight into the details of the contributions of individual residues to binding. To probe experimentally the energetic contributions of individual side chains to protein binding, alanine scanning mutagenesis (Wells, 1991Go) has been used to remove selectively individual side chains from an interface. Recently, Bogan and Thorn (1998) compiled a database of 2325 alanine mutants for which the change in free energy of binding upon mutation to alanine ({Delta}{Delta}G) has been measured. They found that the free energy of binding is not evenly distributed across interfaces; instead, there are hotspots ({Delta}{Delta}G = 2 kcal/mol) of binding energy made up of a small subset of residues in the dimer interface. Of all the amino acid residues found in the interface, the likelihood of being in hotspots is the maximum for tryptophan (Trp).

In this context, it would be of interest to see if there is any structural or binding feature in the three-dimensional structure of a complex that one can use to distinguish a Trp residue in the hotspot from another which is energetically less important. Such characteristics can then be used to assess the importance of Trp in a protein in the binding of other non-proteinaceous molecules, such as carbohydrate, cofactor, substrate or drug.

We have recently analyzed the environment of Trp residues (the aromatic part of the side chain, in particular) in protein structures, the nature of the interacting residues (partners) and the exponential dependence of the accessible surface area of the Trp residue on its number of partners (other protein residues in contact with Trp) (Samanta et al., 2000Go). As atoms buried at protein–protein interfaces are close-packed like the protein interior (Lo Conte et al., 1999Go), the aforementioned features of Trp residues in proteins should also be transferable to the residues in the interface region. Consequently, one should be able to assess the role of Trp in the binding by finding the change in the number of its partner residues on complex formation and the associated loss in its accessible surface area and by looking at other elements of its environment and comparing the results with those found within protein structures. This paper is an anatomy of Trp residues in energetically hotspots and other less important regions in protein–protein interfaces, as well as those involved in the binding of other small molecules.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Information on the Trp residues which are at the protein–protein interface, as revealed by the crystallographic analysis of the heterodimeric complex, was obtained from the file interface.xls in http://motorhead.ucsf.edu/~thorn/hotspot (Bogan and Thorn, 1998Go). Depending on their contribution towards the free energy of binding, these were classified as being or not being in hotspots. Only the complexes for which both thermodynamic and crystallographic data are available could be used and are given in Table IGo. As outlined by Samanta et al. (2000), any residue with an atom within 4 Å of any Trp atom was considered a partner. In protein–protein complexes the partner residues are provided by both molecules, whereas consideration of only the parent molecule (containing the Trp residue) gave the partners before complexation.


View this table:
[in this window]
[in a new window]
 
Table I. Trp residues in the interface and their partner residues
 
For the analysis of the role of Trp in binding small molecules (termed substrates in this paper), all non-proteinaceous molecules (excluding water) in contact with Trp residues were identified for a selected set of 180 protein structures from the Protein Data Bank (PDB) (Sussman et al., 1998Go); the methodology and the files used are supplied in Samanta et al. (2000). The solvent-accessible surface area (ASA) was computed using the program ACCESS (Hubbard, 1991Go), which is an implementation of the Lee and Richards (1971) algorithm. The solvent probe size was 1.4 Å and the default van der Waals radii in the program were used in all calculations. Any hydrogen bond involving donor (>NH groups in the main-chain and the side-chain NE1 position) and acceptor (main-chain O atom) sites in Trp residues and complementary sites in other protein residues or substrates was identified first by noting such groups within a cutoff distance of 3.5 Å and then visually checking them on a graphics terminal.


    Results and discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Trp residues in protein interface

Table IGo lists Trp residues which are/are not in hotspots, as elucidated by Bogan and Thorn (1998). The number of partner residues in contact with the Trp residue, considering either the whole residue or just the aromatic ring, before and after complex formation and the names of the partner residues are also provided. Figure 1Go depicts a Trp in the interface and how its partners are disposed in the two subunits.



View larger version (122K):
[in this window]
[in a new window]
 
Fig. 1. Residues (C{alpha} positions indicated) in contact with Trp169B in the PDB file 3hhr (details are available in Table IGo). The two subunits (and their residues) in the complex are drawn in different colours and the residues upto a distance of 15 Å from the Trp residue are used to draw the surface plot. The diagram was made using RASMOL (Sayle and Milner-White, 1995Go).

 
An analysis of the environment of the aromatic ring of Trp showed that the peak in the distribution of the number of protein residues in contact with the ring (the so-called partners) occurs at six (Samanta et al., 2000Go). The number of partners of Trp residues coming from the same polypeptide chain is in the range 2–5, suggesting that the binding potential of these Trp residues is not completely satisfied in their parent molecule. On complex formation, this number increases to 6–8. In one case of a Trp residue not in the hotspot, the number of partners does not change at all. However, there is an important feature which distinguishes Trp residues in hotspots from those which are not. If one compares the observed values of the change in accessible surface area of Trp residues on complex formation, {Delta}ASAw and {Delta}ASAr in Table IIGo, the former considering the whole Trp residue and the latter only the aromatic ring, it is found that the two sets of values are significantly different for Trp residues in a hotspot, but are nearly the same when Trp residues are not in a hotspot. This suggests that for non-hotspot Trp residues, the change in the accessible surface area on complex formation is essentially restricted to the indole part of the side chain, but for residues in a hotspot the change in accessibility is spread over the whole residue (including the main chain and the CB atom). Moreover, when not in a hotspot, Trp residues are not found to form any extra hydrogen bonds in the complex. However, in two out of three hotspot residues there are hydrogen bonds engaging the Trp residues with the physiological partner molecule.


View this table:
[in this window]
[in a new window]
 
Table II. Change in the accessible surface area of Trp residues in the interface on complex formation
 
Using the analytical expression relating the accessible surface area of a Trp residue and its number of partners (Samanta et al., 2000Go) (Figure 2Go), the expected values of the change in ASA of Trp residues on complex formation, {Delta}ASAw and {Delta}ASAr, can be calculated. These values are, in general, smaller than the observed values irrespective of whether or not the Trp residue is in a hotspot (Table IIGo). This suggests that a Trp residue at the interface is less buried than an average Trp residue in the protein structure and/or on complex formation, the residue gets more buried than suggested by the increase in its number of partners. As discussed above, the observed value of {Delta}ASAw is greater than that of {Delta}ASAr for Trp residues in a hotspot and the calculated values also reflect the same trend. Additionally, for residues not in a hotspot, the trend is just the opposite (with one exception), i.e. {Delta}ASAw (calc.) < {Delta}ASAr (calc.); in one case both the values are 0.0, as there is no change in the number of partners on complex formation.



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 2. A plot of the equation ASA = 189.63e–0.36x, relating the variation of the accessible surface area, ASA (Å2) of the aromatic ring of a Trp residue and its number (x) of partners. If the number of partners increases from 4 to 7 on complex formation, the change in ASA ({Delta}ASAr) can be calculated from the above equation and is shown. Depending on the initial and final values of ASA, the observed value of {Delta}ASAr could be different. Both the observed and calculated values are provided Table IIGo. Instead of the aromatic ring only, the whole Trp residue can also be considered for the ASA calculation and for finding out the number of partners, the corresponding equation is ASA = 246.64e–0.22x.

 
Trp residues in substrate-binding site

Based on the above observations on Trp residues in the protein interface, we wanted to see if it is possible to assess the importance of Trp residues in the binding site of non-proteinaceous molecules in protein structures. For a residue to be important the following two conditions have to be satisfied: {Delta}ASAw (obs.) >= 2{Delta}ASAw (calc.) and {Delta}ASAr (obs.) >= 2{Delta}ASAr (calc.). These conditions are only approximate, as when applied to residues in Table IIGo, these would have missed out one hotspot residue and also would have identified one non-hotspot residue as important. However, in the case of substrate binding these conditions should be more appropriate. As the substrates are usually much larger than the average size of an amino acid residue and {Delta}ASA values are calculated assuming an increase in the number of partners by just one owing to the substrate binding, these values are expected to be smaller than the values actually observed if the Trp residue is crucial for the binding of the substrate. The other criterion for an important residue is the existence of a hydrogen bond between Trp and the substrate molecule.

The formulae of all the substrate molecules used in our analysis and their atoms which are found in contact with the indole ring of Trp residues in different PDB files are shown in Figure 3Go. Information on Trp residues, their partners, accessible surface areas and how these change on substrate binding is provided in Table IIIGo. In one respect these Trp residues are different from those in the protein interface. Whereas the latter residues have 2–5 partners (around the aromatic ring) in the parent molecule (Table IIGo), the majority of the former residues have a value of >=6. The substrate molecules are of different shapes and sizes. Trp residues which are deemed to be important in substrate binding using the conditions on ASA are marked with dots in the last column in Table IIIGo. If in addition there is a hydrogen bond between the Trp residue and the substrate, the residue is likely to be important in substrate binding. One example is the binding of FMN by Trp57 in the structure, 1rcf. 1stp corresponds to the structure of streptavidin which binds biotin with exceptionally high affinity (Kd = 10–15 M) (Green, 1975Go). There are three Trp residues in the binding site (Weber et al., 1989Go) and all are shown to be important, thus lending credence to the predictive power of our methodology. Moreover, aromatic-sugar stacking is a typical feature of protein–carbohydrate interactions (Vyas, 1991Go; Kadziola et al., 1998Go). In all the structures (1byb, 1cel, 1slt and 2gbp) where a carbohydrate molecule is bound, there is at least one Trp residue which is shown to be important. However, in all the cases the decrease in the accessible surface area on substrate binding may not be the best criterion to judge the role of a residue. For example, in the binding of the small sulfate ion (structure, 1sbp), there is hardly any change in ASA and the formation of the hydrogen bond could be the deciding factor in this case. Another situation where the comparison of the observed and calculated values of {Delta}ASA may not yield the right result is when the number of partners is atypically small, e.g. 0. In 4fxn, although the observed value is one of the highest in the table, the calculated value is also large and their difference is very small. Nevertheless, this procedure provides some guidelines as to the importance of a Trp residue in binding, which can then be corroborated by protein engineering experiments.





View larger version (117K):
[in this window]
[in a new window]
 
Fig. 3. Various types of substrates interacting with Trp residues. Atoms in contact with the indole ring of Trp (in all the different structures containing the substrate) are highlighted using squares. Each diagram is labelled by the name of the compound, its formula, the name of the PDB file containing it, its name and number (including the subunit identifier) in the file, the interacting Trp residue number and the atoms which are interacting with Trp. All the PDB files containing the substrate are shown.

 

View this table:
[in this window]
[in a new window]
 
Table III. Accessibility of Trp residues in presence of various substrates
 
Conclusion

Depending on the magnitude of contribution towards the binding energy, interface residues have been classified as being or not being in a hotspot (Bogan and Thorn, 1998Go). In this paper we analyzed whether it is possible to identify Trp residues in hotspots from those which are not, on the basis of crystal structure data. We find that for Trp residues not in hotspots, the change in accessible surface area of the Trp residue on complex formation is restricted to only the indole ring, whereas for hotspot residues the change involves the whole residue. Although the former residues do not form hydrogen bonds with the physiological partner molecule, a hydrogen bond is usually formed for the latter residues. Depending on the change in the number of partner residues, it is possible to calculate the expected change in the accessible surface area of a Trp residue due to complex formation. The observed values are always found to be greater than the calculated values. Similar comparisons between the observed and calculated values and the identification of any hydrogen bond linking Trp to the substrate molecule provides a way to assess the importance of Trp residues in the substrate-binding sites. Based on these encouraging results involving Trp, we are now in the process of extending the methodology to other residues.


    Notes
 
1 To whom correspondence should be addressed. E-mail: pinak{at}boseinst.ernet.in Back


    Acknowledgments
 
The authors are grateful to the Department of Biotechnology and the Council of Scientific and Industrial Research for financial support.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Bogan,A.A. and Thorn,K.S. (1998) J. Mol. Biol., 280, 1–9.[ISI][Medline]

Chothia,C. and Janin,J. (1975) Nature, 256, 705–708.[ISI][Medline]

Green,N.M. (1975) Adv. Protein Chem., 29, 85–133.[Medline]

Hubbard,S.J. (1991) ACCESS, a Program for Calculating Accessibilities. Department of Biochemistry and Molecular Biology, University College London, London.

Janin,J. (1995a) Biochimie, 77, 497–505.[ISI][Medline]

Janin,J. (1995b) Proteins: Struct. Funct. Genet., 21, 30–39.[ISI][Medline]

Janin,J. and Chothia,C. (1990) J. Biol. Chem., 265, 16027–16030.[Free Full Text]

Jones,S. and Thornton,J.M. (1996) Proc. Natl Acad. Sci. USA, 93, 13–20.[Abstract/Free Full Text]

Kadziola,A., Sogaard,M., Svensson,B. and Haser,R. (1998) J. Mol. Biol., 278, 205–217.[ISI][Medline]

Lee,B. and Richards,F.M. (1971) J. Mol. Biol., 55, 379–400.[ISI][Medline]

Lo Conte,L., Chothia,C. and Janin,J. (1999) J. Mol. Biol., 285, 2177–2198.[ISI][Medline]

McCoy,A.J., Epa,V.A. and Colman,P.M. (1997) J. Mol. Biol., 268, 570–584.[ISI][Medline]

Norel,R., Lin,S.L., Wolfson,H.J. and Nussinov,R. (1994) Biopolymers, 34, 933–940.[ISI][Medline]

Samanta,U., Pal,D. and Chakrabarti,P. (2000) Proteins: Struct. Funct. Genet., 38, 288–300.[ISI][Medline]

Sayle,R.A. and Milner-White,E.J. (1995) Trends Biochem. Sci., 20, 374.[ISI][Medline]

Sussman,J.L., Lin,D., Jiang,J., Manning,N.O., Prilusky,J., Ritter,O. and Abola,E.E. (1998) Acta Crystallogr., D54, 1078–1084.

Vyas,N.K. (1991) Curr. Opin. Struct. Biol., 1, 732–740.

Weber,P.C., Ohlendorf, D.H., Wendoloski,J.J. and Salemme,F.R. (1989) Science, 243, 85–88.[ISI][Medline]

Wells,J.A. (1991) Methods Enzymol., 202, 390–411.[ISI][Medline]

Received May 9, 2000; revised October 31, 2000; accepted November 9, 2000.