School of Biochemistry and Molecular Biology, Garstang Building, University of Leeds, Leeds LS2 9JT, UK
1 To whom correspondence should be addressed. E-mail: jackson{at}bmb.leeds.ac.uk
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: drug design/evolutionary trace/molecular recognition/protein interaction/residue conservation
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A large number of structures have been determined for the SH2 family of domains, including src, hck, fyn, lck, syk, abl, p85, plc and shp-2, revealing that many of the domains exhibit a conserved architecture, containing a five-stranded antiparallel ß-sheet core, sandwiched between two -helices, additionally extended by a ß-strand and small triple-stranded ß-sheet (Waksman et al., 1992
). The preferred sequence of the pTyr-containing peptide ligand is pTyrX1X2X3, where X1, X2 and X3 refer to the first (pTyr + 1), second (pTyr + 2) and third (pTyr + 3) positions within the peptide sequence. Structural studies show that the phosphorylated ligand binds perpendicularly to the 5-stranded anti-parallel ß-sheet core, characteristically interacting with two well-defined pockets, which play a role in ligand recognition. The first of these pockets is the basic pTyr-binding pocket, containing several positively charged residues including a vital arginine (Arg 32) from the SH2 signature motif, PheLeuValArgGluSer (FLVRES). It has been shown in src that mutation of Arg32 prevents virtually all binding of pTyr-containing ligands to the domain (Bibbins et al., 1993
). Even so, it has been demonstrated that the phosphate is also involved in hydrogen-bonding interactions with Ser34, Glu35 and Thr36 and makes hydrophobic contacts with the side chain of Lys60. Previously, the specificity of the interaction has largely been attributed to the hydrophobic second binding pocket, which is formed by two loop regions and accommodates the X3 residue. Indeed, mutations to the residues that make up the second cavity have been shown to result in changes to ligand binding specificity and activity (Marengere et al., 1994
; Bradshaw and Waksman, 1998
), although biochemical studies of the pTyrGluGluIle (pYEEI) phosphopeptide binding to Src-like domains have shown that the X1 and X2 also remain important for high-affinity recognition of the peptide (Gilmer et al., 1994
). There is also evidence to suggest that the pTyr + 4 position may contribute to the binding affinity in the N-terminal SH2 domain of shp-1 (Beebe et al., 2000
).
Studies of the interactions of SH2 domain-containing proteins provide an insight into the complex mechanisms and functions of signal transduction. However, the understanding of the cellular mechanisms has been somewhat hindered, perhaps by the lack of known inhibitors of SH2 domains which are effective in cell-based assays (Sawyer, 1998). Additionally, mechanistic studies have been curbed by the difficulty in making quantitative measurements of the blockage of signal transduction pathways via such inhibitors and the associated downstream readouts. Despite some recent successes (Shakespeare et al., 2000
; Vu, 2000
), progress in the field of SH2 domain structure-based drug design has also been slower than first anticipated. This is to some degree because of the lack of bioavailability, but also perhaps as a result of the high degree of similarity between the domains, particularly in the region which binds the conventional pTyr to pTyr + 3 motif, which has provided the chief focus of many drug design strategies. Experiments involving screening of randomized phosphopeptide libraries (Songyang et al., 1993
) led to the idea that SH2 interactions maintain specificity by interacting with particular amino acid sequences contained within the phosphorylated peptide. However, it has also been suggested that while the selected preferential motifs are those with the highest affinity, the difference in affinity between a specific and non-specific interaction is not necessarily high, amounting to less than two orders of magnitude in affinity (Ladbury and Arold, 2000
). It has been questioned whether this is sufficient to guarantee mutual exclusivity in signalling pathways in cells with more than one type of SH2 domain. Indeed, in the same study, an investigation into the surfaces of four SH2 domains in terms of charge and polarity revealed that there is little to distinguish between the binding sites of src, p85, syp and grb (Ladbury and Arold, 2000
). Nevertheless, SH2 domains remain a highly desirable drug target, owing to the range of potential applications, for example inhibitors of src with respect to regulating bone resorption, zap-70 inhibitors regarding immune suppression and inhibitors of grb2, a component of the oncogenic Ras pathway.
SH2 domains have been used as model systems in several studies where techniques for prediction of proteinprotein interfaces have been developed. Casari et al.(1995) represented entire proteins and sequence residues as vectors in a generalized sequence space to predict residues involved in protein function. To assess the validity and accuracy of an evolutionary trace method that defines binding surfaces common to protein families, Lichtarge et al.(1996)
identified functional epitopes and residues within the SH2 domain family critical to binding. In testing an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information, Armon et al.(2001)
demonstrated general surface conservation of the SH2 domain binding site. It was observed that the typical SH2 domain binding site was well conserved and the conservation patch decreased in size as the clade size from a phylogenetic tree was increased. However, while these studies predict the general location of the phosphotyrosyl binding site with a good degree of accuracy, they do not directly describe the differences between SH2 domains in terms of specific interactions with peptides.
Here, we address the issue of ligand recognition and discrimination by SH2 domains in the context of crystallographic and peptide screening data. A conservation scoring method (Valdar and Thornton, 2001) is used to analyse the degree of conservation present in and around the SH2 domain phosphotyrosyl binding site. The domains are analysed within the context of groups clustered according to residue similarity at the peptide binding site. This reveals regions of conservation that are not present uniformly across the whole family, indicating that there are significant differences between groups at the amino acid level. This binding site conservation is group-dependent but is not restricted to residue positions contacting the pTyrX1X2X3 motif, which is generally considered to be most important for high affinity recognition. In some groups, conservation spreads more widely across the binding face. In others it follows the trajectory of the known peptide binding site outside the pTyr to pTyr + 3 peptide positions. Additionally, conservation difference maps determine group-dependent clusters of conserved residues that are not seen when considering a larger experimentally determined group. Several of these clustered residues are involved in proteinligand contacts outside the conventional pTyr to pTyr + 3 binding motif, challenging the notion that this motif is largely responsible for recognition and discrimination of ligands.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Coordinates of SH2 domains were retrieved from the Protein Databank (Berman et al., 2000) following sequence searches in Sequences Annotated by Structure (SAS) (Milburn et al., 1998
) and text-based searches in PDBSum (Laskowski et al., 1997
). Where there was more than one SH2 domain per PDB file, structural alignments were performed on the domain C
coordinates using a geometric hashing structural alignment algorithm (R.M.Jackson, unpublished work) to select the structure with the highest level of atomic similarity to the rest of the dataset. In the case where only one of the domains contained a ligand, that domain was chosen. All other domains were removed from each PDB file. PREPI (Islam, 1995
) was used to transform the coordinates of the c-src representative structure so that the surface of the peptide binding site was oriented on the y-plane. The z-axis represents the depth of the binding site. The selected domains were then structurally aligned to the newly orientated c-src domain, resulting in the transformation of the domains to the same orientation, allowing superimposition of multiple ligands on a representative template. Sequences of structurally unresolved SH2 domains were obtained using text-based and sequence-based searches in SwissProt (Bairoch and Apweiler, 2000
) and BLAST (Altschul et al., 1990
) and converted to FASTA format where necessary. A complete sequence alignment of all the domains was created using ClustalX (Thompson et al., 1997
) and edited using Jalview (Clamp, 1999
).
Definition and clustering of binding site residues
Binding site residues were defined by determining proteinligand contacts in the crystal structures. For a given SH2 domain, all PDB files containing ligands were used in the calculation and determination of proteinligand atomic contacts. Protein residues with at least one atom within a calculated atomic distance of 5 Å or less from any ligand atom were included in the binding site definition. The ligand contacting residues were mapped to the full sequence alignment of SH2 domains. The binding site is defined as any position at which at least 50% of the domains with bound ligands have at least one contact, resulting in the inclusion of 17 positions in the binding site definition. A second definition of the binding site is made, where any position with at least 35% of the SH2 domains with bound ligands had at least one contact. This resulted in 24 alignment positions in the second binding site definition.
Binding site similarity scores were calculated across the whole SH2 domain family in an all-against-all comparison, using a binary scoring method. This systematically selects pairs of domains from a given alignment for comparison and scores identical pairs of residues at each alignment position. A total score is then calculated, producing a matrix of scores that represents how similar each SH2 domain-binding site is to the binding sites of the rest of the family. Cluster analysis was performed on these scores with the OC cluster analysis program (Barton, 1993), using the means linkage method (UPGM). This was carried out for the
50% cut-off and
35% cut-off binding site definitions and the full SH2 domain sequences.
Calculation and display of residue conservation
The calculations of residue conservation for each group were carried out using Scorecons (Valdar and Thornton, 2001). Scorecons calculates residue conservation at each position within a multiple sequence alignment. A value (Cons) between 0 and 1 is assigned to each alignment position, where 0 represents a position that is not conserved and 1 represents a completely conserved position. The Pairwise Exchange Table (PET91) of Jones et al. (Jones et al., 1998
) is used to assess the diversity of residues at each alignment position. A weighted sum of pairwise similarities between residues at a given alignment position is then calculated. The function Cons(i) for position i within the alignment is defined as
![]() |
where N is the number of aligned sequences; sj(i) and sk(i) are the residues at alignment position i of sequences sj and sk; Mut(a,b) measures the similarity between residues a and b according to the mutation data matrix. Wj is the average evolutionary distance between sj and the other aligned sequences. For further details, see Valdar and Thornton (Valdar and Thornton, 2001).
Conservation scores can be displayed via a given representative PDB file for which the sequence is an exact match to one within the multiple alignment. Representative template structures for each group of domains were selected based on the method used to solve the structure, the resolution of the structure, stereochemical quality of the structure using Procheck (Laskowski et al., 1993) and whether ligands were bound. Conservation scores at each position in the alignment were mapped to the corresponding residue in the representative PDB file and the resulting files were viewed using Grasp (Nicholls et al., 1991
), colouring the molecular surface by residue conservation. The colour scheme presented here ranges from blue, through white, to red, where blue represents maximal conservation and red zero conservation. To ensure that all surfaces are coloured on the same scale, two dummy atoms are added to each PDB file, with values of 1 and 0, representing the maximum and minimum conservation score, respectively.
Analysis of residue conservation
Mean conservation scores were calculated by adding scores for each position within the full alignment or part of an alignment (i.e. binding site) and dividing by the number of positions. The sequence alignment was annotated according to (i) residue conservation, (ii) residue accessibility, and (iii) the presence of residues in the binding face of the domain. This involves (i) residues with a conservation value of 0.65, (ii) relative residues surface accessibility of
11% was calculated using Naccess (Hubbard and Thornton, 1993
), and (iii) a z coordinate cut-off of 10 Å from the residue with the highest positive z coordinate to denote presence in the binding face of the domain. This combination is used to highlight residues that fitted all three criteria. It defines highly conserved residues in the binding face of the domain.
Comparison of parent and group conservation
Differences in residue conservation scores between a group alignment, ConsG(i) and the parent alignment, ConsP(i), are given by subtracting the parent conservation score from the group score at each alignment position. Since conservation scores range from 0 to 1, it follows that the difference between two conservation scores potentially ranges from -1 to 1. These differences in conservation scores between alignments at each residue position are normalized on a scale of 0 to 1, using
![]() |
and displayed on a representative PDB structure as described above. A value from 0 to <0.5 represents a loss in conservation of the group relative to the parent. This is coloured grey through to white. A value from >0.5 to 1 represents a gain in conservation. This is coloured white through to green. To ensure that all surface difference maps are coloured on the same scale, two dummy atoms with values of 0 and 1 were added to each PDB file. Calculated differences in conservation score between a group and parent were mapped back to the sequence alignment. Residues with relative accessibilities 11% showing a gain in conservation score of
20% relative to the parent group that are also present on the surface of the binding face of the domain were highlighted.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
All SH2 domainligand contacts based on the available data are shown in Figure 1, mapped to the full sequence alignment. The general binding site is highlighted and is defined as positions at which at least 50% of the ligand-containing structures have at least one proteinligand contact. Additional sites are highlighted where at least 35% of the ligand-containing structures have at least one proteinligand contact. Thus, the SH2 domain binding site can be described by 17 (
50% having contacts) or 24 (
35% having contacts) positions within the complete sequence alignment. The locations of most of the residues involved in proteinligand contacts appear to be consistent throughout the dataset. Additionally, the identities of many of the residues themselves are sequence conserved, in particular the G/S-L motif at alignment position 118119. The so-called SH2 domain signature FLVRES sequence is at positions 3540, although it appears that within the given cut-off distance of 5 Å, only the second half of the motif (RES) is directly involved in the ligand interaction as defined here. Two additional amino acids at positions 4142, usually E and S/T that follow the RES motif, are also frequently involved in the interaction. The highly conserved arginines at positions 13 and 38 can also be seen. The greatest diversity in terms of the binding sites seems to be between positions ~105 and 126 (Figure 1
) where there is inconsistency in the number of residues involved in the interaction and the specific location of these contacting residues within the alignment.
|
Based on the 50% contact binding site definition (see Materials and methods), the family of SH2 domains has been divided into five major groups according to amino acid similarity. Figure 2
demonstrates the clustering of the domains into groups AE and subgroup A'. A study of mean conservation scores (Table I
) indicates that when divided into these groups, whole SH2 domain sequences show a higher degree of conservation (0.72, 0.37, 0.43, 0.42 and 0.40 for groups A'E) than the mean conservation of the entire dataset (0.33). The conservation becomes much more pronounced for the
50% and
35% contact definitions of the binding site. The highest level of mean conservation is seen in group A' and the lowest in group B. Importantly, the mean conservation score is much greater in the defined binding sites than complete sequences, confirming that these sites are more conserved than the rest of the domain.
|
|
The grouping described above is supported by clustering based on the 35% contact conservation level (not shown). Only minor differences are apparent between the 35 and 50% contact conservation trees. For example, csk, which is perhaps an outlier of group B, branches with abl in group A. Group E (the phospholipase C-1 and
2 N- and C-terminal domains) branches into two pairs where the N-termini domains are clustered within group A. However, despite these discrepancies, the groups described herein can be considered stable, as all other SH2 domains remain within the groups on both trees. There are also similarities between the clustering of 50% contact conserved binding regions and that of full SH2 domain sequences (not shown). For example, groups C and D and most of group A remain intact in the full-sequence tree. This might be expected since it has been noted in a related study (S.J.Campbell and R.M.Jackson, unpublished work) that sequence identities and root mean square deviations relating to structural comparisons tend to be higher within groups than between groups. However, using full sequences in the clustering process failed to group each of the zap-70 and syk domains with csk (group B). Most notable is group E, where the plc domains are once again grouped as two separate pairs within the tree, suggesting that group E is the least stable of the groups. The nck domains, which might be considered to be outliers of group A were also grouped separately within the tree.
Residue conservation within groups
Mapping surface conservation to the molecular surface shows the degree of similarity between binding sites within the five main groups A'E (Figure 3). Figure 3a
shows the conservation patterns when all 37 of the domains from the dataset are included in the scoring. The rotations through 90°, 180° and 270° demonstrate that when the whole family is considered, the main region of conservation is restricted to the area immediately proximal to the pTyr binding pocket, rather than covering a greater proportion of the molecule.
|
The sequence alignment (Figure 1) has been annotated in terms of evolutionary conservation, solvent accessibility and inclusion in the binding face of the domain (see Materials and methods). Thus, when considering the three criteria together, the annotations show the distribution of surface residues on the binding face of the molecule within each group that are at least 65% conserved. Positions that display all three characteristics are shown in bold. Many of the proteinligand contacts (underlined) are found to be within these regions, showing that using surface conservation in conjunction with accessibility data can accurately predict functional residues. It follows that the regions that fit the three criteria correspond highly with the
50% and
35% binding site definitions, highlighted in black and grey. However, there also appear to be positions that are not involved in the binding site definitions. Those of note (see Figure 1
) include alignment position 4 in groups B and D, position 6 in groups A, B and E, 1415 in groups B and C, 20 in groups C and E, 4548 in groups A, B and C, 60 in groups A and E, 122123 in group D and alignment position 124 in group C. These might be potential candidate sites for investigation of binding by site-directed mutagenesis.
Locating diversity within the binding site
Binding site diversity between groups and a larger parent group (e.g. all SH2 domains) can be investigated using difference maps of surface conservation. Difference scores between a group and its parent are given by subtracting the parent conservation score from the group conservation score at each alignment position and normalizing the resulting difference. Here we investigate differences between groups C, D and E by comparison of each with a parent group consisting of all three (Figure 4). These groups were selected to form a parent group because all except two of the sequences (shptp 2 and corkscrew C-terminal SH2 domains) are described as related in a study by Songyang et al. (Songyang et al., 1993
), where the SH2 domain family was classified according to in vitro phosphopeptide recognition (see Discussion). The results of the study (Figure 4
) reveal the surface location of regions which are more conserved in the groups than in the parent, where green represents a gain in conservation of a group alignment position relative to the parent (i.e. a positive difference in conservation score), grey represents a loss in conservation (i.e. a negative difference in conservation score) and white represents no change. It is apparent that within each of the groups C, D and E, the size of the conserved region is larger than in the parent group, showing as expected that the groups are more conserved that their parent background. The difference maps reveal the surface locations of residues that are more (or less) conserved within the groups than the parent group and provides an indication of the extent to which there is a difference in conservation. The phosphotyrosine binding pocket is coloured white in all three difference maps, indicating that there is little change between group and parent in what is already a well conserved binding pocket. However, in all three groups, areas immediately adjacent to this site are more conserved than the parent background. An increase in conservation is also clearly present in residue clusters that surround the main area of intensive ligand contact. This is particularly evident in group E. At these locations each group contains residues which are more highly conserved than in the combined group, suggesting that regions surrounding the area of phosphopeptide contact may be important for functional discrimination in the different groups. This pattern challenges the notion that SH2 domains recognise only a linear phosphopeptide sequence, as it is evident that the conserved regions are more extensive than those regions involved in binding the pTyrX1X2X3 motif, which is generally considered most important for high affinity recognition.
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Since the intention was to group the family according to similarity of phosphopeptide binding sites, we used crystallographic structures and contact data to study the SH2 domain binding site. The observed differences in grouping full sequences in exactly the same way as binding sites merits the study of the domains in terms of binding site residues only. However, this asks the question of the extent to which the two differ. The Pfam (Bateman et al., 2002) phylogenetic tree of SH2 domains shows that most of the group B, C and D SH2 domains cluster together. Many of the group A domains cluster on the Pfam tree, but this is not unexpected since we have already shown that many of the group A sequences are highly similar. The observed similarities between the clustering using binding sites only and either the full sequence clustering or Pfam database shows that to a first approximation whole sequence is sufficient to give similar groups in the case of the SH2 domains. Clearly, it remains to be seen if this is a general phenomenon.
Comparison with experimental screening
Songyang et al. classified the SH2 domain family, grouping the domains according to in vitro phosphopeptide recognition (Songyang et al., 1993). A series of experiments were performed in which phosphopeptide libraries consisting of randomized pTyr-containing peptides of the general sequence GlyAspGlypTyrX1X2X3SerProLeuLeuLeu were used to determine optimal binding sequences for specific SH2 domain binding sites. The X1, X2 and X3 positions were randomized and 22 recombinant SH2 domains were used to screen the library. It was found that the binding of the peptides is dependent on pTyr recognition and that different SH2 domains preferentially bind to different sequence motifs at X1, X2 and X3. SH2 domains were divided into four groups depending on the amino acid at the fifth residue in ß-strand D (alignment position 72 in Figure 1
), which were shown experimentally to select phosphopeptides with similar sequences. The first group (group 1) preferentially bind a pTyr-hydrophilic-hydrophilic-Ile/Pro sequence and is further split into two subgroups, 1a (src, fyn, lck, fgr, lyn, yes, hck and d-src) and 1b (syk N- and C-termini, zap-70 N- and C-termini, atk, abl, csk, nck, sem5/grb2). Members of group 1a that were investigated selected phosphopeptides with the general motif pTyr-hydrophilic-hydrophilic-Ile/Pro. From this group, src, fyn, lck and fgr (from the src family) selected pTyr-Glu-Glu-Ile as the optimal peptide. Nck and abl and sem5/grb2 from group 1b all have an aromatic residue at bD5 but the other residues predicted to contact the ligand side chains are distinct from those in the src family. Crk, nck and abl selected Pro at the X3 binding position. Vav was listed alone in group 2. Group 3 SH2 domains (p85a and p85b N- and C-termini, plc-
1 and plc-
2 N- and C-termini, corkscrew N-terminal, shptp1 N- and C-termini and shptp2 N-terminal SH2 domains) were shown to be selective for a pTyrhydrophobicXhydrophobic sequence. Finally, group 4 (shptp2 C-terminal, corkscrew C-terminal and shc) contains SH2 domains that exhibit distinct amino acids at the ßD5 position.
The groupings of Songyang et al. can be directly compared with those presented here and have been included in brackets alongside each protein in Figure 2. We have also included for comparison surface conservation maps (Figure 3c
) using the technique described previously. It is apparent that group A corresponds closely to group 1a of Songyang et al., with all group 1a domains included in the group A cluster. However, group A' shows a higher degree of conservation as expected due to the more select nature of the subgroup. Group B is similar to group 1b of Songyang et al., both groups including the zap-70 and syk C- and N-termini and csk. The surface diagrams also display a similar region of conservation. However, three group 1b proteins (nck1, nck2 and abl) have been transposed to group A on the basis of binding residue similarity. Atk and sem5 have here been re-classified as miscellaneous. The most significant difference between the classification schemes is between groups C, D and E and group 3 of Songyang et al. Together, groups C, D and E correspond to group 3. However, from the results of the clustering, it appears that each of these is sufficiently distinct to warrant separation into different groups. This is evident in the conservation patterns, where the conserved region is seen to increase significantly in each of the groups relative to their parent group. The differences between group 3 and groups C, D and E are shown in Figure 4
and are discussed above. Cbl and shc remain in the miscellaneous class, while the shptp2 and csw C-termini have been transposed from miscellaneous group 4 to group C. These findings are confirmed by the latter half of the mean conservation scores study (see Table I
). The groups which are equivalent to Songyang et al.s scheme (i.e. A' to group 1a, B to group 1b, C, D and E to group 3), all show higher mean levels of conservation, indicating that these groups are more robust. However, this level of qualitative agreement between the theoretical and experimental methods indicates that the method described here is able to generate important functional information about SH2 domain binding sites independently of experimental screening methods. This could be applied to other homologous families.
Diversity and specificity in the SH2 domain family
Classification of a protein family into groups provides a framework to study its members. It serves as a starting point for examining similarities and diversities within a large family. Conservation studies can reveal functionally important regions within protein structures, which relate to the interactions of proteins within a wider assembly. Until a large family is divided into groups it is difficult to draw useful conclusions about the similarity or diversity between family members. The investigation of SH2 domains within the classification scheme described here demonstrates that there are conserved regions within groups that are not present throughout the family as a whole. This diversity, located in the known binding interface, may be important in the recognition of ligands.
It should be appreciated that in the present study, SH2 domainligand interactions have been investigated in isolation rather than in entire assemblies or protein complexes. It should also be noted that the results have been obtained using crystal structures of SH2 domains bound to peptides and other small ligands rather than whole proteins as occurs in vivo. Structures containing ligands were unavailable for the entire dataset and those that were available may not be representative of the rest of the SH2 domain family. Thus the clustering method described here is based upon SH2 domains with available ligand-bound structures and the residues where ligand atoms have bound. It is likely that some of the binding sites will have been defined more completely than others.
Nevertheless, several conclusions can be drawn from the study. The changing patterns of conservation between groups and lack of conservation throughout the whole binding region suggest that while the phosphotyrosine binding site is generally conserved across the family, regions proximal to this site can be considered diverse between groups. Indeed, the work of Songyang et al. probes only the three residue positions pTyr + 1 to pTyr + 3. Our study suggests that there is binding site residue conservation within groups of similar domains outside these areas that might correspond to the conservation of a proteinprotein interface between the SH2 domain and a phosphorylated protein. This observation challenges the notion that SH2 domains recognize only a short linear phosphotyrosyl peptide motif in vivo.
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Armon,A., Graur,D. and Ben-Tal,N. (2001) J. Mol. Biol., 307, 447463.[CrossRef][ISI][Medline]
Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 4548.
Barton,J.G. (1993) OC A Cluster Analysis Program. European Bioinformatics Institute, Cambridge.
Bateman,A., Birney,E., Cerruti,L., Durbin,R., Etwiller,L., Eddy,S.R., Griffiths-Jones,S., Howe,K.L., Marshall,M. and Sonnhammer,E.L. (2002) Nucleic Acids Res., 30, 276280.
Beebe,K.D., Wang,P., Arabaci,G. and Pei,D. (2000) Biochemistry., 39, 1325113260.[CrossRef][ISI][Medline]
Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P. (2000) Nucleic Acids Res., 28, 235242.
Bibbins,K.B., Boeuf,H. and Varmus,H.E. (1993) Mol. Cell. Biol., 13, 72787287.[Abstract]
Bradshaw,J.M. and Waksman,G., (1998) Biochemistry, 37, 1540015407.[CrossRef][ISI][Medline]
Casari,G., Sander,C. and Valencia,A. (1995) Nature Struct. Biol., 2, 171178.[ISI][Medline]
Clamp,M. (1999) Jalview. European Bioinformatics Institute, Cambridge.
Gilmer,T., Rodriguez,M., Jordan,S., Crosby,R., Alligood,K., Green,M., Kimery,M., Wagner,C., Kinder,D. and Charifson,P. (1994) J. Biol. Chem., 269, 17111719.
Grucza,R.A., Bradshaw,J.M., Fütterer,K. and Waksman,G. (1999) Med. Res. Rev., 19, 273293.[CrossRef][ISI][Medline]
Hubbard,S.J. and Thornton,J.M. (1993) NACCESS, Department of Biochemistry and Molecular Biology, University College London.
Islam,S.A. (1995) PREPI. Imperial Cancer Research Fund, London.
Jones,D.T., Taylor,W.R. and Thornton,J.M. (1998) Comput. Appl. Biosci., 8, 275282.
Ladbury,J.E. and Arold,S. (2000) Chem. Biol., 7, R3R8.[CrossRef][ISI][Medline]
Laskowski,R.A., MacArthur,M.W., Moss,D.S. and Thornton,J.M. (1993) J. Appl. Crystallogr., 26, 283291.[CrossRef][ISI]
Laskowski,R.A., Hutchinson,E.G., Michie,A.D., Wallace,A.C., Jones,M.L. and Thornton,J.M. (1997) Trends Biochem. Sci., 22, 488490.[CrossRef][ISI][Medline]
Lichtarge,O., Bourne,H.R. and Cohen,F.E. (1996) J. Mol. Biol., 257, 342358.[CrossRef][ISI][Medline]
Marengere,L.E., Songyang,Z., Gish,G.D., Schaller,M.D., Parsons,J.T., Stern,M.J., Cantley,L.C. and Pawson,T. (1994) Nature, 369, 502505.[CrossRef][ISI][Medline]
Milburn,D., Laskowski,R. and Thornton,J. (1998) Protein Eng., 11, 855859.[Abstract]
Nicholls,A., Sharp,K. and Honig,B. (1991) Proteins: Struct. Funct. Genet., 11, 281296.[ISI][Medline]
Sadowski,I., Stone,J.C. and Pawson,T. (1986) Mol. Cell. Biol., 6, 43964408.[ISI][Medline]
Sawyer,T.K. (1998) Biopolymers, 47, 243261.[CrossRef][ISI][Medline]
Shakespeare,W. et al. (2000) Proc. Natl Acad. Sci. USA, 97, 93739378.
Songyang,Z. et al. (1993) Cell, 72, 767778.[ISI][Medline]
Thompson,J.D., Gibson,T.J., Plewniak,F., Jeanmougin,F. and Higgins,D.G. (1997) Nucleic Acids Res., 24, 48764882.[CrossRef]
Valdar,W.S.J. and Thornton,J.M. (2001) Proteins: Struct. Funct. Genet., 42, 108124.[CrossRef][ISI][Medline]
Vu,C.B. (2000) Curr. Med. Chem., 7, 10811100.[ISI][Medline]
Waksman,G. et al. (1992) Nature, 358, 646653.[CrossRef][ISI][Medline]
Received September 30, 2002;