Evolutionary trace analysis of TGF-ß and related growth factors: implications for site-directed mutagenesis

C.Axel Innis, Jiye Shi and Tom L. Blundell,1

Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
The TGF-ß family of growth factors contains a large number of homologous proteins, grouped in several subfamilies on the basis of sequence identity. These subgroups can be combined into three broader groups of related cytokines, with marked specificities for their cellular receptors: the TGF-ßs, the activins and the BMPs/GDFs. Although structural information is available for some members of the TGF-ß family, very little is known about the way in which these growth factors interact with the extra-cellular domains of their multiple cell surface receptors or with the specific protein inhibitors thought to modulate their activity. In this paper, we use the evolutionary trace method [Lichtarge et al. (1996) J. Mol. Biol., 257, 342–358] to locate two functional patches on the surface of TGF-ß-like growth factors. The first of these is centred on a conserved proline (P36 in TGF-ßs 1–3) and contains two amino acids which could account for the receptor specificity of TGF-ßs (H34 and E35). The second patch is located on the other side of the growth factor protomer and surrounds a hydrophobic cavity, large enough to accommodate the side chain of an aromatic residue. In addition to two conserved tryptophans at positions 30 and 32, the main protagonists in this potential binding interface are found at positions 31, 92, 93 and 98. Several mutagenesis studies have highlighted the importance of the C-terminal region of the growth factor molecule in TGF-ßs and of residues in activin A equivalent to positions 31 and 94 of the TGF-ßs for the binding of type II receptors to these ligands. These data, together with our improved knowledge of possible functional residues, can be used in future structure–function analysis experiments.

Keywords: mutagenesis/prediction/protein evolution/receptor binding motifs/TGF-ß family


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
The TGF-ß family of growth factors comprises over 40 structurally related polypeptides grouped in 10 subfamilies, all of which play vital roles in the development, homeostasis and repair of most tissues in multicellular organisms (Massagué, 1998Go). Members of this cytokine family signal through a group of membrane-bound serine/threonine kinases, collectively known as the TGF-ß receptor family. These transmembrane proteins can be further classified into two subfamilies, type I and type II, on the basis of both function and structural conservation (Table IGo). Members of the TGF-ß subgroup do not appear to share their receptors with activins or BMPs and although the opposite also holds true, cross-reactivity is observed to some extent between the activin subgroup and the BMPs. Note that the term BMP will be used to refer collectively to the BMPs (bone morphogenetic proteins) and GDFs (growth and differentiation factors) found in vertebrates, together with their invertebrate homologues.


View this table:
[in this window]
[in a new window]
 
Table I. Some type I and II protein serine/threonine kinase receptors
 
Members of the TGF-ß family of growth factors are synthesized as precursor molecules consisting of an N-terminal signal peptide, a pro-domain of variable size and a 110–140 residue long mature domain characterized by six invariant cysteines. The latter is released upon proteolytic cleavage of the precursor at an RXXR dibasic site, a reaction most likely to be catalysed by members of the furin protease family (Barr, 1991Go). In most cases, the active protein is a homodimer of ~25 kDa, covalently linked by an inter-chain disulphide bond, though a few notable exceptions to this rule have been observed (Jones et al., 1992Go; McPherron and Lee, 1993Go; Hazama et al., 1995Go; Xu et al., 1995Go).

To date, only five TGF-ß and related growth factors have been characterized structurally: the atomic coordinates for TGF-ß2 (Daopin et al., 1993Go; Schlunegger and Grütter, 1993Go), TGF-ß3 (Mittl et al., 1996Go), BMP7/OP1 (Griffith et al., 1996Go) and BMP2 (Scheufler et al., 1999Go) were determined by X-ray crystallography, while a model of TGF-ß1 was calculated from NMR restraints (Hinck et al., 1996Go). In each case the protomer is a thin, elongated and slightly curved molecule containing a structurally conserved motif known as a cystine knot (Figure 1Go). The latter is best described as a narrow eight-membered ring, comprising two intra-chain disulphide bonds, with a third cystine passing through the ring.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 1. Topology diagram of a typical TGF-ß-like ligand, based on the three-dimensional structures of BMP-2, BMP-7, TGF-ß1, TGF-ß2 and TGF-ß3. Secondary structure motifs depicted in black are found in all five structures. Strand ß5a (dark grey) is only found in BMP-2, helix ß4 (light grey) is found in BMP-7 and helices {alpha}1/{alpha}2 (white) are common to TGF-ß subfamily members. Disulphide bridges are depicted as dotted lines.

 
The current working model for receptor activation involves the recruitment of pairs of type I and type II receptor molecules by a dimeric ligand to form a hetero-hexameric signalling complex. The two receptor types are required for the normal initiation of intracellular signalling pathways and the twofold symmetry of the ligand is consistent with the proposed stoichiometry. Ligand–receptor association is known to occur either in a sequential or in a cooperative fashion. Sequential binding is typical of TGF-ß and activin receptors: the ligand first associates with the type II receptor to form an inactive complex, which then becomes active by recruiting a type I receptor. On the other hand, a fully cooperative interaction is characteristic of BMP receptors. Type I and type II receptors of this kind can bind to their ligand individually with low affinity, but the formation of a tight complex requires both types of receptors. It is nevertheless worth noting that a sequential mechanism does not exclude a certain degree of cooperativity; the main difference lies in the incapacity of the type I receptor to bind the ligand alone.

Only a limited number of functionally important amino acids have been identified in TGF-ß and related growth factors. The influence of segment deletions, residue replacements and isoform chimeras on the binding affinity of TGF-ßs for their cognate type II receptor (TßR-II) was studied, thus highlighting the importance of the C-terminal region (residues 83–112) in the growth factor molecule (Qian et al., 1996Go). Site-directed mutagenesis experiments were also performed on activin-ßA and two amino acids involved in the binding of the activin molecule to its type II receptor were identified: D27 and K102 (Wuytens et al., 1999Go). Additional information comes from the molecular characterization of naturally occurring mutations with clearly marked phenotype, most of which tend to emphasize the importance of the six conserved cysteines and of other key buried residues in determining the overall fold and activity of the growth factor (Mason, 1994Go; Wittbrodt and Rosa, 1994Go; McPherron et al., 1997Go; Thomas et al., 1996Go, 1997Go).

However, the impact of a large number of single or cumulative mutations on the binding of TGF-ß-like growth factors to both receptor types and to a series of structurally diverse protein inhibitors remains to be tested. Since mutating all residues randomly would be an incredibly time-consuming process, ways of narrowing down the choice of possible mutational targets must be sought. In this paper we discuss the use of the evolutionary trace (ET) method, a sequence–structure analysis technique described by Lichtarge et al. (1996), to identify potential targets for mutagenesis in TGF-ß and related growth factors, with the aim of finding receptor binding specificity determinants for the three major subgroups in the family: the TGF-ßs, the activins and the BMPs.

Briefly, a `trace' is generated by comparing the consensus sequences for groups of proteins which originate from a common node in a phylogenetic tree and are characterized by a common evolutionary time cut-off (ETC) and classifying each residue as one of three types: absolutely conserved, class-specific and neutral. Here, `class-specific' denotes residues occupying a strictly conserved location in the sequence alignment, but differing in the nature of their conservation between various subgroups. The information obtained by the ET method can then be mapped on to known protein structures, thus allowing us to identify clusters of important amino acids and to distinguish between buried and exposed residues. The strength of the ET method lies in its flexibility: depending on the ETC value for which a trace is generated, it is possible to maximize the specificity of the analysis over its sensitivity and vice versa. In other words, ET analysis allows for a wide range of `functional resolution'.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
Sequences for the mature domains of 108 growth factors belonging to the TGF-ß superfamily were obtained from the Swiss Protein Sequence Data Bank (Bairoch and Apweiler, 1996Go). Of these, only 58 were retained for the evolutionary analysis. Proteins which were discarded fell into three categories:

  1. Polypeptides known to be functionally unrelated, which signal through tyrosine kinase receptors instead of serine/threonine kinases (glial cell derived neurotrophic factor, neurturin).
  2. Sequences with 25% or less sequence identity relative to the rest of the family (e.g. Müllerian inhibiting substance/anti-Müllerian hormone)
  3. Redundant sequences; human or mouse homologues were kept in priority.

The complete sequences of these polypeptides were aligned manually, using the six invariant cysteines of the TGF-ß family as anchor points. Special care was taken in ensuring that gaps were not inserted into areas of known secondary structure (Figure 2Go). A Phylip distance matrix based on sequence identity was generated for the 58 non-redundant TGF-ß related sequences by the ClustalX program (Thompson et al., 1997Go) and was input into the Kitsch algorithm to build a rooted phylogenetic tree (Baum, 1989Go).



View larger version (105K):
[in this window]
[in a new window]
 
Fig. 2. Sequence alignment of the 58 TGF-ß and related sequences used for the ET analysis. Secondary structure motifs below the alignment correspond to those shown in Figure 1Go. Conserved cysteine residues are boxed.

 
The ET analysis was then carried out using TraceSuite, a series of algorithms developed in-house (unpublished results). First, TraceGroup was used to split the phylogenetic tree into 10 evenly distributed partitions, named P01–P10 in order of increasing ETC (Figure 3Go). For each partition, a trace procedure was completed automatically by the TraceSeq and TraceScript algorithms. This comprised four steps. (1) Protein sequences connected by a common node with evolutionary time greater than a given ETC value were clustered together by TraceGroup and input into TraceSeq. (2) A consensus sequence was generated for each group to distinguish between conserved and non-conserved positions. In practice, Traceseq tagged a residue with a `1' if it was conserved or with a `0' if it varied within the subgroup. (3) A trace was generated by comparing the aligned consensus sequences for all the clusters associated with a given partition. Where all residues for a specific position had been tagged as `1', the one-letter residue name was used if such an amino acid was found in every single cluster, otherwise the position was assigned the symbol `X' and was termed `class-specific'; if an amino acid had been tagged with a `0' in at least one of the consensus sequences, the corresponding position in the trace was considered to be neutral. (4) Script files for both Rasmol (Sayle and Milner-White, 1995Go) and MOLSCRIPT (Kraulis, 1991Go) were created by TraceScript to map the trace on to the surface of available structures and the various residue classes were colour-coded to distinguish between conserved and class-specific, as well as buried and exposed. Residues were classified as buried if their side-chain solvent accessibility was lower than 30%.



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 3. Dendrogram containing 58 TGF-ß family growth factors. Partitions P01–P10 are shown as thin vertical lines. ETC increases from P01 to P10.

 
The ET analysis was performed on two sets of protein sequences, as follows:
  1. Group 1: the entire TGF-ß family, comprising all 58 non-redundant sequences.
  2. Group 2: the bone morphogenetic protein (BMP) subgroup, comprising a total of 36 sequences.

Carrying out the analysis on two differently sized groups of sequences should allow us to assess the validity of the method and to ensure that the functional patches detected at the surface of these highly related proteins occupy similar regions of the molecule, regardless of functional differences.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
Analysis of the mapped traces for partitions P01–P10 reveals clusters of potentially important residues appearing on the convex and concave surfaces of the protomers as the ETC value is increased (Figures 4 and 5GoGo). The first of these clusters, located on the convex surface of the molecule, is centred on P67, a conserved proline with a cis main-chain conformation. In the following analysis, italicized residue numbers will refer to the amino acid positions used in the alignment in Figures 2 and 4GoGo. The numbering systems used in the individual structures will be specified as subscripts when required. A second cluster is found in the groove formed by the two ß-sheets from one protomer and the {alpha}-helix from the other. It is centred on a pair of conserved tryptophan residues at positions 60 and 63.



View larger version (52K):
[in this window]
[in a new window]
 
Fig. 4. Traces for partitions P01–P10, aligned with the amino acid sequences of TGF-ßs 1–3, BMP2 and BMP7. Conserved residues are surrounded by boxes, class-specific residues are denoted by an X, solvent-accessible side chains are shaded and the N-terminal segments of BMP2 and BMP7 for which no structure is available are shown in italics. The numbering in this alignment was used throughout our analysis.

 


View larger version (79K):
[in this window]
[in a new window]
 
Fig. 5. Trace residues for partition P08 were mapped on to the surface of known TGF-ß and BMP structures. Colour coding is as follows: blue, conserved buried; green, conserved exposed; red, class-specific buried; yellow, class-specific exposed. Residues shown to be important for binding the type II receptor are also indicated: residues 132 and 134 (in blue) are thought to be important for binding of TGF-ßs to their type II receptor, residue 62 (in light green) was shown to be involved in the binding of activin A to its type II receptor and residue 140 (in orange) appears to play a role in both of these interactions.

 
The main components of these functional clusters are visible as early as partition P01 and the number of trace residues increases slowly with ETC, thus making it difficult to define cluster boundaries accurately. Visual inspection of the mapped traces nevertheless led to choosing P08 as the partition displaying the highest ratio of functional resolution over random signal. This was a subjective choice and partitions with both lower and higher ETC values should also be considered, in order to extract the maximum amount of functional information. Traces derived for the set of 36 BMP sequences were similar to those derived for the larger group, although the noise levels observed were significantly increased owing to the smaller number of sequences used and the higher percentage of sequence identity relating them (results not shown).

Trace residues exposed on the convex surface of the protomer

A small cluster of exposed trace residues is found on the convex surface of the TGF-ß protomer. It includes a conserved proline at position 67, already visible in the trace for partition P01. In all known TGF-ß and BMP structures, this proline adopts a cis conformation, which appears to be stabilized by three hydrogen bonds linking the ends of the two ß-sheets in the protomer. As indicated by Schlunegger and Grütter (1993), the proline itself is not involved in hydrogen bonding and its cis conformation is difficult to justify in structural terms. It is possible that its unusual backbone geometry positions the neighbouring amino acids in such a way as to form a functional receptor binding site. Alternatively, a trans conformation could result in a potentially unstable structure by preventing the formation of hydrogen bonds between the tips of the two ß-sheets.

The most interesting trace residues belonging to this cluster are found at positions 65 and 66. These residues are located in strand ß3, just before the conserved cis-proline at the tip of finger 1, where they form main-chain hydrogen bonds with the carbonyl and amide groups of residue 129. The latter is a leucine in most cases, but replacement with other non-polar side chains is sometimes observed. There is a clear difference in the substitution patterns for residues 65 and 66 between the BMP/activin subgroup and the TGF-ß subfamily. In the former group, residue 65 is either an isoleucine (e.g. I57 in BMP7) or a valine (e.g. V33 in BMP2), thus retaining a medium-sized non-polar side chain. The corresponding residue in TGF-ßs is a histidine (H34), whose positively charged side chain is seen protruding into the solvent in all TGF-ßs. The presence of such a highly conserved residue in TGF-ßs could account for the fact that members of this subfamily do not bind BMP/activin receptors, presumably owing to the lack of negatively charged cavities capable of accommodating an imidazole ring on the surface of these molecules or to the absence of suitable groups for hydrogen bonding. Position 66 is occupied by a small side chain in most BMPs and activins, typically an alanine (e.g. A34 in BMP2 and A58 in BMP7) or a serine, although a glutamine and even a tyrosine have been observed in some members of the subgroup. In TGF-ßs, this position is occupied by a glutamic acid (E35). Once again, the resulting difference in charge could explain the segregation observed between BMP/activin and TGF-ß receptors for their ligands.

Additional trace residues of interest in this cluster occupy positions 53, 57 and 140. An aspartic acid is nearly always found at positions 53 and 57 (or a glutamic acid in the case of position 57) and there is considerable variation in the degree of solvent accessibility displayed by these residues in the five known TGF-ß family structures. It is therefore difficult to suggest the role played by these amino acids in determining the specificity of any particular ligand/receptor interaction. Position 140 is occupied by non-polar residues (leucine, isoleucine, valine or tyrosine) in all subgroups except the activins, where a positively charged lysine side chain is always present. These results are in agreement with mutagenesis work done on the activin subfamily, where a positive charge at this position in activin-ßA is shown to be essential for binding of the ligand to the activin type II receptor (Wuytens et al., 1999Go). Also, note that residues 53 and 57 were both mutated to alanine in activin-ßA, with no consequent changes in this protein's ability to bind the activin type II receptor (Wuytens et al., 1999Go).

Trace residues exposed on the concave surface of the protomer

In contrast with the group of trace residues discussed above, the cluster on the concave surface of the growth factor molecules is relatively compact and well defined. The main protagonists of this putative binding interface are the conserved aromatic residues at positions 60 and 63, which, together with amino acids 64, 130 and 143 (in strands ß3, ß7 and ß8, respectively), form a hydrophobic pocket large enough to accommodate a phenyl ring. Support for this hypothesis comes from the presence of an ordered dioxane molecule at this location in the 2.0 Å resolution structure of TGF-ß3 (Mittl et al., 1996Go).

A point to note about the location of the second cluster is its proximity to the region identified by Qian et al. (1996) as a putative Type II receptor binding site on the surface of TGF-ß molecules. Experiments carried out by this group on TGF-ß1/TGF-ß2 chimeras show that amino acids 83–112 of TGF-ß1, in particular V92132, R94134 and V98140, which are conserved between species but not among the various TGF-ß isoforms, are responsible for the high affinity binding of TGF-ß1 to TßRII. The aim of the study was not to determine the location of a general TßRII binding site on TGF-ßs, but to understand why the affinity of TßRII for its ligands is greater in the case of TGF-ßs 1 and 3 than it is for TGF-ß2. The proximity of this functional patch to residues identified in the second cluster therefore supports a model in which residues from fingertips 1 and 2 cooperate to form a general type II receptor binding site.

Comparison of the three-dimensional structures of TGF-ßs 1–3, BMP2 and BMP7 reveals major differences in the hydrophobic pocket formed by trace residues on the concave surface of the growth factor molecule (Figure 6Go). First, the tryptophan at position 63 shows a large degree of conformational flexibility among the various structures. In BMP2, the side chain of W31 is rotated by ~180° relative to its TGF-ß homologue W32, hence occluding the dioxane binding site seen in TGF-ß3. In BMP7, the same tryptophan occupies an intermediate position, with the aromatic ring almost perpendicular to the plane of the growth factor protomer. This leads to a wider but shallower pocket compared with that found in the TGF-ßs. We must bear in mind, however, that rotation of the tryptophan side chain probably occurs in vivo in all of these structures and that the crystal forms of the molecules have only managed to capture one of a range of possible conformations. It is possible that this bulky aromatic moiety acts as a flap to mask the pocket in the absence of an interacting molecule, but is displaced upon binding of the receptor to the ligand.



View larger version (70K):
[in this window]
[in a new window]
 
Fig. 6. Conserved hydrophobic pocket found on the concave surface of the BMP2, BMP7 and TGF-ß3 protomers. Residue numbers are the same as those in Figure 4Go.

 
Perhaps a more drastic change between the various structures is seen at position 143 where the phenol ring of a tyrosine in BMPs 2 and 7 is replaced by the non-polar side chain of a leucine residue in TGF-ßs 1–3, thus removing all polar groups from the inside of the pocket. Although by no means conclusive, these conformational differences provide us with ways of interpreting the observed receptor affinities for TGF-ßs, BMP2 and BMP7. Looking at the substitution patterns for residues lining the walls of the hydrophobic cavity, it is of interest to note that position 130 is reasonably well conserved on the whole, with phenylalanine being occasionally inserted instead of tyrosine. Finally, activins share a similar substitution pattern as TGF-ßs for position 143 in which a medium-sized non-polar side chain replaces the bulkier tyrosine (and occasional histidine) found in the BMPs.

Additional trace residues on the concave surface of the protomer are found at positions 62, 81, 82, 95, 132, 133 and 144. In TGF-ßs, residue 62 is a lysine, whereas it is almost invariably replaced by an aspartic acid or a glutamic acid in BMPs/activins. Exceptions to this rule are GDFs 1 and 3, which have a lysine or an arginine at this position and a few BMPs where residue 62 is substituted for a serine or an asparagine. The difference in charge and size of this residue's side chain between TGF-ßs and most BMPs/activins suggests a role in the establishment of a protein–protein interface. This is further supported by mutagenesis data showing that the equivalent residue in activin-ßA (D27) is involved in binding the activin type II receptor. Interestingly, mutating this residue to a lysine increases the affinity of activin-ßA for its type II receptor, although no attempt has been made to see if this mutant can bind TGF-ß receptors (Wuytens et al., 1999Go).

The functions of residues 81, 82 and 95 in the loop between strand ß5 and helix {alpha}3 are more difficult to assess. Position 81 is occupied by a tyrosine in TGF-ßs, a phenylalanine in most BMPs and leucine, alanine or serine in the activins. Residues found at position 82 are a leucine or an isoleucine for TGF-ßs, a proline in the majority of BMPs and a histidine or a tyrosine in activin. Position 95 is occupied by a glutamine or a threonine in TGF-ßs, an asparagine in BMPs and a phenylalanine in activin. None of these residues seems to play a role in dimerization, thus hinting at a possible contribution towards a binding interface. Note that H4782 in activin-ßA was mutated to an alanine with no noticeable changes in its ability to signal through the activin type II receptor (Wuytens et al., 1999Go).

In this respect, residues 132 and 133 at the end of strand ß7 are more interesting, as they show clear differences in their substitution patterns between TGF-ßs and BMPs/activins. In the latter group, both have a marked polar character, with position 132 being frequently occupied by an aspartic acid and residue 133 alternating between a glutamate, an aspartate and an asparagine. In contrast, residues at these positions in the TGF-ßs are isoleucine or valine and glycine, respectively. The absence of a side chain at position 133 in the latter case leaves room for the side chain of residue 132, thus making this amino acid in TGF-ßs the structural equivalent of residue 133 in BMPs/activins. Following this logic, the change of a polar side chain for a non-polar side chain could account for the discrimination between TGF-ßs and BMPs/activins for their receptors. However, mutating either residue does not affect binding of activin-ßA to its type II receptor (Wuytens et al., 1999Go). As for some of the other trace residues mentioned above, this may suggest a possible involvement in the binding of the type I receptor.

The last amino acid to mention in the second trace residue cluster is found at position 144, which is invariably occupied by a serine in the TGF-ßs, a proline in the activins and generally retains a polar or charged character in the BMPs (Glu, Gln, Arg, Lys and even Pro). In all known structures, the side chain of this residue is fully accessible to the solvent, hence allowing it potentially to play a role in a binding interface. Amino acid 144 is yet another example of a trace residue whose mutation to alanine in activin-ßA did not perturb the binding of this growth factor to the activin type II receptor (Wuytens et al., 1999Go).


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
Known crystal structures for the TGF-ß family have relative overall root mean square deviations of <1.1 Å2. In addition to the cystine knot motif, the protomer is characterized by two antiparallel ß-sheets (strands ß1–ß4 form sheet 1 and strands ß5–ß9 form sheet 2) and a four-turn amphipathic {alpha}-helix with its axis almost perpendicular to the sheets (helix {alpha}3). Hence the TGF-ß-fold is reminiscent of an open hand, with the ß-sheets acting as fingers 1 and 2, the cystine knot as the palm and the helix as a thumb/wrist.

There is an extended network of hydrogen bonds between the strands of each finger, but with the exception of a few residues at the fingertips, the two ß-sheets are too far apart to form this kind of non-covalent interaction. Instead, they are linked through the disulphide bridges of the cystine knot and stabilized by a combination of van der Waals contacts and hydrophobic interactions. A closer look at the amino acid sequences of known TGF-ß structures reveals the absence of a conventional hydrophobic core in the isolated protomer. In the dimer however, the non-polar side of the {alpha}-helix on one molecule is in contact with hydrophobic residues in the ß-sheet of the other molecule, resulting in two tightly packed hydrophobic cores related by twofold symmetry. The packing of the protomers is much less compact in the region surrounding the intermolecular disulphide bridge, leaving room for a few ordered water molecules at the dimer interface. Typically, ~20% of the protomer's total solvent-accessible area participates in the dimerization interface, suggesting that dimer formation is an energetically favourable process.

Most of the structural differences observed between the TGF-ßs and the BMPs lie within four regions of the polypeptide chain. One of these is the N-terminal segment, whose length and sequence vary greatly among members of the family. In the TGF-ß subgroup, it folds as an {alpha}-helix followed by an exposed loop and is attached to the core of the molecule via an additional disulphide bond. On the other hand, no interpretable electron density was available for this region in the structures of BMP2 and BMP7, suggesting that the N-terminus of these molecules is disordered. This does not, of course, exclude the possibility that residues forming the N-terminal segment become ordered upon binding of the receptor(s) or, in the case of BMP2, of heparin (Ruppert et al., 1996Go). Other variable regions include the loops at the ends of fingers 1 and 2 (residues 55–64 and 133–137, respectively) and the C-terminus of helix {alpha}3, all of which are prime candidates for determining receptor specificity.

To summarize our results, fingertip residues located on both surfaces of TGF-ß and related growth factor protomers most probably form a binding interface for a variety of molecules, including the extracellular domains of both receptor types and several inhibitory polypeptides. Solvent-accessible trace residues found at positions 62, 65, 66, 132 and 133 display clear differences in their substitution patterns between the TGF-ß subgroup and the BMPs/activins, thus behaving as likely candidates for the determination of receptor specificity in these two major groups. Of the amino acids mentioned above, residue 62 has been shown to play a role in the binding of activin-ßA to the activin type II receptor, but mutating residues 132 and 133 to alanine does not seem to affect the ability of activin to interact with its cognate receptor (Wuytens et al., 1999Go). Two residues of interest were surprisingly omitted from the mutagenesis experiments carried out so far: these are found at positions 65 and 66 and could potentially be responsible for the lack of cross-reactivity between ligands and receptors of the TGF-ß and BMP/activin subgroups. It would therefore be interesting to test the effect of mutating these residues on the receptor binding properties of specific growth factors and to see whether adding a histidine at position 65 or a glutamate at position 66 of BMPs/activins gives these ligands the ability to bind TGF-ß type II receptors. Engineering of these sites could be combined with amino acid replacements at residues 62, 132, 133 or 140 until a shift in receptor specificity is observed.

When considering the role of particular residues in a binding interface, we must keep in mind a few points. Indeed, not all residues will be as well behaved as amino acid 140, whose involvement in type II receptor binding has been confirmed for both TGF-ßs and activins and agrees with the ET analysis predictions. First, a residue which is important in determining the affinity or the specificity of an interaction inside a given subgroup may only be playing a minor role in a similar event for other groups of molecules. For instance, V92132 in the TGF-ßs is clearly involved in the binding interface for the TGF-ß type II receptor, but its loss in activin-ßA does not affect the equivalent process (Qian et al., 1996Go; Wuytens et al., 1999Go). Second, residues which do not even appear in the trace when the functional resolution is highest may nonetheless be taking part in an interaction, for example as minor determinants of affinity. One such residue is R94134 in TGF-ßs 1–3, whose replacement for a lysine in TGF-ß2 has been suggested to decrease the affinity of this growth factor for its type II receptor (Qian et al., 1996Go).

In the case of the TGF-ß family, we must remember that the variable N-terminal segment could potentially harbour specificity determinants, even though the ET analysis cannot reveal the existence of such residues. Also, we cannot exclude the possibility that some of the more conserved residues on the concave surface of the growth factor molecule, namely the two aromatics at positions 60 and 63, could play a structural role in addition to forming the core of a putative binding interface. Indeed, the small dimensions of TGF-ß and related growth factors, in particular the thickness of the protomer, suggest that some of the residues which are crucial in determining the overall fold of the molecule are likely to be solvent accessible. In our analysis, we have not mentioned any of the trace residues found at the dimer interface or forming part of the hydrophobic core, but it is worth keeping in mind that modifying such residues could alter the inter-subunit angle and hence affect the binding properties of the growth factor. Mutagenesis experiments must therefore be designed in such a way as to allow a clear distinction to be made between contributions to the overall fold and stability of the growth factor and contributions to its ability to interact with other proteins.

The role played by all of the exposed trace residues highlighted in our analysis should be further investigated by site-directed mutagenesis, along with the effect of cumulative mutations. Detailed studies of the three main TGF-ß family subgroups could enable us to understand the role of key residues in the formation of ligand–receptor complexes and hence help us design therapeutic agents for bone repair and the treatment of TGF-ß related diseases.


    Note added in proof
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
Following the initial submission of our manuscript, a crystal structure of the BMP-2–BRIA ectodomain complex was published by Kirsch et al. (2000). They describe the binding of two type I receptor molecules to a dimeric BMP-2 ligand and focus on residues present at the macromolecular interface, both on the receptor and on the ligand. An attempt is also made to define the binding epitope for type II receptors.

Overall, residues highlighted by our analysis correspond to the main protagonists in the interaction with type I receptors, while minor determinants of specificity only begin to appear in the traces for later partitions. Clusters of trace residues on the convex and concave surfaces of the ligand correspond, respectively, to the Knuckle and Wrist epitopes defined by Kirsch et al. Trace residues 81 and 82 (F49 and P50 in BMP-2), for which we could find no obvious role in our study, were shown experimentally to be the main determinants of specificity for type I receptors over type II. Indeed, a hollow capable of accommodating these residues is visible at the end of a long cavity on the surface of type I receptors, but is missing in type II receptors. It is also worth noting the importance of F85 from the BRIA extracellular domain, whose aromatic side chain becomes inserted into the hydrophobic cavity lined with trace residues 60, 63, 64, 130 and 143.

Finally, the additional mutagenesis data provided by Kirsch et al. stress the role played by residues W31, I62, L66 and Y103 (residues numbered 63, 98, 102 and 143, respectively, according to our system) and mostly agree with the results of our analysis, even though residues 98 and 102 are totally absent from the trace.


    Notes
 
1 To whom correspondence should be addressed. E-mail: tom{at}cryst.bioc.cam.ac.uk Back


    Acknowledgments
 
We thank Dr Marko Hyvönen and Charlotte Deane for useful discussions on the TGF-ß structures and for reviewing the manuscript and Dr Mark Williams for bringing our attention to the ET method. C.A.Innis was funded by the BBSRC (UK Biotechnology and Biological Sciences Research Council), the Cambridge European Trust and the Sackler Fund. J.Shi received financial support from the Cambridge Overseas Trust. T.L.Blundell is supported by a Wellcome Trust Programme Grant.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Note added in proof
 References
 
Bairoch,A. and Apweiler,R. (1996) Nucleic Acids Res., 24, 21–25.[Abstract/Free Full Text]

Barr,P.J. (1991) Cell, 66, 1–3.[ISI][Medline]

Baum,B.R. (1989) Q. Rev. Biol., 64, 539–541.

Daopin,S., Li,M. and Davies,D.R. (1993) Proteins: Struct. Funct. Genet., 17, 176–192.[ISI][Medline]

Griffith,D.L., Keck,P.C., Sampath,T.K., Rueger,D.C. and Carlson,W.D. (1996) Proc. Natl Acad. Sci. USA, 93, 878–883.[Abstract/Free Full Text]

Hazama,M., Aono,A., Ueno,N. and Fujisawa,Y. (1995) Biophys. Res. Commun., 209, 859–866.[ISI][Medline]

Hinck,A.P. et al. (1996) Biochemistry, 35, 8517–8534.[ISI][Medline]

Jones,C.M., Simon,C.D., Guenet,J.L. and Hogan,B.L. (1992) Mol. Endocrinol., 6, 1961–1968.[Abstract]

Kirsch,T., Sebald,W. and Dreyer,M.K. (2000) Nature Struct. Biol., 7, 492–496.[ISI][Medline]

Kraulis,P. (1991) J. Appl. Crystallogr., 24, 946–950.[ISI]

Lichtarge,O., Bourne,H.R. and Cohen,F.E. (1996) J. Mol. Biol., 257, 342–358.[ISI][Medline]

Mason,A.J. (1994) Mol. Endocrinol., 8, 325–32.[Abstract]

Massagué,J. (1998) Annu. Rev. Biochem., 67, 753–791.[ISI][Medline]

McPherron,A.C. and Lee,S.J. (1993) J. Biol. Chem., 268, 3444–3449.[Abstract/Free Full Text]

McPherron,A.C., Lawler,A.M. and Lee,S.J. (1997) Nature, 387, 83–90.[ISI][Medline]

Mittl,P.R.E., Priestle,J.P., Cox,D.A., McMaster,G., Cerletti,N. and Grütter,M.G. (1996) Protein Sci., 5, 1261–1271.[Abstract/Free Full Text]

Qian,S.W., Burmester,J.K., Tsang,M.L.-S., Weatherbee,J.A., Hinck,A.P., Ohlsen,D.J., Sporn,M.B. and Roberts,A.B. (1996) J. Biol. Chem., 271, 30656–30662.[Abstract/Free Full Text]

Ruppert,R., Hoffmann,E. and Sebald,W. (1996) Eur. J. Biochem., 237, 295–302.[Abstract]

Sayle,R.A. and Milner-White,E.J. (1995) Trends Biochem. Sci., 20, 374.[ISI][Medline]

Scheufler,C., Sebald,W. and Hülsmeyer,M. (1999) J. Mol. Biol., 287, 103–115.[ISI][Medline]

Schlunegger,M.P. and Grütter,M.G. (1993) J. Mol. Biol., 231, 445–458.[ISI][Medline]

Thomas,J.T., Lin,K., Nandedkar,M., Camargo,M., Cervenka,J. and Luyten,F.P. (1996) Nature Genet., 12, 315–7.[ISI][Medline]

Thomas,J.T., Kilpatrick,M.W., Lin,K., Erlacher,L., Lembessis,P., Costa,T., Tsipouras,P. and Luyten,F.P. (1997) Nature Genet., 17, 58–64.[ISI][Medline]

Thompson,J.D., Gibson,T.J., Plewniak,F., Jeanmougin,F. and Higgins,D.G. (1997) Nucleic Acids Res., 25, 4876–82.[Abstract/Free Full Text]

Wittbrodt,J. and Rosa,F.M. (1994) Genes Dev., 8, 1448–62.[Abstract]

Wuytens,G. et al. (1999) J. Biol. Chem., 274, 9821–9827.[Abstract/Free Full Text]

Xu,J., McKeehan,K., Matsuzaki,K. and McKeehan,W.L. (1995) J. Biol. Chem., 270, 6308–6313.[Abstract/Free Full Text]

Received March 20, 2000; revised October 5, 2000; accepted October 13, 2000.