From the Bristol-Myers Squibb Pharmaceutical Research
Institute, Seattle, Washington 98121, the
Department of
Biochemistry, Oxford University, Oxford OX1 3QU, United Kingdom, and
the § Department of Biological Structure, University of
Washington, Seattle, Washington 98194
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
CD44 is a widely distributed cell surface protein that plays a role in cell adhesion and migration. As a proteoglycan, CD44 is also implicated in growth factor and chemokine binding and presentation. The extracellular region of CD44 is variably spliced, giving rise to multiple CD44 isoforms. All isoforms contain an amino-terminal domain, which is homologous to cartilage link proteins. The cartilage link protein-like domain of CD44 is important for hyaluronan binding. The structure of the link protein domain of TSG-6 has been determined by NMR. Based on this structure, a molecular model of the link-homologous region of CD44 was constructed. This model was used to select residues for site-specific mutagenesis in an effort to identify residues important for ligand binding and to outline the hyaluronan binding site. Twenty-four point mutants were generated and characterized, and eight residues were identified as critical for binding or to support the interaction. In the model, these residues form a coherent surface the location of which approximately corresponds to the carbohydrate binding sites in two functionally unrelated calcium-dependent lectins, mannose-binding protein and E-selectin (CD62E).
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
CD44 is a type I transmembrane protein encoded by a gene containing 19 exons (1, 2). Ten of these exons are variably spliced (V1-V10), giving rise to multiple CD44 isoforms. All CD44 isoforms contain at their amino terminus a domain of ~100 residues, which is homologous to cartilage link protein domains (3, 4). The link homology domain of CD44 has been implicated in the hyaluronan (HA)1 binding activity of CD44 (5, 6). In different CD44 isoforms, polypeptides encoded by the variably spliced exons are inserted following exon E5. The functional relevance of different CD44 isoforms is still under investigation, but the following observations have been made. (a) Inclusion of exon V3 results in the modification of CD44 with heparan sulfate (HS) added to an SGSG site contained in this exon (7, 8). These CD44 isoforms can interact with HS-binding growth factors and chemokines. (b) Inclusion of exon V6 renders tumor cells expressing this CD44 isoform aggressively metastatic (9, 10). (c) Inclusion of variably spliced exons results in a increase in the number of O-linked carbohydrates in CD44 (11). This change in glycosylation has been proposed to modulate the ability of CD44 to bind HA and is consistent with the finding that N-linked glycosylation can also modulate HA binding (12, 13). The variably spliced region of CD44 is followed by a stalk encoded by exons E15 and E16, a hydrophobic transmembrane domain, and a cytoplasmic domain that can engage in intracellular signaling pathways (2).
CD44 is expressed by a large number of different cell types. Leukocytes predominantly express the standard form of CD44 (CD44H). This isoform contains no variably spliced exons and binds HA on activated leukocytes (3, 5, 6). This interaction has been shown to play an important role in leukocyte adhesion and migration at sites of inflammation (14). Activated macrophages and dendritic cells express CD44 isoforms containing exon V3 (8). Thus, CD44 is modified with HS and can bind and present HS-binding growth factors and chemokines. This allows these antigen-presenting cells to more efficiently amplify an ongoing immune response.
Although the three-dimensional structure of the ligand binding domain(s) of CD44 is currently unknown, attempts have been made previously to map the HA binding site. In an initial mutagenesis study on CD44, only one residue in the link homology domain, Arg-41, could be identified as critical for the interaction with HA (15). Recently, the solution structure of TSG-6 link domain was determined and found to be similar to the calcium-dependent (C-type) lectin fold (16) which was first determined for rat mannose-binding protein (MBP) (17, 18). TSG-6 provides a prototypic fold for the link protein superfamily to which CD44 belongs and allowed the generation of a comparative molecular model of CD44. This model was used to support a more extensive analysis of the CD44 ligand binding site.
Here, we report the generation of the CD44 model and its application in a mutagenesis analysis of the HA binding site. Twenty-four site-specific mutant proteins were generated, and eight residues were identified as important for HA binding. Together with the previously identified Arg-41, residues Tyr-42, Arg-78, and Tyr-79 form a cluster of residues critical for HA binding. In the model, these residues form a coherent surface with additional residues that support binding. The HA binding surface is extensive, consistent with the size of the ligand, and its location approximately corresponds to the carbohydrate binding sites in MBP and E-selectin.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Model Building-- The CD44 model was built using the energy-minimized average TSG-6 NMR coordinates (16) as template and based on a structure-oriented sequence alignment of several link proteins and CD44 (16). Model building, computer graphics analysis, and energy minimization calculations were carried out with InsightII/Discover (MSI, San Diego, CA). The major secondary structure elements of TSG-6 and conserved residues were included in the model. Conservative residue replacements were carried out in similar conformations, and non-conservative substitutions were modeled using a rotamer search procedure (19). Regions including deletions in CD44 relative to TSG-6 (residues 37-40, 82-84, and 104-107) were modeled based on suitable backbone fragments extracted from the Brookhaven Protein Databank (20, 21). The stereochemistry at splice points was regularized manually. Other regions considered to be conformationally variable (55-58, 63-66, 98-102, and 109-112) were modeled by conformational search with CONGEN (22). Conformations with lowest solvent-accessible surface within 3 kcal/mol of the energy minimum conformation were selected and included in the model.
The stereochemistry and intramolecular contacts of the initially assembled model were refined by energy minimization with Discover until the maximum derivative of the energy function was ~4 kcal/Å. In these calculations, AMBER force field parameters (23), a distance-dependent dielectric constant, and a cutoff distance of 9.5 Å for non-bonded interactions were used. The stereochemistry and the sequence-structure compatibility of the model were assessed with PROCHECK (24) and ProsaII (25), respectively. ProsaII energy profiles were generated using a 50-residue window for energy averaging. Energy profiles were also calculated to assess intermediate models (e.g. after modeling of a deletion). Modeled segments were rejected if a notable increase in average residue interaction energies was observed. Molecular surfaces were calculated and displayed using GRASP (26). Superpositions of structures were generated with ALIGN (27). Using ALIGN, theConstruction and Expression of CD44 Mutant Proteins-- The desired mutations were introduced by overlap extension polymerase chain reaction as described (28). The cDNA encoding full-length human CD44H was cloned into the mammalian expression vector PD19, which contains the hinge and constant regions of human IgG2a. Oligonucleotides complementary to both strands of CD44 cDNA were synthesized, including the desired mutation. These primers were then used in polymerase chain reactions with the CD44 construct as template using the QuikChange site-directed mutagenesis kit (Stratagene, La Jolla, CA) according to the manufacturer's instructions. All constructs were verified by cDNA sequencing.
Wild type and mutant CD44-Ig (immunoglobulin) fusion proteins were produced from transiently transfected COS cells. The binding assays described below were designed for use of COS cell supernatants containing CD44-Ig wild type and mutant proteins and do not require highly purified proteins (11, 15). To concentrate soluble Ig fusion proteins, partial purification from the COS cell supernatants was carried out by protein A column chromatography as described previously (15). The proteins were eluted from the column with 4.0 M imidazole (pH 8.0) containing 1 mM magnesium and calcium chloride, then dialyzed extensively against phosphate-buffered saline (PBS). As reported previously, SDS-polyacrylamide gel electrophoresis analysis of CD44-Ig fusion proteins after this purification step shows an essentially homogeneous fusion protein preparation (11, 15). The Ig fusion protein concentrations were determined using an Ig constant region capture assay (28).Binding of Wild Type and Mutant Proteins to Anti-CD44 Monoclonal
Antibodies--
The immunoreactivity of the Ig fusion proteins with
three anti-CD44 monoclonal antibodies (mAbs) was assayed by
enzyme-linked immunosorbent assay (ELISA). The wells of 96-well plates
(Immunon-2, Dynatech, Chantilly, VA) were coated with goat anti-human
IgG constant region antibody (1:1000; Cappell) at 4 °C overnight, blocked with 1 × specimen diluent (Genetic Systems, Redmond, WA), then washed three times with PBS containing 0.05% Tween 20. The wells
were then incubated with dilutions of wild-type or mutant CD44-Ig
fusion proteins at concentrations of 16 ng/ml to 2 µg/ml, and washed
again. Wells were incubated with the following anti-CD44 mAbs: BU75 (1 µg/ml; Ancell, Bayport, MN), MEM-85 (1:1000, Monosan), or A3D8 (1 µg/ml; Sigma). The wells were washed and incubated with horseradish
peroxidase (HRP)-conjugated goat anti-murine Ig ( and light)
(1:5000, Biosource, Camarillo, CA) for 1 h at room temperature,
then developed in chromogenic substrate (chromogen diluted 1:100 in
citrate-buffered substrate, Genetic Systems). The absorbency was
measured on an ELISA reader at dual wavelengths, 450 and 630 nm). BU75
and MEM-85 are conformationally sensitive, as they do not show
reactivity in Western blots. The A3D8 is a blotting mAb and was used to
monitor protein expression.
Binding of Wild Type and Mutant Proteins to HA-- The ability of the wild type and mutant CD44-Ig fusion proteins to bind to HA was also assayed by ELISA. Immunon-2 plates (Dynatech) were coated with HA (10 µg/ml) overnight at room temperature in 50 mM sodium bicarbonate buffer (pH 9.6) at a concentration of 10 µg/ml. The wells were washed three times with PBS containing 0.05% Tween 20, then blocked with 1 × specimen diluent (Genetic Systems) for 1 h at room temperature. The wells were incubated with dilutions of wild type or mutant CD44-Ig fusion proteins at concentrations of 10-50 µg/ml for 45 min at room temperature. The wells were washed three times, and then incubated with HRP-conjugated Fab goat anti-human Ig gamma chain (1:5000; Biosource) for 45 min at room temperature. After washing, bound HRP-conjugated antibody was assayed using chromagen in buffered substrate (Genetic Systems) as described above.
![]() |
RESULTS AND DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
CD44 Molecular Model-- CD44 modeling was based on the alignment of TSG-6 and CD44 sequences shown in Fig. 1A. Taking conservative mutations into account, the sequence similarity between CD44 and TSG-6 in the modeled region is 50%. Significant departures from the TSG-6 structure are predicted for regions including deletions and for loop conformations (Fig. 1B). Energy profiles were calculated for the CD44 model and the TSG-6 structure (Fig. 1C). The negative average energy values of the energy profiles and their shape similarity suggest that the sequence-structure compatibility of the model is comparable to TSG-6 and that significant errors in the core region of the model are absent. Thus, the CD44 model was indicated to be sufficiently accurate to guide mutagenesis experiments and to analyze the HA binding site.
|
Residue Selection and Mutagenesis--
In the model, residue
Arg-41, which was identified previously as critical for HA binding
(15), maps to a fully exposed position in an extended loop connecting
the first -strand and
-helix. With the predicted location of
Arg-41 as a starting point, 17 CD44 residues were selected in an effort
to screen the surface of the CD44 link homology domain for residues
important for HA binding. A total of 24 CD44 point mutant proteins were
generated, including conservative and non-conservative changes at
several positions. Twenty-one mutant proteins were expressed in
quantities sufficient for further characterization. These mutant
proteins were first tested for their ability to bind to two
conformationally sensitive anti-CD44 mAbs. This was done to
assess the gross structural integrity of the mutant proteins. Then, the
HA binding activity of the mutant proteins was assayed.
Binding Experiments with Conformationally Sensitive mAbs-- The results of all mutagenesis and binding experiments are summarized in Table I. Representative mAb binding profiles are shown in Fig. 2. In general, we found that mAb MEM-85, which effectively blocks HA binding to CD44, mirrors the HA binding properties (see below) of mutant proteins. Some, but not all, mutations that abolished HA binding also abolished MEM-85 binding. These findings indicated that the MEM-85 epitope and the HA binding site in CD44 closely overlap. However, this correlation was not observed for mAb BU75, suggesting that its epitope is either more distant from the HA binding site or that its binding is not affected by these mutations. For example, mutant protein R41A, which bound to mAb BU75 at wild type levels, did not bind to either mAb MEM-85 or HA. The mAb binding characteristics of mutants outside the putative MEM-85 epitope region strictly correlated. Any mutant protein that bound mAb MEM-85 also bound mAb BU75. Therefore, mutant proteins were considered structurally perturbed, if the binding to both mAbs was at least partially affected (e.g. Q65S). The mAb binding experiments suggested that 17 of 21 tested mutant proteins were conformationally sound (Table I). Fig. 2 shows that mutant S112R (which does not affect HA binding, see below) binds slightly better to mAb MEM-85 than wild type CD44. The effect is subtle, but the mutation S112R may slightly increase the avidity of the CD44-mAb interaction.
|
|
HA Binding Experiments-- Representative HA binding experiments are shown in Fig. 3. The HA binding activity of different mutant proteins could be classified as either comparable to wild type, reduced (intermediate), or undetectable. All binding experiments are summarized in Table I. Residue Asn-100 is one of four possible N-linked glycosylation sites (Asn-57, Asn-101, Asn-110, and Asn-120) in the link homology domain of CD44. Drastic mutations of Asn-100 to alanine or arginine affected HA binding but not mAb binding. It is not known whether Asn-100 is glycosylated or not, but the results suggest that Asn-100 contributes, directly or indirectly, to HA binding.
|
Residues Important for HA Binding-- The conclusions drawn from the characterization of mutant proteins are summarized in Table II, which shows a classification of the mutated residues according to their importance for either structural integrity or HA binding. This classification was based on the mAb and HA binding characteristics of the 21 expressed mutant proteins. Eight new residues (Lys-38, Tyr-42, Lys-78, Arg-78, Tyr-79, Asn-100, Asn-101, and Tyr-105) were identified as important for the CD44-HA interaction. Three of these residues (Tyr-42, Arg-78, and Tyr-79) were considered critical, as their mutation completely abolished HA binding. The binding characteristics of these mutant proteins were equivalent to the previously characterized R41A mutant (15), which was also tested for comparison. Three residues (Phe-34, Gln-65, and Phe-119) contribute to the structural integrity of CD44, while four residues (Arg-46, Lys-54, Ser-112, and Tyr-114) were, on the basis of our experiments, not important for structure or HA binding.
|
The Putative HA Binding Site-- The mutated residues were mapped on the CD44 model (Fig. 4A). Residues important for structural integrity map to the same region close to the carboxy terminus of the link homology domain. Mutation of these residues may affect the structure of the link domain directly and/or compromise the association of the polypeptide encoded by CD44 exon E5, at least part of which is required for overall stability and ligand binding (15). Residues Tyr-42, Arg-78, and Tyr-79 closely map to Arg-41 and form a cluster of residues in CD44 that is critical for both HA and mAb MEM-85 binding. This is consistent with finding that MEM-85 effectively blocks HA binding. Residues that substantially contribute to HA binding extend the HA binding site beyond the cluster of critical residues. Two residues not important for binding and three potential N-linked glycosylation sites (Asn-57, Asn-110, and Asn-120) map to positions distant from the binding site.
|
Comparison of Residues in CD44 and TSG-6-- The location of the binding surface in CD44 approximately corresponds to the binding site proposed for TSG-6 (16). However, there are some significant differences. Of the residues identified as important in our study, only Tyr-42 is conserved in TSG-6. Arg-41 in CD44 corresponds to Lys-11 in TSG-6. Similarly, residue Lys-38 in CD44 corresponds to an arginine in TSG-6 and the K38R mutation in CD44 affects HA binding (Table I). None of the other CD44 residues important for HA binding are conserved in TSG-6; critical residues Arg-78 and Tyr-79 both correspond to alanines in TSG-6 (Fig. 1A). It follows that the details of the protein-carbohydrate interaction may substantially differ in these proteins, despite corresponding binding site locations.
Comparison with Carbohydrate Binding Sites in C-type Lectins-- The predicted location of the HA binding site in CD44 was compared with the carbohydrate binding sites in MBP and E-selectin. MBP is a serum protein that binds mannose expressed on the surface of pathogens and plays a major role in primitive innate immune responses in mammals (30). The selectins are a family of type I transmembrane proteins that includes three members, E-, P-, and L-selectin (31, 32). The selectins are predominantly expressed on endothelial cells (E- and P-selectins) or leukocytes (L-selectin) and recognize sialylated Lewis X (-like) tetrasaccharide structures via their amino-terminal extracellular C-type lectin domains (31, 32). Selectin-ligand interactions play a critical role in triggering the initial interaction between leukocytes and vascular endothelium in the course of an inflammatory reaction (31). The three-dimensional structures of the C-type lectin domains of MBP (17) and E-selectin (33) display a high degree of structural similarity and contain a conserved calcium binding site that is critical for carbohydrate binding.
We have superimposed the CD44 model on the E-selectin structure and compared the locations of residues in CD44 and E-selectin (33, 34) that are important for ligand binding. Fig. 5A shows that the binding sites overlap, suggesting that these proteins utilize corresponding regions for the recognition of diverse carbohydrate structures. The locations of tyrosine residues in CD44 (Tyr-79 and Tyr-105) and E-selectin (Tyr-48 and Tyr-94), which are critical for carbohydrate binding, correspond closely. Fig. 5B shows a side-by-side comparison of the carbohydrate binding sites in MBP (18), E-selectin, and CD44. Despite sharing the C-type lectin fold, MBP and E-selectin are functionally distinct. The structures of link proteins are more distantly related to the C-type lectins. Nevertheless, the binding sites map to equivalent regions in these proteins. The binding surfaces increase with the size of the recognized ligands from mannose (MBP) to sialylated Lewis X (E-selectin) to HA (CD44). In MBP, protein-carbohydrate interactions are essentially limited to the conserved calcium coordination sphere (18), while surface residues in the vicinity of the conserved calcium are critical for ligand binding to the selectins. In contrast to MBP and E-selectin, carbohydrate binding to CD44 is not calcium-dependent and involves a larger surface area. The functional role of MBP in primitive immune responses implies that it is an ancient molecule. Thus, link protein domains may have diverged from the C-type lectin fold.
|
Conclusions-- A combined modeling and mutagenesis study has identified CD44 residues important for HA binding and has made it possible, despite the inherent limitations, to generate a three-dimensional outline of the CD44 ligand binding site. Although TSG-6 and CD44 share similar structures and common ligands, the majority of residues important for HA binding to CD44 are not conserved in TSG-6, suggesting the presence of specific interactions. The putative HA binding surface in CD44 is extensive and in part corresponds to the carbohydrate binding site in E-selectin. Comparison of the binding sites in MBP, E-selectin, and CD44 is thought to provide an example for the evolution of carbohydrate-binding protein surfaces.
![]() |
ACKNOWLEDGEMENTS |
---|
We thank Gary Carlton for help in generating Fig. 1 and Debby Baxter for help in the preparation of the manuscript.
![]() |
FOOTNOTES |
---|
* The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
¶ To whom correspondence should be addressed. Present address: MDS Panlabs, Computational Chemistry, 11804 N. Creek Pkwy. S., Bothell, WA 98011-8805. Tel.: 425-487-8297; Fax: 425-487-8262; E-mail: jbajorath{at}panlabs.com.
1 The abbreviations used are: HA, hyaluronan; HS, heparan sulfate; mAb, monoclonal antibody; MBP, mannose-binding protein; PBS, phosphate-buffered saline; ELISA, enzyme-linked immunosorbent assay; HRP, horseradish peroxidase.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|