Primary Structure and Expression of Matrilin-2, the Closest Relative of Cartilage Matrix Protein within the von Willebrand Factor Type A-like Module Superfamily*

(Received for publication, September 16, 1996, and in revised form, December 23, 1996)

Ferenc Deák Dagger , Dorothea Piecha §, Csanád Bachrati Dagger , Mats Paulsson § and Ibolya Kiss Dagger

From the Dagger  Institute of Biochemistry, Biological Research Center of the Hungarian Academy of Sciences, P. O. Box 521, Szeged H-6701, Hungary and the § Institute for Biochemistry, Medical Faculty, University of Cologne, D-50931 Cologne, Germany

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES


ABSTRACT

A mouse cDNA encoding a novel member of the von Willebrand factor type A-like module superfamily was cloned. The protein precursor of 956 amino acids consists of a putative signal peptide, two von Willebrand factor type A-like domains connected by 10 epidermal growth factor-like modules, a potential oligomerization domain, and a unique segment, and it contains potential N-glycosylation sites. A sequence similarity search indicated the closest relation to the trimeric cartilage matrix protein (CMP). Since they constitute a novel protein family, we introduce the term matrilin-2 for the new protein, reserving matrilin-1 as an alternative name for CMP. A 3.9-kilobase matrilin-2 mRNA was detected in a variety of mouse organs, including calvaria, uterus, heart, and brain, as well as fibroblast and osteoblast cell lines. Expressed human and rat cDNA sequence tags indicate a high degree of interspecies conservation. A group of 120-150-kDa bands was, after reduction, recognized specifically with an antiserum against the matrilin-2-glutathione S-transferase fusion protein in media of the matrilin-2-expressing cell lines. Assuming glycosylation, this agrees well with the predicted minimum Mr of the mature protein (104,300). Immunolocalization of matrilin-2 in developing skeletal elements showed reactivity in the perichondrium and the osteoblast layer of trabecular bone. CMP binds both collagen fibrils and aggrecan, and because of the similar structure and complementary expression pattern, matrilin-2 is likely to perform similar functions in the extracellular matrix assembly of other tissues.


INTRODUCTION

Multidomain or mosaic proteins play an important role in the diverse functions of the extracellular matrix (ECM)1 in various tissues (1). Cartilage matrix protein (CMP) (2-4) is an abundant structural component of the ECM in some types of hyaline cartilage. It binds to aggrecan, the large cartilage proteoglycan (5, 6), and to cartilage collagen fibrils (7), and thereby it may serve to connect the two major macromolecular networks. CMP is a homotrimeric glycoprotein of about 50-kDa subunits (2-4), which appear as three connected ellipsoids on electron microscopy of the native mature protein (8). The primary structure of the monomer has been determined from the nucleotide sequence of chicken cDNA and genomic clones (3, 4), and it has also been confirmed in the human and mouse (9, 10). After cleavage of the signal peptide, each subunit consists of two von Willebrand factor type A (vWFA)-like domains separated by an epidermal growth factor (EGF)-like module and followed by a COOH-terminal domain. The latter one has been shown recently to play a role in the trimer assembly via coiled coil formation (8, 11). CMP expression is restricted to particular zones in the growth plate (10, 12, 13).

CMP is one of the simplest members of the vWFA-like module superfamily, a diverse group of proteins sharing high sequence similarity over a segment, which was first identified as the repeated type A domain of von Willebrand factor and has since been found not only in plasma proteins but also in plasma membrane and ECM proteins (14). Crystal structure analysis of an integrin vWFA-like domain has revealed a classic alpha /beta "Rossmann" fold and suggested a metal ion-dependent adhesion site, which is conserved in other vWFA-like modules and can be involved in binding protein ligands (15, 16).

Some of the major constituents of the cartilaginous matrix were found in structurally and genetically related forms in the ECM of other tissues. For example, versican, which is widely expressed in vascular and avascular connective tissue, and the brain-specific neurocan and brevican also bind hyaluronan and show structural similarity to the cartilage-specific aggrecan (reviewed in Ref. 17). Since one of our CMP-specific antisera showed immunostaining not only in cartilage but in the perichondrium as well (12), it raised the question of whether a closely related gene product is functioning in other tissues. To test this hypothesis, we used a chicken CMP probe to isolate cross-hybridizing clones from a mouse epiphyses cDNA library. Here we report on the deduced primary structure of the cloned novel protein, which is a close relative of CMP. It is encoded by a distinct gene and differs from CMP both in structure and tissue specificity. The possible function of the novel protein is discussed.


MATERIALS AND METHODS

cDNA Library Construction and Screening

Poly(A)+ RNA was prepared from the epiphyses and covering tissues of newborn BALB/c mice by affinity chromatography on oligo(dT)-cellulose (Invitrogen). The first cDNA strand was initiated by random hexamer primers and supplied with an oligo(dA) tail. The second strand was primed with an oligo(dA)-tailed XhoI linker. The double-stranded cDNA was supplied with an EcoRI adaptor, cleaved with XhoI, and inserted into the lambda -ZAP II vector (Stratagene). After in vitro packaging, 1 × 106 primary phages were amplified, and plaque lifts were hybridized to the insert of pCMP6 (4) in 0.9 M NaCl, 50 mM sodium phosphate, pH 7.4, 5 mM EDTA, 0.1% SDS, 0.1% bovine serum albumin, 0.1% Ficoll, 0.1% polyvinylpyrrolidone, and 100 µg/ml herring sperm DNA at 60 °C. Filters were washed with 0.9 M NaCl, 90 mM trisodium citrate, pH 7.0, 0.1% SDS, and 0.05% sodium pyrophosphate at 53 °C.

The 5'-end of the cDNA was cloned following reverse transcription-coupled polymerase chain reaction and rapid amplification of cDNA ends (18). Primer 1 (Table I) was annealed to poly(A)+ RNA from L929 cells and elongated by SuperScript II RNase H- reverse transcriptase (Life Technologies, Inc.). Part of the cDNA was specifically amplified between primers 2 and 4. The latter primer was designed to cover a conserved sequence in the first vWFA-like module of chicken and human CMP. After annealing to the cDNA at 48 °C for 2 min, primer 4 was elongated by Pyrococcus furiosus DNA polymerase. Then primer 2 was added, and 30 cycles of amplification (94 °C, 1 min; 55 °C, 1 min; 72 °C, 3 min) were performed. The polymerase chain reaction product was isolated from agarose gel, cleaved by SalI, and inserted into the SalI-SmaI sites of pBluescript IISK+. Another fraction of the first cDNA strand was supplied with a poly(A) tail using terminal deoxynucleotidyl transferase. The linker primer TLT was hybridized to the poly(A)-tailed cDNA and elongated for 40 min. The cDNA ends were amplified as above in two consecutive reactions, using first primer T and primer 5 and then products obtained by linker primer L and the gene-specific primer 6. After cleavage by XhoI, the rapid amplification of cDNA ends products were cloned in the SalI-SmaI site ot the vector. Several independent clones were sequenced to correct for mutations during amplification.

Table I.

Oligonucleotide primers used in this study


Name Sequence

Primer 1 5'-TTCTGGAACACTGAGGTCAAGGACTCAA-3'
Primer 2 5'-TGAAATTGGCCACCAGGAA-3'
Primer 3 5'-AACACAGGCATCCTGATCTTTGCCAT-3'
Primer 4 5'-GTGTTTGTCAGCTCT-3'
Primer 5 5'-CTTGACCGTGCTGCCATATT-3'
Primer 6 5'-GGAGCAGACCTACTCGGGTAA-3'
Linker primer TLT 5'-CCAGCGAGCAGAGTGACGAGGACTCGAGCTCAAGCT17-3'
Primer T 5'-CCAGCGAGCAGAGTGACG-3'
Linker primer L 5'-CGAGGACTCAAGC-3'

Sequence Analysis

Both cDNA strands were sequenced with Sequenase version 2.0 (U. S. Biochemical Corp.) either directly in plasmids using synthetic primers or after subcloning in M13 phages. Nucleotide sequence analyses were performed with the programs of the Genetics Computer Group package (19). Human-expressed (w42930, f08289, r13439, w05292, r02486, n39494, w32485, w04784, n47632, n53823, t94707, n52700, and a27272) and rat-expressed (r47063) sequence tags were identified using the cDNA sequence of pCRP12 in a BLAST search (20) of the National Center for Biotechnology Information (Bethesda, MD) data bases, aligned, and assembled. The human cDNA clone 323380 was obtained from the Reference Library (ICRF, Berlin-Dahlem, Germany) and sequenced to correct for ambiguities. Sequence alignments and phylogenetic trees were constructed by the CLUSTAL W program (21) and the Phylogeny Inference Package version 3.5c (22), respectively.

Cell Cultures and RNA Analysis

The mouse fibroblastic cell lines L929, WEHI 164, and NIH 3T3 and the rat osteoblast cell line UMR-1 obtained from American Type Culture Collection (Rockville, MD) were cultivated in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum (Life Technologies) and used for RNA and protein analyses. Total RNA was prepared from guanidinium thiocyanate extracts of various organs of newborn and adult mice and cell lines using the RNA isolation kit of Stratagene. For Northern analysis 5-10-µg aliquots were electrophoresed, blotted, and hybridized as described (3).

Production of Antiserum

A 1093-base pair BamHI-EcoRI fragment encoding the second vWFA-like module and the COOH end of the protein was inserted in the pGEX-KT vector (Pharmacia Biotech Inc.) in frame with the glutathione S-transferase gene. After introduction into Escherichia coli SURE cells and induction with 1 mM isopropyl-1-thio-beta -D-galactopyranoside, bacteria were lysed, and the fusion peptide was purified on a glutathione-Sepharose affinity adsorbent (Pharmacia Biotech) followed by gel filtration on a column of Superose 12 (Pharmacia Biotech). Fractions enriched for the protein were used to immunize rabbits.

Immunoblot Analysis and Immunohistochemistry

Cultures of WEHI 164 and UMR-1 cells were grown to confluency. The cell layers were washed and replaced with Dulbecco's modified Eagle's medium containing only 1% fetal calf serum. After 48 h the media were harvested and submitted to 4-15% SDS-PAGE under reducing conditions. Proteins were transferred electrophoretically to nitrocellulose and developed with a dilution of the antiserum to the matrilin-2-glutathione S-transferase fusion protein, followed by a swine anti-rabbit IgG-peroxidase complex and the ECL chemiluminescence procedure (Amersham Corp.) as described by the suppliers. Immunohistochemistry was performed as described previously (8), using the matrilin-2 antiserum together with a swine anti-rabbit IgG-peroxidase complex and 3-amino-9-ethylcarbazole as substrate on unfixed cryosections from adult mouse.


RESULTS

Isolation of cDNA Clones for Matrilin-2, a CMP-related Novel Mouse Protein

In an attempt to clone CMP-related genes, a mouse epiphysis cDNA library was screened with a cDNA probe for chicken CMP. From 10 cross-hybridizing clones, characterization of pCRP2, pCRP12, and pCRP18 revealed an open reading frame of 792 triplets (Fig. 1). Since preliminary analysis of the deduced sequence indicated only 53% identity with that of mouse CMP (10), it could not have arisen from the same gene via alternative splicing. Thus the cDNA clones encoded a novel protein.


Fig. 1. Comparison of the modular structure of CMP and matrilin-2 precursors. Vertical lines, regions encoded by cDNA clones indicated. +, location of a segment rich in positively charged residues at the amino terminus of the secreted protein. Arrowheads 1-6, positions of primers 1-6 (Table I) used in cloning and polymerase chain reaction.
[View Larger Version of this Image (24K GIF file)]


The complete protein coding region was cloned in two consecutive steps. Since Northern analysis indicated the expression of an mRNA of the same size both in mouse limb and the mouse cell line L929 (see Fig. 4), poly(A)+ RNA isolated from this cell line was used as a template for further library constructions. To clone reverse transcription-coupled polymerase chain reaction fragments, we used gene-specific primers 1 and 2 in combination with primer 4, specific to a conserved sequence in the first vWFA-like domain of chicken and human CMP (Fig. 1). Several cDNA clones were obtained, and sequencing of pCRP190-pCRP195 showed their identity with pCRP12 in the overlapping regions. Additional 5'-end cDNA clones were produced by the rapid amplification of cDNA ends technique (18) and sequenced, and the two longest ones, pCRP207 and pCRP233, are depicted in Fig. 1. A composite nucleotide sequence of 3571 base pairs was obtained from the overlapping clones (Fig. 2A). The translation start site was assigned to the most upstream one from the three in-frame ATG triplets at nucleotide positions 251-253, because its flanking sequence best matched the consensus motif for the translation initiation site (23), and it had also been observed that functional translation start sites were very seldom preceded by in-frame ATG triplets (23). Although the 250-base pair cDNA located upstream of the first ATG does not contain an in-frame stop codon, it most likely represents an untranslated sequence, since its translation would result in a very unusual amino acid sequence with long stretches of glycine, alanine, and leucine. The 3'-untranslated region of 453 nucleotides includes two putative polyadenylation signals. The nucleotide sequence thus defines an open reading frame of 956 amino acids (Fig. 2A) and a protein precursor with a predicted Mr of 106,800. The first 23 residues correspond to a putative signal peptide in agreement with the (-3,-1) rule of von Heijne (24). Its cleavage would result in a mature secreted protein with a minimum Mr of 104,300 and a more complex primary structure compared with that of CMP (Fig. 1). Thus the novel protein includes a pair of putative vWFA-like modules (vWFA1 and vWFA2) and a COOH-terminal domain, which have 47.4, 52.9, and 33.3% sequence identity, respectively, with the corresponding mouse CMP domains. Contrary to CMP, however, it carries: (i) 10 EGF repeats with an average sequence identity of 46% to the EGF module of CMP; (ii) a unique segment, which has not been identified in other proteins to date; and (iii) a group of positively charged amino acids between residues 24 and 39 preceding the first vWFA-like domain (Fig. 2A). The deduced amino acid sequence contains two NX(S/T) consensus sequences for potential N-glycosylation. One of them is located at the end of the first vWFA-like domain, whereas the other is in the unique segment. Apart from this, the unique segment contains an SG motif that matches the (E/D)XSGXX consensus chondroitin sulfate attachment site proposed by Bourdon (25). The primary structure, including the lack of a transmembrane segment and the presence of a putative secretory signal peptide, suggests that the cDNA clones encode a CMP-related ECM protein that may also form oligomers via the potential COOH-terminal coiled coil domain (see Fig. 3C). We therefore propose the name matrilin-2 for the novel protein and reserve the name matrilin-1 for CMP.


Fig. 4. Distribution of matrilin-2 mRNA in various mouse organs and cell lines. A, Northern hybridization of 5 µg of total RNA samples from total limbs of day 11.5 mouse embryos (lane 1) and epiphyseal cartilage of newborn mice (lane 2). B, relative abundance of matrilin-2 mRNA in mouse organs and various cell lines as determined by Northern analysis. 5 and 10 µg of total RNA were tested from adult mouse organs (lanes 1-11) and cell lines (lanes 12-15), respectively, as indicated above the lanes. Filters were hybridized consecutively with pCRP12 cDNA (MTR) and a chicken DNA fragment for rRNA (28S). Exposure times for autoradiography were 5 days with a screen for the matrilin-2 probe and 6 h without a screen (lanes 1-11) or 6 h with a screen (lanes 12-15) for the rRNA probe, respectively.
[View Larger Version of this Image (27K GIF file)]



Fig. 2. Nucleotide and deduced amino acid sequences of matrilin-2. A, complete sequence of mouse matrilin-2 precursor as determined from overlapping cDNA clones of Fig. 1. Arrowhead, predicted propeptidase cleavage site. Positively charged amino acids at the amino terminus of the secreted protein are bold face. Potential N-linked glycosylation sites are underlined; a consensus motif for chondroitin sulfate attachment is double underlined. Dotted lines, putative polyadenylation signals. B, partial nucleotide and amino acid sequence of human matrilin-2.
[View Larger Version of this Image (99K GIF file)]



Fig. 3. Multiple alignments of amino acid sequences from matrilin-2 and other proteins. Sequences were aligned by the CLUSTAL W multiple alignment program (21) using the default parameters. Module borders were delineated according to exon borders in the CMP genes (4, 9). The consensus was calculated using a threshold value of 0.75. In the name of the modules Gg, Hs, Mm, and Rn reflect chicken, human, mouse, and rat origin, respectively, whereas MTR and TSP1 refer to matrilin-2 and thrombospondin-1, respectively. Lower case letters in the consensus line, conservation of amino acid characters: aliphatic (a), aromatic (r), polar (p), and small (s). Mouse matrilin-2, chicken (4), and human CMP (9) and chicken (26) and human (27) thrombospondin-1 are numbered from the first amino acid of the protein precursor, mouse CMP (10) and human and rat matrilin-2 are numbered from the first codon identified. A, sequence alignment of the vWFA1 (A1) and vWFA2 (A2) modules of CMP and matrilin-2. The locations of the alpha -helices and beta -sheets determined from averaged secondary structure predictions of 75 modules (15) are indicated above the alignments by the arrow ranges beta 1-alpha 12. Structure predictions by the Chou-Fasman and Garnier methods for the vWFA1 module of mouse matrilin-2 are shown underneath (b, beta -sheet; h, alpha -helix; t, turn). The metal ion-dependent adhesion site (*) conserved in vWFA-like modules (16) and conserved hydrophobic moieties (black-square) thought to contribute to the module core (15) are denoted. B, sequence alignment of the EGF modules of matrilin-2 and CMP. egf1-egf10, EGF repeats from mouse matrilin-2 numbered from the amino terminus. C, sequence alignment of the alpha -helical coiled coils of CMP, matrilin-2, and thrombospondin-1. Hydrophobic amino acids at positions 1 and 4 of the heptad repeats are bold face; polar moieties at positions 5 and 7 are shaded. D, alignment of the unique sequence of matrilin-2 from the mouse, human and rat. Potential N-linked oligosaccharide and chondroitin sulfate attachment sites are underlined and double underlined, respectively.
[View Larger Version of this Image (80K GIF file)]


A data bank search indicated that this protein has not been identified previously. In the GenBank EST Division, however, several human-expressed sequence tags were found that had significant similarity to the coding region of mouse matrilin-2. An open reading frame of 313 amino acids (Fig. 2B) with 87% identity to the COOH-terminal region of mouse matrilin-2 was identified and confirmed by sequencing the longest cDNA clone. A short rat cDNA sequence tag encoding the putative oligomerization and unique modules of the protein was similarly found. These observations provide independent evidence that the gene is also expressed in human and rat cells.

Matrilin-2 is Related to Members of Other Protein Families

Multiple alignment of the vWFA-like domains between CMP and matrilin-2 confirmed a high degree of sequence similarity (Fig. 3A). A pair of cysteines located at both ends of the module, the five residues composing the metal ion-dependent adhesion site (16), and six hydrophobic residues reported to be highly conserved in other proteins at positions 17, 60, 100, 123, 153, and 163 (15) are also conserved in matrilin-2. The predicted secondary structures of both vWFA-like modules of matrilin-2 are in remarkably good agreement with the previously determined secondary structure of the vWFA-like domain (15, 16) (Fig. 3A). The potential N-linked oligosaccharide attachment site was found in the vWFA1 domain of matrilin-2 at a different position than in CMP.

The EGF-like modules in both CMP and matrilin-2 are of the B type, which do not contain potential Ca2+ binding motifs (28) and differ only by a single amino acid in length. Sequence alignment showed full conservation of each cysteine as well as a glycine and a lysine at positions 31 and 39, respectively, without insertion of gaps (Fig. 3B). Furthermore, from the 25 highly conserved residues, 20 are also present in CMP from three different species.

Although the sequence identity of the COOH-terminal domains between matrilin-2 and CMP is below the overall value (49%) for the two proteins, structural motifs characteristic of coiled coil alpha -helices (29) can clearly be recognized (Fig. 3C). Within the heptad repeats, positions 1 and 4 are preferentially occupied by aliphatic moieties, and positions 5 and 7 are filled with polar residues. Alignment of this part of matrilin-2 with the trimerization modules of CMP and thrombospondin-1 indicates further conservation of residues in addition to the structural similarity. Immediately upstream of the heptad repeats, two closely spaced cysteines, which were shown to stabilize the homotrimers in thrombospondin-1 (30) and CMP (31), are also fully conserved in matrilin-2.

In mouse matrilin-2, the unique sequence of 75 residues located between the second vWFA-like module and the coiled coil alpha -helix contains a potential glycosaminoglycan attachment site, which is not conserved in humans and rats (Fig. 3D). Sequence analysis of the expressed sequence tags data base has revealed variations among the human sequence tags. The stretch of 20 triplets missing from the rat sequence was also absent from one human tag. Apparently, this segment is subject to alternative splicing.

When the sequences of the vWFA1, vWFA2, EGF1, or COOH-terminal modules of matrilin-2 were used as a query in sequence similarity searches with two programs, BLASTP (20) and FASTA (32), it was found that the closest relatives of those are the corresponding modules in CMP. This suggests that the two proteins have evolved from a common ancestral gene. Computer analysis, using the Fitch-Margoliash algorithm of the Phylogeny Inference Package (22) for the construction of evolutionary trees, revealed a closer evolutionary relationship of the corresponding vWFA-like modules between matrilin-2 and CMP than between the vWFA1 and vWFA2 modules within either protein (data not shown). This indicates that the duplication of the vWFA-like modules preceded the separation of the genes for CMP and matrilin-2. Construction of the evolutionary tree for the EGF modules revealed that the first EGF repeat of matrilin-2 is more distantly related to the other repeats within the same molecule than to the EGF module of CMP (data not shown). However, the other EGF repeats of matrilin-2 have higher degrees of sequence similarity with each other, suggesting that the latter ones have started to diverge from each other after the separation of the ancestor genes of CMP and matrilin-2.

The Matrilin-2 Gene is Expressed in a Variety of Organs and Cell Lines

Distribution of matrilin-2 mRNA in various mouse organs was studied by RNA blot hybridization. A strong band of 3.9 kilobases was detected in limbs of day 11 mouse embryos, but it was not visible in the epiphyseal cartilage samples of newborn mice (Fig. 4A). A transcript of the same size was also found in high abundance in the calvaria, uterus, and heart and in lower abundance in skeletal muscle, the brain, and skin (Fig. 4B). However, by this method the transcript was hardly or not at all detectable in the trachea, femur, lung, spleen, and kidney. Since the mRNA for matrilin-2 was found in a wide variety of tissues, but not in cartilage, its expression pattern clearly differed from that of CMP. To test whether the broad tissue distribution is due to the expression in loose connective tissue cells present in different organs, mouse fibroblastic cell lines were also studied by Northern analysis. Cell lines L929, WEHI 164, and NIH 3T3, which originated from mouse C34/An connective tissue, a BALB/c mouse fibrosarcoma, and a NIH Swiss mouse embryo, respectively, all expressed the 3.9-kilobase matrilin-2 mRNA, thus supporting this hypothesis (Fig. 4B). In addition to this, the mRNA was present in the rat osteoblast cell line, UMR-1 (Fig. 4B).

To gain information about the size and localization of the protein, two expressing cell lines, WEHI 164 and UMR-1, were selected for immunochemical studies. The antiserum raised against the matrilin-2-glutathione S-transferase fusion protein did in media from both cell lines specifically react with a group of bands migrating with a mobility of a molecular mass of 120-150 kDa after reduction (Fig. 5). These bands were not seen without prior reduction, indicating that they had been part of a high molecular mass complex, which under the condition used did not enter the gel or transfer from the gel to the nitrocellulose (data not shown). The molecular mass of matrilin-2 monomers observed on SDS-PAGE is in approximate agreement with the predicted molecular mass, when the possibility of glycosylation is considered. Indeed, the staining of a group of bands between 120 and 150 kDa indicates a heterogeneity, which could be due to differences in glycosylation and/or alternative splicing.


Fig. 5. Immunoblot analysis of cell culture media. Medium was collected from UMR-1 (lane 2) and WEHI 164 (lane 3) cells after 48 h of conditioning and applied to 4-15% SDS-PAGE gels after reduction. After transfer to nitrocellulose matrilin-2 was visualized by use of the antibody raised against the matrilin-2-glutathione S-transferase fusion protein. A heterogenous set of specifically stained bands is seen migrating between 120 and 150 kDa. A strong band at 50-70 kDa is seen also in the lane of molecular mass standards (lane 1) and represents a nonspecific titer, presumably directed to keratins. Autoantibodies to keratins are often found in rabbits, and keratins frequently contaminate the chemicals used in SDS-PAGE.
[View Larger Version of this Image (36K GIF file)]


The antiserum was used for immunohistochemical localization of matrilin-2 in sections of mouse tissues. In preliminary experiments, expression was seen in the matrix adjacent to many mesenchymal but not muscle cells (data not shown), demonstrating that the expression pattern of matrilin-2 is distinct from that of CMP. Therefore, we focused our attention on the skeletal elements, in which CMP has a very characteristic distribution. In sections of tracheal cartilage the perichondrium but not the cartilage matrix proper was stained (Fig. 6, B and C). In sections of trabecular bone the osteoblast layer was positive (Fig. 6A). This indicates that the expression pattern of matrilin-2 and CMP is complementary in the skeletal elements and agrees with the results from immunoblotting (Fig. 5), showing the presence of the protein in conditioned medium from fibrosarcoma cells and an osteoblastic cell line.


Fig. 6. Immunolocalization of matrilin-2 in adult mouse tissues. A, in the vertebral body (vb, upper right) matrilin-2 is detected in the osteoblast layer, which lines the bone trabeculae. The intervertebral disc (id, lower left) is devoid of staining. B, the perichondrium surrounding the tracheal cartilage is positive for matrilin-2, whereas the cartilage proper is negative. C, control section of tracheal cartilage treated with preimmune serum does not show any staining. Bar, 40 µm in A and 20 µm in B and C.
[View Larger Version of this Image (110K GIF file)]



DISCUSSION

This study reports on the molecular cloning and the complete coding sequence of the matrilin-2 gene from the mouse and a partial sequence from the human. Neither the gene nor the protein product has been described previously. Evidence is provided that the gene is expressed in a variety of mouse and human organs as well as mouse and rat cell lines, and it encodes a protein, which is secreted into the extracellular space. The 87% identity over a stretch of 313 amino acids between mouse and human matrilin-2 indicates that the protein is a functionally important novel component of the extracellular matrix in a broad range of mammalian tissues and organs.

Data base analyses both at nucleic acid and amino acid levels have revealed that matrilin-2 belongs to the vWFA-like superfamily. Several lines of evidence indicate that it is the closest relative of CMP: (i) the two proteins share three modules of a considerable degree of sequence similarity; although the vWFA-like domain has been shown to reshuffle with a great variety of modules in different proteins, no other members of the superfamily have been reported to contain the oligomerization module, and only the Caenorhabditis elegans ynx3 protein includes EGF modules (14-16); (ii) since the order of the related modules is also identical in CMP and matrilin-2, it is very unlikely that their genes have originated through convergent evolution; and (iii) data base searches indicate that the closest relatives of all of the three putative matrilin-2 modules are the corresponding ones of CMP; that is, the closest evolutionary relationship was found even for the least conserved putative oligomerization domains between matrilin-2 and CMP. From these data we conclude that the two proteins belong to the same protein family, which we now refer to as the matrilins. The strikingly similar domain structures and the close evolutionary relationship supported by the construction of phylogenetic trees suggest that the genes for CMP (matrilin-1) and matrilin-2 have evolved via the duplication of a common ancestor gene encoding duplicated vWFA, single EGF, and putative oligomerization modules.

Our data show that the matrilin-2 gene is transcribed in a variety of mouse organs and cell types. Its expression level varies within a broad range, the mRNA being most abundant in the calvaria, uterus, and heart and less abundant or not detectable in others. Immunostaining revealed specific reactivity in the perichondrium and other connective tissue cells as well as osteoblasts. Preliminary in situ hybridization experiments also support a broad expression pattern.2 The human-expressed sequence tags were derived from embryonic heart, lung, brain, senescent fibroblasts, and multiple sclerosis lesions of adult patients, which confirms expression in the connective tissue of different organs. Although the gene activity was demonstrated in fibroblast and osteoblast cell lines, it requires further studies to identify all matrilin-2 expressing cell types in vivo clearly.

The strikingly similar domain structure and the complementary expression pattern of CMP and matrilin-2 suggest that the two proteins perform similar functions in the organization of different forms of ECM. Marked differences were noticed in the level of CMP gene expression depending on the chondrocyte differentiation stages both in vivo (10, 12, 13) and in various culture systems (33) and also depending on the forms of cartilage (34). The low level in articular cartilage and in the resting zone of the growth plate is in accordance with the low abundance of the CMP coding sequences in the RNA used for library construction in this article. Apart from cartilage, CMP expression has been reported only in the notochord and certain structures of the eye (12, 13, 35), suggesting a very specialized function. In fact, CMP was reported to bind both to aggrecan (5, 6) and type II collagen fibrils (7) as well as to form a collagen-independent filamentous network (36). Its binding to aggrecan apparently involves covalent cross-linking, which increases with age (6). If the interaction of CMP with the type II collagen fibrils and the aggrecan-hyaluronan network takes place simultaneously, it implies an important bridging function between the two major macromolecular networks. Such a role in the organization of the cartilaginous ECM may explain variations in the abundance of CMP in different forms of cartilage (34). It is not known which domain of CMP is involved in these interactions and what the molecular mechanism is. The vWFA-like domains are major candidates for macromolecular interactions, since they have been shown to bind to a versatile group of ligands, including platelet glycoprotein Ib and collagen in von Willebrand factor (37), collagen, heparin, and hyaluronan in type VI collagen (14, 38), and ICAM-1, iC3b, and fibrinogen in integrins (16).

The primary structure of matrilin-2 indicates that it may play a similar role in the organization of the ECM in other tissues. The presence of a putative secretory signal peptide and the lack of a transmembrane domain in the coding region suggested an extracellular protein, which was confirmed by the immunological detection of the secreted protein in the cell culture media. Although other tissues do not have such a high proportion of ECM as cartilage, collagens and large aggregating proteoglycans are also present in the ECM of other tissues. Therefore, it is possible that matrilin-2 evolved to play a bridging role between these macromolecular networks in other tissues. The coiled coil domain was shown to form trimers in CMP (8), thrombospondin-1, and thrombospondin-2 (26), whereas it forms pentamers in thrombospondin-4 and cartilage oligomeric matrix protein (39). It requires further studies to determine whether matrilin-2 forms trimers or other oligomers via its putative coiled coil domain, but the results from SDS-PAGE under nonreducing conditions (not shown) demonstrate that it is not a monomeric protein.

The predicted structure of matrilin-2 may serve as the molecular basis for further interactions. The presence of the unique segment may give a potential to interact with other ECM components, which are absent from cartilage. The broad size interval of 120-150-kDa protein bands reacting specifically with the matrilin-2-glutathione S-transferase antiserum is indicative of extensive glycosylation and/or attachment of glycosaminoglycan side chains to a variable extent. Consensus motifs for N-glycosylation and glycosaminoglycan attachment were found in matrilin-2, supporting this assumption. In the unique segment, the location of the former one is conserved between the mouse and rat. Furthermore, the positively charged amino terminus may enable matrilin-2 to interact electrostatically with negatively charged polymers, e.g. hyaluronan or other glycosaminoglycans.

Its expression pattern in mouse and the presence of the matrilin-2 cDNA in sequence tags originating from various human and rat tissues indicate an important role in the ECM of mammals. If matrilins perform an essential function in the organization of the ECM, then further members of this protein family with slightly different domain structures can be predicted to exist. These proteins may interact with the collagen and proteoglycan components of the specialized ECM of other tissues, such as brain, spleen, and lung.


FOOTNOTES

*   This work was supported by Grants OTKA T896, E012081, and CO38 from the Hungarian National Scientific Research Foundation (to I. K.), joint Grant 7UNPJO38527 from the Swiss National Foundation (to M. P. and I. K.), and joint Grant I/71 654 from Volkswagen-Stiftung (to M. P. and I. K.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U69262[GenBank] and U69263[GenBank].


   To whom correspondence should be addressed. Tel.: 36-62-432-232; Fax: 36-62-433-506.
1   The abbreviations used are: ECM, extracellular matrix; vWFA, von Willebrand factor type A; CMP, cartilage matrix protein; EGF, epidermal growth factor; PAGE, polyacrylamide gel electrophoresis.
2   F. Deák, and D. Studer, unpublished data.

ACKNOWLEDGEMENTS

We are grateful to Dr. N. Hauser for contributions in the early phases of this work, Dr. R. Wagener for pointing out the existence of expressed sequence tags with homology to matrilin-2, Dr. A. Aszódi for providing the mouse CMP sequence before publication, Dr. L. Patthy for critical reading of the manuscript, and Dr. L. Módis for valuable discussion. We also thank I. Fekete, A. Simon, and I. Kravjár for the excellent technical assistance and A. Borka and M. Tóth for the artwork.


REFERENCES

  1. Engel, J., Efimov, V. P., and Maurer, P. (1994) Development (Camb.) (suppl.) 35-42
  2. Paulsson, M., and D. Heinegård, D. (1981) Biochem. J. 197, 367-375 [Medline] [Order article via Infotrieve]
  3. Argraves, W. S., Deák, F., Sparks, K. J., Kiss, I., and Goetinck, P. F. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 464-468 [Abstract]
  4. Kiss, I., Deák, F., Holloway, R. G., Jr., Delius, H., Mebust, K. A., Frimberger, E., Argraves, W. S., Tsonis, P. A., Winterbottom, N., and Goetinck, P. F. (1989) J. Biol. Chem. 264, 8126-8134 [Abstract/Free Full Text]
  5. Paulsson, M., and D. Heinegård, D. (1979) Biochem. J. 183, 539-545 [Medline] [Order article via Infotrieve]
  6. Hauser, N., Paulsson, M., Heinegård, D., and Mörgelin, M. (1996) J. Biol. Chem. 271, 32247-32252 [Abstract/Free Full Text]
  7. Winterbottom, N., Tondravi, M. M., Harrington, T. L., Klier, F. G., Vertel, B. M., and Goetinck, P. F. (1992) Dev. Dyn. 193, 266-276 [Medline] [Order article via Infotrieve]
  8. Hauser, N., and Paulsson, M. (1994) J. Biol. Chem. 269, 25747-25753 [Abstract/Free Full Text]
  9. Jenkins, R. N., Osborne-Lawrence, S. L., Sinclair, A. K., Eddy, R. L., Jr., Byers, M. G., Shows, T. B., and Duby, A. D. (1990) J. Biol. Chem. 265, 19624-19631 [Abstract/Free Full Text]
  10. Aszódi, A., Hauser, N., Studer, D., Paulsson, M., Hiripi, L., and Bosze, Z. (1996) Eur. J. Biochem. 236, 970-977 [Abstract]
  11. Beck, K., Gambee, J. E., Bohan, C. A., and Bächinger, H. P. (1996) J. Mol. Biol. 256, 909-923 [CrossRef][Medline] [Order article via Infotrieve]
  12. Aszódi, A., Módis, L., Páldi, A., Rencendorj, A., Kiss, I., and Bosze, Z. (1994) Matrix Biol. 14, 181-190 [Medline] [Order article via Infotrieve]
  13. Mundlos, S., and Zabel, B. (1994) Dev. Dyn. 199, 241-252 [Medline] [Order article via Infotrieve]
  14. Colombatti, A., and Bonaldo, P. (1991) Blood 77, 2305-2315 [Medline] [Order article via Infotrieve]
  15. Perkins, S. J., Smith, K. F., Williams, S. C., Haris, P. I., Chapman, D., and Sim, R. B. (1994) J. Mol. Biol. 238, 104-119 [CrossRef][Medline] [Order article via Infotrieve]
  16. Lee, J.-O., Rieu, P., Arnaout, M. A., and Liddington, R. (1995) Cell 80, 631-638 [Medline] [Order article via Infotrieve]
  17. Iozzo, R. V., and Murdoch, A. (1996) FASEB J. 10, 598-614 [Abstract/Free Full Text]
  18. Frohman, M. A. (1993) Methods Enzymol. 218, 340-356 [Medline] [Order article via Infotrieve]
  19. Genetics Computer Group. (1994) Program Manual for the Wisconsin Package, Version 8, Genetics Computer Group, Madison, WI
  20. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) J. Mol. Biol. 215, 403-410 [CrossRef][Medline] [Order article via Infotrieve]
  21. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) Nucleic Acids Res. 22, 4673-4680 [Abstract]
  22. Felsenstein, J. (1989) Cladistics 5, 164-166
  23. Kozak, M. (1989) J. Cell Biol. 108, 229-241 [Abstract]
  24. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690 [Abstract]
  25. Bourdon, M. A. (1990) in Extracellular Matrix Genes (Sandell, L. J., and Boyd, C. D., eds), pp. 166-170, Academic Press, Inc., New York
  26. Lawler, J., Duquette, M., and Ferro, P. (1991) J. Biol. Chem. 266, 8039-8043 [Abstract/Free Full Text]
  27. Lawler, J., and Hynes, R. O. (1986) J. Cell Biol. 103, 1635-1648 [Abstract]
  28. Hanford, P. A., Mayhew, M., Baron, M., Winship, P. R., Campbell, I. D., and Brownlee, G. G. (1991) Nature 351, 164-167 [CrossRef][Medline] [Order article via Infotrieve]
  29. Cohen, C., and Parry, D. A. D. (1990) Proteins Struct. Funct. Genet. 7, 1-15 [Medline] [Order article via Infotrieve]
  30. Sottile, J., Selegue, J., and Mosher, D. F. (1991) Biochemistry 30, 6556-6562 [Medline] [Order article via Infotrieve]
  31. Haudenschild, D. R., Tondravi, M. M., Hofer, U., Chen, Q., and Goetinck, P. F. (1995) J. Biol. Chem. 270, 23150-23154 [Abstract/Free Full Text]
  32. Pearson, W. R., and Lipman, D. J. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 2444-2448 [Abstract]
  33. Muratoglu, S., Bachrati, C., Malpeli, M., Szabó, P., Neri, M., Dozin, B., Deák, F., Cancedda, R., and Kiss, I. (1995) Eur. J. Cell Biol. 68, 411-418 [Medline] [Order article via Infotrieve]
  34. Paulsson, M., and Heinegård, D. (1982) Biochem. J. 207, 207-213 [Medline] [Order article via Infotrieve]
  35. Stirpe, N. S., and Goetinck, P. F. (1989) Development (Camb.) 107, 23-33 [Abstract]
  36. Chen, Q., Johnson, D. M., Haudenschild, D. R., Tondravi, M. M., and Goetinck, P. F. (1995) Mol. Biol. Cell 6, 1743-1753 [Abstract]
  37. Sadler, J. E. (1991) J. Biol. Chem. 266, 22777-22780 [Free Full Text]
  38. Specks, U., Mayer, U., Nischt, R., Spissinger, T., Mann, K., Timpl, R., Engel, J., and Chu, M.-L. (1992) EMBO J. 11, 4281-4290 [Abstract]
  39. Hauser, N., Paulsson, M., Kale, A. A., and DiCesare, P. E. (1995) FEBS Lett. 368, 307-310 [CrossRef][Medline] [Order article via Infotrieve]

©1997 by The American Society for Biochemistry and Molecular Biology, Inc.