©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Differential Expression of an Acidic Domain in the Amino-terminal Propeptide of Mouse Pro-2(XI) Collagen by Complex Alternative Splicing (*)

(Received for publication, June 13, 1994; and in revised form, October 13, 1994)

Noriyuki Tsumaki (§) Tomoatsu Kimura

From the Department of Orthopaedic Surgery, Osaka University Medical School, Suita 565, Japan

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

We isolated and sequenced genomic and cDNA clones encoding the complete amino-terminal portion and the 5`-untranslated region of mouse pro-alpha2(XI) collagen mRNA. Fourteen exons encoded the amino-terminal propeptide, which was divided into three consecutive domains (a long globular domain, an amino-terminal triple helical domain, and a telopeptide domain). The long globular domain was further divided into an upstream basic subdomain and a downstream highly acidic subdomain, as is the case for the amino-terminal propeptides of pro-alpha1(V) and pro-alpha1(XI) collagens. We also demonstrated that the primary transcript undergoes complex alternative splicing. Three consecutive exons (exons 6, 7, and 8) encoding most of the acidic subdomain showed alternative splicing which dramatically affected the structure of the amino-terminal propeptide of pro-alpha2(XI) collagen. Using the reverse transcription-polymerase chain reaction, we analyzed the expression of these exons in various tissues and in developing limb buds of mice. The pro-alpha2(XI) transcripts were abundant in cartilage, but most of them lacked the 3-exon sequences encoding the acidic domain. Most of other tissues also contained mRNAs that corresponded to longer splice variants, including exons 6-8. The differential expression of specific domains of pro-alpha2(XI) collagen may be important in modulating interactions between various components of the extracellular matrix and/or may influence heterotypic collagen assembly.


INTRODUCTION

The fibrillar component of hyaline cartilage consists of several types of collagen. Type II collagen is the major component of the fibrils, whereas a quantitatively minor type IX collagen is associated with their surfaces(1, 2, 3) . Type XI collagen is another component of cartilage fibrils that seems to be localized in the interior of the fibrils(4, 5) . The type XI collagen molecule is composed of three distinct polypeptide subunits: alpha1(XI), alpha2(XI), and alpha3(XI)(6) . The alpha3(XI) chain is believed to be a post-translational variation product of the alpha1(II) collagen gene(7) , whereas the alpha1(XI) and alpha2(XI) chains are distinct gene products that are closely related to the alpha1(V) chain(8, 9, 10) . Type XI collagen is predominantly found in cartilage. However, transcripts of the alpha1(XI) chain have also been detected in a variety of non-cartilagenous cells and tissues(11, 12, 13) . In addition, the alpha1(XI) chain has been detected in bone in association with the alpha1(V) and alpha2(V) chains(14) . Furthermore, the fibrils in the vitreous humor are assembled from molecules containing the alpha1(XI) and alpha2(V) chains(15) . It has therefore been suggested that type V and XI collagens are not separate collagen types but are part of a larger collagen family(15) .

The functions of type V/XI collagen are still obscure. However, emerging evidence suggests that type V collagen forms heterotypic fibrils with the more abundant type I collagen and that it may influence fibrillogenesis by controlling the lateral growth of the fibrils through the coassembly process(16, 17) . This function of type V collagen is presumably based on the particularly slow and/or limited processing of its amino-terminal propeptide (18, 19) (in this paper, the entire amino-terminal portion between the signal peptide and the start of the major triple helix is referred to as the N-propeptide). (^1)Similarly, type XI collagen coassembles with type II collagen and may regulate the diameter of cartilage collagen fibrils(4, 5) . The N-propeptide domain of type XI collagen seems to be at least partly retained after proteolytic processing(20, 21) .

In the present study, we extended our cloning experiments on pro-alpha2(XI) collagen. This study investigated the genomic structure coding for N-propeptide and most of the major triple-helical domain of the mouse pro-alpha2(XI) chain. In addition, we obtained evidence that the N-propeptide domain of pro-alpha2(XI) collagen is differentially expressed because of alternative RNA splicing. The differentially expressed domain is highly acidic and is encoded by three exons that are expressed in various combinations, potentially increasing the functional versatility of type XI collagen.


EXPERIMENTAL PROCEDURES

Isolation of Genomic Clones Containing the Mouse Pro-alpha2(XI) Collagen Gene

A mouse genomic library consisting of 9-23-kb fragments of adult 129sv mouse DNA cloned in the vector EMBL3 was the kind gift of Drs. Hitoshi Niwa and Ken-ichi Yamamura (Kumamoto University School of Medicine). An additional 129sv mouse genomic library in the Lambda FIX II vector was purchased from Stratagene. The insert of a human pro-alpha2(XI) collagen cDNA clone, KTh 93(9) , was used as a probe to screen these mouse genomic libraries. Library filters were screened with hybridization probes using an ECL direct nucleic acid labeling and detection system (Amersham). Phage purification and recombinant DNA isolations were performed using standard methods(22) . After preliminary restriction mapping of positive clones, fragments were subcloned into pBluescript II SK (Stratagene) for nucleotide sequence analysis. In some instances, several deletion series were prepared with a deletion kit for kilo sequencing (Takara). Sequence analysis of double-stranded DNA was carried out using Taq Dye Primer and DyeDeoxy Terminator cycle sequencing kits (Applied Biosystems) and an Applied Biosystems 373A DNA Sequencer.

cDNA Cloning

Poly(A) RNA was extracted from the thorax of 1-day-old mice with a Micro-Fast Track mRNA Isolation Kit (Invitrogen) and was used as the template for rapid amplification of cDNA ends (RACE). 5`-RACE was accomplished using a kit from Life Technologies, Inc., which was based on the method of Frohman(23) . First-strand cDNA synthesis was primed using specific primer 1 (5` AATCGCCTGGGCCTGGGCCTCCTG 3`), based on the partial exon sequence of a genomic subclone coding for part of the amino-terminal telopeptide, which we later found to be exon 13 (see Fig. 2for the positions of the primers and numbering of the exons). After poly(dC) tailing with terminal deoxynucleotidyl transferase, the cDNA was used as the template in the polymerase chain reaction (PCR). This was performed employing a mixture of an anchor primer (5` GGCCACGCGTCGACTAGTACGGGIIGGGIIGGGIIG 3`, supplied in the Life Technologies, Inc. kit) and specific primer 2 (5` TCACCCCCACTACTGCCAAAC 3`), which was complementary to the sequence of exon 13. Uracil DNA glycosylase cloning sequences were added to the 5` ends of both primers. Initial denaturation at 94 °C for 5 min and equilibration at 80 °C was followed by 33 PCR cycles of 95 °C for 1 min, 59 °C for 1 min, and 72 °C for 1.5 min, with final extension at 72 °C for 10 min. Aliquots of the PCR products were electrophoresed on 1.5% agarose gel. A major band of approximately 1,200 bp and several other minor bands were recovered from the gel, cloned into pAMP1 (Life Technologies, Inc.), and sequenced as described above.


Figure 2: Schematic representation of alpha2(XI) N-propeptide and the alignment of two cDNA clones with exon-intron organization. The two cDNAs have the same sequence at their 5` and 3` ends. pRAC2-28 contains a 321-bp deletion relative to pRAC1-15. At the top is the domain structure of alpha2(XI) N-propeptide encoded by pRAC1-15. SP, BS, AS, TH, and TP indicate the signal peptide, basic subdomain, acidic subdomain, amino-terminal triple helical domain, and amino-terminal telopeptide, respectively. Vertical arrows indicate the position of cysteinyl residues in the basic subdomain. The locations of the primers used to obtain cDNAs are indicated by horizontal half-arrows. The bottom part of the figure shows the location of exons 1-14 in the gene. As discussed under ``Results,'' exons 6-8 (hatched boxes) are alternatively spliced exons encoding an acidic subdomain.



Northern Blot

RNA sample was prepared from the whole skeletal tissues of 3-week-old mice as described above. Approximately 3 µg of the poly(A) RNA was electrophoresed in 0.8% agarose gels in the presence of formamide/formaldehyde, transferred onto nylon filters (Hybond N; Amersham), and hybridized to the appropriate probes.

Reverse Transcription (RT)-PCR

The tissue distribution of alpha2(XI) collagen mRNA was studied by RT-PCR. Total RNA was extracted with guanidine thiocyanate from the rib cartilage, skeletal muscle, brain, heart, eye, calvaria, and liver of 3-week-old mice using the previously described method(24) . Total RNA was also extracted from the forelimb buds of 10.0-14.0-day mouse embryos. The RT-PCR was performed as described previously (25) with slight modifications. Briefly, 5 µg of total RNA was transcribed into the first-strand cDNA with a random hexamer primer using Moloney murine leukemia virus reverse transcriptase (Life Technologies, Inc.). Part of the cDNA was amplified by the PCR using a half-aliquot of cDNA as a template and Taq DNA polymerase (Perkin-Elmer/Cetus). The following two specific oligonucleotide primers were used for the reactions: primer A (5` CAGACTCAGAAGCCTCACAG 3`, nt 709-728) and primer B (5` TCCCTCTACAAACATACCAG 3`, complementary to nt 1178-1197) (see Fig. 2and Fig. 3). The PCR was performed with 5 pmol of each primer in 50-µl reaction and involved 28 cycles of denaturing at 95 °C for 1 min, annealing at 57 °C for 1 min, and extension at 72 °C for 2 min. A 983-bp fragment of glyceraldehyde-3-phosphate dehydrogenase cDNA (26) was also amplified as control. After amplification, aliquots of the reaction mixtures were analyzed by electrophoresis on 2.0% agarose gel, blotted onto nylon filter, and hybridized with exon-specific oligonucleotide probes as described below.


Figure 3: Combined nucleotide sequences of the mouse pro-alpha2(XI) cDNAs, pRAC2-28 and pRAC1-15, and of the 5` portion of alpha2(XI) genomic clones as well as the amino acid sequence of the conceptual translation product. Capital letters indicate the coding sequences, whereas lowercase letters signify noncoding sequences. The numbering of nucleotides and amino acid residues begins with the start of the putative signal sequence. The third and fourth rows show human and bovine amino acid sequences(32, 47) . Identical amino acid residues are indicated by short lines. Dots indicate a stretch of the human sequence which is not available. Asterisks indicate sequences absent in the human cDNA. Vertical lines mark the beginning and end of the human and bovine sequences. The positions of exon-intron splice junctions are indicated by triangles above the nucleotide sequence. The open arrow represents the putative signal peptidase cleavage site. The amino-terminal triple helical domain and part of the major triple helical domain are shown by boxed sequences. The alternatively spliced domain encoded by exons 6-8 is shown by a dashed box. Putative N-proteinase cleavage sites in the mouse sequence are underlined. Imperfections in the Gly-Xaa-Yaa triplet structure are underlined by double lines. The cysteine residues are indicated by vertical arrows, and the putative tyrosine sulfation sites are circled.



Hybridization Probes

Oligonucleotide probes specific for exons 6 (oligo 6, 5` GGCTGTCCCCGTAGTCATCA 3`), exon 7 (oligo 7, 5` AGGAATGGCAAGGGACTGGA 3`), and exon 8 (oligo 8, 5` GAGGGCAGGGCCAAGCTCGGTCTCCTCACG 3`) were used to detect alternatively spliced pro-alpha2(XI) mRNAs. An 18-bp oligonucleotide probe (oligo 5/9, 5` GTGGGCAACCTGCTTCTG 3`) corresponding to 9 nt at the 5` end of exon 9 and 9 nt at the 3` end of exon 5 was used to detect pro-alpha2(XI) collagen mRNA with deletion of exons 6-8 (i.e. with exon 5 connected to exon 9). For Northern blotting, exon-specific oligonucleotide probes were 3` end-labeled with [alpha-P]dATP using terminal deoxynucleotidyl transferase and hybridized in Rapid-hyb buffer (Amersham) according to the manufacturer's instructions. The insert from a mouse pro-alpha2(XI) cDNA clone was also used as a probe after random primer labeling (Megaprime DNA labeling system, Amersham). Southern hybridization of the RT-PCR products was performed using an ECL 3`-oligolabeling and detection system (Amersham) according to the manufacturer's instructions.


RESULTS

Identification of Genomic Clones

Four positive genomic clones were selected with the previously characterized cDNA clone coding for the human alpha2(XI) collagen chain(9) . Two of them (NT1 and NT4) were found to be identical by restriction mapping (Fig. 1). Partial nucleotide sequences obtained from subclones of NT1 revealed that they contained exons coding for triple helical sequences (Gly-Xaa-Yaa triplets). The sequence of 24 exons in NT1 covering 1455 bp was initially determined referring to consensus splice junction sequences and by comparison with the reported sequences of other collagen chains. Among these, 18 exons coded for triple-helical domains and 17 of them were 54 or 108 bp long (data not shown), indicating a fibrillar collagen motif(27) . A comparison of the deduced amino acid sequence with the known sequences for human fibrillar collagens demonstrated the highest similarity with the human alpha2(XI) chain (94%), whereas the similarity with other human fibrillar collagens, including alpha1(II), alpha1(V), alpha2(V), and alpha1(XI), was relatively low. In addition, the previously reported partial sequence of the mouse alpha2(XI) gene (28) was contained in clone NT1. Therefore, we concluded that NT1 was part of the mouse alpha2(XI) procollagen gene (col11a-2). Using subclones of NT1 as probes, we then isolated three overlapping genomic clones (NT6, NT8, and NT11) that collectively spanned 40 kb of mouse genomic DNA (Fig. 1). Sequencing of more than 37 exons suggested that the clones contained the entire col11a-2 sequence, which is localized within the major histocompatibility complex(28) . NT6 and NT8 also contained part of the retinoid X receptor beta gene located upstream of col11a-2.(^2)The region of col11a-2 encoding the N-propeptide was completely sequenced and is described below in detail.


Figure 1: Genomic clones coding for mouse pro-alpha2(XI) collagen and the partial exon-intron structure of the gene. The positions of the genomic clones are shown at the top. B and S are restriction sites for BamHI and SalI, respectively. The locations of the identified exons (vertical lines) in the gene are shown below the restriction map. Dashed lines indicate the region which has not been sequenced. Exons coding for the major triple helical domain are bracketed.



Isolation of cDNA Clones Encoding the N-propeptide and 5`-Untranslated Portion of alpha2(XI) mRNA

Using the RACE protocol, we obtained several cDNAs which differed in size by agarose gel electrophoresis. Since a cDNA band of approximately 1,200 bp was the most abundant, it was first recovered from the gel and subcloned for sequencing. Nucleotide sequencing of the insert of a cDNA clone, pRAC2-28, demonstrated that it contained an open reading frame starting after 182 nucleotides of the 5`-untranslated sequence ( Fig. 2and Fig. 3). Following the ATG codon was a stretch of 27 amino acid residues that is typical of eukaryotic signal peptides(29) . The 3` end of the pRAC2-28 insert contained the sequence of specific primer 2 located at the 3` end of the amino-terminal triple helical domain. Examination of the combined nucleotide sequence of cDNA clone pRAC2-28 and the genomic subclone from NT1 indicated that the distance between the putative signal peptidase cleavage site and the beginning of the major triple helical domain was 352 amino acid residues.

Since several minor 5`-RACE products were present in addition to the major cDNA clone, pRAC2-28, some of them were also subcloned and sequenced. Analysis of the longest cDNA clone, pRAC1-15, showed that it had exactly the same 5` and 3` sequences as pRAC2-28 (Fig. 2). However, pRAC1-15 also contained a region of 321 bp which was not present in pRAC2-28. These 321 nucleotides (nt 799-1119 in Fig. 3) found in pRAC1-15 did not change the reading frame and coded for an additional 107 amino acid residues. This additional sequence in the N-propeptide contained multiple tyrosine residues and was highly acidic, with a theoretical pI value of 3.1. Among the 13 tyrosine residues found in this region, at least 4 were embedded within sequences which fulfilled the consensus features for a tyrosine sulfation site(30) . The inclusion of this domain made the configuration of the pro-alpha2(XI) N-propeptide quite similar to that of pro-alpha1(XI) or pro-alpha1(V)(10, 11, 31) . Examination of the nucleotide-derived structure indicated that the longer form of mouse pro-alpha2(XI) N-propeptide was divided into three consecutive domains: a long globular domain, an interrupted collagenous domain (amino-terminal triple helical domain), and a short nonhelical segment (amino-terminal telopeptide). In addition, as noted for the pro-alpha1(XI) and pro-alpha1(V) chains, the globular domain of the pro-alpha2(XI) N-propeptide was divided into two subdomains, which were an upstream basic subdomain (theoretical pI = 10.2) containing 4 cysteine residues and a downstream highly acidic subdomain rich in tyrosine ( Fig. 2and 3). These results indicated that there are at least two distinct populations of mouse pro-alpha2(XI) collagen mRNA. The major transcript encodes a 352-amino acid-long N-propeptide, whereas the minor transcript encodes a 459-amino acid-long N-propeptide containing a highly acidic subdomain. Comparison of the mouse pro-alpha2(XI) N-propeptide sequence with that of humans (32) revealed that the human sequence contained a similar 21 amino acids coding region located within the acidic subdomain encoded by pRAC1-15 (Fig. 3) and absent from pRAC2-28.

Mouse pro-alpha2(XI) collagen contains 11 putative N-proteinase cleavage sites (Ala-Gln or Pro-Gln), and some of them have no counterparts in the human or bovine sequences (Fig. 3). Although Ala-AlaGln at positions 468-470 most closely resembled the conserved sequences of fibrillar procollagen chains(11, 33) , it lacked an associated phenylalanine at position -3, which is suggested to be critical for the action of N-proteinase(33) .

Exon Structure and Alternative Splicing

The 321-bp difference in the two cDNA clones, pRAC1-15 and pRAC2-28, suggested the occurrence of alternative splicing of the primary gene transcript. Therefore, we analyzed the exons encoding the N-propeptide of alpha2(XI) collagen. Comparison of the nucleotide sequences of genomic clones of NT1 and NT8 with that of alpha2(XI) cDNAs showed that the genomic clones contained exons encoding the alpha2(XI) N-propeptide. This region of the gene contained 14 exons (Fig. 2). Exon 1 encodes the 5`-untranslated region and the signal peptide. Exons 2-5 encode a basic domain containing 4 cysteine residues. The adjacent three 3` exons, 6, 7, and 8, encode an acidic domain rich in tyrosine residues. Then the amino-terminal triple helical domain is encoded by exons 9-13. As is the case for other fibril-forming collagen genes, the most 3` exon (exon 14) of the N-propeptide is a junction exon that contains coding information for both the amino-terminal telopeptide and the beginning of the major triple helical sequences. The 321-bp sequence in the longest cDNA clone, pRAC1-15, was thus encoded by three exons (exons 6, 7, and 8) of 78, 63, and 180 bp, respectively (Fig. 3). The coding sequence within exon 7 is also found in the reported human alpha2(XI) cDNA clone(32) . However, sequences corresponding to exons 6 and 8 are not found in the human cDNA. These finding and also the presence of a 3-exon sequence difference in the two mouse cDNAs raised the possibility that still another alternative splicing event may occur in the primary transcript of col11a-2.

To detect the expression of each of the alternatively spliced exons, Northern blot analysis was performed. As shown in Fig. 4, the insert of the cDNA clone (pRAC2-28) hybridized to an RNA band migrating around 6.0-6.4 kb. Oligonucleotide probes specific for exons 6 and 8 also hybridized to bands that were almost identical with the band obtained using the cDNA probe. Probe specific for exon 7 did not show a clear positive signal under these hybridization and washing conditions. These results suggest that at least exons 6 and 8 were present in part of the mouse pro-alpha2(XI) transcripts. However, the size difference that was predicted to be contributed by exons 6-8 could not be resolved into separate bands by this agarose gel electrophoresis method.


Figure 4: Northern blot hybridization of pro-alpha2(XI) cDNA and exon-specific probes. The probes used were the random primer-labeled insert of clone pRAC2-28 and 3` end-labeled oligonucleotides (see ``Experimental Procedures'' and A of Fig. 5for the positions of the exon-specific probes). Each lane contained approximately 3 µg of poly(A) RNA isolated from whole skeletal tissue. The positions of the RNA size markers are indicated on the left in kilobases.




Figure 5: RT-PCR analysis of alternative splicing of exons 6-8 of the pro-alpha2(XI) gene. First-strand cDNA prepared from the indicated tissues was subjected to the PCR using primers A and B (A). The amplification products were separated out on 2% agarose gels and stained with ethidium bromide (B). Control PCR products of glyceraldehyde-3-phosphate dehydrogenase ( G3PDH ) are shown below B. C, Southern blots of the PCR products hybridized with oligonucleotide probes (see ``Experimental Procedures''). Probes specific for exons 5/9, 6, 7, and 8 were used to identify combination of alternatively spliced exons. The predicted sizes of the PCR products with various exon combinations are shown on the left of the panel. Each designation represents the splice variant containing the indicated exons. Possible splice variants that may represent low level aberrant splicing are indicated in brackets. p.c., postcoitum.



RT-PCR Analysis of Alternative Splicing Affecting the N-propeptide of alpha2(XI) Collagen in Various Tissues

In order to detect individual mRNA populations and also to study whether there were tissue-specific differences in alternative splicing affecting the N-propeptide of alpha2(XI) collagen, total RNA isolated from various mouse tissues was analyzed by the RT-PCR. The region of interest was amplified using the 5` primer (primer A) corresponding to sequences in exon 5 and the 3` primer (primer B) corresponding to the complementary sequences bridging exons 9 and 10 (Fig. 5A). Several bands were detected in most of the RNAs analyzed (Fig. 5B). The farthest migrating band (168 bp) corresponds to RNA lacking exons 6-8 (designated as E59), whereas the band (489 bp) closest to the origin corresponds to RNA with these exons (designated as E56789; the underlining indicates the presence of alternatively spliced exons). These results indicated that the major mRNA species is E59, which is found mostly in cartilage. A low level of E59 has also been detected in most of other tissues of 3-week-old mice. Since the presence of several bands migrating between 168 and 489 bp suggested that there were also various other combinations of alternatively spliced exons, we performed Southern hybridization of the RT-PCR products with an oligonucleotide probe spanning exons 5/9 and probes specific for exons 6, 7, and 8. The results confirmed that most of the col11a-2 transcripts in cartilage lacked exons 6-8 (Fig. 5, B and C). In addition, various combinations of exons 6-8 could be detected in various tissues. The longest transcript (E56789; 489-bp bands that were positive with all of the oligo 6, 7, and 8 probes) containing exons 6-8 was detected in most tissues other than cartilage (Fig. 5C). E569 was also detected in various tissues and was most predominant in RNA isolated from the brain. These results indicate that E59, E569, and E56789 are major combinations of alternatively spliced exons in a variety of tissues. Furthermore, E589 seems to be present in RNA from the calvaria. Other splice variants (such as E5689 and E5789) migrating between E59 and E56789 could also be present and are indicated in brackets in Fig. 5C. However, most of these amplification products could not be easily seen in the agarose gel and may represent low level aberrant splicing variants. When we examined the expression of col11a-2 in the developing forelimb bud (10-14 days postcoitum), the E59 level was found to increase markedly along with the progression of chondrogenesis (Fig. 5, B and C). The level of E569 also increased up to day 12 and decreased thereafter.

Proof of the identity of some of the PCR products (E59, E569, E589, E5689, and E56789) was obtained by subcloning and DNA sequencing of the amplification products (data not shown), and the exon-intron boundaries were further confirmed by comparison with the gene sequences (Fig. 6). The sequences at the 5` and 3` splice sites conformed well with the general splice consensus sequences(34, 35) . There was an auxiliary splice site located 6 nt downstream from the 5` end of exon 8 and which was also used in some of the subclones from E589 (Fig. 6). Between 15 and 40 nt upstream from the 3` cleavage site of each intron, putative lariat branch point sequences were found (Fig. 6) (see (34) and (35) for review). The pattern of divergence from the consensus sequence, as such, did not seem to correlate with alternative splicing of exons 6-8.


Figure 6: Nucleotide sequences at the alternatively spliced exon-intron boundaries. Exon sequences are indicated by capital letters and intron sequences by lowercase letters. The consensus sequences of the 5` and 3` splice site and branch point (34, 35) are shown at the top. The putative lariat branch point sequences located 15-40 nucleotides upstream from the 3` cleavage sites of the introns are underlined. The six nucleotides at the 5` end of exon 8 were spliced out in some subclones of the E589 variant (denoted by italics). r, purine; y, pyrimidine; n, any base.




DISCUSSION

We isolated mouse genomic clones containing the entire col11a-2 and sequenced approximately 70% of its coding region, including the complete N-propeptide coding exons. Comparison of the gene sequence with cDNAs indicated that the mouse pro-alpha2(XI) N-propeptide was encoded by 14 exons. In addition, the results provided evidence that there are complex alternative spliced forms of the mouse pro-alpha2(XI) collagen mRNA. At least four combinations of exons 6-8 (E59, E569, E56789, and E589) have been identified. Depending on the combination of these exons, the length of the alpha2(XI) N-propeptide could range from 352 to 459 amino acid residues. Inclusion of each of these exons not only changes the length but seems to dramatically affect the structure and nature of the N-propeptide, since each of the alternatively spliced exons encodes a highly acidic domain.

Although the longest alpha2(XI) mRNA shows relatively low expression, it seems to encode an N-propeptide with a domain structure very similar to those of pro-alpha1(XI) and pro-alpha1(V) collagens. In this context, the pro-alpha1(XI), pro-alpha2(XI), and pro-alpha1(V) chains appears to constitute a distinct subclass within the type V/XI collagen family. However, as reported previously, the primary structure of each of the domains in the N-propeptide is not equally similar between these collagen chains(31) . The primary structure of the acidic subdomain in the N-propeptide differs among the pro-alpha1(V), pro-alpha1(XI), and pro-alpha2(XI) chains in contrast to the similarities found in other regions. The occurrence of alternative splicing of the pro-alpha2(XI) chain generates further diversity and may increase the potential functional versatility of the N-propeptide subdomain.

Many genes are known to produce alternatively spliced mRNAs that each encode a different protein(36) . In some instances, alternative splicing produces several forms of a protein that are necessary at different times in different tissues. Alternative splicing is also known to occur in collagen genes such as alpha1(II), alpha3(IV), alpha2(VI), alpha3(VI), and alpha1(XIII)(37, 38, 39, 40, 41, 42, 43) . Our present findings thus add another example of alternative splicing in collagen genes. As demonstrated by RT-PCR analysis, complex alternative splicing of pro-alpha2(XI) collagen occurred in various tissues. Pro-alpha2(XI) mRNA without exons 6-8 was more abundant in cartilage, and mRNAs including each of these exons in various combinations were mostly found in non-cartilaginous tissues. It has been demonstrated previously that the pro-alpha1(II) mRNA, including exon 2, which codes for a cysteine-rich globular domain, is observed in non-cartilaginous tissues and prechondrogenic cells, whereas the mRNA without exon 2 is localized in cartilage and cells with a chondrogenic phenotype(40, 44, 45) . The structures of the alternatively spliced domains in the N-propeptides of pro-alpha1(II) and pro-alpha2(XI) are not homologous. However, considering that procollagen alpha1(II) and alpha2(XI) mRNAs are coordinately expressed in many tissues (46) and that the pro-alpha1(II) gene also codes for the alpha3 chain of type XI collagen, it is tempting to assume that similar tissue-specific splicing mechanisms act on both of these mRNAs. We also speculate that the alpha1 chain of type XI collagen could undergo similar alternative splicing, since the domain structure (and possibly the gene structure) of the alpha1(XI) and alpha2(XI) N-propeptides is highly conserved.

Procollagen type XI has been suggested to undergo two-step proteolytic processing during its conversion to the matrix form, with the first step removing the carboxyl propeptides and the second step involving cleavage within the N-propeptides. Rotary shadowing data (20) have revealed a substructure within the N-propeptide domain consisting of a hinge, followed by a short rod like stretch in the matrix form of type XI collagen. It is therefore unlikely that the putative N-proteinase cleavage sites found in the amino-terminal telopeptide of the alpha2(XI) chain and the alpha1(XI) chain (11) are utilized, since the entire N-propeptide would be removed if such site was cleaved. It is more likely that cleavage of the alpha2(XI) propeptide occurs upstream of the amino-terminal triple helical domain. In support of this, the basic domain of the alpha2(XI) N-propeptide has been recovered as a continuous peptide (proline/arginine-rich domain; amino acid positions 28-245 in Fig. 3) from cartilage as a product of processing(47) . It is therefore possible that Pro-Gln at positions 258-259 or 263-264 is the site cleaved by N-proteinase. Alternatively, yet another procollagen peptidase may be necessary for processing of the N-propeptide of the alpha2(XI) collagen chain.

The importance of these alternatively spliced domains of the alpha2(XI) N-propeptide remains unclear. However, it should be remembered that the alternatively spliced tyrosine-rich acidic subdomain corresponds to a region in pro-alpha1(V) collagen that has been implicated in the regulation of heterotypic type I + V collagen fibrillogenesis. This region of the alpha1(V) chain remains after procollagen processing and projects away from the major triple-helical axis with a short amino-terminal triple helical arm(17) . Accordingly, the acidic subdomain resides on the surface of fibrils and may sterically prevent further deposition of collagen molecules(16, 17) . In the type II + XI collagen fibrils, the persisting N-propeptide region of type XI collagen seems to play a homologous role(5, 20, 21) . We propose that the differential expression of the acidic subdomain in the pro-alpha2(XI) collagen plays an additional regulatory role in type II + XI collagen fibrillogenesis. If such a domain was, at least transiently, expressed on the surface of the fibrils, it would modulate interactions with other molecules.


FOOTNOTES

*
This work was supported in part by Scientific Research Grant 06454428 from the Ministry of Education, Science and Culture of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) D38412[GenBank].

§
To whom correspondence should be addressed: Dept. of Orthopaedic Surgery, Osaka University Medical School, 2-2 Yamada-oka, Suita City, Osaka 565, Japan. Tel.: 81-6-879-3552; Fax: 81-6-879-3559.

(^1)
The abbreviations used are: N-propeptide, the amino-terminal propeptide of the procollagen molecule; RACE, rapid amplification of cDNA ends; PCR, polymerase chain reaction; RT-PCR, reverse transcription-PCR; nt, nucleotide(s); bp, base pair(s); kb, kilobase pair(s); col11a-2, the gene encoding murine pro-alpha2(XI) collagen; N-proteinase, the enzyme that specifically cleaves the N-propeptide.

(^2)
N. Tsumaki and T. Kimura, manuscript in preparation.


ACKNOWLEDGEMENTS

We are very grateful to Dr. Toshihiko Hayashi for helpful discussions regarding this manuscript.


REFERENCES

  1. Eyre, D. R., Apon, S., Wu, J.-J., Ericsson, L. H., and Walsh, K. A. (1987) FEBS Lett. 220, 337-341 [CrossRef][Medline] [Order article via Infotrieve]
  2. van der Rest, M., and Mayne, R. (1988) J. Biol. Chem. 263, 1615-1618 [Abstract/Free Full Text]
  3. Vaughan, L., Mendler, M., Huber, S., Bruckner, P., Winterhalter, K. H., Irwin, M. I., and Mayne, R. (1988) J. Cell Biol. 106, 991-997 [Abstract]
  4. Mendler, M., Eich-Bender, S. G., Vaughan, L., Winterhalter, K. H., and Bruckner, P. (1989) J. Cell Biol. 108, 191-197 [Abstract]
  5. Petit, B., Ronzière, M. C., Hartmann, D. J., and Herbage, D. (1993) Histochemistry 100, 231-239 [Medline] [Order article via Infotrieve]
  6. Morris, N. P., and Bächinger, H. P. (1987) J. Biol. Chem. 262, 11345-11350 [Abstract/Free Full Text]
  7. Furuto, D. K., and Miller, E. J. (1983) Arch. Biochem. Biophys. 226, 604-611 [Medline] [Order article via Infotrieve]
  8. Bernard, M., Yoshioka, H., Rodriguez, E., van der Rest, M., Kimura, T., Ninomiya, Y., Olsen, B. R., and Ramirez, F. (1988) J. Biol. Chem. 263, 17159-17166 [Abstract/Free Full Text]
  9. Kimura, T., Cheah, K. S. E., Chan, S. D. H., Lui, V. C. H., Mattei, M.-G., van der Rest, M., Ono, K., Solomon, E., Ninomiya, Y., and Olsen, B. R. (1989) J. Biol. Chem. 264, 13910-13916 [Abstract/Free Full Text]
  10. Takahara, K., Sato, Y., Okazawa, K., Okamoto, N., Noda, A., Yaoi, Y., and Kato, I. (1991) J. Biol. Chem. 266, 13124-13129 [Abstract/Free Full Text]
  11. Yoshioka, H., and Ramirez, F. (1990) J. Biol. Chem. 265, 6423-6426 [Abstract/Free Full Text]
  12. Brown, K. E., Lawrence, R., and Sonenshein, G. E. (1991) J. Biol. Chem. 266, 23268-23273 [Abstract/Free Full Text]
  13. Nah, H.-D., Barembaum, M., and Upholt, W. B. (1992) J. Biol. Chem. 267, 22581-22586 [Abstract/Free Full Text]
  14. Niyibizi, C., and Eyre, D. R. (1989) FEBS Lett. 242, 314-318 [CrossRef][Medline] [Order article via Infotrieve]
  15. Mayne, R., Brewton, R. G., Mayne, P. M., and Baker, J. R. (1993) J. Biol. Chem. 268, 9381-9386 [Abstract/Free Full Text]
  16. Birk, D. E., Fitch, J. M., Babiarz, J. P., and Linsenmayer, T. F. (1988) J. Cell Biol. 106, 999-1008 [Abstract]
  17. Linsenmayer, T. F., Gibney, E., Igoe, F., Gordon, M. K., Fitch, J. M., Fessler, L. I., and Birk, D. E. (1993) J. Cell Biol. 121, 1181-1189 [Abstract]
  18. Fessler, J. H., and Fessler, L. I. (1987) in Structure and Function of Collagen Types (Mayne, R., and Burgeson, R. E., eds) pp. 81-103, Academic Press, Orlando, FL
  19. Birk, D. E., Fitch, J. M., Babiarz, J. P., Doane, K. J., and Linsenmayer, T. F. (1990) J. Cell Sci. 95, 649-657 [Abstract]
  20. Thom, J. R., and Morris, N. P. (1991) J. Biol. Chem. 266, 7262-7269 [Abstract/Free Full Text]
  21. Morris, N. P., Keene, D. R., and Oxford, J. R. T. (1994) Trans. 40th Annu. Meet. Orthop. Res. Soc. 19, 423
  22. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  23. Frohman, M. A. (1990) in PCR Protocols: A Guide to Methods and Applications (Innis, M. A., Gelfand, D. H., Sninsky, J. J., and White, T. J., eds) pp. 28-38, Academic Press, New York
  24. Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J., and Rutter, W. J. (1979) Biochemistry 18, 5294-5299 [Medline] [Order article via Infotrieve]
  25. Nakata, K., Nakahara, H., Kimura, T., Kojima, A., Iwasaki, M., Caplan, A. I., and Ono, K. (1992) FEBS Lett. 299, 278-282 [CrossRef][Medline] [Order article via Infotrieve]
  26. Sabath, D. E., Broome, H. E., and Prystowsky, M. B. (1990) Gene (Amst.) 91, 185-191 [CrossRef][Medline] [Order article via Infotrieve]
  27. Yamada, Y., Avvidimento, V. E., Mudryj, M., Ohkubo, H., Vogeli, G., Irani, M., Pastan, I., and de Crombrugghe, B. (1980) Cell 22, 887-892 [Medline] [Order article via Infotrieve]
  28. Stubbs, L., Lui, V. C. H., Ng, L. J., and Cheah, K. S. E. (1993) Mammal. Genome 4, 95-103 [Medline] [Order article via Infotrieve]
  29. Perlman, D., and Halvorson, H. O. (1983) J. Mol. Biol. 167, 391-409 [Medline] [Order article via Infotrieve]
  30. Huttner, W. B. (1987) Trends Biochem. Sci. 12, 361-363 [CrossRef]
  31. Greenspan, D. S., Cheng, W., and Hoffman, G. G. (1991) J. Biol. Chem. 266, 24727-24733 [Abstract/Free Full Text]
  32. Zhidkova, N. I., Brewton, R. G., and Mayne, R. (1993) FEBS Lett. 326, 25-28 [CrossRef][Medline] [Order article via Infotrieve]
  33. Morikawa, T., Tuderman, L., and Prockop, D. J. (1980) Biochemistry 19, 2646-2650 [Medline] [Order article via Infotrieve]
  34. Green, M. R. (1991) Annu. Rev. Cell Biol. 7, 559-599 [CrossRef]
  35. Balvay, L., Libri, D., and Fiszman, M. Y. (1993) Bioessays 15, 165-169 [Medline] [Order article via Infotrieve]
  36. Breitbart, R. E., Andreadis, A., and Nadal-Ginard, B. (1987) Annu. Rev. Biochem. 56, 467-495 [CrossRef][Medline] [Order article via Infotrieve]
  37. Pihlajaniemi, T., Myllylä, R., Seyer, J., Kurkinen, M., and Prockop, D. J. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 940-944 [Abstract]
  38. Chu, M.-L., Pan, T.-C., Conway, D., Kuo, H.-J., Glanville, R. W., Timpl, R., Mann, K., and Deutzmann, R. (1989) EMBO J. 8, 1939-1946 [Abstract]
  39. Doliana, R., Bonaldo, P., and Colombatti, A. (1990) J. Cell Biol. 111, 2197-2205 [Abstract]
  40. Ryan, M. C., and Sandell, L. J. (1990) J. Biol. Chem. 265, 10334-10339 [Abstract/Free Full Text]
  41. Saitta, B., Stokes, D. G., Vissing, H., Timpl, R., and Chu, M.-L. (1990) J. Biol. Chem. 265, 6473-6480 [Abstract/Free Full Text]
  42. Metsäranta, M., Toman, D., de Crombrugghe, B., and Vuorio, E. (1991) J. Biol. Chem. 266, 16862-16869 [Abstract/Free Full Text]
  43. Feng, L., Xia, Y., and Wilson, C. B. (1994) J. Biol. Chem. 269, 2342-2348 [Abstract/Free Full Text]
  44. Sandell, L. J., Morris, N., Robbins, J. R., and Goldring, M. B. (1991) J. Cell Biol. 114, 1307-1319 [Abstract]
  45. Nah, H.-D., and Upholt, W. B. (1991) J. Biol. Chem. 266, 23446-23452 [Abstract/Free Full Text]
  46. Sandberg, M. M., Hirvonen, H. E., Elima, K. J. M., and Vuorio, E. I. (1993) Biochem. J. 294, 595-602 [Medline] [Order article via Infotrieve]
  47. Neame, P. J., Young, C. N., and Treep, J. T. (1990) J. Biol. Chem. 265, 20401-20408 [Abstract/Free Full Text]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.