(Received for publication, May 15, 1996, and in revised form, December 26, 1996)
From the Institute for Molecular Science of Medicine, Aichi Medical University, Nagakute, Aichi 480-11, Japan
We investigated the occurrence of alternatively
spliced forms (V0, V1, V2, and V3) of PG-M/versican, a large
chondroitin sulfate proteoglycan in developing chicken retinas,
using the reverse transcription-polymerase chain reaction. We
characterized the PLUS domain, which is apparently unique to the
chicken molecule and is regulated by alternative splicing. PG-M in
chicken retinas consisted of four forms with (V0, V1, V2, and V3) and
two forms without (V1 and V3) the PLUS domain (PG-M+
and PG-M, respectively). The four forms of
PG-M+ were found in all samples examined, but the
occurrence of the two PG-M
forms was regulated
developmentally. Genomic analysis has revealed that the PLUS and CS-
domains are encoded by a single exon, and this exon has an internal
alternative 5
-splice donor site, allowing alternative spliced forms
that do not include the 3
-end of the exon. Sequences corresponding to
the chicken PLUS domain (plus) were not found in mouse and
human and may have disappeared during evolution. Sequence similarity
suggests that the PLUS domain corresponds to the keratan sulfate
attachment domain of aggrecan and that it has a distinct function in
the chicken eye.
PG-M, a large chondroitin sulfate proteoglycan, is a major extracellular matrix molecule located in the mesenchymal cell condensation regions of developing chicken limb buds (1). Its expression, however, is regulated in an inverse relationship to that of aggrecan, and PG-M disappears after cartilage development (2). PG-M is also transiently expressed in various embryonic tissues during morphogenesis and differentiation (3). Therefore, PG-M may play some regulatory roles in many biological events.
Our cDNA studies on the core proteins of mouse PG-M revealed four
mRNA species designated PG-M(V0), PG-M(V1), PG-M(V2), and PG-M(V3)
in order of length (4, 5). All have hyaluronan-binding domains at the
amino terminus and two epidermal growth factor (EGF)1-like domains, a lectin-like domain,
and a complement regulatory protein (CRP)-like domain at the carboxyl
terminus. The amino- and carboxyl-terminal regions show binding
activity for hyaluronan (1, 6) and a C-type lectin-like activity (7),
respectively. However, they have different chondroitin sulfate
attachment regions in the middle of the core proteins. The differences
are generated by alternative and simultaneous usage of the two
different domains for the chondroitin sulfate attachment region (CS-
and CS-
).
Versican was first identified in human fibroblasts by a cDNA study (8, 9). Homology analysis of the deduced amino acid sequence demonstrated that versican corresponds to the core protein of PG-M(V1) (4). Other forms (V0, V2, and V3) of human versican have since been identified (5, 10).
Although there are four forms of mouse and human PG-M/versican,
PG-M(V2) and PG-M(V3) have not yet been identified in chicken (11). We
reported that the chondroitin sulfate attachment region of chicken
PG-M(V1) is longer than that of mouse PG-M(V1) (4), suggesting an extra
domain between the hyaluronan-binding domains and the CS- domain in
chicken PG-M. Whether or not this domain is a single exon and whether
or not its expression is regulated by alternative splicing remain to be
examined.
In this study, we investigated the occurrence of multiple forms (V0, V1, V2, and V3) of PG-M in the developing chicken retina and found alternative splicing for this domain, which we have named the "PLUS" domain. Although the significance of diverse alternative splicing for PG-M is not known, each form of PG-M may have a unique function in this developing organ. Because PG-M and aggrecan are structurally similar, the PLUS domain might be related to the keratan sulfate attachment domain of aggrecan. We discuss the evolutionary significance of this finding.
Total RNAs were obtained from whole eyes of chicken embryos (White Leghorn) on days 5, 7, and 9 (designated E5, E7, and E9, respectively). Total retinal RNA was obtained from chicken embryos on days 14 and 20 (designated E14 and E20, respectively) and from adult chicken. RNA was extracted using guanidinium thiocyanate (12).
cDNA LibrariesHuman fetal brain, fetal liver, cerebral cortex, and skeletal muscle and mouse brain, embryonic stem cell, and skeletal muscle cDNA libraries were obtained commercially (CLONTECH, Palo Alto, CA).
RT-PCR AmplificationPrimers for RT-PCR amplifications were
chosen from the published sequences of chicken PG-M to detect the
specific portion of each splicing form. Reverse transcription was
performed using three antisense primers (see Fig. 1A,
e, j, and m; and Table I) and
SuperScript II RNase H reverse transcriptase (Life
Technologies, Inc.) as recommended by the manufacturer. The first PCR
amplifications were carried out using pairs of outer primers (see Fig.
1A, a, d, f, i,
and l; and Table I). The second PCR was performed using the
first PCR products as templates and pairs of inner primers (see Fig. 1A, b, c, g, h,
and k; and Table I). Conditions for PCR amplification were
as described (5). We amplified chicken genomic DNAs (CLONTECH) using
the LA PCR kit (Takara Biomedicals, Kyoto, Japan) as recommended by the
manufacturer. The final products were resolved by electrophoresis on a
1.2% (v/w) agarose gel or on 2% (v/w) NuSieve (3:1; FMC Corp. BioProducts, Rockland, ME).
|
PCR products were purified with the EasyPrep PCR Product Prep kit (Pharmacia Biotech, Uppsala). Purified DNAs were sequenced as described (5). The sequencing primers were identical to those used for the above RT-PCR amplifications.
Sequence Similarity Analysis of the PLUS DomainThe sequence of the PLUS domain was compared with the data base compiled by the European Bioinformatics Institute using the GENETYX-MAC computer program (Software Development Co., Tokyo). The deduced amino acid sequence was compared with other protein sequences in the data base compiled by the National Biomedical Research Foundation and the European Bioinformatics Institute.
The second RT-PCR
amplifications performed using the inner primer pairs on E14 and E20
retinal cDNAs and on retinal cDNAs from adult chicken
(1-year-old) generated one or two products (Table I and
Fig. 1 (A and B)). The latter
indicated the presence of two transcripts with and without the exon
containing ~400 nucleotides and corresponding to the PLUS domain. The
shorter transcripts without the exon were found in PG-M(V1) and
PG-M(V3) of E14 retina and in PG-M(V1) of adult retina.
PG-M+ and PG-M refer to PG-M with and without
the PLUS domain, respectively. Four forms (V0, V1, V2, and V3) of
PG-M+ were detected in all retinas (Fig. 1B).
However, the occurrence of PG-M
was developmentally
regulated. PG-M
(V1) was detected in E14 and adult retinas
(Fig. 1B, lanes 3 and 15), but not in
E20 retina (lane 8). PG-M
(V3) was detected in
E14 retina (Fig. 1B, lane 5), but not in E20 and
adult retinas (lanes 10 and 17). Since the primer
pair b and c only gave a band corresponding to the product containing the PLUS domain (Fig. 1B, lanes 6, 11,
and 18), no forms containing the exon for the
hyaluronan-binding region directly spliced to that for the CS-
domain. A summary of the variation of PG-M forms expressed in E14, E20,
and adult retinas is shown in Table II.
|
We further examined the PG-M forms in E5, E7, and E9 whole
eyes using RT-PCR to determine the relevance of PG-M to
the developmental stage. We also examined the presence of PG-M
(V0) and PG-M
(V2). Since the retinas of
these early embryos were too small to isolate, we analyzed whole eyes.
The results revealed the presence of all forms (V0, V1, V2, and V3) of
PG-M+ and two forms (V1 and V3) of PG-M
(Fig.
1C). These expression profiles were the same as those in E14
retina. The variation of PG-M forms expressed in E5, E7, and E9 whole
eyes is summarized in Table II.
We
compared the DNA sequences of the PCR products of PG-M+ and
PG-M. The results revealed alternative splicing of the
PLUS domain, which was located between the hyaluronan-binding B
domain
(nucleotide 1183) and the CS-
domain (nucleotide 1598) for
PG-M+(V0) and PG-M+(V2), between the
hyaluronan-binding B
domain and the CS-
domain (nucleotide 4379)
for PG-M+(V1), and between the hyaluronan-binding B
domain
and the EGF-like domain (nucleotide 9905) for PG-M+(V3)
(Fig. 2A). The results also showed that the
PLUS domain consisted of 414 nucleotides (nucleotides 1184-1597 for
PG-M+(V0)) (Fig. 2B). New termination codons and
shifts of reading frames were not identified in these junctional
regions (Fig. 2A). A computer-assisted sequence similarity
search for the PLUS domain in nucleic acid and protein data bases did
not identify other genes with significant homology.
Location of the Exon Coding for the PLUS Domain in the Chicken PG-M Gene
To confirm the presence of the exon gene for the PLUS domain
in the chicken PG-M gene, which we named plus, we performed
PCR studies on chicken genomic DNA. Primers specific to exons for the
hyaluronan-binding B domain and the CS-
domain were used together
with internal primers to the exon for the PLUS domain to determine the
position of the plus exon in the PG-M gene. Analysis of the
PCR products indicated that the exon was ~12 kilobases downstream of
the exon for the B
domain (Fig. 3A,
lane 2), but adjacent to that for the CS-
domain
(lane 4). We then sequenced the PCR product amplified with
the primer pair p and q. The results showed no intron between the PLUS
and CS-
domain-encoding sequences, suggesting that they are encoded
by a single exon (Fig. 3B). We named this domain
"PLUS-
." The nucleotide sequence of the boundary region between
the two domains is shown in Fig. 3B. Although there was an
exon terminus-like sequence (AAG) in the 3
-terminal portion of the
PLUS domain and a splicing donor site-like sequence in the 5
-terminal
portion of the CS-
domain (Fig. 3B, boldface letters), there was no typical acceptor site-like sequence in the
3
-terminal portion of the PLUS domain. This sequence causes similar
alternative splicing in other genes and is termed an internal alternative 5
-splice donor site (13-15).
Absence of the PLUS Domain in Human and Mouse cDNA Libraries
Sequences corresponding either to an internal
alternative 5-splice donor site or to the PLUS domain in human or
mouse PG-M/versican have not been described (16, 17). To confirm that
there is no PLUS domain in human and mouse PG-M, we amplified the
relevant cDNAs from several cDNA libraries of various human and
mouse tissues using appropriate primers (Table I and Fig.
4). The products showed a single band or no band (Fig.
4), confirming that cDNAs for the PLUS domain were absent in those
cDNA libraries (no PG-M+ forms). Although cDNA
libraries of mouse and human retinas were not examined, cDNAs for
PG-M+ were found in cDNA libraries of various chicken
tissues corresponding to those of the human and mouse tissues tested in
the above experiment. Therefore, the PLUS domain may be unique to
chicken PG-M.
This study revealed that there is an internal alternative
5-splice donor site at the boundary of the PLUS and CS-
domains in
the exon for the PLUS-
domain (Fig. 3B). The absence of a typical 3
-splice acceptor site at the boundary is the reason why the
V0 and V2 forms of PG-M
are absent (Fig.
5). The exon for the PLUS-
domain functions as a
single exon in the expression of the V0 and V2 forms of
PG-M+, but this exon is spliced out in the expression of
the V1 and V3 forms of PG-M
(Fig. 5). An internal
alternative 5
-splice donor site in the exon for the CS-
domain
functions like the beginning of an intron ("pseudo intron") in the
expression of the V1 and V3 forms of PG-M+. During the
splicing, the plus sequence remains as an exon for the V1
and V3 forms of PG-M+ (Fig. 5). Since we found that
PG-M+(V1) is the major form of PG-M in 10-day chicken
embryonic fibroblasts (11), this splicing event does not seem to be
rare.
This study showed that all forms (V0, V1, V2, and V3) of
PG-M+ were present in all samples examined. On the other
hand, the V1 and V3 forms of PG-M are expressed in a
developmentally regulated manner and tend to be expressed at the
earlier stages (Table II), suggesting that alternative splicing
skipping the exon for the PLUS-
domain is regulated developmentally.
The size of the intron between exon VI (B
domain) and exon VII (CS-
domain) in human or mouse PG-M is ~6 kilobases (16, 17). This study
also showed that the size of the intron between the exons for the B
and PLUS-
domains is ~12 kilobases in chicken PG-M/versican.
Considering the difference, a region of the gene containing the PLUS
domain sequence and an intron of ~6 kilobases might have been removed
during evolution by some mechanism. The sequence similarity between the
PLUS domain (414 nucleotides and 137 amino acids) and the first part of
the mouse or human CS-
domain (the same numbers of nucleotides and amino acids) is fairly low, not only in nucleotide sequences (48.6 and
48.8% identity to human and mouse, respectively), but also in amino
acid sequences (19.0 and 10.3% identity to human and mouse,
respectively), which supports the above notion. However, it is still
possible that the PLUS domain is not a separately defined domain, but
is simply an alternatively spliced part of the chicken CS-
domain.
Sequence similarity
analysis has not not identified any nucleotide or amino acid sequences
in the data bases similar to those for the PLUS domain, except for some
identity in the nucleotide sequence to the KS domain of aggrecan as
discussed below. Since the PLUS domain was detected in chicken
PG-M/versican, but not in human or mouse PG-M/versican as far as we
investigated in available cDNA libraries, it is likely that this
domain is unique to chicken PG-M/versican. Genomic analysis of human
and mouse PG-M/versican proteins has shown that the total number of
exons is identical (15 exons) (16, 17). Comparisons of nucleotide
sequences of the cDNAs and deduced amino acid sequences among
chicken, human, and mouse PG-M/versican proteins suggested that chicken
PG-M/versican may have the same number of exons because the PLUS and
CS- domains are derived from a single exon (PLUS-
) (Fig.
3A).
Aggrecan contains several domains that are highly homologous to
PG-M/versican (18-23). This structural identity suggested that the
PLUS domain might correspond to the KS attachment domain of aggrecan.
Comparisons of the nucleotide and amino acid sequences between the PLUS
domain and the KS attachment domain of human, rat, mouse, or chicken
aggrecan revealed significant identity among their nucleotide sequences
(44.1, 47.6, 57.4, and 51.6% to human, rat, mouse, and chicken
domains, respectively) and amino acid sequences (20.0, 23.5, 23.5, and
40.0% to human, rat, mouse, and chicken domains, respectively). A
comparison of the frequency of serine plus threonine residues
(potential O-glycosylation sites) to the total amino acid
residues of the respective domains between chicken PG-M/versican and
human aggrecan also revealed significant similarity with respect to the
potential for O-glycosylation between the PLUS domain of
chicken PG-M/versican and the KS attachment domain of human aggrecan
(Table III). Furthermore, the phylogenetic tree (Fig.
6) constructed as described by Saitou and Nei (24) suggests that the chicken KS domain is more closely related to the PLUS
domain than it is to the human, mouse, and rat KS domains. The distance
between a pair of sequences is the sum of the branch lengths. Thus, the
PLUS domain of PG-M/versican could be considered to correspond to the
KS attachment domain of aggrecan. Interestingly, PG-M/versican
regulates molecular forms by alternative splicing of the PLUS, CS-,
and CS-
domains, while aggrecan does so by alternative splicing of
the EGF and CRP domains (21, 25, 26). With regard to comparisons of the
PLUS domain of PG-M/versican with the KS domain of aggrecan, two
reports describe the relationship between exon boundaries and the
functional domains of aggrecan (27, 28). According to Valhmu et
al. (27), the KS domain is composed of two regions, KS-1 and KS-2.
The former is encoded by exon 11 and is well conserved among various
animal species (bovine, mouse, rat, and chicken), while the latter is
composed of variable numbers of poorly conserved hexapeptide repeats
and is encoded by the 5
-end of the large exon 12, which also encodes the CS-1 and CS-2 domains of aggrecan. Considering our finding that the
PLUS and CS-
domains of PG-M/versican are encoded by a single exon,
the PLUS domain appears to be rather comparable to the KS-2 domain.
However, Li and Schwartz (28) seemed to limit the definition of the KS
domain to the sequence encoded by exon 11 of the chicken gene.
|
Roles of Multiple Forms of PG-M
Chondroitin sulfate
proteoglycans in the retina have been extensively studied (29-44), and
possible functions have been suggested (34, 35, 44-47). We
demonstrated not only the presence of various forms of PG-M in the
developing chicken retina, but also their developmental stage- and
age-dependent variations by immunofluorescent staining with
polyclonal and monoclonal antibodies to PG-M and by Northern
blotting.2 This study showed for the first
time the presence of the V2 and V3 forms of PG-M+ and the
V1 and V3 forms of PG-M mRNAs in the chicken retina.
The tissue-, developmental stage-, and
age-dependent expression of each PG-M form suggests that
each plays specific roles.
We are grateful to Drs. S. Nishida and M. Iwaki (Aichi Medical University) and Dr. Y. Honda (Kyoto University) for continuous support and encouragement.