From the Collagen Research Unit, Biocenter, and Department of Medical Biochemistry, University of Oulu, Kajaanintie 52 A, FIN-90220 Oulu, Finland
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The human gene for the 1 chain of type XV
collagen (COL15A1) is about 145 kilobases in size and
contains 42 exons. The promoter is characterized by the lack of a TATAA
motif and the presence of several Sp1 binding sites, some of which
appeared to be functional in transfected HeLa cells. Comparison with
Col18a1, which encodes the
1(XVIII) collagen chain
homologous with
1(XV), indicates marked structural homology spread
throughout the two genes. The mouse Col18a1 contains one
exon more than COL15A1, due to the fact that
COL15A1 lacks sequences corresponding to exon 3 of
Col18a1, which encodes a cysteine-rich sequence motif.
Twenty-five of the exons of the two genes are almost identical in size,
six of them contain conserved split codons, and the locations of the
respective exon-intron junctions are identical or almost identical in
the two genes. The homologous exons include the closely adjacent first pair of exons and the exons encoding a thrombospondin-1 homology found
in the N-terminal noncollagenous domain 1, which are followed by the
most variable part of the two genes, covering the C-terminal half of
their noncollagenous domain 1 and the beginning of the collagenous
portion, after which most of the exons are homologous. The lengths of
the introns are not similar in these genes, with two exceptions, namely
the first intron, which is very short, less than 100 base pairs, and
the second intron, which is very large, about 50 kilobases, in both
genes. It can be concluded that COL15A1 and
Col18a1 are derived from a common ancestor.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The family of collagens is large, and the number of known
collagenous proteins is increasing. Nineteen genetically distinct vertebrate collagen types and more than 30 genes that encode their constitutive chains have been identified to date (1-4). The criteria for classification as collagen are that such proteins have at
least one triple-helical domain consisting of polypeptide chains with a
repeated Gly-X-Y sequence and are structural
components of the extracellular matrix. The collagen types have been
named with Roman numerals in the order of their discovery. The
fibril-forming collagens, types I, II, III, V, and XI, have a single,
uninterrupted triple-helical domain that is available for fibril
formation. The genes encoding these types are highly homologous (1,
5-7), and those for the three major ones, COL1A1,
COL1A2, COL2A1, and COL3A1, are
characterized by 51-52 exons. Their triple-helical domain is coded for
by 41-42 exons, most of which are 54 bp1 in size or multiples
thereof, and each exon begins with a complete codon for a glycine. The
class of nonfibril-forming collagens includes types IV, VI-X, and
XII-XIX, which all have one or more interruptions in the collagenous
sequence. The genes coding for this heterogeneous group are more
divergent in structure, and their numbers and sizes of exons can vary
considerably (1, 5, 6).
The complete primary structure of the human 1(XV) chain consists of
1388 residues, with the following domains: a 25-residue putative signal
peptide, a 530-residue N-terminal noncollagenous domain, a 577-residue
collagenous sequence, and a 256-residue C-terminal noncollagenous
domain (8). The collagenous sequence consists of nine collagenous
domains, which are separated by eight noncollagenous domains. Collagen
types XV and XVIII have been found to be homologous (8-13), and it has
been suggested that they should be called multiplexins
(multiple triple helix domains and
interruptions) (10). The N-terminal noncollagenous domains of both collagen chains contain sequence homology to thrombospondin, and seven of their collagenous domains are homologous, as are the
C-terminal noncollagenous domains.
The exon-intron organization of the mouse type XVIII collagen gene has
recently been determined (14), and a partial structure corresponding to
the seven extreme 3' exons has been described for the gene encoding
human type XV collagen (8). The genes encoding the homologous collagens
are located on separate chromosomes, the human gene for the 1(XV)
collagen chain having been mapped to chromosome 9 (15) and its mouse
counterpart to chromosome 4 (16), whereas the
1(XVIII) collagen gene
is located on human chromosome 21 and mouse chromosome 10 (11).
We report here on the isolation of genomic clones for the human type XV collagen and characterization of the exon-intron organization of the entire gene. Comparison of the type XV collagen gene with that encoding type XVIII collagen reveals marked conservation in exon-intron organization, thus indicating that the two genes derive from a common ancestor. Analyses of the 5'-flanking sequence of the COL15A1 gene using a computer search for promoter elements and deletion constructs transfected into HeLa cells suggested a "housekeeping promoter" characterized by the lack of a TATAA motif and the presence of apparently functional Sp1 binding sites.
![]() |
EXPERIMENTAL PROCEDURES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Isolation and Characterization of Genomic
Clones--
Radioactively labeled human cDNA clones for type XV
collagen were used as probes for screening human genomic libraries: a human lung fibroblast genomic library in the FIXTM
vector (944201; Stratagene), a human leukocyte genomic library in the
vector EMBL-3 (HL1006d; CLONTECH), a human
lymphocyte cosmid library in pWE15 (951203; Stratagene), and a human
genomic library in the cosmid vector PJB8 (a gift from Dr. Leena
Ala-Kokko, University of Oulu, Finland). The screenings were performed
under stringent conditions (17): hybridizations were carried out at
41 °C in 50% (v/v) formamide in 5× SSC (1× SSC = 0.15 M NaCl, 0.015 M sodium citrate, pH 6.8), 1%
(w/v) bovine serum albumin, 1% Ficoll (w/v), 1% polyvinylpyrrolidone
(w/v), 0.25 mg of denatured salmon sperm DNA/ml, and 0.1% (w/v) SDS.
The final washes for the filters were carried out in 0.5× SSC, 0.1%
SDS at 65 °C. The positive clones picked from the libraries were
analyzed by restriction enzyme mapping and Southern blotting, and
suitable restriction fragments were subcloned into the plasmid
pBluescript SK (Stratagene).
Nuclease S1 Protection--
Total RNA from cultured human skin
fibroblasts was isolated by guanidium isothiocyanate-chloroform-phenol
extraction (19), and the S1 nuclease protection experiment was
performed as described (17, 20). A 574-bp
SacI-BanI fragment (nucleotides 469 to +112 in
Fig. 3) was 5'-end-labeled with T4 polynucleotide kinase and
[
-32P]ATP (3000 Ci/mmol, Amersham Pharmacia Biotech).
The double-stranded probe (3 × 105 cpm) was
hybridized to 20 µg of total RNA from human skin fibroblasts in the
presence of 80% formamide, 40 mM Pipes, pH 6.4, 400 mM NaCl, and 1 mM EDTA at 67 °C for 15 h. After hybridization, 300 µl of buffer (280 mM NaCl, 50 mM sodium acetate, pH 4.5, 4.5 mM ZnSO4) was added, and the mixture was digested with 800 units of S1 nuclease (Boehringer Mannheim) at room temperature for 20 min. The protected fragments were analyzed on a 6% polyacrylamide sequencing gel. 20 µg of yeast tRNA was used as a negative control. The exact sizes of the protected fragments were determined by comparison with adjacent dideoxynucleotide sequencing reactions (21).
Nucleotide Sequencing and Sequence Analysis-- The nucleotide sequences were determined by the Sanger dideoxynucleotide chain termination method (21) either manually or using an automated DNA sequencer (Applied Biosystems). Vector-specific or sequence-specific 17-mer primers synthesized in an Applied Biosystems DNA synthesizer (Department of Biochemistry, University of Oulu) were used, and the nucleotide sequence data were analyzed by DNASIS (Amersham Pharmacia Biotech). Consensus sites for the binding of transcription factors were searched for in the Transcription Factor Data Base using the Sequence Analysis software package, Version 8.0 (Genetics Computer Group, Inc.).
Northern Blot Analysis-- Human adult multitissue Northern blots (7760-1 and 7759-1; CLONTECH) were hybridized under stringent conditions with 32P-labeled probes covering 33 kb of the intron 2 in the COL15A1 gene in the manner suggested in the manufacturer's protocol.
Deletion Constructs for Promoter Analysis--
Five deletion
constructs consisting of different lengths of 5'-flanking sequences of
the human type XV collagen gene were made. All of the fragments were
restriction enzyme-digested from a HindIII subclone derived
from a cosmid clone HG-23 (Fig. 1) and subcloned into the pGL2-Basic
Vector (Promega) upstream from the luciferase gene. An EspI
restriction site at the position +27 was utilized as a common 3'-end
for all of the constructs. A linker primer containing restriction sites
EspI-SalI-HindIII was attached to the
3'-end of all the constructs, and a HindIII site from a
pGL2-Basic Vector (Promega) was used in subcloning. As the 5'-ends of
the constructs, different restriction sites were used
(HindIII for del 1, HincII for del 2, XhoI for del 3, XbaI for del 4, and
SacI for del 5). Accordingly, the 5' subcloning position in
the vector depended on the construct, so that del 1 was subcloned as a
HindIII fragment, del 2 as a
SmaI-HindIII fragment, del 3 as a
XhoI-HindIII fragment, del 4 as a
NheI-HindIII fragment, and del 5 as a
SacI-HindIII fragment. Deletion constructs used in promoter
analysis consisted of the following fragments: del 1, bp 3598 to +27;
del 2, bp
2615 to +27; del 3, bp
1858 to +27; del 4, bp
1117 to
+27; and del 5: bp
474 to +27.
Cell Culture and Transfection Assays--
HeLa cells were
routinely maintained at 37 °C in Dulbecco's modified Eagle's
medium (Imperial) supplemented with 10% fetal calf serum, 50 µg of
ascorbate per ml, 2 mM glutamine, 100 units/ml of
penicillin, and 50 µg/ml of streptomycin. HeLa cells were transiently transfected with a liposome-based method (DOTAP liposomal transfection reagent kit, Boehringer Mannheim), according to the manufacturer's protocol. Briefly, the various luciferase deletion constructs (5 µg)
were transfected with 1 µg of pCMV--galactosidase plasmid (CLONTECH) to normalize for transfection
efficiencies. For cotransfection experiments, 5 µg of luciferase
plasmids were cotransfected with either 1 µg of the human Sp1
expression vector (pEVR2/Sp1 plasmid) or 1 µg of the control
expression vector (pEVR2/0 plasmid). Cells were harvested 24 h
after transfection, and luciferase activity was determined from cell
extracts using the luciferase assay system (Promega). The
-galactosidase activity was measured using the
-galactosidase
enzyme assay system (Promega). To normalize transfection efficiency for
the cotransfection experiments, total DNA was extracted from each
sample, and Dot-blot was performed. The nitrocellulose membrane was
hybridized with a probe corresponding to a fragment of the luciferase
reporter gene. Densitometry scanning of the autoradiograms was
performed with the GelWorks 1D program (UVP Gel Documentation and
Analysis System, GDS8000). The pGL2-Basic vector and the pGL2-Control
vector (Promega) were used as negative and positive controls,
respectively. The human expression vector for Sp1 under control of the
CMV promoter, pEVR2/Sp1, was a gift of Dr. Suske (Institut für
Molekularbiologie und Tumorforschung, Marburg, Germany). The control
pEVR2/0 was obtained from the plasmid pEVR2/Sp1 lacking the Sp1
cDNA fragment. All plasmids used for transfection were purified by
the plasmid midi kit (Qiagen).
![]() |
RESULTS AND DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Isolation and Characterization of Genomic Clones--
The
isolation and characterization of the seven extreme 3' exons of the
genomic lambda clone HLF-15 (Fig. 1) encoding part of the human
1(XV) chain gene has been described previously (8). In order to
isolate additional clones, the same lambda library that yielded clone
HLF-15 was screened four times using different fragments of the human
type XV collagen cDNA (8) as probes. These screenings resulted in
the isolation of five new clones, HLF-3, HLF-5, HLF-13, HLF-17, and
HLF-18 (Fig. 1).
|
Identification of the Transcription Initiation Site and Sequences
of the 5'-Flanking Region of the Gene--
The transcription
initiation site of the gene was determined by S1 nuclease protection
analysis (Fig. 2). A double-stranded SacI-BanI DNA fragment corresponding to the
sequence 469 to +112 in Fig. 3 was
isolated and 5'-end-labeled with
-32P. This probe was
then hybridized to total RNA isolated from cultured human skin
fibroblasts. When the hybrids were subjected to nuclease S1 digestion,
three major protected fragments of sizes 120, 164, and 165 nucleotides
and nine minor bands were detected (Fig. 2). Comparison of the sizes of
the protected fragments with adjacent dideoxynucleotide sequencing
reactions indicated that a major transcription initiation site is
located at an adenosine (A) nucleotide and another at two thymidines
(T) 44-45 nucleotides upstream of this. Because the former showed the
stronger band and was accompanied by several other initiation sites, it
is marked +1 in Fig. 3.
|
|
Deletion Analysis of the COL15A1 Promoter--
To investigate the
functional properties of the human COL15A1 promoter, we
performed reporter gene analysis using various deletions constructs. A
series of 5' deletions from bp 3598 to +27 were constructed from the
human promoter and linked to the luciferase reporter gene. Transient
transfection experiments were carried out in HeLa cells, which express
collagen type XV, in five independent experiments, each run in
duplicate (Fig. 4, A and
B). After normalization on
-galactosidase activity, all
the promoter constructs exhibited similar luciferase activity.
Consistent with these data, the shortest promoter fragment, from bp
474 to +27, is sufficient to give the entire promoter activity for the COL15A1 gene in HeLa cells.
|
Cotransfections with Sp1 Expression Vector--
Because the
sequence from bp 474 to the transcription start site was found to be
rich in G+C and to contain four potential Sp1 binding sites, we
investigated whether Sp1 has a potential role in the regulation of the
COL15A1 gene. The different deletion constructs were
cotransfected in HeLa cells with a human Sp1 expression vector or with
the corresponding vector without the Sp1 cDNA (control). Results
are expressed for each deletion construct as a ratio of the relative
luciferase activity obtained with the Sp1 expression vector to that
obtained with the control (Fig. 4C). Although basal luciferase activity obtained with the negative control pGL2-Basic vector was not changed, cotransfection with the Sp1 expression vector
induced the promoter activity of all constructs from 5.5-fold for the
longest construct to 10.3-fold for the shortest one. These results
suggest that Sp1 is involved in the regulation of the human type XV
collagen gene.
Exon-Intron Organization of the Human Gene for the 1(XV)
Collagen Chain--
DNA sequencing of the genomic clones indicated
that the human type XV collagen gene consists of 42 exons and 41 introns (Fig. 1). Sequences were determined for all the exons, their
intron junctions, most of the intronic sequences of reasonable size and about 3.6 kb of the 5'-flanking region of the gene. Exons 1-41 vary in
size from 36 to 548 bp, whereas the extreme 3' exon is 1119 bp in
length, containing 908 bp of 3'-untranslated sequences (Table
I). The introns vary in length between 89 bp and about 55 kb (Table I). The various overlapping genomic clones
covered the entire gene with the exception of introns 2 and 9, the
sizes of which were obtained by Southern blotting of genomic DNA. The exon-intron boundaries (Table II) agree with published consensus sequences for splice donor and acceptor sites (23). The donor site
following exon 6 is unusual in that the normally invariant GT
dinucleotide is replaced by GC. Fewer than 30 examples of GC donors
have been observed among the thousands of donor sites catalogued thus
far (23-25). Two other examples of GC splice donors in collagen genes,
COL4A1 and COL7A1, have been reported (26,
27).
|
|
Comparison of the Human 1(XV) and Mouse
1(XVIII) Collagen
Genes--
The human type XV and mouse type XVIII collagen genes are
of somewhat different sizes, the former being about 145 kb in size and
the latter about 102 kb, and they have 42 and 43 exons, respectively. They are highly similar in their exon-intron organization, but the
introns in the type XV gene are in most cases longer than those in the
type XVIII gene.
|
Conclusions-- The genes encoding the fibril-forming collagens range in size from 18 to 53 kb and consist of over 50 exons (1, 5, 6), whereas those encoding the nonfibril-forming collagens show more extensive heterogeneity in their genomic organization: they vary in size from 5 kb for COL10A1 (30) to 750 kb for COL5A1 (31) and in number of exons from 3 for COL10A1 (30) to 118 for COL7A1 (26). The present characterization of the complete exon-intron structure of the COL15A1 gene, showing it to be about 145 kb in size and to contain 42 exons, makes it one of the largest collagen genes, with a typically high number of exons.
The exon pattern of the COL15A1 gene differs markedly from that of the fibril-forming collagen genes, in which the triple-helix is encoded predominantly by exons of 54 bp or multiples of this (1, 5, 6). Only one of the exons in the COL15A1 gene that code for purely collagenous sequences is 54 bp in size. In fact, none of the nonfibril-forming collagen genes characterized so far displays the 54-bp exon pattern observed in the fibril-forming collagen genes, whereas many of them, including COL15A1, typically contain 36- and 63-bp exons, in addition to exons of more variable length, encoding the interrupted collagenous sequences (1, 5, 6). The 5'-flanking region of COL15A1 is characterized by the lack of a TATAA motif and the presence of several GC motifs. This renders the 5'-flanking region of COL15A1 similar to promoters of the "housekeeping genes," which are transcribed widely but at low RNA levels in many tissues. Several other collagen genes also contain multiple GC boxes as their main promoter elements, including the COL5A1, COL7A1, COL11A1, and COL11A2 genes (26, 32-34). In addition, the downstream promoter of COL6A2 (35) and the upstream promoter of Col18a1 (14), two collagen genes with alternate promoters, are also of this kind. Transient transfection experiments, which were performed on HeLa cells with 5' deletion constructs ranging from bp ![]() |
ACKNOWLEDGEMENTS |
---|
We gratefully thank Ritva Savilaakso and Jaana Väisänen for expert technical assistance. We gratefully acknowledge Dr. G. Suske (Institut für Molekularbiologie und Tumorforschung, Marburg, Germany) for the pEVR2/Sp1 expression vector.
![]() |
FOOTNOTES |
---|
* This work was supported by grants from the Health Sciences Council of the Academy of Finland, the Sigrid Juselius Foundation, FibroGen Inc. (South San Francisco, CA), and Suomalainen Lääkäriseura Duodecim.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) AF052956-AF052975.
To whom correspondence should be addressed. Tel.: 358-8-5375800;
Fax: 358-8-5375810; E-mail: taina.pihlajaniemi{at}oulu.fi.
1 The abbreviations used are: bp, base pair(s); kb, kilobase(s); PCR, polymerase chain reaction; Pipes, piperazine-N,N'-bis[2-ethanesulfonic acid]; del, deletion.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|