(Received for publication, December 31, 1996, and in revised form, January 10, 1997)
From the Biocenter and Department of Biochemistry, University of Oulu, FIN-90570 Oulu, Finland
We report the isolation and characterization of cDNA clones for a novel isoform of lysyl hydroxylase (lysyl hydroxylase 2), a posttranslational enzyme of collagen biosynthesis. The open reading frame predicted a protein of 737 amino acids, including an amino-terminal signal peptide. The amino acid sequence has overall similarity of over 75% to the lysyl hydroxylase (lysyl hydroxylase 1) characterized earlier. This similarity is even higher in the carboxyl-terminal end of the molecules. Lysyl hydroxylase 2 contains nine cysteine residues, which are conserved in lysyl hydroxylase 1. Furthermore, the conserved histidines and aspartate residues required for lysyl hydroxylase activity are present in the sequence. Northern analysis identified a transcript of 4.2 kilobases, which was highly expressed in pancreas and muscle tissues. Expression of cDNA in insect cells using a baculovirus vector yielded proteins with lysyl hydroxylase activity and an antiserum against a synthetic peptide of the deduced amino acid sequence recognized proteins with molecular weights of 88 and 97 kDa in homogenates of the transfected cells.
Collagens are the most abundant proteins in the human body, and they are found essentially in all tissues. To date, 19 different collagen types have been identified (1-4). The biosynthesis of collagens is characterized by several posttranslational modifications, one of which is hydroxylation of lysyl residues. Hydroxylysine occurs in the Y position of the repeating X-Y-Gly triplets within the helical region of the collagen molecule. Hydroxylysine also occurs in the sequence of nonhelical telopeptide regions of some collagen molecules, when glycine is replaced by either serine or alanine (1, 2, 5). The amount of the hydroxylysyl residues varies considerably between the different collagen types. Additional variation is found within the same collagen type in different tissues and even within the same tissues in different physiological and pathological states (1, 2, 5). Hydroxylysyl residues have an important role in the structure and stability of collagens. The hydroxy groups participate in the formation of intermolecular cross-links and serve as sites of attachment for carbohydrate units that are unique to collagens (1-4, 6).
Lysyl hydroxylase (EC 1.14.11.4) catalyzes the hydroxylation of lysyl residues in collagens (1). The enzyme requires Fe2+, 2-oxoglutarate, O2, and ascorbate in the reaction. The active enzyme is a homodimer consisting of subunits with a molecular weight of about 85,000 (1) and is constitutively expressed in a variety of tissues (7, 8). The complete cDNA-derived amino acid sequence has been reported for the enzyme from chick (9), human (10, 11), and rat (12). The enzyme has also been characterized at the genomic level (7), and the first mutations in the lysyl hydroxylase gene have been characterized in patients with type VI of the Ehlers-Danlos syndrome (EDSVI)1 (13-17).
We report here the cloning and characterization of a novel human lysyl hydroxylase (lysyl hydroxylase 2), which is highly expressed in pancreas and muscle tissues. The similarity in the amino acid sequence is about 75% when compared with the human and chicken lysyl hydroxylase characterized earlier (lysyl hydroxylase 1).
PS1-PS2 fragment (PS1, 5-TTTCATTCATCTCTGATC-3
; PS2,
5
-TCTAAGTCAAGCGGAA-3
, sequences homologous to the lysyl hydroxylase cDNA sequence (obtained from the data base of expressed sequence tags, T11367[GenBank], T18568[GenBank])) was generated by PCR from oligo(dT)-primed human
kidney cDNA. The amplification was performed using Taq
DNA polymerase (Amplitaq, Perkin-Elmer) in the presence of 50 mM KCl, 10 mM Tris-HCl, pH 8.3, 1.5 mM MgCl2 and 0.01% (w/v) gelatin (94 °C, 2 min; 94 °C, 40 s; 43 °C, 40 s; 72 °C, 2 min; 30 cycles; 72 °C, 7 min). The subcloned fragment was sequenced and used
as a probe for library screening.
The following oligonucleotide pairs and conditions were used in the
amplifications to clone the 3 end of the novel cDNA: AP1
(5
-CCATCCTAATAGACTCACTATAGGGC-3
, Marathon kit, Clontech) and PS22
(5
-GTACAATTGCTCTATTGAGTCA-3
), 94 °C, 10 min; 94 °C, 40 s;
55 °C, 40 s; 72 °C, 2 min; 30 cycles; 72 °C, 10 min. For cloning of the 5
end, the following oligonucleotide pairs were used:
AP2 (5
-ACTCACTATAGGGCTCGAGCGGC-3
, Marathon kit, Clontech) and PS38
(5
-TTGTTGAACTATACGGTTGACATA-3
), 95 °C, 5 min; 95 °C, 40 s;
61 °C, 40 s; 72 °C, 2 min; 30 cycles; 72 °C, 5 min; PS39 (5
-CACCTCTCCATTCTTCTCCTTG-3
) and oligo(dT) anchor primer
(5
-GACCACGCGTATCGATGTCGACTTTTTTTTTTTTTTTTT-3
, 5
/3
RACE kit,
Boehringer Mannheim), 94 °C, 2 min; 94 °C, 20 s; 60 °C,
30 s; 72 °C, 2 min; 35 cycles; 72 °C, 10 min; PS41
(5
-TCGATGGAATCCATCACTTTCT-3
) and PCR anchor primer
(5
-GACCACGCGTATCGATGTCGGAC-3
, 5
/3
RACE kit, Boehringer Mannheim),
94 °C, 2 min, 94 °C, 20 s; 57 °C, 30 s; 72 °C, 2 min; 35 cycles; 72 °C, 10 min; PS46 (5
-CCGCACCCAGACAGGGATTC-3
) and
PCR anchor primer, 94 °C, 2 min; 94 °C, 20 s; 60 °C,
30 s; 72 °C, 2 min; 35 cycles; 72 °C, 10 min.
Fetal and adult human kidney (gt11,
Clontech) and adult human pancreas (
gt10, Clontech) cDNA
libraries were screened using standard protocols (18). Hybridization
was carried out for 16 h in 5 × SSPE (1 × SSPE = 0.15 M NaCl, 0.01 M
NaH2PO4·H2O, 0.001 M EDTA, pH
7.4), 5 × Denhardt's, 50% formamide, 0.01% SDS, 100 mg/ml
denatured herring DNA at 42 °C. The filters were washed twice in
2 × SSC, 0.5% SDS and once in 1 × SSC, 0.1% SDS for 15 min at room temperature.
The 3 end and the 5
end of the novel cDNA
were obtained using the Marathon cDNA amplification kit (Clontech)
and 5
/3
RACE kit (Boehringer Mannheim).
To clone the 3 end of noncoding region of the novel cDNA,
double-stranded cDNA was prepared using poly(A) RNA from human kidney and oligo(dT) primer (Marathon kit) which contained two degenerate nucleotides following the poly(T) sequence. The Marathon cDNA Adaptor was ligated to the blunt ended double-stranded
cDNA and amplification was carried out by PCR using an adaptor
specific primer and a gene-specific primer obtained from the sequence
of clone
44.
To clone the 5 end of the novel cDNA, double-stranded cDNA was
prepared using a sequence-specific primer. The Marathon cDNA adaptor was ligated to the blunt ended double-stranded cDNA, and amplification was carried out using an adaptor-specific primer paired
with a primer obtained from the sequence of
44. Using 5
/3
RACE
kit, a single-stranded, tailed cDNA was converted to a
double-stranded molecule by PCR using an oligo(dT) anchor primer paired
with a sequence-specific oligonucleotide. Amplification was carried out
using a PCR anchored primer paired with a sequence-specific primer.
DNA sequencing was performed by standard
dideoxynucleotide sequencing using T7 sequencing kit (Pharmacia Biotech
Inc.), Taq cycle sequencing kit (U. S. Biochemical Corp.) and
PRISMTMAmpliTaq FS dye terminator cycle sequencing kit (Perkin-Elmer).
Part of the sequencing was carried out manually and part using an Abi Prism 377 DNA sequencer (Perkin-Elmer). All clones were sequenced, despite the overlap in the sequence of the clones. The sequence of the
ends of the molecule was confirmed by sequencing DNA fragments obtained
by different oligonucleotide pairs in 5
/3
RACE.
A human Multiple Northern blot
containing poly(A) RNA from different tissues (Clontech) was hybridized
for 16 h in 5 × SSPE, 10 × Denhardt's, 50%
formamide, 2% SDS, 100 mg/ml denatured herring DNA at 42 °C using
radioactively labeled 44 clone as a probe. The blot was washed in
2 × SSC, 0.05% SDS at room temperature for 40 min and then at
50 °C for 40 min.
Expression of the novel cDNA was carried out by baculovirus transfer vector (18) in the BAC-TO-BACTM Expression system (Life Technologies, Inc.). PCR was used to generate two different constructs using human kidney cDNA as a template. Construct 1 contained the nucleotides from 1 to 2272 and construct 2 the nucleotides from 27 to 2272. The PCR products were confirmed by sequencing. Insect cells were harvested 48 or 72 h after infection according to protocol described for human lysyl hydroxylase (19).
Other AssaysWestern blot analysis was carried out using a polyclonal antibody, produced in chicken, against a synthetic peptide of NPRTLKILIEQNRKI (amino acids 399-413 of lysyl hydroxylase 2). The homogenates of baculovirus-infected cells were fractionated in reducing conditions by SDS/10% polyacrylamide gel electrophoresis, blotted onto an Immobilon membrane (Millipore) and incubated with the antibodies against the synthetic peptide. Bound antibodies were visualized using the ECL detection system (Amersham Life Science, Inc.) and x-ray film (Eastman Kodak Co.). Lysyl hydroxylase activity was assayed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14C]glutarate (20), synthetic peptide ARGIKGIRGFSG or IKGIKGIKG was used as substrate.
The first
cDNA clone (PS1-PS2) was generated by amplification of
oligo(dT)-primed human kidney cDNA using oligonucleotides PS1 and
PS2 as primers (Fig. 1). The amplification yielded a
fragment of about 300 nucleotides in length. Human kidney and pancreas cDNA libraries were then screened using PS1-PS2 clone as a probe. One positive recombinant was identified in the pancreas (44) and one
in the kidney (
6) library (Fig. 1). Clone
44 was used as a probe
to obtain four additional overlapping clones (Fig. 1,
5,
25,
48, and
86). The 5
and 3
ends of the novel cDNA were
obtained using 5
and 3
RACE.
Nucleotide and Derived Amino Acid Sequences of the cDNA
The cDNA clones (Fig. 1) encode a polypeptide of 737 amino acids, which starts with a codon for methionine and covers 1290 nucleotides of the 3-untranslated sequence. The sequence contains two
internal EcoRI restriction sites, and the 3
-untranslated sequence contains two potential polyadenylation signals. A putative signal peptide is present at the amino-terminal end of the protein. The
predicted molecular weight of encoded polypeptide is 84,659 when
calculated including the signal peptide (Fig. 1).
The nucleotides of the coding region of the novel cDNA sequences
are 63 and 59% identical to the chicken (9) and human (10) lysyl
hydroxylase sequences, respectively. There are no long regional
streches of identical nucleotide sequences in the molecules; the
identical nucleotides are distributed evenly in the molecule. The
3-untranslated region of the molecule is clearly different from that
in the previously characterized human lysyl hydroxylase 1.
The amino acid sequence of the novel polypeptide is compared with the chicken (9) and human (10) lysyl hydroxylase sequences in Fig. 1. The translated product of the novel cDNA is 11 amino acids longer than a translation product of human lysyl hydroxylase. A high similarity of the amino acid sequence (overall similarity over 75%) was observed between the novel polypeptide and human and chicken lysyl hydroxylases. The COOH-terminal region is especially similar to lysyl hydroxylase. Furthermore, a 62-amino acid sequence in the central region of the molecule shows a high similarity (over 90% identity covering the amino acids 414-475 of the novel molecule). Examples of particularly variable regions are a 13-amino acid sequence that is totally dissimilar (amino acids 283-295) and an 18-amino acid sequence (amino acids 340-357) with an 11% identity to lysyl hydroxylase. The 50 amino acids from amino terminus of the molecule have a 56% identity to human lysyl hydroxylase. The nine cysteine residues conserved in lysyl hydroxylase amino acid sequences are also conserved in the novel polypeptide. The amino acid sequence contains seven potential attachment sites for asparagine-linked oligosaccharides, two of them have identical location in the sequence for human lysyl hydroxylase.
The carboxyl-terminal end of lysyl hydroxylase 1 is especially well conserved between chicken and human (10) having an identity of over 90% for the last 139 amino acid residues of the molecule. This finding suggests that this region of the molecule is important for the function of the enzyme. A search for conserved residues within the sequences of 2-oxoglutarate dioxygenases and a related dioxygenase, isopenicillin N synthase, suggested that two histidine-containing motifs in the carboxyl terminus, residues 656 and 708 in the human lysyl hydroxylase sequence, may function as ferrous binding ligands in the catalytic mechanism common to all 2-oxoglutarate dioxygenases (21, 22). Site-directed mutagenesis of prolyl 4-hydroxylase (23) and lysyl hydroxylase (19), the enzymes belonging to the group of 2-oxoglutarate-dependent dioxygenases, demonstrated the importance of these histidine residues in catalytic activity of the enzymes. Mutation analysis revealed three other functionally important histidines in the carboxyl-terminal portion of lysyl hydroxylase polypeptide (19). It is remarkable that all these histidine residues are also conserved in the novel molecule (amino acids His-666, His-667, His-710, His-716, and His-718). Studies on human lysyl hydroxylase furthermore indicated that the mutation of aspartate residues in the carboxyl-terminal region of the molecule cause dramatic inactivation of the enzyme (19); two of these residues (amino acids Asp-668 and Asp-684) can be also found in the carboxyl-terminal region of the novel polypeptide. Asparagine-linked glycosylation has been shown to be a requirement for maximal lysyl hydroxylase activity (19, 24). One can speculate that the requirement holds true also for the novel protein, because (i) the sequence contains many putative glycosylation sites, (ii) two putative glycosylation sites are highly conserved; at least one of these is glycosylated in lysyl hydroxylase expressed in insect cells and both asparagines (corresponding to Asn-209 and Asn-696 of the novel sequence) are required for lysyl hydroxylase activity (19).
A Novel Polypeptide with Lysyl Hydroxylase ActivityTo determine whether the novel polypeptide has lysyl hydroxylase activity, a recombinant baculovirus encoding for the novel cDNA was generated in insect cells (Table I). Two constructs were prepared for the experiments, one containing the whole coding region, including the signal peptide (construct 1) and the other containing the coding region but lacking the first nine amino acids of the signal peptide (construct 2). Recombinant proteins were harvested by homogenization of the cells in a buffer containing Nonidet P-40, and the insoluble pellet was further homogenized in a buffer containing glycerol (19). Activity measurements of the supernatants indicate that majority of lysyl hydroxylase activity was found in the Nonidet P-40 fraction. Both peptide substrates were hydroxylated by the recombinant proteins.
|
An antiserum was prepared against a synthetic peptide derived from the cDNA sequences of lysyl hydroxylase 2. The recombinant proteins produced by the constructs were analyzed by SDS-polyacrylamide gel electrophoresis and then immunostained using the peptide antibodies for detection. The peptide antibodies recognized protein bands with molecular weights of 88,000 and 97,000 in cells transfected with the constructs for lysyl hydroxylase 2 (not shown). The lower molecular weight band corresponds exactly to the size of an immunostained band found in kidney tissue and kidney cells (not shown). It is not known, however, if both molecular weight 88,000 and 97,000 proteins have lysyl hydroxylase activity. As indicated above, lysyl hydroxylase 2 contains many potential sites for glycosylation, and it is possible that the two forms on the immunoblot are due to variation in the glycosylation of proteins in insect cells.
Expression of the Novel Lysyl Hydroxylase in Various TissuesExpression of the novel gene in various tissues was
determined by Northern hybridization. A single band of 4.2 kilobases
was obtained in the hybridization from heart, brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas. The data indicate that
the expression of the gene is highly regulated. A prominent hybridization signal was present in muscle tissue such as heart and
placenta, as well as pancreas, whereas the signal was very faint in the
lung (Fig. 2).
Conclusions
A novel isoform (lysyl hydroxylase 2) for lysyl hydroxylase has been described in this study. The molecule has a high identity to the lysyl hydroxylase characterized earlier (lysyl hydroxylase 1, Refs. 9-12), but the novel molecule clearly is a product of a different gene. This finding confirms the hypothesis of existence of isoforms for lysyl hydroxylase based on observations that hydroxylation of telopeptidyl lysyl residues may be under separate enzymic control distinct from that active toward lysyl residues within the helical regions of type I collagen (5, 25). The discovery of isoforms of lysyl hydroxylase is highly significant, because it may explain the great variation of lysine hydroxylation in collagen of different tissues in patients with EDSVI (26, 27), a disease resulting from the gene defect in lysyl hydroxylase 1. The isoforms are also in agreement with the finding that lysines in collagen types II, IV, and V in EDSVI patients were hydroxylated normally (27). Furthermore, the residual activity of EDSVI cells preferentially directed toward type IV collagen (28) suggests the presence of isoforms for lysyl hydroxylase. The presence of isoforms may also explain our previous observations that cells from an EDSVI patient producing a truncated form of lysyl hydroxylase 1, which lacks the highly conserved carboxyl-terminal portion of the molecule, nevertheless have detectable lysyl hydroxylase activity (13).
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U84573[GenBank].
Dr. Timo Hautala, Department of Internal Medicine, University of Iowa, is acknowledged for his help in the computer search.