(Received for publication, August 19, 1994; and in revised form, January 19, 1995)
From the
We have isolated a number of recombinant clones from a human skin fibroblast cDNA library that contain extensive sequence homology to several coding domains within the human lysyl oxidase mRNA. Using one of these lysyl oxidase-like cDNAs, we obtained several overlapping genomic DNA recombinants. Restriction mapping and DNA sequence analysis revealed that the complete sequence of the lysyl oxidase-like mRNA was encoded by seven exons distributed throughout 25 kilobases of genomic DNA. Exons 2-6 encoded the region of greatest homology to lysyl oxidase. The size of these five exons, moreover, was exactly the same as the size of the corresponding exons within the lysyl oxidase gene. Northern blot analysis also revealed the concomitant appearance of lysyl oxidase and lysyl oxidase-like mRNA in several human tissues. It appears therefore that the genes encoding lysyl oxidase and a lysyl oxidase-like protein share a common evolutionary origin and may also be functionally related.
Lysyl oxidase is a copper-dependent enzyme responsible for the development of lysine-derived cross-links in the structural extracellular matrix proteins, collagen and elastin(1, 2) . This amine oxidase catalyzes allysine cross-links using lysine residues located within the telopeptide and collagenous domains of many procollagens (3) , the biosynthetic precursors of several collagen types. Desmosine and isodesmosine cross-links, in contrast, are the product of the lysyl oxidase-catalyzed deamination of lysine residues distributed throughout tropoelastin, the soluble precursor of insoluble elastin(1, 2) . While the products of lysyl oxidase catalysis have been well characterized, the mechanism(s) by which this enzyme interacts with both procollagen and tropoelastin substrates is unknown.
Lysyl oxidase has been isolated from a number of different
tissues from several phylogenetic species as an enzymatically active
extracellular matrix protein of 32 kdaltons(2) . Recently,
overlapping cDNA recombinants from chicken(4) ,
mouse(5) , rat(6) , and human(7, 8) tissues have been described that encode a 48-kDa
preprolysyloxidase. This enzymatically inactive precursor protein is
synthesized from a conserved, single copy gene which has been mapped in
humans, to the long arm of chromosome 5 (7, 8) and, in
mouse, to chromosome 18 (9, 10, 11) . Both
human and mouse lysyl oxidase mRNAs are encoded by seven exons
distributed through 14 kb ()of both human and mouse genomic
DNA(12, 13, 15) . (
)
Several years ago, Kagan and co-workers (16, 17) and Kuivaniemi (18) reported the presence of several chromatographic variants of lysyl oxidase in both human and bovine tissue. These enzymatically active variants were clearly not precursors of the mature enzyme nor were they derived from any obvious post-translational modification(s) of lysyl oxidase. Moreover, Kagan et al.(19) reported subtle but distinct differences in amino acid composition between isolated variants of lysyl oxidase, implicating the existence of multiple isoforms of the enzyme. Such isoforms could arise from separate lysyl oxidase genes or through alternate usage of exons within the known gene encoding lysyl oxidase and could provide a basis for understanding the mechanism of interaction of lysyl oxidase with multiple substrates.
We have
demonstrated very recently that exons within the lysyl oxidase gene are
not subject to alternate usage in a variety of human tissues. Kenyon et al.(20) , however, have reported
several overlapping cDNA clones that would encode a protein with
extensive amino acid sequence homology to the entire sequence of the
secreted form of lysyl oxidase. Although the function of this lysyl
oxidase-like mRNA is unknown, the existence of such a mRNA suggests
that multiple genes may encode structural and functional variants of
lysyl oxidase. In support of this hypothesis, this article reports the
complete structure of a multiexon gene which has a striking similarity
in exon-intron structure, exon sequence homology, and tissue-specific
expression to the human lysyl oxidase gene.
All
these oligonucleotides are derived from exon sequences flanking the
appropriate introns. Oligomers YK7, 24, 25, and 27 are all upstream
primers. Oligomers YK23, 26, 28, and 29 are downstream primers. The
sizes of intron 1 and intron 2 were determined by restriction mapping
of -phage recombinants and partial DNA sequencing of plasmid
subclones.
Alignment of the derived amino acid sequences for
human(7, 8) , rat(6) , mouse(5) , and
chicken (4) lysyl oxidases and the open reading frame of the
human lysyl oxidase-like cDNAs revealed a similar carboxyl-terminal
sequence that exhibited 76% homology between the human lysyl oxidase
and lysyl oxidase-like derived amino acid sequences (Fig. 1).
This homologous region encompassed the entire sequence of the mature
lysyl oxidase, including a copper-binding and other metal-binding
domains(22, 23, 24, 25) . The
copper-binding domain and the four histidine residues present within
this conserved sequence (WEWHSCHQHYH) that are involved in the copper
binding coordination complex are strictly conserved. Further, a growth
factor and cytokine receptor domain was also identified in both derived
amino acid sequences. The amino acid sequence
C-X-C-X-W-X
-C-X
-C
(where X
is a defined number of any amino acids),
is conserved in exon 5 and 6 of both genes, and it agrees with the
sequence
C-X
-C-X-W-X
-C-X
-C
that is a proposed extracellular ligand-binding domain for a number of
receptors for cytokines, prolactin, and growth
hormone(26, 27) .
Figure 1: Amino acid sequence alignment of the conserved domains of the human (H), rat (R), mouse (M) and chick (C) lysyl oxidase and the predicted lysyl oxidase-like (LOL) protein. Amino acids are indicated in single-letter code and numbered from the amino-terminal end of previously reported sequences for lysyl oxidase (8) and a lysyl oxidase-like (20) protein. Dashes indicate identical amino acids, and * identifies stop codons. Conserved cysteine residues, copper, and putative metal binding sequences are in shaded boxes. The four copper coordinating histidine residues within the copper-binding domain are in reverse font. T bars above the sequences illustrate exon-intron junctions and exon numbers determined for the human and mouse lysyl oxidase and the lysyl oxidase-like gene. A growth factor and cytokine receptor domain is underlined.
From a comparison of the derived amino acid sequences for the human lysyl oxidase-like protein and lysyl oxidases from several different phylogenetic species, it was evident that many of the substitutions within the carboxyl terminus of the lysyl oxidase-like protein are identical to amino acid variations within the analogous domains in lysyl oxidases from different species.
In contrast to the homology with the sequence of mature human lysyl oxidase, very little homology was evident in the region of the amino-terminal domain encoded by human lysyl oxidase-like mRNA. Similarly, no obvious homology was present between the 3`-untranslated regions of both lysyl oxidase and lysyl oxidase-like mRNAs.
Figure 2:
Structure of the human lysyl oxidase-like
gene. Comparison of the exon-intron structure of the lysyl oxidase-like (LOL) and lysyl oxidase (LO) genes. Non-conserved
exons are shown in hatched boxes and conserved exons by shaded boxes. Exons are numbered: Ex 1-7. E, EcoRI; P, PstI; H, HindIII; S, SacI. Relative positions of
genomic DNA inserts from the phage recombinants LOL-G1, G4, G7, and G11
are indicated. A 1-kb size marker is also indicated. Exact sizes of
exons (Ex) and introns (Int) are given in base pairs
in the table. The information summarizing the complete structure of the
human lysyl oxidase gene was obtained from two previous
reports(28) . The sizes of introns 1 and 2 within
the lysyl oxidase-like gene were derived from restriction mapping;
introns 3-6 were determined by PCR
analysis.
The overall structure of the lysyl oxidase and lysyl oxidase-like genes are very similar. Seven exons encode the 5`- and 3`-untranslated regions and the coding domains in each of the lysyl oxidase and lysyl oxidase-like mRNAs. Exons 2-6 encode the regions of greatest homology between the derived amino acid sequences of lysyl oxidase and the lysyl oxidase-like protein. The sizes of the exons 2-6 are also exactly the same in both genes. In contrast, exon 1 is smaller in the lysyl oxidase gene than in the lysyl oxidase-like gene. Conversely, exon 7 is substantially larger in the lysyl oxidase gene than the corresponding exon in the lysyl oxidase-like gene. Exon 7 in the lysyl oxidase gene encodes a large 3`-untranslated region which has little homology to a smaller 3`-untranslated region in exon 7 of the lysyl oxidase-like gene. Similarly, exon 1 in the lysyl oxidase gene shares little homology with exon 1 in the lysyl oxidase-like gene.
Considerable divergence in size and sequence exists between all six introns in both genes. While intron sequence within each intron-exon junction in the lysyl oxidase-like gene conforms to the consensus sequence NCAG/GTRAGT characteristic of exon-intron junctions in eukaryotic genes(29) , no homology between intron sequences in both genes was apparent beyond these consensus sequences.
The positions of four of the six introns within the human lysyl oxidase-like gene resulted in split codons. The first intron interrupted codon 368 (a glycine codon) between the first and second nucleotides. The second intron interrupted codon 405 (encoding serine) between the second and third nucleotides. Similarly, the third intron interrupted codon 450 (a glutamine codon) between the second and third nucleotide. In contrast, the fourth intron interrupted the coding sequence between codons 502 (a glutamine codon) and 503 (a glycine codon), and intron 5 interrupted the coding sequence between codons 534 (a lysine codon) and 535 (a valine codon). Finally, the sixth intron resulted in a split codon 573 (encoding glutamine), interrupting the codon between the second and third nucleotides. These exon-intron junctions interrupt the lysyl oxidase codons at exactly the same positions.
Figure 3: Transcription initiation sites determined by primer extension analysis. The mRNA start sites were determined using total RNA from human skin fibroblasts. Lane 1 is the primer extension reaction with primer YK 12. Lanes G, A, T, and C are sequencing reactions of unrelated DNA used for size determination.
The nucleic acid sequence of 1.2 kb of the 5`-flanking region of the lysyl oxidase-like gene did not reveal significant sequence homology (35.7%) to the corresponding region of the human lysyl oxidase gene. The lysyl oxidase-like gene promoter contains no typical TATA or CAAT box sequences. Potential transcription factor binding sites including four Sp1 sites, one TF-II-I motif, single Ap2 and Ap4 sites, and an octamer motif were identified. There is one GC box at -588 that overlaps with one of the Sp1 sites (TGGGCGGGGT). The presence of several recognition sequence elements for Hox proteins, zeste-Ubx (CGAGCG, CGCTCG), and zeste-white (CACTCA) suggests that the lysyl oxidase-like gene expression is under developmental regulation(30) . There are putative binding sites for the activator protein malT. The PU box (GAGGAA) that is a binding site for the ets-oncogene-related transcriptional activator is at -598. The nucleotide sequence TGTTCT at -83 is a recognition element of the glucocorticoid receptor, and it is indicative of steroid hormone regulation of the lysyl oxidase-like gene (31) (Fig. 4).
Figure 4: 5`-Flanking region of the human lysyl oxidase-like gene. The first nucleotide preceding the ATG codon is numbered -1. Potential regulatory consensus sequences and transcription factor binding sites are indicated by shaded boxes. Transcription initiation sites determined by primer extension analysis are indicated by arrowheads.
Figure 5:
Northern blot analysis of RNA from
several human tissues. MTN blots containing poly(A) RNA isolated from several human tissues were used as described
under ``Materials and Methods.'' Panel A,
autoradiogram obtained following incubation of the MTN blot with a
radiolabeled lysyl oxidase-like cDNA and exposed for 24 h. Panel
B, the detection of lysyl oxidase mRNAs in a separate MTN blot
exposed for 16 h. Panel C, the MTN blot used in panel A was stripped, rehybridized with an actin cDNA, and exposed for 10
min. The tissue source of each RNA sample on the MTN blots is indicated
at the top of each panel. The position of RNA molecular weight markers
(in kilobase pairs) is also indicated on each MTN
blot.
Gene duplication is a common mechanism for encoding similar but genetically distinct protein variants. The genes encoding tissue-specific isoforms of cytochrome c oxidase arose through duplication of an ancestral multiexon gene, and, consequently, an identical exon-intron structure was observed within these genes(34, 35) . Similarly, the globin gene cluster evolved from an ancestral gene by consecutive rounds of gene duplication(36) . Variation of intragenic conservation of exon size within the fibrillar collagen genes is also an example of a complex type of gene duplication(37, 38) .
It is evident that all or
part of the lysyl oxidase and lysyl oxidase-like genes shared a common
ancestor. While exons 2-6 are highly conserved in both sequence
and size in both genes, exons 1 and 7 have little sequence homology and
no similarity in size. The ancestral gene for lysyl oxidase and the
lysyl oxidase-like protein may have contained only exons 2-6.
Duplicated copies of this ancestral sequence could have maintained a
conserved sequence and size of these five exons and additional exons
were added to an evolving multiexon structure through a process of exon
shuffling. Alternatively, it is possible that duplication of an
ancestral gene involved all seven exons and six introns. Two of those
exons and all the intron sequence may have diverged in sequence and in
size. For example the size of the first intron of the lysyl oxidase
gene is conserved in mouse, rat, and human, and the sequence contains
several transcription factor binding sites in all three species. ()The same intron in the lysyl oxidase-like gene differs in
size and has no sequence homology to the lysyl oxidase intronic
sequences suggesting a different role for this part of the gene.
Most of the coding domain for the secreted, active form of lysyl oxidase is contained within exon 2-6. Ten of the 12 cysteine residues present in this catalytic domain in human, rat, mouse, and chicken lysyl oxidases are conserved. Six of these exist as disulfide bonds and are responsible for the functional conformation of the mature lysyl oxidase enzyme. The presence of these cysteine residues in the lysyl oxidase-like protein suggests that the tertiary structure of this protein is similar to lysyl oxidase.
The first exon of the lysyl oxidase gene encodes a 5`-untranslated region, a signal peptide, a propeptide sequence, and the first few amino acids of the functional and secreted lysyl oxidase responsible for transport and secretion. Most of this first exon exhibits the greatest divergence among different species. The amino-terminal end of the derived amino acid sequence encoded within exon 1 of the lysyl oxidase-like gene is homologous to the three domains characteristic of a signal sequence(40) . These include a positively charged amino-terminal domain (MALAR) (codons 1-5), a central hydrophobic region (GSRQLGALVWGACL) (codons 6-19), and a carboxyl-terminal region (CVLVHGQ) (codons 20-26). The 3`-end of exon 1 encodes a peptide sequence with little homology to the propeptide sequence of preprolysyl oxidase. However, the presence of putative endopeptidase cleavage sites between arginine residues at positions 85, 86, 93, 94, and 272, 273 suggests that this peptide may function as a propeptide domain. Proteolytic processing at these sites would result in 49-, 48-, and 31-kDa protein products. The divergent sequence within the first exon of the lysyl oxidase-like gene may therefore encode a signal sequence and propeptide domain within a preprolysyl oxidase-like protein that may regulate intracellular and extracellular transport of a secreted protein, the pathway of which, however, may be different from preprolysyl oxidase.
Exon 7 in both the lysyl oxidase and the
lysyl oxidase-like genes contains only five nucleotides of coding
sequence, a stop codon, and a 3`-untranslated region which is 3.5 kb in
length in the lysyl oxidase gene and contains several differentially
used polyadenylation signals that result in multiple-sized lysyl
oxidase mRNAs. Only one polyadenylation signal was observed
in exon 7 of the lysyl oxidase-like gene, and the position of this
consensus sequence within the 3`-untranslated region was consistent
with the appearance of a 2.3-kb lysyl oxidase-like mRNA. No other
polyadenylation signals were detected downstream of the observed
consensus sequence, and no other higher molecular weight lysyl
oxidase-like mRNAs were detected by Northern blot analysis. Exon 7
seems therefore to be a far shorter exon in the lysyl oxidase-like
gene. The difference in size and sequence suggests both separate or
divergent evolutionary origins and different potential effects on mRNA
stability and other post-transcriptional control mechanisms mediated by
3`-untranslated domains(41) .
Lysyl oxidase has been isolated
as an enzymatically active family of polypeptides of 32 kDa that appear
in several tissues and in several phylogenetic species as at least four
isoforms(17) . The mechanistic and functional basis for these
chromatographically isolated variants of lysyl oxidase is unknown. It
is clear, however, that these isoforms of lysyl oxidase individually
catalyze cross-link formation using both collagen and elastin
substrates(17) . Moreover, these isoforms are not
proteolytically processed precursors of one another nor do they seem to
be post-translationally modified variants of a secreted precursor
protein(16, 17, 18) . Kagan and co-workers (19) have shown, however, that subtle differences in amino acid
composition exist between the isolated isoforms, suggesting the
possible existence of genetically distinct variants of lysyl oxidase.
We have recently shown that such variants could not arise through
alternate usage of exons within the human lysyl oxidase gene. Therefore, it is possible that the lysyl oxidase-like gene could
encode a functional variant of lysyl oxidase. Concomitant appearance of
lysyl oxidase and lysyl oxidase-like mRNA in several human tissues
suggests that co-ordinate expression of both lysyl oxidase and the
lysyl oxidase-like gene could be important to overall lysyl oxidase
activity. It is well known, for example, that preparations of lysyl
oxidase readily aggregate in vitro and cross-link activity is
routinely assayed in the presence of urea(16) . It is possible
therefore that in vivo lysyl oxidase is a complex of subunits
synthesized from at least two, but possibly as many as four, different
genes.
A possible subunit structure for lysyl oxidase would provide the basis for understanding the mechanism(s) whereby lysyl oxidase can mediate multiple catalytic functions. Variation in subunit composition, for example, may explain how lysyl oxidase interacts with procollagen and tropoelastin substrates. A subunit structure may also be important in understanding the recently reported role of lysyl oxidase in ras-mediated tumor suppression(5, 14, 39, 42) . The presence of a cytokine receptor domain in both lysyl oxidase and lysyl oxidase-like proteins may be particularly important to a multifunctional enzyme or enzyme complex that not only plays a significant role in the maintenance of extracellular matrix structure but also functions, either directly or indirectly to maintain an equilibrium between cell proliferation and differentiation.