(Received for publication, May 17, 1995)
From the
Multimerin is a massive, soluble protein found in platelets and
in the endothelium of blood vessels. Multimerin is composed of varying
sized, disulfide-linked multimers, the smallest of which is a
homotrimer. Multimerin is a factor V/Va-binding protein and may
function as a carrier protein for platelet factor V. The cDNA for human
multimerin was isolated from
Multimerin is a large, soluble protein (1, 2) stored within platelet Multimerin is one of the largest proteins found in platelets and
endothelial cells, with most of its multimers exceeding a million
daltons in size(1, 2, 3, 4) . A number
of parallels exist in the protein trafficking and storage of multimerin
and von Willebrand factor. Multimerin resembles von Willebrand factor
in its complex, disulfide-linked multimeric structure (2, 4) . However, unlike von Willebrand factor, which
is assembled from dimers, the smallest multimer of multimerin is a
400-kDa homotrimer (2) . Both proteins are stored within the
electron-lucent zone of platelet In this report, we describe the isolation, sequencing, and deduced
amino acid sequence of human endothelial cell multimerin cDNA. These
studies identify multimerin as a unique protein, unrelated to von
Willebrand factor, with RGDS, EGF-like, and coiled-coil domains, and a
carboxyl-terminal region that resembles the trimeric, carboxyl-terminal
globular domains of complement C1q and collagens type VIII and X.
Lambda DNA was
purified from plate lysates(10) , digested with EcoRI,
and the multimerin cDNA clones were subcloned into the EcoRI
site of pGEM 7Zf+ (Promega).
Screening of the 300,000 plaques-forming units from the
Clontech endothelial cell cDNA expression library yielded seven
multimerin immunoreactive clones (mm
Figure 2:
Nucleotide sequence and deduced amino acid
sequence of human endothelial cell multimerin. An in-frame stop codon
in the 5`-untranslated region is underlined. An internal EcoRI site is indicated. The 3`-untranslated region contains a
polyadenylation signal and terminates in a poly(A) tail. Amino acids
368-376 were confirmed by sequencing an internal peptide fragment
of multimerin. The location of the most probable signal peptide is
indicated. Potential N-linked glycosylation sites (dot) are noted, and cysteine residues are underlined. RGDS and EGF-like domains are indicated. A partial
EGF-like domain, lacking the first cysteine of the EGF concensus
sequence CXCXXXXXGXXC, is also
indicated.
Figure 1:
Multimerin
sequencing strategy. Arrows indicate the direction and
distance of sequencing. Clones subcloned as PCR fragments are indicated
with ``pcr'' in their name. The consensus clone
length was 4212 bp.
The radiolabeled mm The sequence of the human endothelial
cell multimerin cDNA is shown in Fig. 2. The open reading frame
is preceded by a 71-bp noncoding region containing an in-frame stop
codon. The initiation ATG conforms to Kozak's consensus sequence
for initiation (14) and is followed by an open reading frame of
1228 codons and a 3`-noncoding region of 454 bp containing a
polyadenylation signal and a poly(A) tail.
The multimerin cDNA encodes a protein
of 1228 amino acids with a calculated molecular mass of 138 kDa (Fig. 2). Analysis for the signal peptide cleavage site, using
Prosite (PC Gene), indicated that the most probable cleavage site was
between amino acids 19 and 20. The protein, minus the signal peptide,
has a predicted molecular mass of 136 kDa, which is in close agreement
with the 132-kDa nonglycosylated precursor identified by metabolic
protein labeling studies of Dami cell (3) and endothelial cell
multimerin. Analysis of the
multimerin protein for functional domains using MacVector and PC Gene
identified the adhesive motif RGDS (amino acids 186-189) (15, 16) and an EGF-like (17, 18) domain (amino acids 1065-1076).
Consensus sequences for a tyrosine sulfation site (amino acid 1038) (19, 20) and an asparagine hydroxylation site (21) (amino acid 1058) were identified adjacent to the EGF-like
domain. The protein contained 23 potential N-glycosylation
sites(22, 23, 24, 25) , which is in
close agreement with the 17-21 sites predicted by N-deglycosylation of endothelial cell and Dami cell
multimerin(3) . Search of the NCBI data
banks using the BLASTP algorithm and BLOSUM 62 matrix indicated that
multimerin was a novel protein. Assessment of the high scoring
homologous sequences identified similarities between multimerin and a
large number of proteins that contain EGF-like domains. The highest
scoring elements were Xotch (26) and its homologues in other
species. These homologues are transmembrane proteins, containing
multiple EGF-like domains, and are important for neurogenic
development. The highest scoring human proteins with homology to the
EGF-like domain of multimerin included TAN-1(27) , a homologue
of Xotch and Notch(28) , fibroblast proteoglycan core protein (29) , and coagulation factors IX (30, 31, 32) and
X(33, 34, 35) . Fig. 3(upper
panel) shows the comparison alignments for the EGF-like domain of
multimerin. An additional region of homology between multimerin and
Xotch, Notch, and TAN-1 proteins was identified spanning amino acids
265 and 303 of multimerin (Fig. 3, lower panel). This
region of multimerin contains three cysteine residues but lacks the
first cysteine in the EGF-like consensus sequence
CXCXXXXXGXXC. This domain is homologous with
EGF-like domains in Xotch, Notch, and TAN-1 and in proteoglygcan core
proteins.
Figure 3:
Comparison alignments of multimerin with
proteins containing EGF-like domains. The upper panel shows
the comparison of the EGF-like domain in multimerin with the EGF-like
domains of the highest scoring, homologous protein sequences. The lower panel is a comparison of a domain of multimerin with
similarity to EGF-like domains in Xotch, Notch, and the human homologue
TAN-1. This region of multimerin lacks the first cysteine residue of an
EGF-like consensus sequence. Boxed residues indicate consensus
residues, and an asterisk indicates the conserved residues in an
EGF-like motif.
A number of proteins, including the rod-like tail of many
myosin heavy chains (from a variety of species and tissue
types)(36, 37, 38) ,
macrogolgin(39) , and the tpr oncogene(40) ,
showed homology with regions in the central portion of multimerin
between amino acids 317-1024. These homologous proteins are known to
contain coiled-coil structures. Multiple sequence comparisons using the
Clustal program (DNA Star) and a PAM 250 matrix revealed one region
where residues were conserved between multimerin (amino acids
476-498) and many myosin heavy chain sequences (Fig. 4).
The similarities between multimerin and a variety of coiled-coil
proteins suggested that there could be coiled-coil structures within
multimerin. This possibility was investigated using sequence analysis
the program PEPCOIL. The multimerin polypeptide sequence contained
regions of high probability for coiled-coil structures in the region of
the protein that was similar to other coiled-coil proteins. The
probable coiled-coil structures were located between amino acids:
317-375, 400-445, 668-738, and 818-873 (Fig. 5).
Figure 4:
Comparison alignments of multimerin with
the rod-like tail of myosin heavy chains. Boxed residues
indicate consensus residues.
Figure 5:
The probability of coiled-coil structures
in multimerin. A number of highly probable coiled-coil regions are
identified between amino acids
317-873.
An additional region of significant homology was
observed in the carboxyl terminus of the multimerin sequence. This
region of multimerin was found to resemble the trimeric,
carboxyl-terminal, non-collagen-like, globular domain of several other
proteins. These included the A, B, and C chains of human and mouse
complement C1q protein(41, 42, 43, 44, 45, 46) and
human
Figure 6:
Comparison alignments of multimerin with
the COOH-terminal globular domains of complement C1q A, B, and C chains
and human
A
schematic summary of the various multimerin domains is shown in Fig. 7. Based upon the multimerin constructs that contained the
epitope recognized by the monoclonal antibody JS-1, the JS-1 epitope
was localized to the region containing amino acids 961-1139, which
includes the EGF-like domain and a portion of the COOH-terminal
globular domain.
Figure 7:
Multimerin protein domains. Domains
identified by protein analysis and the location of the monoclonal
antibody JS-1 epitope are shown.
Because of the many similarities observed in the
multimeric structures and protein trafficking of multimerin and von
Willebrand factor(2, 4) , Our earlier
studies of multimerin biosynthesis identified a p-170 precursor
protein, containing high mannose-linked carbohydrate(3) . The
132-kDa polypeptide component of this p-170 multimerin precursor is in
close agreement with the size of the deduced amino acid sequence minus
its signal peptide (136 kDa). Based on these data, we have designated
the multimerin precursor protein (minus the signal sequence) as
promultimerin. During biosynthesis, the N-linked carbohydrates
on promultimerin are converted to complex forms to produce a p-186
protein in endothelial cells and a p-196 protein in Dami
cells(3) .
Figure 8:
Northern analysis of multimerin expression
in cultured cells and in human tissues. The expression of multimerin (upper panels) and actin (lower panels) mRNA in
isolated human cells and in human tissues. The left panels show cultured cells and platelets (20 µg of total RNA/lane),
and the right panels are a multiple tissue Northern
(Clonetech; 2 µg of poly(A)
The data presented in this report confirm that multimerin is a
unique protein. Further studies are required to define the precise
roles of multimerin's RDGS, coiled-coil, EGF-like, partial
EGF-like, and C1q-like domains and to determine how the multimerin
subunits are assembled into the large, disulfide-linked multimers. The
coiled-coil and globular domains are likely sites for interchain
associations. We postulate that the RGDS, EGF-like, and C1q-like
domains will prove to be important sites for the interaction of
multimerin with other proteins. Knowledge of the functions of these
domains and of multimerin's tertiary structure may provide
insights into the molecular mechanisms that control hemostasis.
Additional clues may be provided by investigations of individuals who
are deficient in multimerin.
The nucleotide
sequence(s) reported in this paper has been submitted to the
GenBank®/EMBL Data Bank with accession number(s)
U27109[GenBank].
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS AND DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES
gt11 endothelial cell libraries using
antibodies, and the isolated cDNA clones were used to obtain the full
sequence. The full-length multimerin cDNA was 4.2 kilobase pairs.
Northern analyses identified a 4.7-kilobase transcript in cultured
endothelial cells, a megakaryocytic cell line, platelets, and highly
vascular tissues. The multimerin cDNA can encode a protein of 1228
amino acids with the probable signal peptide cleavage site between
amino acids 19 and 20. The protein is predicted to be hydrophilic and
to contain 23 N-glycosylation sites. The adhesive motif RGDS
(Arg-Gly-Asp-Ser) and an epidermal growth factor-like domain were
identified. Sequence searches indicated that multimerin is a unique
protein. Analyses identified probable coiled-coil structures in the
central portion of the multimerin sequence. Additionally, the
carboxyl-terminal region of multimerin resembles the globular,
non-collagen-like, carboxylterminal domains of several other trimeric
proteins, including complement C1q and collagens type VIII and X.
-granules (3) and endothelial cell Weibel-Palade bodies. (
)Following activation of these cells, multimerin is
released and binds to the cell surfaces of
platelets(1, 2, 3, 4) , and
endothelial cells,
and the extracellular matrix. In
vivo, multimerin is restricted to megakaryocytes, platelets, and
the endothelium and subendothelium of blood vessels, and it is not
found in the plasma(3, 4) .
Recent studies
have identified multimerin as a factor V/Va-binding protein. (
)In resting platelet lysates, multimerin is complexed with
factor V, and immunoelectron microscopy studies indicate that factor V
and multimerin are stored together within platelet
-granules.
However, following platelet activation and
the release of multimerin and factor V, these two proteins
dissociate.
These findings suggest that multimerin may play
a role in the storage and stabilization of platelet (but not plasma)
factor V and also indicate separate functions for these proteins on
activated platelets. The avid association of multimerin with activated
platelets (1, 4) and endothelial cells
suggests that there may be other functions, possibly adhesive,
for multimerin once it is released from intracellular stores.
The variable molecular weight of multimerin is due to differences
in the number of multimerin subunits comprising the
protein(1, 2, 4) . Multimerin is highly
glycosylated, with complex N-linked carbohydrate accounting
for about one-third of its molecular mass(3) . It is
synthesized as a p-170 protein (132-kDa polypeptide component)
containing high mannose, N-linked carbohydrates, which are
then converted to complex forms(3) . During biosynthesis,
interchain disulfide bonds form to generate homotrimers and larger
homomultimers(2, 3, 4) . Proteolysis of the
subunits occurs (without disrupting the multimeric structure), leading
to the stable p-155 subunit that is stored in
platelets(1, 2, 3, 4) .
-granules and within the
Weibel-Palade bodies of endothelial cells(3) .
They
are constitutively secreted as small multimers, and their intracellular
stores are enriched in high molecular weight
multimers(5) .
However, unlike von Willebrand
factor, multimerin is not detectable in plasma and differs in its
subunits size and glycosylation(1, 3, 4) .
cDNA Libraries
Human endothelial cell cDNA
libraries in gt11 were obtained from Clontech (5` stretch human
endothelial cDNA library; Palo Alto, CA) and additional human umbilical
vein endothelial cell libraries (VII-91-4 and VII-91-5) were a generous
gift from J. Evan Sadler (6) (St. Louis, MO).
Screening of
Libraries
were screened for clones (7) expressing the multimerin protein
using both monoclonal and polyclonal anti-multimerin antibodies (1) (1:1000 dilution), alkaline phosphatase-conjugated
secondary antibodies (1:10,000 dilution, Promega, Madison, WI), and
nitro blue tetrazolium-5-bromo-4-chloro-3-indolyl phosphate (Sigma) for
detection. All of the clones identified by antibody screening were
immunoreactive with both monoclonal and polyclonal multimerin antisera.
Following cloning and sequencing of the initial isolates, the most 5`
clone was labeled and used to rescreen the gt11 cDNA Libraries
gt11 library for
full-length cDNA clones(8, 9) .
PCR
The
multimerin cDNA inserts in Amplification of
Inserts
gt11 were amplified using PCR and
primers specific for
gt11, as described previously (11) .
The inserts were subcloned and their flanking
gt11 sequences were
used to determine the orientation of the multimerin cDNAs(12) .
These clones were also used for sequencing across an internal EcoRI site, using multimerin-specific sequencing primers.
DNA Labeling
Multimerin cDNA inserts were labeled
with [-
P]ATP, using a random primer
labeling kit (U. S. Biochemical Corp.), for use in library screening
and Northern blotting. The 5` EcoRI fragments of the
full-length multimerin cDNA (mm
17) and of mm
4 were used for
Northern analyses.
DNA Sequencing
Double-stranded sequencing of
overlapping clones was performed using manual (dideoxy sequencing with
Sequenase, U. S. Biochemical Corp.) and automated sequencing of Qiagen
miniprep DNA (Qiagen, Chatsworth, CA). Automated DNA sequencing was
performed by the Central Facility of the Institute for Molecular
Biology and Biotechnology, McMaster University, using an Applied
Biosystems (model 373A) automatic DNA sequencer. Sequencing was done
using dyedeoxy terminator technology with cycle sequence Taq according to the manufacturer's instructions. Primers used
for sequencing included M13 universal forward and reverse primers.
Multimerin-specific primers were used to fill in gaps in the aligned
sequence of exonuclease III-deleted clones. PCR-derived clones were
used only for studies of orientation and for sequencing across the
internal EcoRI site.Data Analyses
Sequence analyses and alignments
were performed using the MacVector and Assemblylign programs (Eastman
Kodak Co.) and PC Gene (IntelliGenetics, Inc., Mountain View, CA). Data
bank searches (NCBI at National Library of Medicine, Bethesda, MD:
nonredundant
PDB+SwissProt+SPupdate+PIR+GenPept+GPupdate)were
performed using the BLASTP algorithm. Further alignments of homologous
sequences were performed using the Clustal program (DNA Star Ltd.,
London) and a PAM 250 matrix. Analyses for coiled-coil structures were
performed by Dr. Seth Darst, Rockefeller Institute, using the program
PEPCOIL (Genetics Computer Group, Madison, WI).Protein Purification and Sequencing
Multimerin was
purified from outdated platelet concentrates by affinity
chromatography, as described(1) . 15 µg of the purified
protein was used for preparative SDS-polyacrylamide gel electrophoresis
(reduced) and transblotted onto a polyvinylidene difluoride membrane
(Bio-Rad), following the manufacturer's instructions. The 155-kDa
multimerin subunit was localized using Ponceau Red, excised from the
membrane, and used to obtain internal amino acid sequence data. Protein
digestion (lysylendopeptidase C), peptide separation, and amino acid
sequencing were performed by the Harvard Microchemistry Facility
(Cambridge, MA) using an ABI 477A protein sequencer with a 120A PTH-AA
analyzer.Northern Analyses
Northern analysis was performed
as described(9, 13) . RNA was isolated from first
passage endothelial cells, platelets (washed platelet
pellet (1) from 30 ml of whole blood), and from resting and
PMA-treated Dami cells (3) using TRIzol (Life Technologies,
Inc.). 20 µg of total RNA was loaded per lane (1% agarose gels),
and RNA markers (Promega) were used to determine the transcript sizes.
Hybridization was performed using
P-labeled multimerin
cDNA (mm
4 and mm
17) to analyze the expression of multimerin
in cultured cells and in multiple tissues (multiple tissue Northern;
Clontech), using high stringency washes, as recommended by the
manufacturer. Multimerin expression was compared with actin (cDNA probe
supplied by Promega) as the control.
Molecular Cloning and Sequencing of Full-length
Multimerin cDNA
The deduced protein sequence of multimerin was
investigated by cloning and sequencing of the multimerin cDNA. As
previous studies indicated synthesis of multimerin by endothelial
cells, human endothelial cell cDNA libraries in
gt11
were chosen for these studies. Because the NH
terminus
sequence of multimerin was found to be blocked, antibodies were used
for screening. The full-length multimerin cDNA was predicted to
contain a 3.6-kbp open reading frame, based upon the 132-kDa
polypeptide component of the multimerin precursor(3) .
1-7). Sequencing of the
5` and 3` ends of mm
4, mm
5, and mm
7 identified overlap
in their sequences, and all three isolates were recognized by both
monoclonal (JS-1) and polyclonal multimerin antisera. All of the
gt11 libraries screened were constructed using EcoRI
adapters. Digestion of mm
2 and mm
5 liberated two EcoRI fragments, indicating the presence of at least one
internal EcoRI site in the multimerin cDNA. To identify the
orientation of the expressed clones in
gt11 and the number of
internal EcoRI sites, the complete mm
5 and mm
7
inserts were amplified using primer sites in the
arms (11, 12) and subcloned into pGEM. Sequencing of the
PCR-amplified inserts identified the orientation of the fragments and
indicated that only one internal EcoRI site was present in
mm
5. A 3` 729-bp fragment containing a polyadenylation signal site
and terminating in a poly(A) tail (Fig. 2) was identified. The
most 5` clone identified by antibody screening (mm
4, 1.4 kbp)
terminated at the internal EcoRI site. Exonuclease III
deletions of the overlapping clones (mm
4 and mm
7) were
created to fully sequence the cDNA fragments in both directions (Fig. 1).
4 cDNA was used to screen
the libraries VII-91-4 and VII-91-5 for a full-length cDNA clone. Two
clones were isolated that contained additional 5` sequence: mm
11
(isolated from VII-91-4; 2.1- and 0.75-kbp EcoRI fragments)
and mm
17 (isolated from VII-91-5; 3.7-kbp EcoRI
fragment). The first isolate, mm
11, was used to obtain additional
sequence 5` of mm
4, using multimerin-specific primers.
Subsequently, exonuclease III deletions were created from mm
17,
and the region 5` (and overlapping) of mm
4 and mm
11 was fully
sequenced in both directions. Sequencing of the 5` end of mm
17
identified an in-frame stop codon prior to an initiation codon. This
reading frame was consistent with the multimerin fusion proteins that
had been identified by antibody screening of the
gt11
library(12) . These findings indicated that a complete cDNA
sequence had been obtained. The sequence derived from the overlapping
clones was 4212 bp in length.
Deduced Protein Sequence of Multimerin and Homology with
Other Proteins
The putative protein encoded by the multimerin
cDNA was compared with amino acid sequence data obtained from purified
platelet multimerin (Fig. 2). High confidence sequence data were
obtained from an internal peptide fragment of platelet multimerin.
These sequence data were in complete agreement with the predicted
protein sequence (amino acids 368-376), indicating that the cDNA
sequenced encodes multimerin. Kyte-Doolittle plots indicated that the protein
was hydrophilic, consistent with the partitioning of multimerin into
the aqueous phase of Triton X-114 platelet extracts(1) .
Alignment of the multimerin polypeptide sequence to itself using
MacVector and a PAM250 matrix failed to identify significant internal
repeats within the multimerin protein sequence.
The RGDS site was located in an
unglycosylated region of the molecule with a high local flexibility
score (MacVector, Karpus-Schulz analysis).
-1 collagens type VIII (47) and X (47, 48, 49, 50, 51) . The
protein sequence comparisons for this domain indicate conservation of
hydrophobic and uncharged residues (Fig. 6). Electron microscopy
studies of these homologous proteins have shown that their
carboxyl-terminal domains form a globular-shaped
head(53, 54, 55) . In C1q, the globular
domain is assembled from the carboxyl-terminal regions of an A, a B,
and a C chain(56) . The globular domains of C1q and collagens
type VIII and X are implicated in protein interactions. This domain in
the complement C1q protein is known to interact with the Fc portion of
IgG and other complement activator molecules, leading to complement
activation(56) . In collagen type VIII, which is a heterotrimer
of
-1 and
-2 chains, the COOH-terminal globular domain is
implicated in the assembly of a mesh-like structure of collagen type
VIII molecules(54) . Collagen type X, a homotrimeric protein,
contains a similar COOH-terminal globular
domain(47, 48, 49, 50, 51) .
In collagen type X, the carboxyl-terminal globular domains associate to
form a hexagonal mesh of collagen molecules(55) .
-1 collagens type VIII and X. Boxed residues
indicate consensus residues.
we had
anticipated that there could be similarities in their amino acid
sequences. However, von Willebrand factor was not identified in the
data bank searches for homologous proteins, and direct comparison of
the von Willebrand factor (57) and multimerin protein sequences
failed to identify significant homologies. While the propolypeptide
region has been postulated to be involved in the targeting of von
Willebrand factor to Weibel Palade bodies(52, 58) ,
the lack of similarity in the sequences of multimerin and von
Willebrand factor suggest that other factors must account for their
similar trafficking in platelets and endothelial cells.
The site(s) of proteolytic cleavage that
produce the p-155 multimerin subunit (105-kDa polypeptide component) (3) that is stored within platelets (1) and endothelial
cells
await identification. Because the EGF-like domain,
globular C1q-like domain, and JS-1 epitope are located at the carboxyl
end of the protein sequence, we postulate that the mature protein is
produced by cleavage in the NH
-terminal region.
Analyses of Multimerin RNA
Northern analysis of
cultured cells and human tissues identified a 4.7-kb transcript using
the multimerin cDNA probe (Fig. 8). Previous metabolic protein
labeling studies indicated that multimerin is synthesized by
endothelial cells and by Dami cells (3) (a
megakaryocytic cell line) after stimulation with PMA. The multimerin
transcript was identified in cultured endothelial cells, Dami cells,
and platelets. Comparison of resting and PMA-stimulated Dami cells
indicated increased expression of the multimerin mRNA following
stimulation of these cells with PMA. The identification of the
multimerin transcript in platelets indicates that endogenous
biosynthesis by megakaryocytes is the source of platelet multimerin.
Comparison of multiple tissues identified the highest expression of
multimerin in placenta, lung, and liver, three highly vascular tissues (Fig. 8). The detection of multimerin mRNA in vascular tissues
is in agreement with the localization of multimerin in vascular
endothelium in situ using immunohistochemistry.
RNA/lane). Resting (-PMA) and PMA-treated (+PMA) Dami cells
are shown for comparison. A 4.7-kb transcript is identified in
endothelial cells, PMA-treated Dami cells, platelets, and in lung,
placenta, and liver (highly vascular
tissues).
The technical assistance of Jane C. Moore and Zhili
Song is gratefully acknowledged. The authors thank Drs. William
Sheffield and Aled Edwards (McMaster University) for their helpful
discussions and Dr. Seth Darst (Rockefeller Institute) for performing
the analysis for coiled-coil structures.
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.