(Received for publication, August 12, 1994; and in revised form, October 26, 1994)
From the
We describe here the isolation of a novel non-collagenous protein from the acid demineralization extract of bovine cortical bone. This 24-kDa protein is multiply phosphorylated at serine residues in Ser-X-Glu/Ser(P) sequences, a recognition motif for phosphorylation by the secretory pathway protein kinase, and we have termed this protein secreted phosphoprotein 24 (spp24).
The cDNA
structure of spp24 was determined by sequencing cDNA fragments obtained
by reverse transcription-polymerase chain reaction, 3`-rapid
amplification of cDNA ends, and screening a gt11 cDNA library.
This cDNA sequence predicts a 200-residue initial translation product
which consists of a 20-residue signal sequence and the 180-residue
mature spp24. Northern blot analysis using the spp24 cDNA showed that
spp24 mRNA is in liver and bone but not in heart, lung, kidney, or
spleen. A search of existing protein sequences revealed that the
N-terminal 107 residues of mature spp24 are related in sequence to the
cystatin family of thiol protease inhibitors, which suggests that spp24
could function to modulate the thiol protease activities that are known
to be involved in bone turnover. Several of the proteins in the
cystatin family that are most closely related to spp24 are not only
thiol protease inhibitors but are also precursors to peptides with
potent biological activity, peptides such as bradykinin and the
neutrophil antibiotic peptides. It is therefore possible that the
intact form of spp24 found in bone could also be a precursor to a
biologically active peptide, a peptide which could coordinate an aspect
of bone turnover.
Bone is unusual among the extracellular matrices of vertebrates because it is continuously turned over throughout life. This turnover is mediated by the action of osteoblasts and osteoclasts, and serves to provide calcium derived from bone mineral for the maintenance of serum calcium homeostasis. Proteins secreted by bone cells can therefore have functions which range from the formation of the organic bone matrix and its mineralization to the removal of bone matrix by osteoclastic bone resorption and the coupling of resorption to formation.
In spite of
over 20 years of concerted effort, only a few proteins have been
isolated and characterized from bone matrix(1) , and relatively
little is known about the function of these proteins in bone formation
and turnover. The non-collagenous bone matrix-derived proteins whose
amino acid sequence is presently known include the vitamin K-dependent
proteins bone Gla protein (ossteocalcin) (2, 3) and
matrix Gla protein (MGP)()(4, 5) , the
RGD-containing putative cell adhesion proteins osteopontin (6) and bone sialoprotein(7) , the small proteoglycans
decorin and biglycan(8, 9, 10) , and
osteonectin(11, 12) .
The objective of the present study was to identify new bone matrix-derived proteins and to evaluate their possible biological activities in bone formation and turnover. We report here the isolation and molecular cloning of a novel bone matrix protein of 24-kDa molecular mass whose N-terminal 107 residues are related in sequence to the cystatin family of thiol protease inhibitors. This protein is multiply phosphorylated at serine residues in Ser-X-Glu/Ser(P) sequences, the recognition motif for phosphorylation by the secretory pathway protein kinase (13) which has been observed in most secreted phosphoproteins(14) . We have termed this novel bone protein secreted phosphoprotein 24 (spp24).
The neutral pH-insoluble pellet was dissolved in 3 ml of
6 M guanidine HCl in 0.1 M Tris, pH 9, and applied to
a 2 150-cm column of Sephacryl S-100 HR equilibrated with the
same buffer at room temperature. The eluant fractions containing spp24
were pooled and further purified using a 4.6 mm
25-cm C
reverse phase HPLC column with a 2-h gradient from 0.1%
trifluoroacetic acid in water to 0.1% trifluoroacetic acid in 60%
acetonitrile at a flow rate of 1 ml/min.
SDS-polyacrylamide gel electrophoresis was performed under reducing conditions as described elsewhere (15) using 4-20% gradient gels (Novex, San Diego, CA).
The message
from the internal region to the 3`-end was sequenced after generating a
PCR fragment using 3`-rapid amplification of cDNA ends
(3`-RACE)(16) . A specific internal sense primer
(5`-CGCTGCCACTGGTCCTCCAGCTCT-3`) was synthesized which was located 45
bp upstream from the internal antisense degenerate primer. In addition,
a unique 23-base oligonucleotide adapter primer linked to a
17-oligo(dt) (5`-ACGCGTCGACCTCGAGATCGATG-(dT)-3`), and the
adapter primer (5`-ACGCGTCGACCTCGAGATCGATG-3`) were used for the
3`-RACE system. A 370-bp cDNA fragment was produced by both bovine
periosteum and bovine liver total RNA and subsequently cloned and
sequenced as described above. Identical sequences were obtained for
this 370-bp fragment from bone and liver.
The 5`-end cDNA clone was
obtained by screening a bovine liver gt11 cDNA library (Clontech,
Palo Alto, CA) with the 312-bp cDNA probe. Bacteriophages were plated
and transferred to nitrocellulose filters as described by the Clontech
protocol. Two replica filters were lifted from each plate. Following
hybridization to the
P-labeled bovine 312-bp cDNA probe,
eight positive phage plaques were isolated. Two of these contained the
5`-end 380-bp cDNA sequence as determined by PCR with the two
degenerate primers. cDNA inserts of these two clones were enzymatically
amplified by performing PCR with a specific antisense primer
(5`-CAGATAGGGGCTCAGTGACTGGGA-3`) located 73 bp downstream from the
N-terminal sense degenerate primer and either a
gt11 forward or a
gt11 reverse primer (Promega, Madison, WI). A 470-bp PCR product
was obtained from one clone and a 300-bp PCR product from the other
clone. The 5`-end PCR products were cloned and partially sequenced with
TA Cloning
System and Version 2.0 DNA Sequencing Kit as
above.
Figure 1:
Purification of spp24
and MGP by gel filtration over a Sephacryl S-100 HR column. The
proteins extracted from ground bovine bone by demineralization in 10%
formic acid were dried and the neutral pH-soluble proteins were removed
by washing repeatedly with 50 mM NHHCO
. The water-insoluble proteins were
then dissolved in 6 M guanidine HCl with 0.1 M Tris-HCl at pH 9.0 and loaded onto a 2
150 cm Sephacryl
S-100 HR column equilibrated with the same buffer at room temperature.
Fraction volume, 3 ml. Inset, SDS-polyacrylamide gel
electrophoresis of partially purified spp24. Proteins were
electrophoresed on a 4-20% gradient gel and stained with
Coomassie Brilliant Blue. Lane 1, molecular mass standards; lane 2, 20 µg of the purified spp24 in pooled fractions
61-63. (See ``Materials and Methods'' for
details.)
Figure 2:
Further purification of spp24 by reverse
phase high pressure liquid chromatography. 100 µg of the partially
purified spp24 in pooled fractions 61-63 from the gel filtration
shown in Fig. 1was loaded directly onto a 4.6 mm 25-cm
C
column equilibrated with 0.1% trifluoroacetic acid at
room temperature. Bound proteins were subsequently eluted with a 2-h
linear gradient to 0.1% trifluoroacetic acid in 60% acetonitrile.
Fraction volume, 1.3 ml. Inset, SDS-polyacrylamide gel
electrophoresis of purified spp24. Proteins were electrophoresed on a
4-20% gradient gel and stained with Coomassie Brilliant Blue. Lane 1, molecular mass standards; lane 2, 10 µg
of the purified spp24 in pooled fractions
24-27.
As can be seen in Fig. 1, an
additional protein component is recovered in the gel filtration step in
a peak centered at fraction 55. When this fraction was subjected to
SDS-gel electrophoresis using 4-20% gradient gels, a single major
protein fraction was found of 38-kDa molecular mass (data not shown).
When this 38-kDa protein was subsequently transferred from the gel to a
poly(vinylidene fluoride) membrane and subjected to N-terminal protein
sequencing, a single N-terminal sequence was obtained,
YPQNWHHXSDLQHVILDKVGLQKIPKVREKT. A search of the non-redundant
data base of the NLM using the BLAST search program (19) revealed this 38-kDa bone protein to be nearly identical
in sequence to a recently reported 38-kDa protein isolated from bovine
cartilage which has been termed cartilage leucine-rich protein
(GenBank accession no. U08018). The identity of this
38-kDa bone matrix protein with the cartilage leucine-rich protein was
confirmed by the isolation of a cyanogen bromide peptide whose sequence (XNLVSLHLQHXQIREVAAGAF) is identical to residues 77-97
of the bovine cartilage leucine-rich protein.
Figure 3: Complete nucleotide sequence of spp24 cDNA and deduced amino acid sequence of the protein. Underlined sequences correspond to those determined by protein sequencing (see Table 1). The cleavage site of the putative signal sequence is indicated by the arrow at residue 20 and the N terminus of mature spp24 begins at residue 21. The stop codons are marked by asterisks. The polyadenylation signal sequence AATAAA is indicated in bold lettering.
Figure 4:
Strategy for determining the sequence of
spp24 cDNA. (1) A 380-bp cDNA fragment which contains the
N-terminal portion of spp24 was obtained using RT-PCR with two
degenerate primers. (2) A 370-bp cDNA fragment which contains
the 3`-untranslated region of spp24 was generated using 3`-RACE with a
specific internal primer. (3) A 312-bp fragment of spp24 was
used for gt11 cDNA library screening and Northern blot analysis. (4) A 470-bp cDNA fragment was obtained by screening a bovine
liver
gt11 cDNA library with the 312-bp cDNA probe and the
3`-region (243 bp) of this fragment was
sequenced.
Figure 5:
Northern blot analysis of spp24 message
levels in bovine tissues. RNA was extracted from the indicated bovine
tissues and 40 µg of total RNA from each tissue was run on a 1.4%
formaldehyde-agarose gel, blotted onto a Nytran membrane, and
hybridized with a P-labeled, 312-bp spp24 cDNA fragment (Fig. 4). Lane 1, bone periosteum; lane 2,
heart; lane 3, lung; lane 4, kidney; lane 5,
spleen; lane 6, liver. The migration positions of molecular
size markers in kilobases are indicated on the left.
We have described the isolation and amino acid sequence of a novel bone phosphoprotein of 24-kDa molecular mass which we have termed secreted phosphoprotein 24 (spp24). The purification procedures developed for the isolation of spp24 are based on its insolubility at neutral pH and its solubilization by 6 M guanidine HCl or by acidic pH. spp24 was first separated by precipitation from the neutral pH-soluble proteins in the acid demineralization extract of bovine bone and then purified to homogeneity by gel filtration over Sephacryl S-100 HR in 6 M guanidine HCl buffer followed by reverse phase HPLC using a C4 matrix in an acidic buffer. Two other neutral pH-insoluble proteins were also recovered from the gel filtration step, MGP and a 38-kDa protein that appears to be identical to a 38-kDa protein recently isolated from bovine cartilage and termed cartilage leucine-rich protein (see above).
To evaluate the possible
relationships between spp24 and other known proteins, the complete
200-residue spp24 sequence deduced from its cDNA structure was compared
with all presently known protein sequences in the non-redundant data
base of the NLM using the BLAST search program (19) . This
search revealed the presence of a comparable level of sequence identity
between spp24 and cystatin domains 1 and 3 of human kininogen (23) and between spp24 and the precursor to the bovine
neutrophil antibiotic peptide bactenecin(24) . The bactenecin
precursor and cystatin domains 1 and 3 of kininogen are known to be
related in sequence to the cystatin family of thiol protease
inhibitors(25, 26) , and spp24 is accordingly compared
to two additional members of this family in Fig. 6, porcine
cathelin (26) and chicken cystatin(27) . As can be seen
by analysis of this figure, the bactenecin precursor and cathelin are
more closely related to spp24 than to cystatin domains 1 and 3 of
kininogen or to chicken cystatin. Cystatin domains 1 and 3 of kininogen
and cystatin are also more closely related to spp24 than to bactenecin
precursor or to cathelin. It is therefore probable that spp24 is an
evolutionary intermediate which links cathelin, bactenecin precursor,
and the closely related precursors to the neutrophil antibiotic
peptides Bac5 (28) and indolicidin (29) with the
various cystatins and with the cystatin domains of kininogen. The
structure of a prototypical cystatin domain has been determined from
crystallographic studies of the 108-residue chicken cystatin and is a
compact structure with a 5-stranded -sheet wrapped around a 5-turn
-helix(30) . If, as seems probable, the sequence
identities observed between spp24 and chicken cystatin reflect similar
polypeptide conformations, it seems likely that the entire 107-residue
region of spp24 between the N terminus of the mature protein and the
11-residue phosphoserine-rich sequence is folded into a cystatin-like
tertiary structure. Since the 4 cysteine residues in the cystatin
domain of spp24 lie at sequence positions known to be involved in
disulfide bonds in other members of the cystatin family, it is probable
that these 4 cysteine residues are likewise involved in disulfide bonds
and that these bonds join Cys-63 with Cys-94 and Cys-87 with Cys-105 in
the mature spp24 protein (Fig. 6).
Figure 6: Amino acid sequence homologies between spp24 and porcine cathelin, bovine bactenecin precursor, cystatin domains 1 and 3 of human kininogen, and chicken cystatin. Residue numbers refer to the sequence position in mature spp24. The related sequences are residues 1-92 of cathelin(26) , 24-126 of bactenecin precursor (24) , 23-127 of kininogen (cystatin domain 1)(23) , 268-371 of kininogen (cystatin domain 3)(23) , and 12-116 of chicken cystatin (27) . Identical amino acids are boxed.
Many members of the cystatin family have been shown to potently inhibit thiol proteases such as cathepsins and papain, and it is possible that the ability to inhibit thiol proteases is a feature of most proteins with a cystatin domain. Among the proteins most closely related in sequence to spp24, cathelin, chicken cystatin, and cystatin domain 3 of kininogen have been previously shown to inhibit thiol proteases(25, 26) . Although the bactenecin precursor has not itself been tested for its ability to inhibit thiol proteases, the closely related precursor to the neutrophil antibiotic peptide Bac5 has been reported to potently inhibit cathepsin L(28) . If spp24 is in fact a thiol protease inhibitor, the presence of spp24 in bone suggests that the target thiol protease is also found in bone. Several thiol proteases of the cathepsin family are in fact known to be expressed by bone cells(31) , and there is evidence to suggest that such thiol proteases may be released from osteoclasts to digest collagen and various non-collagenous proteins under the acidic conditions of osteoclast-mediated bone resorption(32) .
A second possible spp24 function is suggested by the observation that the cystatin domains most closely related in sequence to spp24, cystatin domain 3 of kininogen and the cystatin domain of the neutrophil antibiotic precursors, lie to the immediate N terminus of a peptide segment which, when released by protease action, has potent biological activity. Cleavage of kininogen with kallikrein releases bradykinin, a potent vasodilator, cleavage of the bactenecin precursor of neutrophils yields the antibiotic dodecapeptide bactenecin, cleavage of the indolicidin precursor yields the tryptophan-rich antibiotic tridecapeptide indolicidin, and cleavage of the Bac5 precursor yields the 46-residue antibiotic polypeptide Bac5. The common location of these peptides to the immediate C terminus of cystatin domains suggests that the proteolytic cleavages which release the active peptides may involve a common mechanism of substrate recognition that is based in part on the presence of proximal cystatin domain. If an analogous proteolytic cleavage were directed by recognition of the cystatin domain of spp24, the resulting peptide would be derived from the C-terminal 62 residues of spp24, a region which is unrelated in sequence to any known protein. It is important to note that, in spite of the very high number of sequence identities between the cystatin domains of these neutrophil antibiotic precursors, there is no significant level of sequence identity between any of the antibiotic peptides themselves or between these peptides and bradykinin. It seems likely that the sequence of each biologically active peptide is different because the structural demands of the target binding site is itself different in each case.
There are several intriguing
similarities between spp24 and fetuin which suggest that the proteins
could have similar mechanisms of action in bone. Both proteins are
synthesized by liver as well as bone (33) and accumulate in the
extracellular matrix of bone. Both proteins have cystatin domains, one
for spp24 and two for fetuin(34) . Both proteins contain
phosphoserine(35) . Finally, both proteins have an extended
C-terminal sequence following the last cystatin domain, a C-terminal
sequence which could arguably be a precursor to a biologically active
peptide. It is of interest to note that HS
glycoprotein, the human analogue of fetuin, circulates in blood as a
two-chain molecule (36) and that the cleavages which generate
the two-chain form occur within the extended C-terminal sequence that
follows the last cystatin domain. The connecting peptide which is
removed by those cleavages contains a sequence which could be
phosphorylated by the SXE/S(P)-specific secretory pathway
protein kinase, the sequence SPSGE (residues 310-314) in the
protein (36) .
The identification of phosphoserine residues within the serine-rich sequence of spp24 identifies the protein as the third bone-derived phosphoprotein in which the location of phosphoserine residues has been established, the others being osteopontin (37) and matrix Gla protein (14) . The phosphorylation of serine residues in spp24 follows the recognition motif for serine phosphorylation that has been found in most secreted phosphoproteins, including MGP and osteopontin. All but one of the serine residues phosphorylated in spp24 have the negatively charged side chain of glutamate or phosphoserine in the n + 2 position in the consensus recognition sequence Ser-X-Glu/Ser(P), a substrate recognition pattern first observed in milk caseins and now identified in a wide variety of secreted phosphoproteins. The only phosphorylated serine that does not conform to this recognition motif is serine 130, which has a glycine residue in the n + 2 position in the sequence SSSSGSSSS. It is interesting to note that osteopontin also has a phosphorylated serine which has glycine in the n + 2 position in an analogous sequence, SSGSS(37) . It is possible that the small size of glycine may allow sufficient conformational flexibility to enable the SXE/S(P)-specific secretory pathway protein kinase to phosphorylate serines which have glycine in the n + 2 and phosphoserine in the n + 3 positions.
We have noted previously that phosphoproteins secreted into the extracellular environment of cells are invariably partially phosphorylated at each target serine residue, while those phosphoproteins secreted into milk and saliva are fully phosphorylated (14) . This pattern of partial serine phosphorylation is also seen in spp24 (Table 2). We have speculated that such partial serine phosphorylation may reflect a role for phosphoserine residues in the regulation of phosphoprotein activity by modulation of the SXE/S(P)-specific protein kinase or of a phosphoprotein phosphatase(14) . Many secreted phosphoproteins have phosphoserine residues that are clustered in highly anionic sequences, including MGP and osteopontin. Spp24 follows this pattern, with a potential maximum net negative charge of 18 in an 11 residue span, assuming complete phosphorylation of all serine residues. It seems probable that phosphorylation of serine residues in this region will create sufficient charge repulsion to inhibit formation of any secondary structure and that this sequence could therefore act as an anionic spacer separating the cystatin domain of spp24 from its C-terminal domain. If this model is correct, regulated changes in the extent of serine phosphorylation in this region would provide a mechanism to alter the separation of the cystatin and C-terminal domains of spp24 and thereby modulate the activity of spp24 or its susceptibility to proteolytic cleavage.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U08018[GenBank].