From the Departamento de Bioquímica y Biología Molecular, Facultad de Medicina, Universidad de Oviedo, 33006-Oviedo, Spain
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A cDNA encoding a new cysteine proteinase
belonging to the papain family and called cathepsin F has been cloned
from a human prostate cDNA library. This cDNA encodes a
polypeptide of 484 amino acids, with the same domain organization as
other cysteine proteinases, including a hydrophobic signal sequence, a
prodomain, and a catalytic region. However, this propeptide domain is
unusually long and distinguishes cathepsin F from other proteinases of
the papain family. Cathepsin F also shows all structural motifs
characteristic of these proteinases, including the essential cysteine
residue of the active site. Consistent with these structural features, cathepsin F produced in Escherichia coli as a fusion
protein with glutathione S-transferase degrades the
synthetic peptide benzyloxycarbonyl-Phe-Arg-7-amido-4-methylcoumarin, a
substrate commonly used for functional characterization of cysteine proteinases. Furthermore, this proteolytic activity is blocked by
trans-epoxysuccinyl-L-leucylamido-(4-guanidino)butane,
an inhibitor of cysteine proteinases. The gene encoding cathepsin
F maps to chromosome 11q13, close to that encoding cathepsin W. Cathepsin F is widely expressed in human tissues, suggesting a role in
normal protein catabolism. Northern blot analysis also revealed a
significant level of expression in some cancer cell lines opening
the possibility that this enzyme could be involved in degradative
processes occurring during tumor progression.
The cysteine proteinases are a widespread group of enzymes that
catalyze the hydrolysis of many different proteins and play a major
role in intracellular protein degradation and turnover (1, 2). These
proteolytic enzymes can be subdivided into more than 20 different
families, including the papain family, calpains, streptopains,
clostripains, viral cysteine proteinases, and caspases, the largest one
being that of papain (3). In fact, the papain family of cysteine
proteinases comprises a large number of enzymes from both prokaryotes
and eukaryotes, with representative members expressed in bacteria,
fungi, protozoa, plants, and humans (3, 4). In recent years, the number
of human cysteine proteinases belonging to the papain family has
considerably increased, and a total of 10 different family members has
been characterized at the amino acid sequence level. These human
cysteine proteinases include cathepsin B (5), cathepsin L (6, 7),
cathepsin H (8, 9), cathepsin S (10, 11), cathepsin C (11, 12), cathepsin O (13), cathepsin K (14, 15), cathepsin W (16), cathepsin L2
(17), and cathepsin Z (18). Structural analysis of these enzymes has
revealed that all of them contain a series of conserved features
including an essential cysteine residue in their active site. In
addition, it is well established that all these cysteine proteinases
are synthesized as preproenzymes, which are processed to the
corresponding proenzymes and targeted to the lysosomes by the mannose
6-phosphate signal attached to them. However, these enzymes differ in
tissue distribution and in some enzymatic properties, including
substrate specificities and pH stability. Functional analysis of these
proteinases has shown that in addition to their intracellular role in
protein recycling, they are involved in other normal processes such as antigen presentation (19), bone remodeling (20), and prohormone activation (21). In addition, it has been suggested that cysteine proteinases are involved in a variety of disease processes such as
pulmonary emphysema (22), osteoporosis (23), Alzheimer's disease (24),
rheumatoid arthritis (25), and cancer invasion and metastasis (26).
Therefore, these enzymes represent primary targets for the development
of inhibitors that could block its uncontrolled activity in these
pathological conditions.
As part of our work directed to look for proteolytic enzymes that could
be of importance in tumor progression, we have recently identified
different cysteine proteinases of the papain family overexpressed in
human carcinomas from diverse sources. These proteases include
cathepsin O, originally cloned from a breast carcinoma (13), cathepsin
L2, overexpressed in breast and colon carcinomas (17), and cathepsin Z,
ubiquitously distributed in cancer cell lines and primary tumors and
characterized by containing an unusual short propeptide in its amino
acid sequence (18). We have also identified human bleomycin hydrolase,
a cytosolic cysteine proteinase distantly related to other members of
the papain family and involved in chemotherapy resistance (27, 28). In
this work, we describe the molecular cloning and complete nucleotide sequence of a cDNA encoding a new member of the papain family of
cysteine proteinases, which has been called cathepsin F, and that is
mainly characterized by possessing a unique long propeptide domain in
its amino acid sequence. We also report the expression of the gene in
Escherichia coli and the functional characterization of the
recombinant enzyme. Finally, we determine the chromosomal location of
the cathepsin F gene and analyze its expression in human tissues and
cancer cell lines.
Materials--
A human prostate cDNA library, constructed in
Probe Preparation and cDNA Library Screening--
After
searching the GenBankTM data base of human expressed
sequence tags (ESTs) for sequences with homology to members of the papain family of cysteine proteinases, we identified a sequence derived
from an ovarian tumor cDNA
clone2 that could be a good
candidate to encode a new family member. Further searching of the EST
data base for sequences similar to H39591 led us to identify 6 overlapping ESTs, spanning around 680 bp, and useful to prepare a probe
for trying to clone a cDNA encoding this putative novel human
cysteine proteinase. To obtain this probe, we performed a PCR
amplification of DNA prepared from a prostate cDNA library using
two specific primers 5'-GGACTGTGGACAAGATGGAC and
5'-AGCTGTTCTTGATGGCCCA, whose sequence was derived from the overlapping
ESTs. The PCR reaction was carried out in a GeneAmp 2400 PCR system
from Perkin-Elmer for 35 cycles of denaturation (94 °C, 15 s),
annealing (56 °C, 15 s), and extension (72 °C, 15 s).
The PCR-amplified product was cloned and, after confirming its identity
by nucleotide sequence analysis, was used to screen a human prostate
cDNA library, according to standard procedures (29). Hybridization
to the radiolabeled probe was carried out for 18 h in 6× SSC (1× = 150 mM NaCl, 15 mM sodium citrate, pH 7.0),
5× Denhardt's (1× = 0.02% bovine serum albumin, 0.02%
polyvinylpyrrolidone, 0.02% Ficoll), 0.1% SDS, and 100 µg/ml
denatured herring sperm DNA at 65 °C. The membranes were washed
twice for 1 h at 65 °C in 0.1× SSC, 0.1% SDS and exposed to
autoradiography. After plaque purification, cloned inserts were excised
by EcoRI digestion and the resulting fragments subcloned
into the EcoRI site of pUC18. The isolated prostate cDNA
encoding a novel human cysteine proteinase was also used as a probe to
screen a mouse brain cDNA library following the same procedure
described above. Positive clones were isolated and characterized by
nucleotide sequence analysis.
DNA and Protein Sequence Analysis--
DNA fragments selected
for nucleotide sequencing were inserted in the polylinker region of
phage vector M13mp19 and sequenced by the dideoxy chain termination
method using either M13 universal primer or cDNA specific primers
and the Sequenase Version 2.0 kit (U. S. Biochemical Corp.). All
nucleotides were identified in both strands. Sequence ambiguities were
solved by substituting dITP for dGTP in the sequencing reactions.
Computer analysis of DNA and protein sequences were performed with the
software package of the University of Wisconsin Genetics Computer Group
(30). A phylogenetic tree directed to examine the evolutionary
relationships between human cysteine proteinases was constructed using
the NEIGHBOR program, included in the PHYLIP software package (31). The
construction of the tree was done by the unweighed pair group method
using arithmetic averages. The phylogenic distances were obtained
according to the method described by Kimura (32).
Construction of Expression Vectors and Expression in E. coli--
To prepare an expression vector suitable for production of
recombinant cathepsin F in E. coli, we first generated a
647-bp DNA fragment containing the coding sequence for the mature human cathepsin F by PCR amplification of the isolated full-length cDNA with primers 5'-ATGGCCCCACCTGAATGGGACT and
5'-TCAGTCCACCACCGCCGAG. The PCR reaction was carried out for 20 cycles of denaturation (95 °C, 30 s), annealing (60 °C,
30 s), and extension (68 °C, 1 min) using the
ExpandTM Long Template PCR System (Roche Molecular
Biochemicals) to try to reduce error frequency. The PCR product was
phosphorylated with T4 polynucleotide kinase, repaired with Klenow
fragment, and ligated to the expression vector pGEX-3X (Amersham
Pharmacia Biotech), previously treated with SmaI and
alkaline phosphatase. The resulting plasmid, called pGEX-3X CTSF, was
transformed into E. coli strain BL21(DE3). and the
transformed cells were grown in LB broth containing 100 µg/ml
ampicillin at 37 °C for 16 h, diluted 1/100 with the same
medium, and grown to an A600 of 1.0. Then,
isopropyl-1-thio- Enzyme Activity Assays--
The enzymatic activity of purified
cathepsin F produced in E. coli was measured using 20 µM Z-Phe-Arg-AMC, Z-Arg-Arg-AMC, or Z-Arg-AMC as
substrates and following the procedure described by Barrett and
Kirschke (33) with minor modifications. Assays were performed at
30 °C, in 100 mM sodium acetate buffer, pH 5.5, containing 8 mM dithiothreitol, 2 mM EDTA, and
0.05% Brij 35. Substrate hydrolysis was monitored in a Cytofluor 2350 fluorometer (Millipore, Bedford, MA) at excitation and emission
wavelengths of 360 and 460 nm, respectively. For inhibition assays, the
reaction mixture was preincubated with 20 µM E-64 at
30 °C for 15 min, and the remaining activity was determined using
the fluorogenic substrate Z-Phe-Arg-AMC as above.
Chromosomal Mapping--
Total DNA from a panel of 24 monochromosomal somatic cell hybrids containing a single human
chromosome in a mouse or hamster cell line background was PCR-screened
for the presence of the genomic sequence flanked by the cathepsin F
specific primers 5'-GTGCTGATCAGAAGTGCTGCTGC and
5'-AGTTTCCTGGACATGGATAGGGAC. Amplification conditions were as follows:
35 cycles of denaturation (94 °C, 15 s), annealing (68 °C,
15 s), and extension (72 °C, 1 min). To more precisely determine the physical location of the cathepsin F gene within the
human genome, fluorescent in situ hybridization (FISH) of genomic DNA clones for cathepsin F was performed as described previously (34). Briefly, genomic clones were isolated from a human P1
artificial chromosome (PAC) genomic library screened by filter
hybridization with the full-length cathepsin F cDNA as probe. Two
independent clones were identified enclosing the cathepsin F gene as
demonstrated by PCR and Southern blot analysis. DNA from one of these
PAC clones (called 123H18) was obtained with the standard alkaline
lysis method and then used for FISH mapping. To do that, 2 µg of the
PAC DNA was nick-translated with biotin-16-dUTP and hybridized to
normal male metaphase chromosomes obtained from
phytohemaglutinin-stimulated cultured lymphocytes. Biotinylated probe
was detected using two avidin-fluorescein layers. Chromosomes were
diamidine-2-phenylindole dihydrochloride-banded, and images were
captured in a Zeiss axiophot fluorescent microscope equipped with a CCD
camera (Photometrics).
Northern Blot Analysis--
Northern blots containing 2 µg of
poly(A)+ RNA of different human tissue specimens and cancer
cell lines were prehybridized at 42 °C for 3 h in 50%
formamide, 5× SSPE (1× = 150 mM NaCl, 10 mM
NaH2PO4, 1 mM EDTA, pH 7.4), 10×
Denhardt's, 2% SDS, and 100 µg/ml denatured herring sperm DNA.
After prehybridization, filters were hybridized with a full-length
cDNA for cathepsin F. After hybridization, filters were washed with
0.1× SSC, 0.1% SDS for 2 h at 50 °C and exposed to
autoradiography. RNA integrity and equal loading was assessed by
hybridization with an actin probe.
Identification and Characterization of a cDNA Encoding Human
Cathepsin F--
As a previous step to identify and characterize new
human cysteine proteinases belonging to the papain family, we performed an analysis of the human EST data bases, searching for expressed sequences with significant similarity to those previously determined for human cathepsins. This computer search allowed us to identify several overlapping ESTs that, after translation, generated an open
reading frame with a significant degree of similarity to papain-like
cysteine proteinases. A cDNA containing part of these overlapping ESTs was PCR-amplified from DNA of a human prostate cDNA library and used as a probe to hybridize this library. After screening of approximately 1 × 106 plaque-forming
units, DNA was isolated from 10 independent clones selected according
to their positive hybridization with the probe, and their nucleotide
sequences were determined. Computer analysis of the obtained sequences
confirmed that all of them derived from the same gene, potentially
encoding a new member of the papain family of cysteine proteinases.
Further analysis of the nucleotide sequence derived from the isolated
clone containing the largest insert revealed the presence of an open
reading frame coding for a protein of 484 amino acids and a predicted
molecular weight of 53,365 (Fig. 1).
To provide additional evidence that the isolated prostate cDNA
encoded a putative cysteine proteinase, we performed a detailed amino
acid sequence comparison between the identified sequence and those
present in the data bases. This analysis revealed that the highest
degree of identity was found with a cysteine proteinase from
Schistosoma mansoni (48%). Significant similarities were also found with the different human cysteine proteinases of the papain
family, with the percentage of identities ranging from 37% with
cathepsin L2 to 26% with cathepsin B. Furthermore, the identified
amino sequence exhibits the domain organization and structural motifs
characteristic of the papain-like cysteine proteinases (Fig.
2). Thus, it contains a stretch of
hydrophobic amino acids close to the initial methionine which likely
corresponds to the signal peptide found in all other family members. In
addition, the multiple amino acid sequence alignment between all human
cysteine proteinases of the papain family characterized to date (Fig.
2) also allows the identification of a proregion and a mature
proteinase domain in the identified protein sequence, as well as to
define the putative cleavage site between both domains. Thus, the
active processed form of the putative novel cysteine proteinase would start at the alanine residue located at position 271, since it immediately precedes the absolutely conserved proline residue located
at the +2 position in all family members. The amino acid sequence
alignment shown in Fig. 2 also allows the identification of the
putative active site Cys residue (at position 295) of the deduced amino
acid sequence as well as other residues proposed to be important for
the catalytic properties of cysteine proteinases, including the His-431
and Asn-451 residues (35, 36). Furthermore, the amino acid sequences
surrounding these three residues are also well conserved. Thus, the
N-terminal region contains the glutamine residue (at position 289) of
the oxyanion hole present in the structure of these enzymes as well as
the conserved tryptophan residue and hydrophobic segment immediately
adjacent to the active site cysteine residue (35, 36). In addition, the
C-terminal region contains a series of conserved aromatic and glycine
residues located around the histidine and asparagine residues of the
active site. Finally, the deduced amino acid sequence also contains
five potential sites of N-glycosylation (Asn-Glu-Thr at
position 160, Asn-Arg-Thr at 195, Asn-Phe-Ser at 367, Asn-Asp-Ser at
378, and Asn-Arg-Ser at 440), it is very likely that at least one of
them is effectively glycosylated and has attached the mannose
6-phosphate marker required for lysosomal targeting of these enzymes.
On the basis of all these structural characteristics, we can conclude that the isolated human prostate cDNA codes for a novel member of
the papain family of cysteine proteinases, which we propose to call
cathepsin F. A phylogenetic tree constructed to evaluate the
evolutionary relationships of human cathepsin F to other human cysteine
proteinases revealed that this novel protein is only distantly related
to other family members, the more closely related member is cathepsin W
(Fig. 3).
After first submission of the present work, a paper describing a
sequence closely related to that described herein has been reported
(37). According to these authors, cathepsin F should be much smaller
than the protein identified in this work (302 versus 484 amino acid residues), and contrary to all known members of this
protease family, it would lack a hydrophobic signal sequence. Consequently, they propose that cathepsin F would be targeted to the
lysosomal compartment via an N-terminal signal peptide-independent lysosomal targeting pathway. However, an alternative explanation would
be that Wang et al. (37) have isolated a partial cDNA clone for this novel cysteine proteinase. In fact, a detailed comparison of both sequences shows that they are identical in the
3'-region, but the open reading frame identified in this work extends
more than 180 amino acids upstream from the methionine residue
considered by Wang et al. (37) as the first residue in
cathepsin F. In addition, and as can be seen in Figs. 1 and 2, the
sequence reported herein contains a hydrophobic signal sequence like
all the remaining cathepsins. This 19-residue leader peptide ends in a
sequence Ala-Val-Ala, which matches perfectly the Ala-X-Ala
motif found at the processing site of eukaryotic preproteins (38).
Computer analysis using the algorithm developed by Nielsen et
al. (39) confirmed that this site presented the highest
probability to be the processing peptide bond of the cathepsin F leader
sequence. Taken together, these data indicate that human cathepsin F
contains a bona fide signal sequence; and consequently, its domain
organization is identical to that previously described for all known
members of the papain family of cysteine proteinases. Nevertheless, the
amino acid sequence reported in this work for cathepsin F exhibits a
very long propeptide (251 residues) which distinguishes this enzyme
from all other family members, whose equivalent domains range from 206 to 41 residues in length (cathepsin C and cathepsin Z, respectively)
(Fig. 2).
To provide additional information on the structural organization of
cathepsin F, studies were undertaken to identify and clone the murine
homolog of this cysteine proteinase. To this purpose, the cDNA
encoding human cathepsin F was used as a probe to screen a mouse brain
cDNA library. Nucleotide sequence analysis from clones selected
after positive hybridization to the radiolabeled probe revealed the
presence of an open reading frame coding for a protein showing 72% of
identities with human cathepsin F (Fig. 2; accession number AJ131851).
This murine protein also contains a hydrophobic signal sequence as well
as a long propeptide domain, thus providing further evidence that these
structural features are not exclusive of the sequence reported herein
for human cathepsin F. Finally, it is remarkable that according to
preliminary studies,3 there
is a putative homolog of cathepsin F in Drosophila, whose amino acid sequence also exhibits a signal sequence and a long prodomain, confirming again that both domains are present in this novel
cysteine proteinase.
Expression of the Human Cathepsin F cDNA in E. coli and
Analysis of the Proteolytic Activity of the Purified Recombinant
Protein--
To examine further the possibility that the isolated
cathepsin F cDNA encodes a catalytically active cysteine
proteinase, studies were undertaken to produce the human protein in a
bacterial expression system following the strategy previously used to
produce other cysteine proteinases of the papain family (17, 18, 40). For this purpose, a 647-bp fragment encoding the predicted mature cathepsin F was PCR-amplified as described under "Experimental Procedures" and cloned in the polylinker region of the expression vector pGEX-3X. The resulting plasmid (pGEX-3X CTSF), whose identity was verified by nucleotide sequencing, was transformed into
E. coli BL21(DE3), and the transformed bacteria were
induced to produce the recombinant protein by treatment with
isopropyl-1-thio- Physical Mapping of the Human Cathepsin F Gene--
To provide
additional information on the structural and evolutionary relationship
of human cathepsin F to other members of the papain family of cysteine
proteinases, we carried out studies directed to establish the
chromosomal localization of the cathepsin F gene. To this purpose, we
first performed a PCR-based strategy directed to screen a panel of
somatic cell hybrid lines containing a single human chromosome in a
mouse or hamster background. As can be seen in Fig.
5A, positive amplification
results were obtained in hybrids containing human chromosomes 11 and
15. However, the somatic cell hybrids containing human chromosome 15 also carry fragments from chromosome 11 (42), strongly suggesting that the human cathepsin F gene maps to this latter chromosome. To establish
more precisely the chromosomal location of the cathepsin F gene in
chromosome 11, we carried out FISH analysis. To do that, we first
isolated PAC clones containing this gene by screening a genomic library
using as probe the full-length cDNA for cathepsin F. After
characterization of the isolated PAC clones by both Southern blot and
nucleotide sequencing analysis, DNA isolated from one of them (123H18)
was then employed in FISH experiments on human chromosome metaphase
spreads. After diamidine-2-phenylindole dihydrochloride banding of the
metaphase cells showing specific hybridization signals, fluorescent
spots corresponding to the biotinylated PAC clone were mapped to the
q13 region of chromosome 11 (Fig. 5B). Interestingly, the
gene encoding cathepsin W has also been mapped to this region of
chromosome 11 (43), strongly suggesting that cathepsins F and W could
be tightly linked in the human genome. The gene encoding cathepsin C
has also been located at the same region, but in a different band
11q14.1-14.3 (44), whereas the genes coding for the remaining
members of the family have been mapped to different chromosomes
(45-51).
Expression Analysis of Cathepsin F in Human Tissues and Cancer Cell
Lines--
As a preliminary step to elucidate the potential role of
cathepsin F in human tissues, we examined by Northern blot analysis the
expression pattern of this enzyme in a wide variety of cells and
tissues including leukocytes, colon, small intestine, ovary, testis,
prostate, thymus, spleen, pancreas, kidney, skeletal muscle, liver,
lung, placenta, brain, and heart. After hybridization with a
radiolabeled probe specific for cathepsin F, a single transcript of
approximately 2.1 kb was observed with variable intensity in most
examined tissues (Fig. 6A).
The major sites of cathepsin F expression were skeletal muscle and
testis, whereas expression in leukocytes and thymus was virtually
undetectable. The widespread distribution of cathepsin F in human
tissues should be consistent with a putative role for this enzyme in
the intracellular protein catabolism taking place in lysosomes from all
cell types. However, the wide variability observed in cathepsin F
expression in the different tissues analyzed in this work suggests that
in addition to its housekeeping role as a lysosomal digestive enzyme,
it may play a more specific role in those tissues like skeletal muscle and testis in which its relative levels are very high.
Finally, in this work we have addressed the possibility that cathepsin
F could be overexpressed by human cancer cells lines from different
sources, as already shown for other cysteine proteinases of the papain
family (26, 52). To this purpose, we first hybridized a Northern blot
containing poly(A)+ RNAs extracted from different cancer
cell lines (HL-60, HeLa, K-562, MOLT-4, Burkitt's lymphoma Raji,
colorectal adenocarcinoma SW480, lung carcinoma A549, and melanoma
G361) with the full-length cDNA for cathepsin F. As shown in Fig.
6B, high levels of a transcript identical in size (about 2.1 kb) to the one detected in normal tissues were observed in HeLa cells.
Lower levels of this transcript were also detected in melanoma, K-562,
and lung carcinoma cells.
The availability of EST data bases represents an excellent tool to
look for novel genes through computer search of short expressed DNA
sequences with nucleotide sequence similarity to genes of interest. In
this work, we have used this strategy as a first step to clone a new
member of the papain family of cysteine proteinases, which we have
called cathepsin F. The identification of this human protease was based
on the finding of a series of overlapping ESTs, whose sequence was
similar to previously characterized human cysteine proteinases. These
sequences were used to design a DNA probe that was PCR-amplified from a
human prostate cDNA and subsequently employed to screen a cDNA
library from the same tissue. This screening led finally to the finding
of a full-length cDNA coding for cathepsin F. Pairwise comparisons
for structural similarities between the identified amino acid sequence
for this protein and those for the remaining papain-like cysteine
proteinases confirmed that cathepsin F displays the same domain
organization as other family members. Thus, a signal peptide, a
propeptide domain, and a catalytic region can be identified in the
amino acid sequence deduced for this protein. The identification of
this signal sequence, which is also present in the mouse and
Drosophila homologs of cathepsin F (Fig. 2, and data not
shown), does not support the data reported by Wang et al.
(37), after submission of this manuscript, who have proposed that
cathepsin F lacks signal sequence. Furthermore, the catalytic domain
contains all structural motifs characteristic of cysteine proteinases,
including the nucleophilic cysteine residue involved in covalent
intermediate formation during peptide hydrolysis, as well as the
histidine and asparagine residues that constitute the catalytic triad
of these enzymes (35, 36). Consistent with these structural
characteristics, functional analysis of recombinant cathepsin F
produced in a bacterial expression system provided additional evidence
that the isolated cDNA codes for a catalytically active cysteine
proteinase. In fact, the purified recombinant protein exhibits a
significant proteolytic activity against fluorogenic substrates used
for assaying the enzymatic activity of these proteinases. In addition,
this degrading activity was abolished by inhibitors of cysteine
proteinases but not by inhibitors of any other class of proteolytic
enzymes. Nevertheless, this novel protease also contains in its amino
acid sequence some specific features. Of special interest in this
regard is the finding that its N-terminal propeptide domain is
extremely long when compared with those described for all the remaining
papain-like cysteine proteinases. According to structural properties,
the prosegments found in these enzymes can be classified into two
groups (53, 54). The first one contains cathepsin L-like
enzymes with prodomains of about 90 amino acids in length and bearing
two highly conserved motifs called ERFNIN and GNFD. The second group
comprises the cathepsins B from different sources and is characterized
by a smaller proregion of about 60 amino acids lacking the ERFNIN
consensus sequence. In addition, there are two cysteine proteinases
that cannot be classified into any of these groups. Thus, human
cathepsin C propeptide contains 206 amino acids (12), whereas the
recently described human cathepsin Z contains a proregion that is only 41 residues in length and lacks the above-mentioned conserved domains
(18). Human cathepsin F markedly deviates from all of them because its
prosegment contains 251 amino acids. Interestingly, both mouse and
Drosophila homologs of cathepsin F also exhibit a very long
prodomain (Fig. 2 and data not shown) indicating that it is a
characteristic feature of this enzyme. At present, the functional
significance of this extremely long prosegment is unknown. In this
regard, it is well established that the propeptide found in papain-like
enzymes acts as an intrinsic inhibitor of their proteolytic activity
(55). In addition, this region has also been found to be essential for
the proper folding of these enzymes, for stabilizing their structure
upon exposure to changes in pH, or for providing the structural markers
required for microsomal membrane binding or lysosomal targeting
(55-59). It is likely that the long prosegment of cathepsin F may play
some specific role in addition to those proposed for this domain of
papain-like cysteine proteinases.
In this work, we have also analyzed the chromosomal location of the
cathepsin F gene as well as its expression in normal and tumor cells.
According to both FISH and somatic hybrid mapping techniques, this gene
localizes to the long arm of chromosome 11, at 11q13. This position is
the same as that recently reported for the cathepsin W gene, indicating
that these genes are clustered in the human genome. Consistent with
these results, a phylogenetic tree constructed to analyze the
evolutionary relationships between all known human cysteine proteinases
of the papain family demonstrated that cathepsin F and cathepsin W are
closely related. In addition to its possible value in the context of
evolutionary studies of the human cysteine proteinases, knowledge of
the chromosomal location of the cathepsin F gene reported here may be
useful for searching putative genetic diseases associated with this
gene. Interestingly, different studies have reported that the 11q13
region is frequently altered in diverse human tumors (60, 61).
Consequently, it will be of great interest to examine the possibility
that cathepsin F may be a target of these genetic abnormalities
associated with human carcinomas.
On the other hand, analysis of the expression of cathepsin F in human
tissues has provided some information about the putative functional
significance of this protein. Thus, the finding that it is expressed in
most normal tissues analyzed suggests a putative general role for this
enzyme in the lysosomal protein catabolism taking place in all cell
types. This expression pattern of cathepsin F classifies this enzyme
within the group of widely distributed cysteine proteinases such as
cathepsins B, L, H, O, and Z, as opposed to a series of recently
described family members including cathepsins K, S, W, and L2, which
appear to play highly specific roles in those tissues in which they are
overexpressed or even exclusively expressed (see Ref. 2 for a review).
Nevertheless, it is remarkable that cathepsin F expression levels in
normal tissues exhibit a large variability, and there are tissues such as skeletal muscle and testis, in which its mRNA levels are up to
20-fold higher than in others such as kidney and colon, which also
produce this novel protease, albeit at low levels. The finding of very
high levels of cathepsin F mRNA in skeletal muscle is of particular
interest in light of previous data reporting an essential role of
cysteine proteinases in muscle proteolysis in both normal and
pathological conditions, including some forms of muscular dystrophy
(62-64). Further studies will be required to evaluate the possibility
that cathepsin F could be responsible for the catabolism of specific
protein substrates in the muscle. On the other hand, its high level
expression in the testis is also suggestive of a role for this novel
cathepsin in fertilization processes, as proposed for other family
members including the recently described cathepsin L2 (17, 65).
Finally, the expression analysis of cathepsin F has also revealed the
presence of this enzyme in several human cancer cell lines, being
especially significant in high levels in HeLa cells. This finding
suggests that cathepsin F may play some role in the progression of some
human carcinomas, thereby providing additional interest to the further
functional characterization of this proteinase.
In conclusion, we have identified and characterized a new human
cysteine proteinase of the papain family that shows similarities and
differences with the remaining family members previously described. Cathepsin F exhibits signal sequence and all structural features of
cysteine proteinases as well as a profile of activity against fluorogenic substrates and sensitivity to inhibitors typical of these
enzymes. However, it shows an extremely long propeptide domain which
distinguishes this enzyme from other family members. Furthermore, its
high level expression in certain tissues such as skeletal muscle and
testis is suggestive of a specific activity in some physiological
processes taking place in these tissues. The availability of
recombinant cathepsin F and specific reagents for this new proteinase
generated in this work will be very helpful to evaluate its precise
functional role in the context of the increasingly complex pathways of
protein degradation and turnover in human tissues.
INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
gt11, and different Northern blots containing poly(A)+
RNAs prepared from diverse human tissues and cancer cell lines were
from CLONTECH (Palo Alto, CA). A high density
gridded human P1 artificial chromosome
(PAC)1 genomic library and a
panel of somatic cell hybrids containing a single human chromosome in a
rodent background were supplied by the Human Genome Mapping Resource
Center (Cambridgeshire, UK). Restriction endonucleases and other
reagents used for molecular cloning were purchased from Roche Molecular
Biochemicals (Mannheim, Germany). Synthetic peptides Z-Phe-Arg-AMC,
Z-Arg-Arg-AMC, and Z-Arg-AMC were from Bachem (Bubendorf, Switzerland),
and proteinase inhibitor E-64 was from Sigma. Oligonucleotides were
synthesized by the phosphoramidite method in an Applied Biosystems DNA
synthesizer (model 392A) and used directly after synthesis.
Double-stranded DNA probes were radiolabeled with
[
-32P]dCTP (3000 Ci/mmol) from Amersham Pharmacia
Biotech (Buckinghamshire, UK) using a commercial random-priming kit
from Amersham Pharmacia Biotech (Uppsala, Sweden).
-D-galactopyranoside was added to a
final concentration of 1 mM, and the incubation was
continued for 3 h. Cells were collected by centrifugation, washed,
and resuspended in 0.05 volumes of PBS, lysed by using a French press
and centrifuged at 20,000 × g for 20 min at 4 °C.
The soluble extract was treated with glutathione-Sepharose 4B and
eluted with glutathione elution buffer (10 mM reduced
glutathione in 50 mM Tris-HCl, pH 8.0) following the
manufacturer's instructions.
RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
View larger version (39K):
[in a new window]
Fig. 1.
Nucleotide and deduced amino acid sequence
of human cathepsin F cloned from a prostate cDNA library. The
active site residues characteristic of cysteine proteinases are
boxed. Arrows indicate the putative cleavage
sites between the signal sequence and the propeptide as well as between
the propeptide and the mature enzyme. EcoRI cloning sites
are underlined.
View larger version (86K):
[in a new window]
Fig. 2.
Comparison of the amino acid sequence of
human and mouse cathepsin F with that of other human cysteine
proteinases. The amino acid sequences of human cysteine
proteinases were extracted from the SwissProt data base, and the
multiple alignment was performed with the PILEUP program of the GCG
package (30). Numbering corresponds to the sequence of
cathepsin F. Residues that are common to all sequences are shown in
bold. Gaps introduced to optimize the alignment
are indicated by hyphens.
View larger version (14K):
[in a new window]
Fig. 3.
Schematic illustration of evolutionary
relationships between human cysteine proteinases. The phylogenetic
tree was constructed using the NEIGHBOR program of the PHYLIP software
package (31). The phylogenic distances were obtained according to the
method described by Kimura (32).
-D-galactopyranoside. Protein extracts
were prepared from the induced bacteria and analyzed by SDS-PAGE. As
shown in Fig. 4A, the bacteria
transformed with the recombinant plasmid contained a fusion protein of
about 52 kDa, which was not present in the control extracts, whereas
these control extracts contained a 29-kDa band corresponding to the parental glutathione S-transferase that was absent in the
recombinant bacteria. The fusion protein containing cathepsin F was
purified by affinity chromatography in a glutathione-Sepharose 4B
column, which was eluted with a reduced glutathione-containing buffer. The protein material present in the chromatographic eluate was analyzed
by SDS-PAGE, and as shown in Fig. 4A, a single band of the
expected size was detected. The fractions containing purified cathepsin
F were pooled, and their enzymatic activities against Z-Phe-Arg-AMC,
Z-Arg-Arg-AMC, and Z-Arg-AMC were examined. As can be seen in Fig.
4B, these analyses revealed that recombinant human cathepsin
F exhibits a significant proteolytic activity (3.75 µmol/min/mol
enzyme) against the synthetic peptide Z-Phe-Arg-AMC, which has been
defined as an optimal substrate for different cysteine proteinases
(41). This enzymatic activity is slightly higher than that obtained for
cathepsins L2 and Z produced in E. coli as fusions with
glutathione S-transferase and assayed under the same
experimental conditions (17, 18) (Fig. 4B). It is remarkable that, similar to these cathepsins, the proteolytic activity of recombinant cathepsin F against other fluorogenic substrates such as
Z-Arg-AMC and Z-Arg-Arg-AMC was extremely low or undetectable. Finally,
we examined the possibility that the degrading activity of recombinant
cathepsin F against Z-Phe-Arg-AMC was inhibited by specific inhibitors
of cysteine proteinases. In fact, this proteolytic activity was
completely abolished by E-64, a commonly used inhibitor of these
enzymes, whereas inhibitors of serine proteinases (phenylmethylsulfonyl
fluoride), aspartyl proteinases (pepstatin A), and metalloproteinases
(EDTA) did not show any significant effect (Fig. 4B, and
data not shown). According to these preliminary enzymatic analyses,
together with the above mentioned structural characteristics, we can
conclude that cathepsin F is a cysteine proteinase with the substrate
specificity and sensitivity toward inhibitors characteristic of these
enzymes.
View larger version (25K):
[in a new window]
Fig. 4.
Expression of cathepsin F in E. coli and analysis of its enzymatic activity.
A, 5-µl aliquots of bacterial extracts (pGEX-3X and
pGEX-3X CTSF), as well as 1 µl of purified fusion protein
(CTSF) were analyzed by SDS-PAGE. Arrows indicate
parental glutathione S-transferase (29 kDa) and fusion
protein (about 52 kDa). The size in kDa of the molecular size markers
(MWM) is shown at the right part of the figure.
B, recombinant cathepsin F and cathepsin L2 were
incubated with 20 µM Z-Phe-Arg-AMC, and the substrate
hydrolysis at 30 °C was monitored in the presence or in the absence
of 20 µM E-64, at the indicated times.
View larger version (73K):
[in a new window]
Fig. 5.
Chromosomal mapping of the human cathepsin F
gene. A, 100 ng of total DNA from the 24 monochromosomal somatic cell lines was PCR-amplified with primers
5'-GTGCTGATCAGAAGTGCTGCTGC and 5'-AGTTTCCTGGACATGGATAGGGAC as described
under "Experimental Procedures." pBR322 digested with
HaeIII (Marker V, Roche Molecular Biochemicals) was used as
a size marker. B, fluorescent in situ
hybridization with a biotinylated probe specific for human cathepsin F. Metaphase cells were counterstained with diamidine-2-phenylindole
dihydrochloride.
View larger version (74K):
[in a new window]
Fig. 6.
Expression of the cathepsin F gene in human
tissues and cancer cell lines. A, 2 µg of
poly(A)+ RNA prepared from the indicated tissues were
analyzed by Northern blot hybridization with the full-length cDNA
for human cathepsin F. The positions of RNA size markers are shown.
Filters were subsequently hybridized with a human actin probe in order
to ascertain the differences in RNA loading among the different
samples. B, 2 µg of poly(A)+ RNA
prepared from the indicated tumor cell lines were hybridized with
the above-described probe specific for human cathepsin F. Filters
were finally hybridized with a human actin probe.
DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
![]() |
ACKNOWLEDGEMENTS |
---|
We thank Drs. M. Balbín and J. P. Freije for helpful comments and S. Alvarez for excellent technical assistance.
![]() |
FOOTNOTES |
---|
* This work was supported in part by Grants SAF97-0258 from Comisión Interministerial de Ciencia y Tecnología, Glaxo-Wellcome, Spain, and EU-BIOMED II Grant BMH4-CT96-0017.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) AJ007331 and AJ131851
Recipient of fellowships from Ministerio de Educación y
Ciencia (Spain) and Fuji-Chemical Industries (Takaoka, Japan), respectively.
§ To whom correspondence should be addressed: Dept. de Bioquímica y Biología Molecular, Facultad de Medicina, Universidad de Oviedo, 33006 Oviedo-Spain. Tel.: 34-985-104201; Fax: 34-985-103564; E-mail: clo{at}dwarf1.quimica.uniovi.es.
2 GenBankTM accession number H39591, deposited by Hillier, L., Clark, N., Dubuque, T., Elliston, K., Hawkins, M., Holman, M., Hultman, M., Kucaba, T., Le, M., Lennon, G., Marra, M., Parsons, J., Rifkin, L., Rohlfing, T., Soares, M., Tan, F., Trevaskis, E., Waterston, R., Williamson, A., Wohldmann, P., and Wilson, R., Washington University-Merck EST project.
3 I. Santamaría and C. López-Otín, unpublished results.
![]() |
ABBREVIATIONS |
---|
The abbreviations used are: PAC, P1 artificial chromosome; bp, base pair(s); E-64, trans-epoxysuccinyl-L-leucylamido-(4-guanidino)butane; EST, expressed sequence tag; PAGE, polyacrylamide gel electrophoresis; PCR, polymerase chain reaction; Z-Phe-Arg-AMC, benzyloxycarbonyl-L-phenylalanyl-L-arginine-7-amido-4-methylcoumarin; FISH, fluorescent in situ hybridization.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|