(Received for publication, September 27, 1994; and in revised form, November 18, 1994)
From the
The identification of proteinases of Porphyromonas gingivalis that act as virulence factors in periodontal disease has important implications in the study of host-pathogen interactions as well as in the discovery of potential therapeutic and immunoprophylactic agents. We have cloned and characterized a gene that encodes the 50-kDa cysteine proteinase gingipain or Arg-gingipain-1 (RGP-1) described previously (Chen, Z., Potempa, J., Polanowski, A., Wikstrom, M., and Travis, J. (1992) J. Biol. Chem. 267, 18896-18901). Analysis of the amino acid sequence of RGP-1 deduced from the cloned DNA sequence showed that the biosynthesis of this proteinase involves processing of a polyprotein that contains multiple adhesin molecules located at its carboxyl terminus. This finding corroborates previous evidence (Pike R., McGraw, W., Potempa, J., and Travis, J.(1994) J. Biol. Chem. 269, 406-411) that RGP-1 is closely associated with adhesin molecules, and that high molecular weight forms of the proteinase are involved in the binding of erythrocytes.
Mammalian periodontal diseases result from complex interactions
between the host and a variety of anaerobic microorganisms. A number of
studies have suggested an important role for Porphyromonas
gingivalis in human periodontal tissue
destruction(1, 2, 3) . Several potential
virulence factors, including the elaboration of proteinase activity,
have been identified for this
organism(4, 5, 6) . Proteinases have been
proposed to play a major role in periodontal disease because of their
capacity to degrade protective host immunoglobins and to hydrolyze host
proteins that provide amino acids required for growth, and by their
participation in the destruction of host connective tissue (7, 8, 9, 10, 11, 12, 13) .
Furthermore, there are reports indicating a direct involvement of
``trypsin-like'' proteinases of P. gingivalis in its
binding to erythrocytes and extracellular matrix components. This
suggests that some of the P. gingivalis proteolytic enzymes
associated with the cell surface function as adhesins that mediate
bacterial adherence to host tissues (reviewed in (14) ).
Several groups have reported the cloning of proteinase genes from P.
gingivalis(15, 16, 17, 18, 19, 20) .
We report here the molecular cloning of a gene that encodes gingipain-1
(RGP-1), ()a 50-kDa arginine-specific cysteine proteinase,
described previously by Chen et al.(21) , and found
predominantly in culture medium as 95- and 50-kDa proteins, or
associated with bacterial membranous fractions as 110- and
70-90-kDa forms.
Although the role of the RGP-1 proteinase in the development of periodontal disease is not yet fully clear, recent results have indicated that this proteinase is the major vascular permeability enhancement factor of P. gingivalis, resulting in gingival crevicular fluid production at sites of periodontitis caused by infection with this organism(22) . It has been shown previously that 110- and 95-kDa RGP-1 protein complexes possess erythrocyte-binding properties(23) , suggesting an association between proteolytic and hemolytic activities. Here we show, from analysis of the amino acid sequence deduced from the rgp1 gene sequence, that this proteinase/adhesin association is derived from the biosynthesis of RGP-1 as a polyprotein that contains multiple adhesin domains at the carboxyl terminus of the previously identified proteinase. By comparison of the proteinase domain of the polyprotein with the sequences of other cysteine proteinases, RGP-1 appears to be a member of a new family of pathogenic proteinases.
We describe here that RGP-1, the major arginine-specific
cysteine proteinase from P. gingivalis, is synthesized as a
polyprotein that can function as an erythrocyte-binding protein through
the presence of multiple adhesin domains at its carboxyl terminus. Chen et al.(21) have determined the primary structure of
the amino terminus of RGP-1 by direct amino acid sequencing. This
sequence information was used to prepare a mixture of synthetic
oligonucleotides: primer GIN-1-32, a 32-mer coding for amino acids
2-8 of the mature protein (TPVEEKE); and primer GIN-2-30, a
30-mer coding for amino acids 25-32 of the mature protein
(KDFVDWKN). These primers were used to amplify from genomic DNA the
corresponding fragment of the rgp1 gene by PCR. The expected
105-base pair PCR product was cloned and sequenced. On the basis of
this sequence, GIN-8S-48, a unique 48-mer oligonucleotide probe
corresponding to the coding strand of rgp1, was synthesized
and used to screen a DASH DNA library constructed from BamHI-digested P. gingivalis genomic DNA. DNA
sequence analysis of positive clones indicated that the proteinase
domain was encoded by these
3.5-kbp clones. However, since no
transcriptional termination codon was evident within the large open
reading frame encoding the proteinase, overlapping clones were isolated
from a size-selected PstI/HindIII plasmid library,
using a 20-mer oligonucleotide probe. Using this procedure, several
4.5-kbp-containing clones were obtained. In total,
7.8 kbp of
genomic DNA from BamHI to HindIII sites (Fig. 1A) was isolated and characterized. The composite
6327-base pair PstI/PvuII fragment of this genomic
DNA (Fig. 1A) was fully sequenced in both directions
and is described here, with the first base of the 5`-PstI site (Fig. 1A) assigned as base 1. Within this composite
sequence was found an open reading frame encoding a 1704 amino acid
sequence (Fig. 1B), with the 5`-most ATG initiation
codon at nucleotides 949-951. Between this ATG and the mature
RGP-1 sequence are an additional 8 in-frame methionine codons. The
exact ATG used for initiation of translation is currently unknown,
although the presence of a consensus TATA box (TATAAT) at nucleotides
889-894 suggests the 5`-most ATG as the strongest candidate.
Figure 1:
A, map of the cloned
genomic DNA sequence encompassing the rgp1 gene. Only major
restriction sites are indicated: B, BamHI; P, PstI; S, SmaI; A, Asp718; Pv, PvuII; H, HindIII. M13 subclones used for DNA sequencing are shown (arrows). The overlap at the 3`-PstI site was
determined by sequencing a SmaI/BamHI plasmid clone.
Also shown is a schematic representation of the RGP-1 polyprotein
structure, including the proposed methionine used for translation
initiation and experimentally determined basic residue cleavage sites.
Also shown () is the experimentally determined active site
cysteine residue of the previously identified
50-kDa mature
RGP-1(21) . B, full deduced amino acid sequence of the
RGP-1 polyprotein in single-letter code. Peptide sequences
identified previously(23) , or as part of the present work, are underlined. The first amino acid of the mature RGP-1
proteinase (tyrosine) is assigned as amino acid
1.
The most striking feature of the deduced protein sequence is the presence of multiple homologous sequences immediately carboxyl-terminal to the proteinase coding domain (Fig. 1), leading to a calculated molecular mass of 185.4 kDa of the encoded polyprotein. Within these sequences can be found peptides identified by Pike et al.(23) as the components of high molecular mass gingipain (HGP) that confer adhesion activity on the high molecular mass RGP-1 complex. The polyprotein sequence deduced from the gene sequence now allows exact delineation of the primary structure of the mature RGP-1 proteinase. The amino terminus (23) is derived from proteolytic processing at an arginine residue (Fig. 1A). The carboxyl terminus is derived by processing at Arg-492 and also releases the amino terminus of the 44-kDa HGP (HGP44, Fig. 1, A and B, underlined) determined by Pike et al.(23) . Similar processing at Arg-1202 gives rise to the amino terminus of the 27-kDa HGP27 (Fig. 1, A and B, underlined) also found to be associated with RGP adhesion activity (23) .
More recently, we have used high resolution
SDS-PAGE (27) to separate the 95-kDa form of RGP into 5 major
bands of 50, 44, 27, 17, and 15 kDa. ()Amino-terminal
sequence analysis confirmed the structures of the 50-, 44-, and 27-kDa
fragments reported previously(23) . For the 17- and 15-kDa
fragments, the following amino termini were determined: PQSVWIERTVDL
and ADFTETFESSTHG, respectively. Thus, the 17-kDa polypeptide (HGP17, Fig. 1, A and B, underlined) is
cleaved at lysine residue 1044, most likely catalyzed by Lys-gingipain,
the other cysteine proteinase produced in large quantities by P.
gingivalis(23) . The sequence of the 15-kDa fragment
(HGP15, Fig. 1, A and B, underlined)
reveals that processing occurs after Arg-909. The calculated molecular
mass of 53.9 kDa for RGP-1 is in good agreement with its mobility of
50 kDa on SDS-PAGE. Similarly, the 417-, 275-, 158-, and 135-amino
acid sequences of HGP44 (44.7 kDa), HGP27 (29.6 kDa), HGP17 (17.5 kDa),
and HGP15 (14.3 kDa), respectively, correlate well with their SDS-PAGE
mobilities. Together, the proteinase domain and adhesin/hemagglutinin
fragments would create a polyprotein of 159.9 kDa, while the high
molecular mass form of RGP (HGP) is only 95 kDa, indicating that the
secreted enzyme is most likely processed and assembled as a
non-covalent complex of the proteinase with different individual
adhesin/hemagglutinin domains.
Three large repeats of homologous sequence are located within three of the cleavage products (Fig. 2A). The first is found in the middle of HGP44, the second in HGP17, and the third in the carboxyl-terminal region of HGP27. Amino acid sequence identities within these 49-amino acid stretches varies between 76 and 96%, indicating similar roles for each sequence, possibly in non-covalent interactions with the proteinase domain, or in the adhesion activity of the high molecular mass complexes.
Figure 2: A, alignment of conserved areas within the adhesin domains of the RGP-1 polyprotein sequence. B, identification of the active site cysteine residue as Cys-185 by active site labeling and separation of V8 proteinase- and Asp-N-derived peptides. X refers to unidentifiable amino acid residues at a given Edman degradation cycle. During such analyses, these positions are characteristically associated with cysteine or modified amino acid residues in the polypeptide chain.
The mature RGP-1 sequence exhibited no similarity with
any cysteine proteinase reported previously except for the related
enzyme from the same organism, referred to as Lys-gingipain (23) . ()Even in this case, the similarity is
limited to the sequence around His-211 and Asn-442, suggesting that
these residues, along with the active site cysteine residue (Cys-185),
determined by active site labeling (Fig. 2B), encompass
the catalytic triad. Interestingly, the lack of any similarity around
these sequences with cysteine proteinases other than Lys-gingipain
suggests that these bacterial proteinases represent a distinct branch
of this family of proteolytic enzymes. That the catalytic apparatus of
RGP-1 might be different from that of other known cysteinyl proteinases
demands that residues other than Cys-185 involved directly in the
hydrolysis of peptide bonds must be verified experimentally.
Molecular cloning of the rgp1 gene confirms previous findings that this arginine-specific proteinase is closely associated with adhesion activity. High trypsin-like activity has been shown previously to be an important virulence factor in P. gingivalis. Sequencing of the corresponding rgp1 gene of the virulent W50 strain of P. gingivalis revealed only two conservative amino acid changes in the mature enzyme sequence (data not shown). Thus, any involvement of RGP-1 in virulence would have to be due to its differential regulation, and enhanced expression in virulent strains. The availability of RGP-1 DNA sequences will now allow the further study of hemolysis of erythrocytes through the adhesin/hemagglutinin activities of this proteolytic polyprotein. Recombinant polypeptides can also be used for the development of potential immunoprophylactic and therapeutic agents against this human pathogen.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U15282[GenBank].