©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
The Sequence and Conformation of Human Pancreatic Procarboxypeptidase A2
cDNA CLONING, SEQUENCE ANALYSIS, AND THREE-DIMENSIONAL MODEL (*)

(Received for publication, November 14, 1994; and in revised form, December 30, 1994)

Lluis Catasús (1)(§) Josep Vendrell (1) Francesc X. Avilés (1)(¶) Suzanne Carreira (2) Antoine Puigserver (2) Martin Billeter (3)(**)

From the  (1)Departament de Bioquímica i Biologia Molecular, Unitat de Ciències and Institut de Biologia Fonamental, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain, the (2)Laboratoire de Biochimie et Biologie de la Nutrition, Faculté des Sciences St-Jérôme (CNRS-URA 1820), Université d'Aix-Marseille III, 13397 Marseille Cedex 20, France, and the (3)Institut für Molekularbiologie und Biophysik, Eidgenössiche Technische Hochschule Hönggerberg, CH-8093 Zürich, Switzerland

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES

ABSTRACT

A full-length cDNA clone coding for human pancreatic preprocarboxypeptidase A2 has been isolated from a gt11 human pancreatic library. Expression clones were identified by specific interaction with antisera raised against the native protein. The open reading frame of the polynucleotide sequence is 1254 base pairs in length and encodes a protein of 417 amino acids. This cDNA includes a short leader signal peptide of 16 amino acids and a 94-amino acid-long activation segment. The amino acid sequence shows 89% identity to that of rat procarboxypeptidase A2, the only A2 form sequenced so far, and 64% identity to that of human procarboxypeptidase A1. The newly determined sequence was modeled to the three-dimensional crystal structures of both bovine carboxypeptidase A and porcine procarboxypeptidase A1 by a novel distance geometry approach. Biases in the modeling were avoided by relying exclusively on automatic procedures and by using random structures as starting points. Information taken from the known homologous structures refers only to the backbone since no explicit data describing the conformation of side chains were transferred. Ten structures of human carboxypeptidase A2 were determined on the basis of each of the two known crystal structures. The root-mean-square distance for the backbone atoms between the 10 structures and their mean for 237 selected residues is 0.7 Å when starting from the bovine protein and 0.8 Å for 251 selected residues when starting from the porcine protein. The 94 residue-long activation segment was also determined in the modeling based on the porcine zymogen; its structure is well defined but not its orientation with respect to the enzyme moiety. The model obtained for human procarboxypeptidase A2 is discussed with respect to the specificity and activation of the enzyme.


INTRODUCTION

The traditional classification of pancreatic carboxypeptidases (CPs) (^1)and their zymogens (procarboxypeptidases: pro-CPs) into the A and B forms (1) has changed in recent years after the identification of the A1 and A2 isoforms, first reported for the rat proteins(2, 3) . The A1 isoform is equivalent to the forms previously known as A, of which the bovine (4) and porcine (5, 6) species are good representatives, and shows preference for aliphatic C-terminal residues of peptide substrates. The A2 isoform selectively acts on the bulkier aromatic C-terminal residues, and has only been characterized in depth for the enzyme form in the rat system(3, 7) ; however, no information about the three-dimensional structure of the proenzyme is available.

At the proenzyme level, these proteins show a higher degree of complexity due to their oligomeric association with proenzymes of serine proteases. These associations preferentially involve the precursors of the A1 form, while the A2 and B forms are generally found in the monomeric state(1) . This has been clearly demonstrated for the human species(8) . In this system, it was also found that the A2 zymogen shows inhibition properties and an activation mechanism closer to the B forms than to the A1 form, a surprising fact when considering the closer homologies to the latter in sequence and specificity.

Given the limited information about procarboxypeptidases A2 (pro-CPA2), studies on this isoform from species other than rat may help to confirm its differential character and evolutionary pathway and to understand the molecular reasons for its specific functional properties. With these aims in mind, we have cloned and sequenced a full-length cDNA for A2 from a gt11 human pancreatic library as reported in this paper.

A comparative analysis of the deduced amino acid sequence with those from other pro-CPs reveals amino acid substitutions that may well explain the properties of the human A2 zymogen and its active form. These sequence comparisons are further substantiated in the context of our earlier determinations of the primary structure of human pro-CPA1 (9) and of the three-dimensional structures of both porcine pro-CPA1 and pro-CPB(10, 11) . Here, we extend this analysis by modeling the primary structure of human pro-CPA2 onto the known structures of bovine CPA (12, 13) and porcine pro-CPA1(10) , to which it shows 64 and 63% identity, respectively.

Comparison studies have shown that sequence identities between two proteins exceeding about 40% reliably indicate similar three-dimensional structures(14) . Therefore, it is warranted to model a protein with a new sequence according to the known fold of a homologous protein, provided that the sequence homology is significant. Two important questions arise with respect to the modeled structure. 1) How do the substituted amino acid residues fold within the frame of the known structure, and how do the conserved residues and the backbone fold adapt to the new packing requirements? 2) How accurate is the three-dimensional model of the new protein, in particular in regions with substituted residues or with deletions or insertions? The modeling procedure presented and used here avoids biases by relying exclusively on automatic procedures and on random structures as starting points. Information taken from a homologous structure depends only on its backbone conformation. The local precision of the modeled structure is defined by the variations among the conformers resulting from the repeated application of the procedure to various random start structures.


EXPERIMENTAL PROCEDURES

Materials

Horseradish peroxidase-conjugated Human IgG adsorbed was purchased from Bio-Rad. [alpha-S]dATP (>37 TBq/mmol) and [alpha-P]dCTP (>110 TBq/mmol) were from Amersham Corp. DNA polymerase and T4 DNA ligase were purchased from Boehringer Mannheim. Restriction endonucleases and Klenow fragment of Escherichiacoli were from Promega. The sequencing kit was from Pharmacia Biotech Inc. The human pancreatic cDNA library constructed in bacteriophage gt11 was generously provided by Dr. M. E. Lowe (Washington University).

Preparation of Antibodies

Anti (human pancreatic procarboxypeptidase) serum was obtained by immunizing rabbits (strain NZW) with 500 µg of human pro-CPAs and pro-CPB, purified as described previously(8) . The resulting IgG were purified by ammonium sulfate precipitation followed by chromatography on a DEAE-Sepharose column equilibrated with a 20 mM potassium phosphate buffer (pH 6.8).

Library Screening

A partial length CPA2-cDNA clone of 1200 bp was isolated from a human pancreatic cDNA library by immunoscreening performed with antibodies directed against pro-CPs using the procedure described in (15) . The CPA2-cDNA 1200-nucleotide-long sequence was P-labeled by random primer (Promega kit) and used to probe the human pancreatic cDNA library. Phage plaques were transferred onto nylon filters (Amersham Corp.) as described in (16) . The filters were hybridized at 42 °C in 6 times SSC (SSC: 0.15 M NaCl, 0.03 M sodium citrate pH 7.0), 5 times Denhardt's buffer (Denhardt's buffer: 0.02% Ficoll 400, 0.02% polyvinylpyrrolidone, 0.02% bovine serum albumin), 1% sodium dodecyl sulfate, 50% formamide, and 1 µg/ml denatured salmon sperm DNA. The filters were washed 4 times at room temperature for 5 min each with a solution containing 2 times SSC, 0.1% sodium dodecyl sulfate, and then twice under conditions of high stringency at 65 °C for 30 min in 0.1 times SSC, 0.1% sodium dodecyl sulfate. Positive plaques were purified, and bacteriophage DNA was prepared as described in (17) . Two of these clones contained a 1331-bp full-length cDNA insert.

Nucleotide Sequence Analysis

One of the full-length cDNA inserts was subcloned into plasmid pUC9. The nucleotide sequence was determined by the dideoxy chain terminating method (18) with the M13 universal primer using a suitable kit from Pharmacia. The plasmid containing the pre-pro-CPA2 cDNA insert was digested with various restriction enzymes, and the obtained cDNA fragments were ligated into M13mp18 vector. The nucleotide sequence was determined in both orientations and across the limits of all cDNA pieces analyzed.

Modeling of the Three-dimensional Structure

For the modeling, structure information was extracted from a reference structure and used in calculations of a set of structures with the new sequence. In a first step, two lists of conformational constraints were assembled from the reference structure. The first list contained all distances shorter than 5.0 Å between the atoms N, C, C, and C`. Note that the positions of the C atoms do not depend on side chain conformations. These distances were turned into upper limits on distances by adding 0.5 Å; the resulting constraints thus allowed for some structural variation with respect to the reference structure. Additional distance constraints were used to enforce disulfide bridges and to ligate the zinc ion in carboxypeptidases to the neighboring histidines. These nonsequential bonds involved only residues of conserved sequence fragments. The second list consisted of constraints on all backbone dihedral angles and , except for the two terminal angles. Each of these angles was constrained within ± 20° of the angle value in the reference structure. Both constraints lists were then modified to reflect the sequence of the protein with the unknown structure. During this step, constraints were eliminated if they made no more sense with the new sequence, e.g. distance constraints to C atoms in newly present glycines or angle constraints for -angles of newly present prolines. Similarly, all constraints to residues involved in insertions or deletions or to residues less than three positions away from these were removed. In the two crystal structures used for the present modeling, three cis peptide bonds occur in sequence fragments that are conserved among all carboxypeptidases considered here. These were also enforced in the modeled structures.

The two lists with distance and dihedral angle constraints and the new sequences were used as input for the distance geometry program DIANA (19) . This program, which operates in the dihedral angle space and thus keeps all bond lengths and bond angles at their ideal values, first creates from the given sequence a structure with random conformation. This conformation is then modified by varying all dihedral angles in an attempt to satisfy all constraints from the input lists as well as the requirements of steric repulsion. DIANA was applied to 50 different random start structures using the standard protocol and one REDAC cycle (20) . At the end, 3000 iterations of optimization with all distance constraints were added. During this last DIANA step, no dihedral angle constraints were enforced. This should allow for local rearrangements to accommodate the new packing requirements caused by the side chain substitutions. For the description of the newly modeled structure, the ten DIANA conformers that converged best, i.e. with the lowest residual violations of the constraints, were used.


RESULTS

cDNA Cloning and Sequence Analysis

Using polyclonal antibodies raised against pancreatic pro-CPs, a partial-length cDNA clone was isolated from a gt11 human pancreatic library. The clone contained a 1200-bp insert coding for a protein possessing 89% identity with residues -68 to 309 of rat pro-CPA2 (3) and 64% identity with the same region in human pro-CPA1(9) . This clone was P-labeled by random primer and used to rescreen the same human pancreatic cDNA library at higher stringency.

After two rounds of rescreening, only six hybridization positive clones were independently isolated from 10^5 recombinant phages; two of these clones contained large inserts. Their cDNAs were isolated at the preparative level, and the size of their inserts were shown to be of 1331 bp by digestion with EcoRI restriction endonuclease followed by agarose gel electrophoresis. One of these large cDNA inserts was subcloned into the EcoRI site of the pUC9 vector and entirely sequenced in both senses. A simple comparison of the amino acid sequence deduced for this clone with the sequences from other pancreatic pro-CPs indicates that this cDNA insert contains the full-length of human pre-pro-CPA2. This was further confirmed by data base screening with the FASTA program(21) . The nucleotide sequence and the corresponding amino acid sequence of the protein are shown in Fig. 1.


Figure 1: Nucleotide sequence and corresponding amino acid sequence of human pancreatic preprocarboxypeptidase 2. The signal and activation regions extend from residues -110 to -95 and from residues -94 to -1, respectively. The processing sites for the pre- and pro- peptides are designated by an arrow. The probable polyadenylation signal ATTAAA is underlined. Amino acids are labeled according to the numbering scheme of rat pro-CPA1 (9) . Note the two deletions with respect to pro-CPA1 at positions 6 and 57.



The analyzed full-length cDNA insert contained standard 5`- and 3`-flanking regions, a poly (A) tail, and an open reading frame of 1254 bp coding for 417 amino acids. The size of the coding region coincides with that of the homologous A2 isoform from rat(3) , the only species in which this form has been sequenced until now. The 5`-flanking region of the human full-length cDNA insert was only 4 bp in length, and the 3`-flanking region had a 48-nucleotide segment containing the consensus polyadenylation signal sequence ATTAAA located 18 nucleotides upstream from the poly(A) tail, which is at least 25 nucleotides in length.

The sequenced human pre-pro-CPA2 encodes a 16-amino acid signal peptide at its N-terminus (residues -110 to -95 in Fig. 1), which is presumably cleaved during the expression of the inactive zymogen. This is deduced by comparison with the N-terminal sequences of human pro-CPs previously determined by one of our groups(8) . It is worth stressing here that, in contrast to most eukaryotic signal peptides(22) , the cleavage site is at a cysteine residue. Comparison with the sequences of active human enzymes (8) also allows us to deduce that the proteolytic activation of the A2 human zymogen occurs by tryptic cleavage of a 94-residue N-terminal fragment and generation of a 307-amino acid-long enzyme (residues -94 to -1 and 1 to 309 with two gaps, respectively, in Fig. 1).

Sequence Comparisons

The amino acid sequence of human pro-CPA2 is compared in Fig. 2with five other pancreatic pro-CPs, namely bovine A(4) , rat A1(2) , human A1(9) , human B(22) , and rat A2(3) , and with human mast cell pro-CPA(23) . The deduced identities are 64, 61, 64, 42, 89, and 40%, respectively. The two intrachain disulfide bonds of the A2 catalytic domain, as well as the residues involved in catalysis, substrate binding, and coordination of the active site zinc atom, previously determined from crystallographic and kinetic studies on rat CPA2 (3, 7) are preserved in human CPA2.


Figure 2: Comparison of the amino acid sequence of human pro-CPA2 to those of bovine pro-CPA, both rat and human pro-CPA1, human mast cell pro-CPA, human pro-CPB, and rat pro-CPA2. The numbering system of the carboxypeptidase moiety is made according to the bovine A enzyme(4) . Residues in the activation segments are written in italics and are preceded by an A, and the numbering system results from the alignment based on maximal coincidence of secondary structure elements, except in their C-terminal regions, where alignment is also based on maximal point identities. The sequence of reference in the latter case is that of porcine pro-CPB, and each insertion or deletion is considered to occupy one position (see (11) ). The actual length of the activation segments is 94 residues for all the A forms and 95 for the B forms. Only the amino acids that differ from those of the bovine pro-CPA sequence are shown for the other proteins. Dashes represent amino acid deletions. Opencirclesabove the alignments identify the two deletions in the enzyme moiety of the A2 forms. Asterisks are placed over the functionally important residues, which are discussed in the text. The arrow indicates the site of the primary tryptic activation cleavage.



The alignments shown in Fig. 2confirm that the overall sequence homology in the activation segment regions is substantially lower than that in the enzyme moieties. When the human A2 activation segment is compared with the corresponding bovine A, rat A1, human A1, human B, rat A2, and human mast cell A activation segments, the following identity scores are obtained, respectively: 55, 51, 56, 23, 82, and 22%. These data confirm that the human A2 form is closer to the A1 form than to the B form. In particular, human A2 shows an important deletion at relative positions 45-48 in the activation segments, like the porcine A1 form, which corresponds to the regions folded as a 3 helix in the three-dimensional structure of pro-CPB (11) . The functional importance of these and other changes will be discussed below. These sequence comparisons also support the notion that mast cell pro-CPA is closer to pancreatic pro-CPB than to pancreatic pro-CPAs.

Structure Modeling

The three-dimensional structure of the human A2 form was modeled using two known crystal structures. First, a structure was derived for the 305 residues forming the enzyme part (residues 3 to 309 in Fig. 1) using the crystal structure of bovine CPA (12) . This reference structure was determined at very high resolution, 1.54 Å(13) , and it shares 64% sequence identity with human CPA2. A second model was derived for the complete proenzyme human pro-CPA2 (residues -94 to 309 in Fig. 1) using the crystal structure of porcine pro-CPA1(10) . The latter reference structure was determined to a resolution of 2 Å, and 63% of its residues are identical to the corresponding residues in human pro-CPA2. Two deletions in human CPA2 with respect to both crystal structures are located near the N terminus and in a surface loop at positions 6 and 57 in Fig. 1; they have little influence on the structure of the rest of the protein. In the following, the two models obtained with the procedure described under ``Experimental Procedures'' are presented jointly.

The list of upper distance constraints used for human CPA2 contains 4210 entries, and the one for human pro-CPA2, 5563 entries. These numbers include three constraints for the enforcement of each of the two disulfide bridges between cysteines 138 and 161, and cysteines 210 and 244. The former disulfide bridge is conserved in all CPAs, while the latter is observed only in the A2 forms. The corresponding residues in the crystal structures of the A1 forms are, however, separated by only a short distance (e.g. the distance from Thr C to Ile C in bovine CPA is 3.8 Å). A few additional distance constraints enforce the interaction of the zinc ion with histidines 69 and 196. Again, these 2 histidines are part of sequence fragments that are conserved in all CPAs considered here. The calculations with DIANA yielded for each constraint list 10 conformers with all residual violations of the distance and the van der Waals' constraints below 1.2 Å (with the exception of one van der Waals' constraint in one conformer) indicating that the human CPA2 sequence can assume the overall chain fold of bovine CPA and of porcine pro-CPA1. No upper distance constraint from the 4210 entries list, one upper distance constraint from the 5563 entries list, and exactly one van der Waals' constraint were violated by more than 0.5 Å in more than three conformers. Thus, no serious inconsistency of the distance constraints with the altered packing requirements due to the new sequence occurred.

Model Description

Table 1summarizes the r.m.s. distance comparisons that describe the precision of the modeled human CPA2 and pro-CPA2 structures as well as their structural similarity to bovine CPA and porcine pro-CPA1. In the proenzyme model, the backbone r.m.s. distance of 1.49 Å for the complete protein drops to 1.16 and 0.99 Å if the two domains are considered separately; this is indicative of the structural independence between the N-terminal activation domain and the enzyme. This feature coincides with earlier observations according to which rather loose contacts occur between the two domains in both the porcine A and B forms of pro-CPs(10, 11) . Both models are well defined with r.m.s. distance values around 1 Å for the backbone, even lower values if some surface loops are excluded (selected residues in Table 1), and values up to 1.6 Å when side-chain atoms are included. The r.m.s. distance values for the comparisons to the crystal structures of the A1 forms are somewhat larger, implying the presence of structural differences between the new modeled structures and the crystal structures, which served as starting points for the modeling. The differences between the two models amount to a r.m.s. distance value of 1 Å for the backbone; they are therefore of a size similar to the precision within each model and smaller than the differences to the crystal structures. For this reason, only the model for human pro-CPA2 is presented in the following. The only difference of significance between the two models affects the N-terminal 6 residues of the enzyme, which are not well defined in the model of human CPA2.



The backbone superposition of human pro-CPA2 and porcine pro-CPA1 in Fig. 3shows that backbone deviations either occur in surface loops or are due to a different position of the activation domain relative to the enzyme but that the interior with the binding pocket is very similar. Thus the 153 side-chain replacements and the two deletions in human pro-CPA2 with respect to porcine pro-CPA1 require no significant change of the backbone structure. The residues involved in zinc binding and catalysis (His, Glu, Arg, His, and Glu) and substrate anchoring and positioning (Arg, Asn, Arg, and Tyr) are conserved among all CPAs, and therefore no significant structural difference occurs between the model and the crystal structure. Among other residues of importance for substrate binding such as Arg, Ser, Tyr, Ser, Met, Ile, Ile, Ala, Gly, Ser, Ile, Asp, Ala, and Phe, which are responsible for the enzyme specificity, only 3 are not conserved in human pro-CPA2 and porcine pro-CPA1 or bovine pro-CPA. Met of human pro-CPA2 corresponds to a Leu in both A1 forms, Ser Val in porcine pro-CPA1, and Ala Ser in porcine pro-CPA1 and to Thr in bovine pro-CPA. In the model structure of human pro-CPA2, all 3 residues are well defined and next to each other (Fig. 4). Compared with the A1 forms, the flexible Met and the small Ser and Ala present a binding surface more capable of accommodating a bulky side chain from the substrate. The presence of an Ile residue at position 255 in the specificity site (7) confirms that the A types share this characteristic, in contrast to the Asp residue found at this position in the B types. This residue, located in the center of the binding pocket of the catalytic domain, seems to be a critical determinant for the different specificities for hydrophobic or for charged substrate residues.


Figure 3: Stereo view of the human pro-CPA2 model and the porcine pro-CPA1 crystal structure. The C traces of both structures are shown after superposition of the backbone of the enzyme part. The enzyme part of the model is drawn with a heavyline, its activation domain and the entire porcine pro-CPA1 are drawn with thinlines.




Figure 4: Stereo view of the binding pocket of human and porcine pro-CPA2. Shown are the backbone fragments 61-68, 141-145, 188-206, 236-257, and 261-271 of both molecules with thinlines and side chains of residues 203, 254, and 268 (Met, Ser, and Ala in human pro-CPA2, Leu, Val, and Ser in porcine pro-CPA2) with heavylines.



As mentioned above, the major structural difference regarding the activation segment of human and porcine pro-CPA2 is its imprecise positioning with respect to the enzyme part. The largest difference occurs for the two helices on the outer face of the activation domain (Fig. 3). Also, the connecting helix between the activation and enzyme domains is not very well defined. Nevertheless, the superposition of the activation domains of the two structures (not shown) indicates coincidence of all important side chains within the structural spread of the human pro-CPA2 model.


DISCUSSION

The results shown here demonstrate that a pro-CPA2 form is encoded in the human genome. The abundance of its cDNA in a pancreatic library and the coincidence of its derived Nterminal sequence with that of a major procarboxypeptidase form previously isolated from the human pancreas (8) confirm that pro-CPA2 is abundantly expressed in this organ in humans. A similar reasoning can be applied to the A1 human form previously cloned by our laboratories(9) . Taken together, all this confirms the former assignment of the zymogens (B2, B1, A2, A1, and A1 binary complex) separated by high performance liquid chromatography on DEAE columns, which was based on different biochemical properties(8) .

The higher number of sequence identities found between human pro-CPA2 and rat pro-CPA2 (89% homology), as compared with human pro-CPA2 and human pro-CPA1 (64%), is in agreement with the proposal of Gardell et al.(3) about the formation of the two isoforms by gene duplication before speciation of mammals. A careful alignment of these sequences indicates several features that back the differentiation of A2 from A1, such as the particular occurrence of Cys and Cys in the former, both residues forming a disulfide bond in rat the A2 form (7) and being absent in the A1 forms. This disulfide bond is of importance, given that it may affect the conformation and dynamics of the active-site surface loop extending from residue 242 to 263. Another different structural feature is the deletion at positions 6 and 57 in human pro-CPA2, also evident in the rat counterpart.

The occurrence of two carboxypeptidase A isoforms has only been described in the human and rat species. The lack of reports about the occurrence of A2 in bovine (the most studied system) does not guarantee its absence, even if it is taken into account that bovine CPA has a specificity that is mid-way between the A1 and A2 isoforms. Most likely, its apparent absence is due to the inability of the separation procedures to isolate the different forms at the protein level. The resolution of this problem probably requires the use of more efficient separation methodologies or of molecular genetics approaches applied to different vertebrate species besides bovine.

The comparison between human carboxypeptidases A2 and A1, rat A2, and bovine A forms (the latter two of known three-dimensional structures) clearly points out the conservation of the residues that are important for catalysis and for the delineation of the binding site cavity for small substrates or inhibitors (such as Gly-Tyr). Thus, the Zn ligands (His, Glu, His) and the residues involved in catalysis (Glu, Arg) and in substrate anchoring and positioning (Arg, Arg, Asn, Arg) are present in all of these molecules. Also, residues that form the surface loop and close the active site (Ile, Tyr, Ala) are always present. Differences may be expected at those residues forming the specificity pocket. The substitutions at positions 203, 253, 254, or 268 in the A1 forms by residues of smaller side chains in human pro-CPA2 should favor the enlargement of the specificity cavity to facilitate the recognition of substrates with bulkier aromatic residues, one of the characteristics of the A2 function. From these points of view, the previous hypothesis for the specificity basis of the rat A2 form (3) is fully backed in the human counterpart.

Sequence comparisons between the activation segment regions of different pro-CPs provide clues for their inhibition mechanisms and activation processes(1) . These comparisons are shown in Fig. 5for the human A2 form and the porcine A and B forms, of known tertiary structures(10, 11) . Beforehand, it is interesting to indicate that the porcine enzymes are folded in those regions in an alpha-beta open sandwich formed by two alpha-helices packed on one side of a four-stranded beta-sheet plus two reverse turns; an extra 3 helix is found packed at the side of the two alpha-helices in the B form. When the three sequences are aligned on the basis of both secondary structure and residue conservation, it is found that the human A2 form exhibits a 56 and 24% homology with the A and B porcine forms, respectively.


Figure 5: Primary and secondary structure comparison between the activation segment (As) of porcine pancreatic pro-CPs A1, B(10, 11) , and human A2. The numbering system adopted is explained in Fig. 2. The actual length of the activation segments is 94 residues for the A forms and 95 for the B forms. The boxes mark the limits of the secondary structure elements observed, which are identified below the sequences.



Residue conservation at the activation segment regions of pro-CPs is therefore smaller than that observed among the enzymes themselves. However, the substitutions do not modify the secondary structure propensities of those regions, as shown by sequence-based prediction and by homology modeling. As depicted in Fig. 5, the deletion of residues 45-48 with respect to the B form implies the loss of a 3 helix in A2 as is also the case in porcine A1. Given that in pro-CPB this region has been attributed the role of keeping the proenzyme devoid of activity in front of small substrates(10) , we may expect a change in this property in human A2. Indeed, this has been proved by kinetic analysis with Bz-Gly-L-Phe, which indicated that human pro-CPA2 presents a significant residual activity against this substrate (not shown).

Contradictory results seem to appear at the C terminus of the activation segment (residues -18 to -1 in Fig. 1, or residues 84-101 according to the alignment of Fig. 5). This region connects the activation domain to the enzyme moiety and is of prime importance in the proteolytic activation mechanism of pro-CPs(1, 24) . The alignments of sequence and secondary structure elements (both predicted from the sequence and observed in the model for human pro-CPA2) indicate the presence of a long alpha-helix for this region. However, the activation process of human pro-CPA2 (8) is neither bimodal nor slower with respect to human pro-CPB, as observed when this region is structured in a long helix in the A1 forms(1, 10) . In the present case, it seems that the long helix does not clamp the activation segment of human pro-CPA2 on to the active enzyme after being severed from it, as suggested for porcine pro-CPA1(10) . More detailed analysis of the three-dimensional structure at this region may give clues to help understand this behavior.


FOOTNOTES

*
This work was supported by Grant BIO92-0458 from the Ministerio de Educación y Ciencia, Spain (CICYT), and by Fundación Francisca de Roviralta and Centre de Supercomputació de Catalunya (CESCA). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U19977[GenBank].

§
Programa de Formación del Personal Investigador fellowship recipient (Ministerio de Educación y Ciencia, Spain).

To whom correspondence should be addressed: Tel.: 34-3-5811315; Fax: 34-3-5812011.

**
Recipient of a visiting professor fellowship from Fundación Juan March (Spain).

(^1)
The abbreviations used are: CP, carboxypeptidase; pro-CP, procarboxypeptidase; bp, base pair(s); r.m.s. distance, root-mean-square distance.


REFERENCES

  1. Avilés, F. X., Vendrell, J., Guasch, A., Coll, M., and Huber, R. (1993) Eur. J. Biochem. 211, 381-389 [Abstract]
  2. Clauser, E., Gardell, S. J., Craik, C. S., MacDonald, R. J., and Rutter, W. J. (1988) J. Biol. Chem. 263, 17837-17845 [Abstract/Free Full Text]
  3. Gardell, S. J., Craik, C. S., Clauser, E., Goldsmith, E. J., Stewart, C. B., Graf, M., and Rutter, W. J. (1988) J. Biol. Chem. 263, 17828-17836 [Abstract/Free Full Text]
  4. Le Huërou, I., Guilloteau, P., Toullec, R., Puigserver, A., and Wicker, C. (1991) Biochem. Biophys. Res. Commun. 175, 110-117 [Medline] [Order article via Infotrieve]
  5. Kobayashi, R., Kobayashi, Y., and Hirs, C. H. W. (1978) J. Biol. Chem. 253, 5526-5530 [Abstract]
  6. Vendrell, J., Avilés, F. X., Genescà, E., SanSegundo, B., Soriano, F., and Méndez, E. (1986) Biochem. Biophys. Res. Commun. 141, 517-523 [Medline] [Order article via Infotrieve]
  7. Faming, Z., Kobe, B., Stewart, C.-B., Rutter, W. J., and Goldsmith, E. J. (1991) J. Biol. Chem. 266, 24606-24612 [Abstract/Free Full Text]
  8. Pascual, R., Burgos, F. J., Salvà, M., Soriano, F., Méndez, E., and Avilés, F. X. (1989) Eur. J. Biochem. 179, 609-616 [Abstract]
  9. Catasús, L., Villegas, V., Pascual, R., Avilés, F. X., Wicker-Planquart, C., and Puigserver, A. (1992) Biochem J. 287, 299-303 [Medline] [Order article via Infotrieve]
  10. Guasch, A., Coll, M., Avilés, F. X., and Huber, R. (1992) J. Mol. Biol. 224, 149-157
  11. Coll, M., Guasch, A., Avilés, F. X., and Huber, R. (1991) EMBO. J. 10, 1-9 [Abstract]
  12. Quiocho, F. A., and Lipscomb, W. N. (1971) Adv. Protein Chem. 25, 1-78 [Medline] [Order article via Infotrieve]
  13. Rees, D. C., Lewis, M., and Lipscomb, W. N. (1983) J. Mol. Biol. 168, 367-387 [Medline] [Order article via Infotrieve]
  14. Hobohm, U., Scharf, M., Schneider, R., and Sander, C. (1992) Protein Sci. 1, 409-417 [Abstract/Free Full Text]
  15. Mierendorf, R. C., Percy, C., and Young, R. A. (1987) Methods Enzymol. 152, 458-469 [Medline] [Order article via Infotrieve]
  16. Sambrook, J., Fritsch, E. T., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual , p. 2.109, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  17. Benson, J. A. (1987) BioTechniques 2, 126-127
  18. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467 [Abstract]
  19. Güntert, P., Braun, W., and Wüthrich, K. (1991) J. Mol. Biol. 217, 517-530 [Medline] [Order article via Infotrieve]
  20. Güntert, P., and Wüthrich, K. (1991) J. Biomol. NMR 1, 447-456 [Medline] [Order article via Infotrieve]
  21. Lipman, D. J., and Pearson, W. R. (1985) Science 227, 1435-1441 [Medline] [Order article via Infotrieve]
  22. Yamamoto, K. K., Pousette, A., Chow, P., Wilson, H., El Shami, S., and French, C. K. (1992) J. Biol. Chem. 267, 2575-2581 [Abstract/Free Full Text]
  23. Reynolds, D. S., Gurley, D. S., Stevens, R. L., Sugarbaker, D. J., Austen, K. F., and Serafin, W. E. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 9480-9484 [Abstract]
  24. Oppezzo, O., Ventura, S., Bergman, T., Vendrell, J., Jörnvall, H., and Avilés, F. X. (1994) Eur. J. Biochem. 222, 55-63 [Abstract]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.