©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Hydroxyarginine-containing Polyphenolic Proteins in the Adhesive Plaques of the Marine Mussel Mytilus edulis(*)

(Received for publication, May 24, 1995; and in revised form, June 14, 1995)

Vladimir V. Papov (2) Thomas V. Diamond (1) Klaus Biemann (2) J. Herbert Waite (1)(§)

From the  (1)Marine Biology/Biochemistry Program, College of Marine Studies & Department of Chemistry, University of Delaware, Newark, Delaware 19716 and the (2)Department of Chemistry, 18-587, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES

ABSTRACT

An unusual polymorphic protein family of nine or more variants has been isolated from the byssal adhesive plaques and foot of the marine mussel Mytilus edulis. In accordance with established terminology, the family is referred to as M. edulis foot protein 3 or simply Mefp-3. Variants of Mefp-3 have molecular masses of about 6 kDa, isoelectric points greater than 10.5, and an amino acid composition dominated by six amino acids: glycine, asparagine, 3,4-dihydroxyphenylalanine (Dopa), tryptophan, arginine, and an unknown basic amino acid. The latter has been isolated and identified as 4-hydroxyarginine using fast atom bombardment mass spectrometry and appropriate standards. The primary structure of variant Mefp-3F has been determined by peptide mapping using automated Edman sequencing in combination with fast atom bombardment and matrix-assisted laser desorption ionization mass spectrometry: ADYYGPNYGPPRRYGGGNYNRYNRYGRRYGGYKGWNNGWNRGRRGKYW where Y represents Dopa, and R represents hydroxyarginine. Notably, the 4 occurrences of RY are marked by a resistance to trypsin digestion. Although the conversion of tyrosines to Dopa is essentially complete, hydroxylation of arginines varies between 40 and 80%. In contrast to other mussel adhesive proteins such as Mefp-1 and -2 which have large numbers of highly conserved, tandemly repeated peptide motifs, Mefp-3 has only short sporadic repeats. The specific function of Mefp-3 in byssal adhesion is unknown.


INTRODUCTION

The adhesion of marine mussels to underwater surfaces is of scientific and technological interest because it is strong, durable, opportunistic, and not undermined by the presence of water (Waite, 1992). Since mussel adhesion is mediated by the byssus, an external bundle of quinone-tanned threads tipped with flattened adhesive pads or plaques, much recent research has focussed on characterizing those byssal proteins in closest proximity with the substrate surface. Attempts to extract soluble adhesive molecules directly from the plaques have met with little success due to their highly cross-linked nature. Recently, however, Diamond(1993) reported that the plaques deposited by mussels transferred to sea water at 4-8 °C had a greater proportion of extractable protein than those at 15-18 °C, thus suggesting that cross-linking might be temperature-dependent. At least four families of plaque proteins (6, 46, 70, and 120 kDa) have been detected following extraction and polyacrylamide gel electrophoresis in acid-urea (Diamond, 1993). All contain the post-translationally modified amino acid L-3,4-dihydroxyphenylalanine (L-Dopa). (^1)

The polyphenolic protein known as Mytilus edulis foot protein 1 (Mefp-1), was the first of the Dopa-containing byssal precursors to be characterized (Waite and Tanzer, 1981; Filpula et al., 1990). It has a mass of 120 kDa and consists of tandemly repeated decapeptides each containing 2 residues of lysine, 1-2 residues of Dopa (Waite, 1983; Laursen, 1992), 1-2 residues of trans-4-hydroxyproline, and 1 residue of trans-2,3, cis- 3,4-dihydroxyproline (Taylor et al., 1994). Because of its highly adsorptive and surface-active behavior in vitro (Notter, 1988; Olivieri et al., 1992; Hansen et al., 1994), Mefp-1 has long been regarded as a key ingredient of mussel adhesion. Unfortunately, confirmation of this role has been dogged by the extreme insolubility of Mefp-1 in byssus (Diamond, 1993; Rzepecki et al., 1992). Although the presence of Mefp-1 in plaques has been demonstrated by immunohistochemical localization (Benedict and Waite, 1986), recent evidence suggests that the protein may in fact be distributed over the entire byssus as a natural coating or lacquer (Rzepecki et al., 1992). Of the other proteins known to be present in adhesive plaques, the recovery of three is much improved in plaques formed at cold temperatures. Mefp-2 (46 kDa) is a major plaque-specific constituent that consists of 11 tandem repeats of an epidermal growth factor motif 37-41 residues in length with the Dopa modifications limited to the non-epidermal growth factor N and C termini of the protein (Rzepecki et al., 1992; Inoue et al., 1995). The remaining two proteins, Mefp-3 and Mefp-4, are also prominent in cold-shocked plaques but until recently no other detailed information was available.

In this paper, we report on a unique 6-kDa family of Mefp-3 proteins that is synthesized and stockpiled in the mussel foot and then specifically deposited into the adhesive plaques of the byssus. These polypeptides resemble the other byssal precursor proteins in basicity and Dopa content, but are unprecedented in containing high levels of a new post-translational modification of arginine, namely, 4-hydroxyarginine. The latter contributed to the complexity of primary structure determination which was only solved with the application of matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS) (Karas and Hillenkamp, 1988).


MATERIALS AND METHODS

Protein Isolation from Mussel Plaques

Adhesive plaques (1,000-2,000) were produced by M. edulis maintained in 200-liter marine aquaria (4-7 °C) and tethered to Plexiglas sheets (18 inches 24 inches) at densities of 20/sheet. Plaques were harvested from the sheets at 18-24-h intervals with a single-edged razor, washed with excess double-distilled water, and extracted with 1 ml of 5% (v/v) acetic acid in 8 M urea using mini-tissue grinders (Kontes, Vineland, NJ). The homogenate was centrifuged for 10 min in an Eppendorf Microfuge (15,000 g), and the supernatant was dialyzed (M(r) cut-off 1,000) against 1 liter of 5% acetic acid at 4 °C and lyophilized for electrophoresis and C-8 HPLC. Those HPLC fractions containing P-2 and -3 were dissolved in distilled water to a final concentration of about 10 mg/ml. The samples were then S-alkylated by the following modification from Hollecker(1990). To 40 µl of sample was added 20 µl of 0.1 M EDTA, pH 7.0. This was followed by the addition of 0.4 ml of fresh borate/ascorbate buffer (0.5 M sodium borate, 0.5 M ascorbic acid, pH 8.0), 20 µl of 1 M dithiothreitol, and 1.2 ml of 8 M urea. After an incubation of 30-40 min at room temperature, 300 µl of fresh 0.25 M iodoacetate in borate/ascorbate buffer at pH 8.0 was introduced to the reaction mixture and continued for another 15 min. The reaction was concluded by the addition of 20 µl of glacial acetic acid and placed on ice prior to C-8 HPLC.

Protein Isolation from Mussel Feet

Mussel feet were amputated from fresh exsanguinated mussels collected from local rafts and jetties and immediately frozen at -80 °C. Phenol glands were dissected from the feet as described previously (Waite, 1983). The glands in lots of 150-200 were forcibly extruded through a garlic press and extracted by homogenization using large (30 ml) Kontes tissue grinders and 100-150 ml of cold 5% acetic acid. The homogenates were centrifuged at 20,000 g for 40 min in a refrigerated Sorval 5B centrifuge. After discarding the supernatants, the pellets were rehomogenized on ice in 75 ml of 5% acetic acid in 8 M urea. This was again centrifuged at 20,000 g for 40 min. Supernatants were pooled, adjusted to 33% (w/v) with ammonium sulfate (Schwarz/Mann) and stirred for 30 min at 7 °C before centrifuging at 15,000 g for 40 min. To harvest protein not precipitated by 33% ammonium sulfate, the supernatant was dialyzed overnight against at least 4 liters of doubly distilled water or 0.7% (w/v) perchloric acid using dialysis tubing with a 1,000 molecular weight cut-off (Spectrum Industries, Los Angeles, CA). A white floc in the dialysis tubing was harvested by centrifugation for 60 min at 10,000 g in a swinging bucket rotor (Sorval HB4) and, following removal of the supernatant, redissolved in a small volume of 5% acetic acid in 8 M urea. Since water- or perchloric acid-insoluble floc is only partially soluble in acetic acid with urea, it typically required centrifugation on an Eppendorf Microfuge (10 min at 15,000 g) prior to application to C-18 high performance liquid chromatography (HPLC). The 260 mm 7 mm column was an RP-300 or OD-300 Aquapore (Applied Biosciences Inc.) eluted with a linear gradient of aqueous acetonitrile (Waite et al., 1992). Eluant was monitored continuously at 280 nm, and collected 1-ml fractions were assayed by amino acid analysis and electrophoresis following freeze-drying.

Preparation and Purification of Peptides

Trypsin and endoproteinase Lys-C (Boehringer Mannheim) digestion of Mefp-3 was performed at the following two reaction conditions. A, for trypsin and endoproteinase Lys-C digestion, about 1 mg of Mefp-3F was dissolved in 0.5 ml of 5 mM Tris ascorbate, pH 7.5 (prepared by adding equal volumes of 10 mM Trizma Base (Sigma) and 10 mM ascorbic acid) at a protease:protein weight ratio of about 1:100, under constant stirring at 22-24 °C for 18 h under 275 kilopascals of N(2). B, for trypsin digestion, Mefp-3F was dissolved in 0.5 ml of 0.15 M sodium borate, pH 8.5, all other conditions remaining the same as above. The progress of digestion in each case was monitored by removing 5-µl aliquots at 3-h intervals for acid-urea gel electrophoresis (see below). The digestion was terminated by addition of 0.3 ml of glacial acetic acid followed by freezing (-80 °C) and lyophilization. The freeze-dried residue was dissolved in 0.5 ml of 5% acetic acid. Resolution of the peptides was achieved by reversed-phase HPLC using a 260 4 mm Phenomenex or Microsorb C-18 column (Rainin Instruments, Woburn, MA). Eluting solvent was a linear gradient of aqueous acetonitrile (0-30%) with 0.1% trifluoroacetic acid.

Electrophoresis

Routine electrophoresis was done on polyacrylamide gels (7% acrylamide and 0.2% N,N`-methylenebisacrylamide) containing 5% acetic acid and 8 M urea (Panyim and Chalkley, 1969). This system is ideal for basic Dopa-containing proteins because it can be processed for protein or Dopa staining with equal facility. Proteins were stained with Serva Blue R (Serva Fine Chemicals, Westbury, NY), whereas Dopa was stained with either the Arnow reagents (Waite, 1983) or nitro blue tetrazolium redox cycling (Paz et al., 1991). Apparent molecular weight was determined by polyacrylamide gel electrophoresis in the presence of SDS and discontinuous Tris-glycine (Hoefer). Isoelectric focussing was done on PAGE in 8 M urea in the pH range 7-10. Proteins separated by acid-urea PAGE were horizontally transferred to polyvinylidene difluoride (Immobilon P, Millipore) by electrophoresis at 200 mA in 0.7% acetic acid for 40 min using a Genie Transfer Unit with platinized electrode plates (Idea Scientific, Minneapolis, MN). Transferred proteins were stained with 0.1% Serva Blue G-250 in 40% aqueous methanol with 7% acetic acid and destained with the same solvent minus the stain. Protein bands (usually in quintuplicate) were excised with a clean single-edge razor and prepared as below for hydrolysis and amino acid composition (Tous et al., 1989).

Amino Acid Analysis and Sequencing of Peptides

Peptides and proteins were hydrolyzed in 6 N HCl with 10% phenol and 10% trifluoroacetic acid in vacuo at 150 °C for 20 and 40 min to correct for the losses of certain amino acids (Tsugita et al., 1987). Recovery of tryptophan required hydrolysis in 4 N methanesulfonic acid (Simpson et al., 1972) or in 6 N HCl with 30% phenol in vacuo for 20 and 40 min at 165 °C (Muramoto and Kamiya, 1990). Routine amino acid analysis was by ion exchange HPLC and ninhydrin-based detection system (Beckman System 6300 Auto Analyzer) using a previously described gradient program (Waite, 1991). R was quantified using the molar color yield of arginine. Due to its coelution with ammonia, tryptophan was separately quantitated on System 6300 using a 40-min program of NaD (5% sodium chloride and 1.9% sodium citrate at pH 6) at a column temperature of 70 °C. The amino acid sequence of protein and peptides was derived by automated Edman degradation using a Porton Instruments Microsequencer (Porton, Tarzana, CA). Phenylthiohydantoin derivatives of amino acids were chromatographically separated according to a gradient program specified by Waite(1991). The elution position of phenylthiohydantoin-4-hydroxyarginine was determined by spotting sample glass fiber discs with authentic 4-hydroxyarginine and running for 2-3 cycles.

Isolation of Arginine Derivative

About 3-4 mg of Mefp-3 was hydrolyzed in 2 ml of 6 M HCl with 5% phenol in vacuo for 24 h at 110 °C. After this, the hydrolysate was flash-evaporated to dryness at 60 °C. The residue was taken up in distilled water and applied to a 4 200 mm HPLC ion exchange column (Beckman P/N 338076) and eluted isocratically at room temperature with NaD (5% NaCl with 1.9% sodium citrate from Beckman) buffer (flow 0.3 ml/min). Eluting fractions were monitored at 220 nm and manually assayed for the guanidino functionality using the Sakaguchi reaction (Litwack, 1960). Sakaguchi-positive fractions were examined by automated amino acid analysis (Beckman System 6300), and those containing the unknown amino acid were pooled, lyophilized at -80 °C, resuspended in 0.2 M acetic acid, and chromatographed on a column of Bio-Gel P-2 (80 1.5 cm) eluted with 0.2 M acetic acid. One-ml fractions were collected and monitored by conductivity, the Sakaguchi reaction, and amino acid analysis. Fractions containing the arginine derivative were freeze-dried and analyzed by fast atom bombardment mass spectrometry (FAB-MS) (Barber et al., 1981). 4-Hydroxyarginine and N-hydroxyarginine standards were kindly supplied by E. A. Bell (King's College, London) and P. L. Feldman (Glaxo Research Institute, Research Triangle Park, NC), respectively.

Mass Spectrometry

Arginine, the unknown hydroxyarginine, and the other arginine derivatives and peptides were ionized by FAB using glycerol as the matrix. A tandem mass spectrometer (JEOL HX110/HX110) having a E(1)B(1)E(2)B(2) configuration was used to generate the mass spectra (Sato et al., 1987). The first mass spectrometer (MS-1) was operated at a resolution of 1:1000 and was used to measure the mass of protonated (M + H) ions and to select the C-only species for collision-induced dissociation (CID), accomplished by introducing helium into the field-free region between MS-1 and the second mass spectrometer (MS-2). The resulting fragments were analyzed by MS-2 using a linked scan. MS-2 was also operated at a resolution of 1:1000 with the acceleration voltage set to 10 kV, and the collision cell voltage kept at 3 kV above ground. The cesium gun was operated at 20-25 kV and CID profile scans acquired with the JEOL Complement data system.

MALDI-TOF experiments were performed using a Vestec VT2000 LD-TOF (linear) mass spectrometer (Vestec Corp., Houston, TX). The MALDI matrix was prepared by dissolving alpha-cyano-4-hydroxycinnamic acid (10 mg/ml) in 50% acetonitrile. The Mefp-3 protein or peptides derived thereof were dissolved in this matrix solution to give a final concentration between 1 and 10 pmol/µl. About 1 µl of this solution was applied to the target plate and allowed to evaporate. The sample spots were irradiated using a Laser Science N(2) laser (LSI, Inc., Cambridge, MA). The laser (337 nm) has a pulse width of 8 ns and was operated at a repetition rate of 5 Hz. MALDI ionization generates protonated singly and doubly charged ions for the Mefp-3 protein (mostly singly charged ions for peptides) which were accelerated using either 30 or 35 kV accelerating voltage. The resolution was about 1:300 which was sufficient to allow mass assignment of the major peaks due to the different hydroxylation states of the Mefp-3 protein.

The Mefp-3F protein or peptides derived from enzymatic digestion of Mefp-3F (HPLC-purified) were dissolved in either 100 mM ammonium acetate (pH 4.0) or 100 mM ammonium citrate (pH 5.5) and digested with carboxypeptidase P (Boehringer Mannheim). Alternatively, aminopeptidase M (Boehringer Mannheim) in 50 mM sodium phosphate (pH 7.0) buffer was used for N-terminal sequence information. The reactions were performed at room temperature with the enzyme to substrate ratio between 1:10 and 1:100 by weight. Aliquots from the reaction solutions were taken at timed intervals and dissolved in alpha-cyano-4-hydroxycinnamic acid matrix solution before being analyzed by MALDI-TOF mass spectrometry.


RESULTS

Characterization of a Plaque-specific Protein

Newly secreted byssal threads and adhesive plaques of the mussel M. edulis contain a number of proteins that are extractable with 5% acetic acid, 8 M urea and separable by acid-urea-PAGE (Fig. 1A). Several of the plaque proteins (P-1 to P-4) have detectable levels of Dopa as indicated by redox cycling with nitro blue tetrazolium. Plaque protein 3 is the most mobile of the plaque-specific proteins and distinct from the others by its sky-blue metachromasy following staining with Coomassie Blue (Serva Blue R-250). P-3 was separated from P-2 (identical with Mefp-2) and resolved into a cluster of 4 peaks by C-8 reversed phase HPLC of a reduced and S-alkylated plaque extract (Fig. 2A). Each fraction under the peaks was subjected to amino acid analysis following hydrolysis (see P-3 in Table 1), and the N terminus of fraction 32 was sequenced (Table 2). The unique composition and N terminus, i.e. A(A/D)YYGPNYGPPR, indicate that P-3 is distinct from either of the previously purified Dopa-containing proteins associated with adhesion in the byssus of M. edulis (Rzepecki et al., 1992; Waite et al., 1985).


Figure 1: Polyacrylamide gel electrophoresis of mussel byssus and foot-derived proteins. A, byssus-derived proteins on acid-urea gels stained for protein (CB) and redox cycling (NBT): P, 23 µg of byssal plaque extract; T, 25 µg of byssal thread extract; M3, 9 µg of purified foot-derived protein 3 for comparison. B, foot (phenol gland)-derived proteins on acid-urea gels: AA, 5% acetic acid-extracted proteins (41 µg); UA, 5% acetic acid- and 8 M urea-extracted proteins (37 µg); PCA, perchloric acid-precipitated proteins (11 µg). Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS) and isoelectric focussing (IEF) of Mefp-3F (panels at right): apparent molecular mass determination by SDS-PAGE (standards used were rabbit phosphorylase b (92 kDa), bovine serum albumin (68 kDa), ovalbumin (43 kDa), carbonic anhydrase (31 kDa), soybean trypsin inhibitor (20 kDa), and lysozyme (14 kDa)); isoelectric focussing-PAGE of Mefp-3 in the range pH 7-10. Numbers below the lanes denote aliquots taken from HPLC fractions in Fig. 2B.




Figure 2: A, C-8 reversed phase separation of S-alkylated adhesive plaque-derived proteins. Inset, acid-urea gel electrophoresis of protein aliquots taken from fractions 27-45. Fraction 32 (arrow) was sequenced. B, resolution of dialysis-precipitated Mefp-3 by C-18 reversed phase HPLC. Sample load, 1.5 ml; flow rate, 1 ml/min. Stationary buffer, aqueous 0.1% trifluoroacetic acid; mobile buffer, 0.1% trifluoroacetic acid in acetonitrile. Full-scale absorbance at 280 nm is 0.1 (A) and 0.5 (B), respectively. Each graduation in % Acetonitrile equals 10%. Inset, acid-urea polyacrylamide gel of protein aliquots (5 µg) taken from fractions 30-45 under the elution profile. Electrophoretic variants are denoted A to J. Fractions pooled for bulk digestion and Edman sequencing are enclosed by a bracket. Fraction 39 (arrow) was selected for micro Lys-C digestion followed by MALDI analysis.







Isolation of a Matching Foot Protein

A cursory search for a P-3 match in the foot, necessitated by the low yield and labor-intensive protein extraction from the plaques, revealed that a similar protein or rather family of proteins resides in the phenol gland region of the foot. In conformity with current convention, this protein is abbreviated Mefp-3 for Mytilus edulis foot protein 3. Mefp-3 is liberated from phenol glands during homogenization only by including high urea (8 M) concentrations in the 5% acetic acid extraction buffer (Fig. 1B). This may explain previous failures to detect it (Waite and Tanzer, 1981; Rzepecki et al., 1992). By exploiting Mefp-3's high solubility in buffers containing ammonium sulfate and its tendency to precipitate in dilute perchloric acid or water, it can be prepared to >90% purity without reliance on any chromatography (Fig. 1B).

Protein Polymorphism and Composition

Chromatographic polishing of partially purified Mefp-3 was achieved by reversed-phase C-18 HPLC and resulted in an elution profile consisting of 4 major and at least 6 minor peaks and shoulders (Fig. 2B). Acid-urea-polyacrylamide gel electrophoresis of fractions 30-45 suggests that all of the peaks but one contain an electrophoretically heterogeneous mixture of protein variants denoted A to J (Fig. 2B, inset). Only fractions 37-39 would appear to contain electrophoretically homogeneous protein which is denoted as variant F. However, given the asymmetry of the peak over fractions 37-39, the actual purity of even these fractions remains suspect. Amino acid compositions of the major electrophoretically distinct species were determined following acid-urea-PAGE, transfer to polyvinylidene difluoride, and hydrolysis (Table 1). All are dominated by Asx, Gly, and Dopa to a total of about 60 mol %. In point of fact, Mefp-3 Dopa levels are nearly double those of P-3, but losses in the latter are to be expected in the oxidative environment of the plaque. A relatively high tryptophan content (4 mol %) is also present as is an unknown pre-arginine component R (Table 1). All species have an apparent pI > 10.5 as determined by isoelectric focussing in the presence of 8 M urea and Triton X-100 and an apparent molecular mass of 11 kDa as determined by discontinuous SDS-PAGE (Fig. 1B). A more accurate molecular weight determination was obtained by MALDI-TOF mass spectrometry which can routinely provide a mass measurement with an accuracy of ±0.1% using an external standard (Beavis and Chait, 1990). The MALDI-TOF mass spectrum of the major component of Mefp-3 variant F (fraction 39 in Fig. 2A, inset) was acquired utilizing bovine ubiquitin (M + H) = 8565.9 and (M + 2H) = 4283.4 as an internal standard to further improve the mass accuracy (Fig. 3). The singly charged region of Mefp-3F is enlarged in the inset and shows that Mefp-3F produces a cluster of peaks that differ from the most highly hydroxylated form (M + H) = 6135.2 of the protein in multiples of about 16 mass units.


Figure 3: MALDI-TOF mass spectrum of the major component of Mefp-3 variant F (fraction 39 in Fig. 2B, inset). The peaks labeled STD (M+H) at m/z 8568.9 and STD (M+2H) at m/z 4283.4 are due to the singly and doubly charged ions, respectively, of the internal calibrant bovine ubiquitin. The inset represents an expansion of the singly charged Mefp-3F region of the mass spectrum which reveals the different hydroxylation states of the protein. The observed m/z value of the (M + H) ion of the most highly hydroxylated form is 6135.2 which agrees well with the (M + H) value 6136.31 calculated for the final amino acid sequence.



Isolation of a New Amino Acid

Appearance of a prominent peak (R) eluting prior to arginine during amino acid analysis of Mefp-3 hydrolysates led us to attempt a purification and chemical characterization of this amino acid. Its isolation was easily accomplished in two steps by ion exchange chromatography (Fig. 4A) followed by gel filtration on Bio-Gel P-2 using 3-4 mg of acid-hydrolyzed Mefp-3 (Fig. 4B). Purity was routinely checked by amino acid analysis and paper chromatography (ethyl methyl ketone/propionic acid/water, 2:1:2, by volume on Whatman No. 1 filter paper) using the Sakaguchi reagent spray (Bell and Tirimanna, 1964; Litwack, 1960). R gave an intensely scarlet color with Sakaguchi in contrast to the orange-red of Arg. This indicates the presence of a guanidyl group and agrees with previous studies on hydroxyarginines by Bell and Tirimanna(1964). N-Hydroxyarginine was tested and found to be Sakaguchi-negative.


Figure 4: Purification of hydroxyarginine (R*) from hydrolysates of Mefp-3 by ion exchange. Beckman AA 20 0.4 mm column eluted with sodium citrate pH 6 at 0.3 ml/min. Full scale absorbance is 2.0 at 280 nm; deflection for nonaromatic amino acids probably reflects changes in the index of refraction (A) and gel filtration (B) chromatography using Bio-Gel P-2 (45 1.5 cm) eluted with 0.4 M acetic acid at room temperature. Flow rate was adjusted to 0.5 ml/min. Fractions (not collected until after 80 ml was eluted) were assayed for conductivity and by amino acid analysis.



Identification of 4-Hydroxyarginine

The new amino acid isolated from Mefp-3 produced an abundant (M + H) ion of m/z 191.2 by FAB mass spectrometry indicating a molecular weight of 190.2. This is 16 Da higher than that of arginine and, together with the Sakaguchi reaction, suggested that the unknown amino acid is a hydroxyarginine. The collision-induced dissociation (CID) mass spectra of arginine and the unknown hydroxyarginine are shown in Fig. 5, A and B, respectively. Two abundant ions (m/z 70 and 87) in the CID mass spectrum of arginine are not present to any appreciable extent in the CID mass spectrum of the hydroxyarginine, whereas the latter spectrum exhibits two new peaks at m/z 86 and 103 which correspond to a mass shift of 16 Da, respectively. On the other hand, both spectra contain an ion at m/z 73. The structures of the major arginine fragments were deduced and then verified by obtaining CID spectra from several different arginine derivatives of known structure. From the structures of the ions at m/z 73, 86, and 103 (shown in Fig. S1), it was determined that the unknown amino acid was 4-hydroxyarginine. Its identity was later confirmed by obtaining a CID mass spectrum (Fig. 5C) of authentic 4-hydroxyarginine (Bell and Tirimanna, 1964). In addition, R matches authentic 4-hydroxyarginine in Sakaguchi color, and behaves identically with 4-hydroxy-L-arginine which elutes at 79 min between ammonia and Arg on extended amino acid analysis and forms a unique phenylthiohydantoin-derivative eluting at 8-9 min (between His and Dopa) on C-18 HPLC following automated Edman degradation (results not shown). Although it is reasonable to suggest the L-configuration for the alpha-carbon in light of its susceptibility to trypsin cleavage (unless followed by Dopa), the absolute configuration at C-4 remains unknown.


Figure 5: CID mass spectra using FAB ionization of arginine (A), hydroxyarginine derived from Mefp-3 (B), and authentic 4-hydroxyarginine (C).




Figure S1: Scheme 1.



Peptide Mapping and Sequencing

To elucidate the primary structure of Mefp-3, the relatively homogeneous HPLC fractions 37-39 were initially selected for further characterization. These ostensibly contained a single electrophoretic variant F (Mefp-3F) with a single N-terminal sequence up to residue 11. Beyond this point, each incidence of phenylthiohydantoin-Arg is accompanied by another peak corresponding to phenylthiohydantoin-hydroxyarginine at 8.7 min. Digestion of Mefp-3F with trypsin in 0.1 M Tris-ascorbate resulted in 28 peaks (many more than a mass of 6 kDa could have accommodated) (Fig. 6A). Sequencing was undertaken but complicated by the coelution of related peptides, e.g. every peptide beginning with Dopa had a counterpart beginning with HOArg-Dopa, and every peptide ending with Arg had a counterpart ending with HOArg. These suggested the following: 1) the post-translational conversion of Arg to HOArg may be essentially random with an overall efficiency of <100%; 2) Arg/HOArg frequently occurs in pairs; and 3) while Arg-Dopa is trypsin-labile, HOArg-Dopa is not. In an attempt to reduce tryptic peptide heterogeneity, Mefp-3F was digested with trypsin but this time in 0.15 M borate, pH 8.0. This was observed to retard cleavage of Lys-Dopa and Arg-Dopa bonds due to Dopa's complexation of borate (Waite and Rice-Ficht, 1992). Thus, if HOArg-Dopa is naturally resistant to trypsin, and Lys-Dopa and Arg-Dopa can be rendered so in borate, then only the HOArg-X, Arg-X, and Lys-X (where X represents primary amino acids other than Dopa) bonds should be cleaved in the protein. As expected, trypsin-borate digestion of Mefp-3F resulted in fewer peptides (Fig. 6B) having sequences shown in Table 2. Wherever a basic residue is followed by Dopa, the peptide bond remains relatively resistant. TB-2 and TB-4 are exceptions that appear after 12-18 h and apparently are derived from TB-3 and TB-5, respectively. TB-7 proved to be a mixture of closely related peptides KYW and GKYW. These are the clearest suggestion, in fact, that the proteins included in pooled fractions 37-39 (Fig. 2B, inset) may have subtle variations in sequence even though they appear to be electrophoretically homogeneous. Peptides T-15 and TB-5 also exhibited sequence heterogeneity beyond residue 11 (data not shown). Digestion with endoproteinase Lys-C in 0.1 M Tris-ascorbate results in only three peptides (Fig. 6C). LC-1 yields only N-terminal Dopa during Edman sequencing. Since it also contains stoichiometric amounts of tryptophan as determined by amino acid analysis following acid hydrolysis, we conclude its sequence to be YW. This sequence is also apparent in peptides produced by trypsinization of Mefp-3F in borate (TB-7) and Tris-ascorbate (T-20). LC-2 was sequenced to residue 21 and matches the N terminus of Mefp-3F. LC-3 could not be sequenced beyond the eighth residue. A partial sequence can be proposed from these data and reveals uncertainty about the residues between GWNNGWNR and KYW at the C terminus and NYNRYN and RRYGGYK in the middle of Mefp-3F: ADYYGPNYGPPRRYGGGNYNRYN-RRYGGYKGWNNGWNR-KYW.


Figure 6: Resolution of Mefp-3F-derived peptides by C-18 reversed phase HPLC. Sample load, 50 µl; flow rate, 1 ml/min. Stationary buffer, aqueous 0.1% trifluoroacetic acid; mobile buffer, 0.1% trifluoroacetic acid in acetonitrile. Full scale absorbance at 280 nm is 0.2. A, trypsin in 0.1 M Tris-ascorbate, pH 7.5. B, trypsin in 0.15 M borate, pH 8.0. C, endoproteinase Lys-C in 0.1 M Tris-ascorbate, pH 7.5.



The small size of peptides produced as well as the possibility of microheterogeneity severely limits the effectiveness of a classical Edman approach here. Thus, mass spectrometry was employed to clarify the sequence of peptides near the C terminus as well as confirm the overall suggested structure.

Sequence Determination by Mass Spectrometry

Aliquots from the tryptic, tryptic/borate, and endoproteinase Lys-C digests of Mefp-3F (fraction 39 in Fig. 2, inset) were analyzed directly (without purification) by MALDI-TOF mass spectrometry. The mass spectrum (Fig. 7) of the Lys-C digest, in contrast to the complexity generated by the tryptic digests, gave only two major ion clusters with the individual ions in each cluster separated by 16 mass units as shown in more detail in the inset. However, it is apparent from the HPLC chromatogram of the Lys-C digest (Fig. 6C) that at least 3 major components are present in the digest mixture. The peptide assigned as LC-1 (calculated M(r) = 383.1) was not observed in the mass spectrum because small peptides are not efficiently ionized by MALDI and the low mass region is obscured by peaks due to the matrix. Instead, the HPLC fraction containing LC-1 was analyzed by FAB mass spectrometry and produced a (M + H) ion of m/z 384.1. A CID mass spectrum of this ion indicated that it is the dipeptide YW.


Figure 7: MALDI-TOF mass spectrum of an aliquot from an endoproteinase Lys-C digest of the major component of Mefp-3F. The inset is an expansion of the (M + H) region of LC-2 showing peaks that are separated in mass according to hydroxylation state (the individual peaks in the cluster are 16 mass units apart).



The sequence of the peptide LC-3 was determined from the MALDI-TOF mass spectra of separate carboxypeptidase P and aminopeptidase M digests. These results verify the partial peptide sequence deduced by Edman degradation and, more importantly, resolve the Edman ambiguity in LC-3 beyond the tenth residue. After separation of LC-3 from LC-2 by reversed phase HPLC, the M(r) of the most highly hydroxylated component of LC-3 was found to be 1605.7 (using MALDI mass spectrometry with internal calibration). Fig. 8shows the MALDI-TOF mass spectrum of peptide LC-3 incubated with carboxypeptidase P for 30 s. The C-terminal lysine has already been released by this time, and the mass spectrum now contains four major peaks: (M + H) = 1478.6, 1421.4, 1249.3, and 1077.1. The mass differences 128.1, 57.2, 172.1, and 172.2 correspond to the consecutive loss of K, G, R, and R, respectively. Further digestion with carboxypeptidase P finally liberates glycine (mass difference of 57.2) establishing GRRGK as a partial C-terminal sequence for peptide LC-3. A similar experiment with aminopeptidase M successfully cleaved LC-3 consecutively to the seventh residue (GWNNGWN). This sequence is part of the tryptic peptide GWNNGWNR (T-25 and TB-9 in Table 2) which strongly suggests that the eighth residue of LC-9 is R. It is of interest to note that this peptide was the only one in the tryptic digest of the extremely hydrophilic Mefp-3F protein that was sufficiently hydrophobic to ionize well enough by FAB to make it possible to acquire a CID spectrum from which its sequence could be deduced (Biemann, 1990). More importantly, the M(r) of LC-3 together with the molecular weights of the partial C-terminal and N-terminal sequences provided enough information to confirm that R is indeed at position 8 resulting in the sequence GWNNGWNRGRRGK for LC-3.


Figure 8: MALDI-TOF mass spectrum acquired after 30 s of carboxypeptidase P digestion of peptide LC-3 from Mefp-3F. The mass differences indicate that RRGK are the last four C-terminal residues of this peptide (undigested LC-3 has (M + H) = m/z 1606.7).



Similarly, aliquots from the digestion of peptide LC-2 with carboxypeptidase P were also analyzed by MALDI-TOF-MS. The mass spectrum of the digest solution after 30 min is shown in Fig. 9. Each peak (highest hydroxylated form) is labeled with the amino acid(s), the mass(es) of which correspond to the mass differences observed from cluster to cluster. Following the initial cleavage of K and Y are two nonconsecutive cleavages of GG and RY, respectively. These data confirmed the sequence derived (peptides T-5, T-6, TB-2, and TB-3 in Table 2) by the Edman method. The remaining five clusters of ions define the C-terminal sequences of peptides T-15 and TB-5 (Table 2). The measured (M + H) values of the most hydroxylated forms are m/z 3238.1, m/z 3181.6, m/z 3001.9, m/z 2830.3, m/z 2716.0, which correlate well with the calculated values 3238.24 (-R), 3181.19 (-G), 3002.1 (-Y), 2829.82 (-R), and 2715.72(-N), respectively. Further digestion of LC-2 with carboxypeptidase P consecutively cleaved six more amino acids to give the following C-terminal sequence for LC-2: GNYNRYNRYGRRYGGYK.


Figure 9: MALDI-TOF mass spectrum of peptide LC-2 incubated with carboxypeptidase P for 30 min. The most highly hydroxylated peak in each peptide cluster is labeled with the amino acid(s) that corresponds to the loss in mass (i.e. molecular weight minus H(2)O) of that amino acid.



Endoproteinase Lys-C digestion of Mefp-3F based on the final amino acid sequence is expected to produce three peptides LC-1, LC-2, and LC-3 (Fig. 10). The peptide LC-3 has a calculated M(r) value of 1605.7 which agrees well with that measured for the (M + H) ion (m/z 1606.1) in Fig. 8using external calibration. The molecular weight of LC-1 has already been mentioned while the observed (M + H) ion for the most hydroxylated component of LC-2 (Fig. 7C) has a (M + H) of m/z 4182.9 which correlates well with the calculated value of 4183.25.


Figure 10: Sequence of Mefp-3F showing overlap of endoproteinase Lys-C (LC) and tryptic (TB) peptides. Dopa is denoted by Y and hydroxyarginine by R. ? denotes those sequences revealed only by MALDI-TOF mass spectra. Inset shows structure of RY suggesting the H-bond that might protect the underlying peptide bond from trypsin attack.



The C-terminal sequence KYW was confirmed by MALDI mass spectra of the carboxypeptidase P digest of Mefp-3F itself. Further digestion of Mefp-3F with carboxypeptidase P resulted in nonconsecutive losses (i.e. more than one residue) corresponding to RG followed by loss of RG. These data supported the amino acid sequence deduced from the mass spectrometric analysis of LC-3 and also provided the overlap information necessary to place LC-3 before LC-1 at the C terminus.

In conclusion, all these data together lead to the amino acid sequence shown in Fig. 10for Mefp-3F. The m/z value calculated for the (M + H) ion of the most highly hydroxylated form of Mefp-3F is 6136.31 which corresponds well to the experimentally determined value (m/z 6135.2) mentioned earlier. The sequence of other variants, although incomplete, is expected to show subtle variations from Mefp-3F. This is hardly surprising in view of the similar amino acid compositions.


DISCUSSION

Mature byssal adhesive plaques are ordinarily intractable to extraction due to extensive protein cross-linking. When freshly secreted or perturbed by cold shock, however, they contain a small number of extractable proteins. One of these, P-3 and its foot-derived precursor, Mefp-3, are unusual in containing high levels of two intriguing post-translationally modified amino acids: Dopa and 4-hydroxyarginine. Dopa-containing proteins are widely distributed throughout the animal kingdom including organisms from the following animal phyla: Chordata, Mollusca, Annelida, Platyhelminthes, and Cnidaria (reviewed by Waite(1990)). By and large, the proteins serve as precursors for natural adhesives and varnishes that undergo a curing process known as quinone-tanning. The functional effect of incorporating Dopa into the primary structure of proteins is 2-fold: Dopa adsorbs tenaciously to surfaces (Olivieri et al., 1992), and, following catalytic conversion to quinones by catecholoxidase, Dopa mediates protein cross-linking. The economy of serving both a cross-linking and surface coupling function has been noted previously (Waite et al., 1992).

4-Hydroxyarginine has been previously detected only as a free amino acid in the seeds of vetch (Bell and Tirimanna, 1963), lentils (Sulser et al., 1975), and in tissues of sea anemones (Makisumi, 1961) and sea cucumbers (Fujita, 1959), but never as a part of the primary structure of proteins. Neither its function nor the reason for its incomplete conversion from Arg in Mefp-3 are known at this time. There is one unique feature conferred by hydroxyarginine that is apparent from peptide mapping studies with trypsin. While the Arg-Dopa bond is cleaved by trypsin in Tris-ascorbate, it is not when Arg is converted to HOArg. The lability of HOArg-X linkages to trypsin when X is any primary amino acid other than Dopa suggests some interaction between hydroxyarginine and Dopa, e.g. hydrogen bonding that blocks enzyme access to the peptide bond (Fig. 10). Four HOArg-Dopa pairs exist in Mefp-3F. Like Dopa, arginine and presumably its hydroxylated derivative are also an asset for the molecular interactions indispensible for adhesion: Arg can be a hydrogen donor in as many as 5 hydrogen bonds in which the acceptors are usually backbone carbonyl groups (Borders et al., 1994). It is also involved in planar parallel stacking with aromatics that does not impede the hydrogen bonding capacity of arginine (Flocco and Mowbray, 1994). For these reasons, perhaps, Arg-rich proteins bind polyphenols avidly and are rather readily insolubilized by them (Meek and Weiss, 1979). Although no known sequence matches can be found for Mefp-3F, the RG-rich character of the protein is reminiscent of some RNA binding proteins (Burd and Dreyfuss, 1994).

Unlike the other two byssal precursors Mefp-1 and-2, Mefp-3 does not consist of long stretches of tandemly repeated peptides. There are, however, some suggestions of repetition: GWNNGWNR (TB 9) and RYGG (TB 3 and TB 5). Like Mefp-2, Mefp-3 occurs only in the byssal adhesive plaques. Future studies should address what specific role it plays there. Byssal adhesive plaques contain at least 4 different morphological domains when examined by electron microscopy: e.g. the 5-µm-thick lacquer on the plaque surface facing the sea water, the microcellular foam of the plaque interior, the primer mediating the interface between the foam and foreign surface, and the fibers from the thread embedded in the plaque (Benedict and Waite, 1986; Tamarin et al., 1976). So far, there is only enough evidence to correlate 2 of these with proteins, i.e. Mefp-1 with the lacquer and a short chain collagen with the fibers (Qin and Waite, 1995). P-3 or Mefp-3 may be associated with one or more of the other functional roles.


FOOTNOTES

*
This work was supported by National Institutes of Health Grants DE10042 (to J. H. W.) and GM05472 and RR00317 (to K. B.), and Office of Naval Research Grant N00014-89-J-3121 (to J. H. W.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence and reprint requests should be addressed. Tel: 302-831-4441; Fax: 302-831-6335.

(^1)
The abbreviations used are: L-Dopa, L-3,4-dihydroxyphenylalanine; HOArg, 4-hydroxyarginine; Mefp, M. edulis foot protein; HPLC, high performance liquid chromatography; MALDI, matrix-assisted laser desorption ionization; TOF, time-of-flight; FAB, fast atom bombardment; CID, collision-induced dissociation; PAGE, polyacrylamide gel electrophoresis.


REFERENCES

  1. Barber, M., Bordoli, R. D., Sedgwick, R. D., and Tyler, A. N. (1981) J. Chem. Soc. Chem. Commun. 1981,325
  2. Beavis, R. C., and Chait, B. T. (1990) Anal. Chem. 62,1836-1840 [Medline] [Order article via Infotrieve]
  3. Bell, E. A., and Tirimanna, A. S. L. (1963) Nature 197,901-902 [Medline] [Order article via Infotrieve]
  4. Bell, E. A., and Tirimanna, A. S. L. (1964) Biochem. J. 91,356-358 [Medline] [Order article via Infotrieve]
  5. Benedict, C. V., and Waite, J. H. (1986) J. Morphol. 189,261-270 [Medline] [Order article via Infotrieve]
  6. Biemann, K. (1990) Methods Enzymol. 193,455-479 [Medline] [Order article via Infotrieve]
  7. Borders, C. L., Broadwater, J. A., Bekeny, P. A., Salmon, J. E., Lee, A. S., Eldridge, A. M., and Pett, V. B. (1994) Protein Sci. 3,541-548 [Abstract/Free Full Text]
  8. Burd, C. G., and Dreyfuss, G. (1994) Science 265,615-621 [Medline] [Order article via Infotrieve]
  9. Diamond, T. V. (1993) Dopa Proteins from the Adhesive Plaques of Mytilus edulis. Master's thesis, University of Delaware, Newark, DE
  10. Filpula, D. R., Lee, S. M., Link, R. P., Strausberg, S. L., and Strausberg, R. L. (1990) Biotechnol. Prog. 6,171-177 [Medline] [Order article via Infotrieve]
  11. Flocco, M. M., and Mowbray, S. L. (1994) J. Mol. Biol. 235,709-717 [CrossRef][Medline] [Order article via Infotrieve]
  12. Fujita, Y. (1959) Bull. Chem. Soc. Jpn. 32,439-442
  13. Hansen, D. C., Luther, G. W., and Waite, J. H. (1994) J. Colloid Interf. Sci. 168,206-216 [CrossRef]
  14. Hollecker, M. (1990) in Protein Structure (Creighton, T. E., ed) pp. 145-153, IRL Press, Oxford
  15. Inoue, K., Takeuchi, Y., Miki, D., and Odo, S. (1995) J. Biol. Chem. 270,6698-6701 [Abstract/Free Full Text]
  16. Karas, M., and Hillenkamp, F. (1988) Anal. Chem. 60,2299-2301 [Medline] [Order article via Infotrieve]
  17. Laursen, R. A. (1992) in Results and Problems in Cell Differentiation 19 Biopolymers (Case, S. T., ed) pp. 55-74, Springer, Berlin
  18. Litwack, G. (1960) Experimental Biochemistry: A Laboratory Manual , Wiley, New York
  19. Makisumi, S. (1961) J. Biochem. (Tokyo) 49,284-291
  20. Meek, K. M., and Weiss, J. B. (1979) Biochim. Biophys. Acta 587,112-120 [Medline] [Order article via Infotrieve]
  21. Muramoto, K., and Kamiya, H. (1990) Anal. Biochem. 189,223-230 [Medline] [Order article via Infotrieve]
  22. Notter, M. F. D. (1988) Exp. Cell. Res. 177,237-246 [Medline] [Order article via Infotrieve]
  23. Olivieri, M. P., Baier, R. E., and Loomis, R. E. (1992) Biomaterials 13,1000-1008 [Medline] [Order article via Infotrieve]
  24. Panyim, S., and Chalkley, G. R. (1969) Arch. Biochem. Biophys. 130,337-346 [Medline] [Order article via Infotrieve]
  25. Paz, M., Flückinger, R., Boak, A., Kagan, H. M., and Gallop, P. M. (1991) J. Biol. Chem. 266,689-692 [Abstract/Free Full Text]
  26. Qin, X.-X., and Waite, J. H. (1995) J. Exp. Biol. 198,633-644 [Abstract/Free Full Text]
  27. Rzepecki, L. M., Hansen, K. M., and Waite, J. H. (1992) Biol. Bull. 183,123-137 [Abstract/Free Full Text]
  28. Sato, K., Asada, T., Ishihara, M., Kunihiro, F., Kammei, Y., Kubota, E., Costello, C. E., Martin, S. A., Scoble, H. A., and Biemann, K. (1987) Anal. Chem. 59,1652-1659 [Medline] [Order article via Infotrieve]
  29. Simpson, R. J., Neuberger, M. R., and Liu, T.-Y. (1976) J. Biol. Chem. 251,1936-1940 [Abstract]
  30. Sulser, H., Beyeler, M., and Sager, F. (1975) Lebensm.-Wiss. Technol. 8,161-162
  31. Tamarin, A., Lewis, P., and Askey, J. (1976) J. Morphol. 149,199-222 [Medline] [Order article via Infotrieve]
  32. Taylor, S. W., Waite, J. H., Ross, M. M., Shabanowitz, J., and Hunt, D. F. (1994) J. Am. Chem. Soc. 116,10803-10804
  33. Tous, G. I., Fausnaugh, J. L., Akinyosoye, O., Lackland, H., Winter-Cash, P., Vitorica, F. J., and Stein, S. (1989) Anal. Biochem. 179,50-55 [Medline] [Order article via Infotrieve]
  34. Tsugita, A., Uchida, T., Mewes, H. W., and Ataka, T. (1987) J. Biochem. (Tokyo) 102,1593-1597 [Abstract]
  35. Waite, J. H. (1983) J. Biol. Chem. 258,2911-2915 [Abstract/Free Full Text]
  36. Waite, J. H. (1990) Comp. Biochem. Physiol. 97B,19-29 [CrossRef]
  37. Waite, J. H. (1991) Anal. Biochem. 192,429-433 [Medline] [Order article via Infotrieve]
  38. Waite, J. H. (1992) in Results and Problems in Cell Differentiation 19 Biopolymers (Case, S. T., ed) pp. 27-54, Springer, Berlin
  39. Waite, J. H., and Rice-Ficht, A. C. (1992) Molec. Biochem. Parasitol. 54,143-152 [Medline] [Order article via Infotrieve]
  40. Waite, J. H., and Tanzer, M. L. (1981) Science 212,1038-1040
  41. Waite, J. H., Housley, T. J., and Tanzer, M. L. (1985) Biochemistry 24,5010-5014 [Medline] [Order article via Infotrieve]
  42. Waite, J. H., Jensen, R. A., and Morse, D. E. (1992) Biochemistry 31,5733-5738 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.