ABSTRACT
A mussel (Mytilus galloprovincialis) cDNA encoding
Mgfp2, a major component of the adhesive plaque that anchors mussels
tightly to underwater surfaces was isolated. It encoded a protein
mainly consisted of epidermal growth factor-like repeats, containing
tyrosine residues that will be converted to 3,4-dihydroxyphenylalanine
near C and N termini. Amino acid residues important for cell-cell
interaction in other epidermal growth factor-like proteins were,
however, not conserved in the structure of Mgfp2. RNA blot analysis on
adult tissues showed foot-specific expression of this gene, while the
analysis on developing larvae showed that the expression starts with
formation of the foot. These results suggest that the function of Mgfp2
has been specialized to form the adhesive plaque.
Mussels inhabit the turbulent and inhospitable niches in the
intertidal zone. They adhere tightly to underwater surfaces using the
adhesive holdfast, the byssus, which is a bundle of threads that
terminate in an adhesive plaque(1) . The role of the adhesive
plaque is to anchor byssal threads tightly to wet and irregular
surfaces of substrates with enough strength to withstand turbulent
waves. Major components of the plaque are two types of protein both of
which incorporate varying proportions of (DOPA) (
)into their
primary structures(1) . One component, foot protein 1, is an
adhesive protein containing more than 60 tandem repeats of a
``decapeptide'' motif (AKPSYPPTYK) (2, 3) and is insolublized through a series of process
promoted by catechol oxidase existing in the foot, which catalyzes the
conversion of peptidyl DOPA residues into peptidyl DOPA-quinones and
into quinone tanning(1) . Tyrosine residues are converted to
peptidyl DOPA residues in an earlier intracellular process involving a
putative protein tyrosyl hydroxylase. The other major component, foot
protein 2, is a cystine-rich structural element of the plaque matrix,
but little is known about it because of its insoluble and
proteolysis-resistant nature. Recently, the foot protein 2 was purified
from the foot, the organ that synthesizes the byssus, and sequences of
peptide fragments obtained by trypsinization following reductive
alkylation were determined(4) . Three types of peptide
fragments were found to be present, but neither the full sequence nor
distribution of each fragment in the whole protein was determined. In
this study, we isolated a cDNA clone encoding the foot protein 2 from
the mussel, Mytilus galloprovincialis to derive the whole
structure. It was found that it encodes a peptide containing epidermal
growth factor (EGF)-like repeats. The tissue specificity and
developmental stage specificity of the expression were also examined.
MATERIALS AND METHODS
Mussels and Their Larvae
Adult mussels (M.
galloprovincialis) about 4 cm in shell length were sampled at
Heita Bay, Iwate Prefecture, Japan. Mature mussels were purchased from
an aquaculture company in Mie prefecture, Japan. They were placed in
seawater at room temperature until spawning occurred. Fertilized eggs
were collected and maintained at 9-15 °C in several 100-liter
polycarbonate tanks. (
)
RNA Isolation
Total RNA was isolated
from adult tissues and larvae using the total RNA separator kit
(Clontech Laboratories). Poly(A)
RNA was isolated from
the foot total RNA using the mRNA separator (Clontech Laboratories).
cDNA was synthesized using the cDNA synthesis kit plus (Amersham
Corp.).
PCR Amplification
The oligonucleotide primers,
GA(T/C)GA(T/C)GA(A/G)GA(T/C)GA(T/C)TA(T/C)AC encoding a part of the
N-terminal peptide (DDEDDYT) and
(T/C)TC(A/G)TC(A/G)TC(A/G)TC(A/G)TT(A/G)TA corresponding to antisense
strand of the sequence encoding a part of the putative C terminus
peptide (YNDDDE) previously reported (4) were prepared, and a
cDNA pool obtained from the foot was screened by polymerase chain
reaction. The reaction buffer containing 6 ng each of these primers, 1
Tth reaction buffer (Toyobo), 200 µM dNTP, 1
µg cDNA, and 1 unit of Tth DNA polymerase (Toyobo) was
amplified for 30 cycles using the DNA thermal cycler 480
(Perkin-Elmer). Each cycle consisted of 30 s at 95 °C, 30 s at 42
°C, and 4 min at 70 °C.
Plaque Hybridization
The foot cDNA library of M. galloprovincialis(3) was screened with the
polymerase chain reaction fragment labeled with
[
P]dCTP. The cDNA obtained was subcloned into a
plasmid vector pBluescriptII SK
(Stratagene).
Sequencing
Nucleotide sequences were determined
using PRISM DyeDeoxy sequencing kit and 373A DNA sequencer (Applied
Biosystems).
Computer Analysis
Sequence analysis was carried
out with Genetyx Mac version 24.0.0 (Software Development Co. Tokyo).
RNA Blot Hybridization
Ten µg of total RNA
obtained from tissues of adult mussels or from larvae was
electrophoresed on a 1% agarose gel, transferred onto a nylon membrane,
and hybridized with whole cDNA labeled with
[
P]dCTP.
RESULTS AND DISCUSSION
Cloning of the Foot Protein 2 cDNA
As a result
of the screening of foot cDNA pool by polymerase chain reaction, an
about 1.2-kilobase amplified fragment containing all the three types of
peptides previously reported (4) was obtained. The foot cDNA
library of M. galloprovincialis(3) was screened using
the fragment as a probe and a clone carrying an insert of approximately
1.5-kilobase DNA was isolated.
Structure of the Foot Protein 2
The sequence
analysis indicated that the clone contains an open reading frame of
1422 nucleotides ending with a stop codon (Fig. 1). A stop codon
was also found preceding the first ATG codon by a 5` rapid
amplification of cDNA ends experiment (data not shown). The open
reading frame encoded a polypeptide of 473 amino acids, which is
hereafter referred to as Mgfp2 (M. galloprovincialis foot
protein 2). The first 17 residues probably represent the signal peptide
because of the hydrophobicity of this region and because of the
N-terminal sequence of the mature protein reported
previously(4) . The expected cleavage position is consistent
with the rule of von Heijne (5) . No other distinct hydrophobic
region, which was likely to be a transmembrane domain, was observed in
the sequence. The mature sequence was classified into three distinct
regions: the short N-terminal region, the long central repetitive
region (37-40 amino acids in length), and the short C terminus
region (Fig. 2). The N- and C-terminal regions mainly consisted
of polar residues and several proline and tyrosine residues. These
regions contained characteristic acidic sequences, DDEEDD and DDDE,
respectively. It is interesting that YKPPVYKP is identical to a repeat
motif of proline-rich cell wall protein of soybean (6) not to
mention the homology to the repeat consensus of Mgfp1,
AKPSYPPTYK(P)(1, 2, 3) . The repetitive
region, which occupies more than 90% of the mature peptide, consists of
11 repeats of the EGF-like motif in which 6 cysteine residues were
arranged at a characteristic spacing (Fig. 2). Although the
length of the repeats was not constant, the positions of cysteine and
glycine residues in each repeat were highly conservative. Tyrosine
residues were also well conserved in each repeat, although some of them
were replaced with functionally similar phenylalanine residues as
observed in other EGF-like peptides(7) . Positions of proline,
lysine, and asparagine residues were also common to most of 11 repeats (Fig. 2). The molecular mass of the mature Mgfp2 calculated from
the predicted amino acid sequence was 49.8 kDa, slightly larger than
that roughly estimated from peptide fragments in the previous study in Mytilus edulis (42-47 kDa)(4) .
Figure 1:
Nucleotide sequence and predicted amino
acid sequence of Mgfp2 cDNA. Nucleotides are numbered above the sequence. The termination codon is indicated by an asterisk. Underlined is a possible N-glycosylation site. Underlineditalic letters indicate the possible hydroxylation sites of
asparagine/aspartic acid.
Figure 2:
Amino acid sequence of Mgfp2 deduced from
cDNA sequence. Signal peptides, N terminus region, repetitive region
and C terminus region were shown separately. In the repetitive region,
sequences of all the repeats were aligned according to the positions of
conserved cysteine residues. Consensus amino acids (>50%
conservation) of the sequence repeats are boxed. Tyrosine
residues that are expected to be converted to DOPA co- or
post-translationally are underlined. The position within the
total Mgfp2 sequence of the first amino acid of each line was
indicated.
Homology of Mgfp2 to Other EGF-like Proteins
A
considerable number of proteins were found to be significantly
homologous to the EGF-like motif of Mgfp2 by computer analysis. The
proteins revealing highest homologies, shown in Fig. 3, were
other members of EGF family including differentiation
factors(8, 9, 10, 11, 12, 13, 14) ,
coagulation factors(15, 16) , and extracellular matrix
components(17, 18) . Positions of cysteine, glycine,
and tyrosine/phenylalanine residues were shared with most of other
members, but total similarities were moderate. EGF domains are supposed
to have important roles for cell-cell interaction, but its functional
mechanisms are not yet completely understood. Several characteristic
elements in EGF-like structure have recently been identified in some
EGF-like proteins. N-glycosylation sites and the hydroxylation
sites of asparagine/aspartic acid have often been found in the EGF
family of proteins. Possible modification sites were also present in
Mgfp2: position 93 for N-glycosylation and positions 283 and
442 for aspartic acid and asparagine hydroxylation sites (Fig. 1). Some residues responsible for calcium binding have
been determined in coagulation factors, e.g. aspartic acid or
asparagine residues at positions 1 and 3 (numbering starts at the left in Fig. 3)(19) . Amino acid residues
important for receptor binding and/or biological activities have also
been identified in some members of EGF-family, e.g. the
arginine residue at position 41, tyrosine residue at position 37, and
several residues outside the region shown in Fig. 3(20) . These amino acid residues are, however, not
conserved in the EGF-like motif of Mgfp2. In addition, Mgfp2 contains
no possible transmembrane domain. Thus, no amino acid residues
reminiscent of cell-cell interaction were found in EGF-like motif of
Mgfp2.
Figure 3:
Alignment of Mgfp2 with other EGF-like
proteins. The consensus sequence of Mgfp2 is shown in this figure. The
letter X indicates a non-consensus amino acid. Other sequences
were obtained from Swiss Plot data base. Hyphens indicate gaps
introduced to maximize homology. Residues homologous with the Mgfp2
consensus sequence are boxed. Numbers at the beginning of
every sequence indicate the position of the first amino acids of the
sequences within the total protein
sequences.
Positions of DOPA Residues
The unique structural
feature of Mgfp2, which is not observed in any other EGF-like proteins,
is that it has DOPA residues in its primary structure. It is possible
to identify positions of tyrosine residues that will be converted co-
or post-translationally to DOPA by aligning the previously reported
DOPA-containing peptides (4) with the whole sequence. It is
interesting that putative DOPA residues are clustered in both ends of
the sequence (Fig. 2). This finding suggests that both ends are
located at the surface of the molecule in the tertiary structure where
DOPA residues will be accessible to catechol oxidase, the conversion
enzyme(2) . The DOPA residues at the surface of the molecule
may play a role in interacting with other Mgfp2 molecules or with other
protein species including the foot protein 1. Thus, it is presumed that
the basic conformation of Mgfp2, which is relatively resistant to
proteolytic degradation, is formed mainly by the EGF-like structure and
DOPA-containing terminal regions to promote the interaction among
molecules and insolublization through cross-linking processes of DOPA
residues.
Expression of Mgfp2 Gene in Developing Larvae
It
is important to know the expression site of the gene to estimate its
function. To examine whether Mgfp2 gene is expressed in other tissues,
RNA blot analysis on major tissues was performed. As the result, it was
shown that Mgfp2 gene is transcribed only in the foot, not in other
organs examined (Fig. 4a). RNA blot analysis on
developing larvae was also performed to examine the stage specificity
of the expression of Mgfp2 (Fig. 4b), because some
EGF-like genes known to promote the cell differentiation are expressed
in specific stages of development(21) . It was found that Mgfp2
is undetectable during free swimming stages but became detectable at
the pediveliger, foot formation stage.
Figure 4:
RNA blot analysis on adult tissues (a) and developing larvae (b) of M.
galloprovincialis. 10 µg of total RNA extracted from the foot,
mantle, adductor muscle, and gill of adult mussels (a) or that
from fertilized eggs within 3 h after fertilization, 3-day-old
trochophore, 4-day-old veliger, 10-day-old veliger, 32-day-old
pediveliger, 42-day-old pediveliger, and the foot of the young adult
that had attached to the wall of tanks with the byssus (b) was
electrophoresed in a formaldehyde gel, transferred onto a nylon
membrane and hybridized with whole Mgfp2 cDNA probe. Arrowheads indicate the positions of 28 and 18 S
rRNA.
Mgfp2 Is a Novel Class of EGF Family
The
structural features of Mgfp2 indicated that Mgfp2 is a secreted protein
containing DOPA residues but has no amino acid residues implicated in
the role of cell-cell interaction. RNA blot analysis on adult tissues
indicated that the expression of the Mgfp2 gene is foot-specific (Fig. 4a). Similar analysis on developing larvae
indicated that Mgfp2 gene expression starts with the foot formation (Fig. 4b). These structural and expression data suggest
that Mgfp2 functions only as the matrix protein of the adhesive plaque
and not as a growth or differentiation factor. It is supposed that
Mgfp2 and other EGF family genes are derived from the same ancestral
gene and only Mgfp2 has acquired the function to form an insoluble
extraorganismic material by incorporating DOPA-containing sequences
into its structure. Thus it seems appropriate to classify Mgfp2 as a
novel class of EGF-like protein family.