(Received for publication, November 18, 1996, and in revised form, January 24, 1997)
From The Wellcome Trust Centre for Cell-Matrix Research, University of Manchester, 2.205, Stopford Building, Manchester M139PT, United Kingdom
It has been demonstrated previously that respiratory secretions contain three oligomeric, gel-forming mucins; one of these was identified as the product of the MUC5AC gene (1). Here we demonstrate that the other two mucins are glycoforms of the MUC5B gene product. This was accomplished by trypsin treatment of the purified reduced mucin subunit populations and N-terminal sequencing of the liberated peptides. The products of trypsin digestion were separated by gel filtration into high molecular weight mucin glycopeptides and low molecular weight tryptic peptides. The latter were fractionated by reverse phase chromatography, and four of the major peptides were sequenced. Three of these peptides were identical to and contiguous within a 51-amino acid sequence deduced from a cDNA clone (JER57) encoding a portion of the MUC5B mucin. The other peptide is also present within this sequence but showed identity in only 9 of its 10 residues. A polyclonal antiserum raised against one of these peptides was reactive with the two putative MUC5B glycoforms. Analysis of the high molecular weight glycopeptides indicated that the MUC5B subunit contained different types and lengths of glycosylated domains; one domain of Mr 7.3 × 105, two domains of Mr 5.2 × 105, and a third domain of Mr 2.0 × 105. The amino acid composition of the larger two glycopeptides was similar in serine, threonine, and proline content but distinct from that of the smallest glycopeptide. Each of these domains in the mucin subunit is separated by a trypsin-sensitive region, and the relative abundance of the major peptides derived by proteolysis of these regions and their occurrence in a contiguous sequence suggest that they contain a common cysteine-rich motif.
Respiratory tract mucus is the principal barrier in the lung against chemical and pathological insult. The physical properties of this gel-like secretion are due solely to high molecular weight O-linked glycoproteins termed mucins. Respiratory mucins are polydisperse in mass (Mr 2-40 × 106) and length (0.5-10 µm) and can be fragmented into their constituent subunits (Mr 2-3 × 106) by reduction (1-6). Proteinase treatment of reduced subunits yields high molecular weight glycopeptides (Mr 300,000-500,000), and these fragments contain the majority of the O-linked glycans. The core protein of the reduced mucin subunit is thus composed of alternating oligosaccharide-rich proteinase-resistant domains and proteinase-sensitive "naked" domains.
Northern blot and in situ hybridization analyses have shown that at least eight mucin genes (MUC1, MUC2, MUC3, MUC4, MUC5AC, MUC5B, MUC7, and MUC 8) are expressed in the respiratory tract (7-14). However, biochemical analysis of respiratory secretions demonstrates that only three major mucin populations comprise the bulk of the gel-forming species (1). These mucins, which differ in charge density and electrophoretic mobility, are all polymeric species that can be fragmented into subunits by reduction (1, 15), and using mucin-specific antisera, one of these has been identified as the MUC5AC mucin (1, 16). Immunohistochemistry demonstrated that this mucin was a product of the goblet cells (17), and this is in agreement with in situ hybridization data (11). Another of the major species, the least charged and least electrophoretically mobile, was shown to be a product of the submucosal glands (16). In situ hybridization has demonstrated that this is the site of synthesis of the MUC5B mucin in the respiratory tract (11), and thus it is likely that this mucin may be the product of the MUC5B gene.
In this study we have generated tryptic peptides from the proteinase-sensitive naked regions of the core proteins of the two previously unidentified mucin populations (1) (termed here mucins X and Y) in an attempt to determine if they are the products of novel or identified MUC genes. In addition we have also investigated the structural organization of their subunits.
Trypsin, modified by reductive alkylation to
reduce autolysis, was purchased from Promega (Southampton, United
Kingdom). -Cyano-4-hydroxycinnamic acid, substance P, and insulin
(bovine pancreas) were from Sigma. Sequencing-grade
trifluoroacetic acid was from Applied Biosystems (Warrington, United
Kingdom), and acetonitrile (high performance liquid chromatography
grade) was purchased from Rathburn Chemicals (Walkerburn, United
Kingdom).
Reduced subunits were prepared from mucins extracted from the mucus gel plug obtained postmortem from the lungs of an individual who died in status asthmaticus as described previously (1, 18). The reduced mucin subunit populations were purified by anion exchange chromatography on Mono Q as described previously (1) and then dialyzed against water and lyophilized.
Agarose Gel ElectrophoresisReduced mucin subunits were electrophoresed in 1.0% (w/v) agarose gels in 40 mM Tris acetate, 1 mM EDTA, pH 8.0, containing 0.1% (w/v) SDS and then transferred to nitrocellulose by vacuum-blotting before detection using antibodies (1, 19).
Preparation of High Molecular Weight Glycopeptides and Tryptic PeptidesReduced mucin subunits (3.5 mg) were dissolved in 450 µl of 0.1 M ammonium hydrogen carbonate, pH 8.0, and 1 µg of trypsin (50 µl) was added; after 24 h at 37 °C the
digest was chromatographed on a Superose 12 HR 10/30 column in 0.1 M ammonium hydrogen carbonate, pH 8.0, at a flow rate of
0.4 ml/min. Fractions from the column were taken into five pools (TR-I
to TR-V; see Fig. 2) that were further analyzed.
Reverse Phase Chromatography of Tryptic Peptides
Tryptic peptides were chromatographed at a flow rate of 240 µl/min on a µRPC C2/C18 PC 3.2/3 column eluted with 0.1% (v/v) trifluoroacetic acid (5 min) followed by a linear gradient of 0-30% (v/v) acetonitrile in 0.1% (v/v) trifluoroacetic acid (30 min) using the Pharmacia SMART system. Major peaks were analyzed by matrix-assisted laser desorption time of flight mass spectrometry (MALDI-TOF MS),1 and peptides were purified to homogeneity by re-chromatography on the column using shallower gradients centered around their elution point.
MALDI-TOF MSSamples (1 µl) in 0.1% trifluoroacetic acid
containing various proportions of acetonitrile were mixed with an equal
volume of 50 mM -cyano-4-hydroxycinnamic acid, applied
to a TOFSpec target, and analyzed by MALDI-TOF MS in positive ion mode
using a VG TOFSpec-E with substance P (mass, 1348.7 Da) and bovine
insulin (mass, 5734.5 Da) as internal standards. The data generated
were processed using the OPUSTM peak detection program.
N-terminal amino acid sequencing was performed on isolated peptides using an Applied Biosystems 476A protein microsequencer.
Polyclonal AntiseraAs a general nonspecific mucin probe, a polyclonal antiserum raised against cervical mucin reduced subunits that recognizes protein epitopes on a range of reduced mucins was used (20). A polyclonal antiserum (MAN-5BI) was raised against a synthetic peptide, corresponding to the sequence of peptide TR-IV-C (ELGQVVECSLDFGLVCR), conjugated with keyhole limpet hemocyanin. In an enzyme-linked immunosorbent assay with the free peptide on the solid phase at 1 µg/well, the antiserum (incubated for 1 h at room temperature) had a titer of 1:1500. The titer is defined as the antiserum dilution giving an A405 of 0.5 (the midpoint of the sigmoidal curve) using a horseradish peroxidase-labeled secondary antibody (1 h at room temperature) with O-phenylenediamine (10 min) as the substrate. The antisera were used at a dilution of 1:1000 for Western blots.
Fractionation and Molecular Weight Determination of Mucin Glycopeptides (TR-I)High molecular weight glycopeptides (pool
TR-I) were chromatographed on a Superose 6 HR 10/30 column eluted with
4 M guanidinium chloride at a flow rate of 200 µl/min.
The column effluent was passed through an in-line Dawn DSP laser
photometer coupled to a Wyatt/Optilab 903 inferometric refractometer to
measure light scattering and sample concentration, respectively (1),
and the data were analyzed according to Zimm (21). Fractions across the
glycopeptide distribution were taken into three pools (GP-I to GP-III;
Fig. 9) that were re-chromatographed on the column and then desalted on
a Hi-Trap column and lyophilized before determination of their amino
acid compositions.
Amino Acid Analysis
Samples were hydrolyzed under nitrogen in 3 M HCl at 105 °C for 16 h, and the resulting amino acids were derivatized with phenylisothiocyanate and then separated by reverse phase chromatography using a 3-µm ODS2 column (1).
We and others have demonstrated that respiratory secretions from
normal and a variety of hypersecretory conditions contain two or three
major mucin species (1, 15-17). The reduced mucin subunits
corresponding to each of these populations were purified from mucus gel
obtained from an asthmatic individual by anion exchange chromatography
as described previously (1). The mucin subunit populations had
different electrophoretic mobilities on agarose gel
electrophoresis (Fig. 1), and one of these was
previously identified as the MUC5AC mucin (1). The object of this
investigation was to identify the other two mucin populations (termed
here mucin X and mucin Y).
Mucin Population X
To determine the identity of mucin
population X its reduced subunits were fragmented with trypsin, and the
resulting tryptic peptides were fractionated, and major peptides were
sequenced by automated Edman degradation. Trypsin treatment yielded
five main peaks (TR-I-V) after gel filtration chromatography on
Superose 12 (Fig. 2). Fraction TR-I contained high
molecular weight mucin glycopeptides (Mr
300,000-700,000), and fractions TR-II and TR-III contained lower
molecular weight glycosylated peptides (Mr
10,000-50,000). MALDI-TOF MS revealed that the majority of the low
molecular weight tryptic peptides (Mr
1,000-10,000) were present in fractions TR-IV and TR-V (Fig.
3, a and b). The spectra show that
although there are a large number of peptides generated by proteolysis,
there are a few major peptides present, i.e. those peptides
with masses 1038, 1129, 1147, 1685, and 1975 Da (Fig. 3). TR-IV was the
major peptide-containing fraction and, as expected from the elution position on Superose 12, had higher molecular weight components than
TR-V.
The tryptic peptides in fractions TR-IV and TR-V were separated by
reverse phase chromatography, and both samples showed a complex series
of peaks, however, it was evident that there was similarity in the
major peaks in the chromatograms (Fig. 4, a and b). Four of these peptides (TR-IV-A-C and TR-V-D; see
Fig. 4) were purified by re-chromatography on the reverse phase column, and their homogeneity was ascertained by MALDI-TOF MS (data not shown).
The mass of each of the four peptides A-D was 1038, 1685, 1975, and
1129 Da, respectively. It can be seen that the major peaks in the
reverse phase separation correspond to some of the major peaks observed
in the mass spectra of the unfractionated peptides (Fig. 3), indicating
that these four peptides are major products of trypsin digestion of
this mucin population. Peptides TR-IV-A, TR-IV-B, and TR-IV-C are also
present in the TR-V chromatogram, showing that the initial size
fractionation (Fig. 2) was not totally effective.
The primary sequence of each of the four peptides was determined by automated Edman degradation, and the data are presented in Table I. A search of the protein sequence data bases revealed a 100% sequence identity for peptides TR-IV-B, TR-IV-C, and TR-V-D and a 1-amino acid difference (arginine for glycine) in 10 residues for peptide TR-IV-A within a 51-amino acid sequence deduced from the JER57 cDNA clone (Fig. 5a), which codes for a part of the MUC5B mucin (22). This region of the MUC5B mucin and our peptides also seem homologous but not identical to two other human mucins, MUC2 and MUC5AC (Fig. 5b). In summary these data indicate that the core protein of mucin population X is encoded by the MUC5B gene.
|
Mucin Y
Trypsin treatment of this mucin subunit population
yielded a similar gel filtration profile (data not shown) to that
observed for mucin population X (Fig. 2). The low molecular weight
tryptic peptides were pooled into a single fraction corresponding to
TR-IV and TR-V and analyzed by MALDI-TOF MS (Fig. 6).
The spectrum is similar to those presented for mucin X peptides (Fig.
3), and four of the major peptides (A-D) with masses of 1038, 1685, 1975, and 1129 Da, respectively, are also present (Fig. 6). In
contrast, other major signals (with masses of 1457 and 1549 Da) are
observed that were absent from the spectra for mucin X. However, we
have previously shown that this sample contains a minor amount of the MUC5AC mucin (1), and these peptides may arise from this mucin protein.
Reverse phase chromatography (Fig. 7) revealed a complex pattern of peaks; three of the major peaks occur in the same fractions as observed for the mucin X peptide chromatograms (Fig. 4). In addition, MALDI-TOF MS on these fractions reveals peptides of identical
mass to the mucin X peptides (data not shown). In summary these data
indicate that mucin Y is a different glycoform of the MUC5B gene
product.
To confirm this a polyclonal antiserum (MAN-5BI) raised against a
synthetic peptide corresponding to TR-IV-C was used to probe a Western
blot of an agarose gel separation of the three mucin subunit
populations (Fig. 8). It is apparent that subunit bands corresponding to mucin X and mucin Y (Fig. 1) are both reactive with
this antiserum, whereas the MUC5AC subunit is not.
Structural Organization of the MUC5B Reduced Subunit
The high molecular weight mucin glycopeptides (TR-I, see Fig. 2) were separated into three components (GP-I-III) by gel filtration chromatography on Superose 6 (Fig. 9a), and their amino acid compositions are presented in Table II. Each glycopeptide has a high content of serine, threonine, and proline, and the data demonstrate that samples GP-I and GP-II are similar but different from GP-III, which has a much higher content of serine relative to threonine. The molecular weight distribution for the glycopeptides was determined by light scattering (Fig. 9a), and the average molecular weight values calculated for GP-I, GP-II, and GP-III were 7.3 × 105, 5.2 × 105, and 2.0 × 105, respectively. From the light scattering data we were also able to deduce radii of gyration across the glycopeptide distribution (data not shown), and the relationship between the radius of gyration and molecular weight is presented (Fig. 9b). The slope of this plot is a shape-sensitive parameter, and the value of 0.78 is consistent with an almost rod-like structure for the glycopeptides. From integration of the refractive index increments across the three peaks, the amount of each glycopeptide was deduced, and in conjunction with their measured average molecular weight, a molar ratio of 1:2:1 was determined for the three components GP-I, GP-II, and GP-III, respectively.
|
Previously we have shown that three oligomeric mucin populations comprise the bulk of the gel-forming species in human respiratory secretions (1). After reduction these mucins can be separated by anion exchange chromatography and agarose gel electrophoresis (1, 15). One of these mucins was identified as the product of the MUC5AC gene, but the genetic identity of the other two was not ascertained (1). The mucin preparation studied here was shown to contain all three populations, and the aim of this study was to assign genetic identities to the two unknowns. Amino acid sequence data were obtained from N-terminal sequencing of four tryptic peptides, and these identical sequences were found within a 51-amino acid sequence deduced from the cDNA clone (JER57) that encodes a small segment of the MUC5B mucin. In contrast, only partial similarity in sequence is observed between these peptides and regions of the human mucins MUC2 (24) and MUC5AC (25). On the basis of these findings and the similarity in the pattern of their tryptic peptides and their reactivity with an antipeptide antiserum we conclude that the two mucin populations X and Y are products of the MUC5B gene, but they represent different glycoforms. In support of this conclusion we have shown previously that these two mucin populations have different carbohydrate compositions and charge densities (1). The latter finding explained the difference in their electrophoretic mobility. The existence of two glycoforms of the MUC5B gene product raises questions as to whether they are both present in the normal situation, whether their level is changed between normal and diseased conditions, and if they are the product of the same or different cells.
A mucin population corresponding to the least-charged variant of the MUC5B mucin was shown to be enriched in mucin preparations derived from human respiratory tract submucosal tissue (16), which, according to in situ hybridization data, is the site of synthesis of MUC5B mucin in the airways (11). This mucin was virtually absent from epithelial surface (i.e. goblet cell) mucin preparations that are enriched in the MUC5AC mucin (16), a finding that is also consistent with in situ hybridization data (11).
Hydrodynamic and electron microscopy studies led us to propose a model
for mucin subunits as containing proteinase-resistant, oligosaccharide-rich domains flanked by proteinase-sensitive, cysteine-containing, naked-protein regions (3, 4, 15, 20, 26, 27). The
oligosaccharide-rich domains correspond to the high molecular weight
mucin glycopeptides released by proteolysis of the reduced mucin
subunit. Here we show that the high molecular weight mucin
glycopeptides derived from the least-charged glycoform of the MUC5B
mucin can be separated on the basis of size into three different
components with an average Mr of 7.3 × 105, 5.2 × 105, and 2 × 105 and a relative molar ratio of 1:2:1. If present in this
ratio in the intact reduced MUC5B subunit, then the glycosylated
domains would account for approximately 80% of the mass
(Mr 2.5 × 106 (1)). Thus, we
can suggest a model for the MUC5B mucin subunit (Fig.
10) in which these glycosylated proteinase-resistant
domains are flanked by cysteine-containing regions of the protein core less substituted with glycan chains and thus more susceptible to
proteolysis after reduction. These naked domains account for approximately 20% of the mass of the subunit (i.e. 500,000)
and can be fragmented by trypsin into smaller peptides and
glycopeptides. From the dramatic relative abundance of a few major
peptides and the surprising fact that these peptides are found in a
contiguous sequence one may propose that this cysteine-rich motif is
repeated with identical sequence numerous times in the molecule.
Interestingly a homologous motif is found repeated twice within the
MUC2 mucin core protein flanking the smaller of the two glycosylated
regions (25). By analogy we propose that this cysteine-rich motif
flanks the glycosylated domains in the MUC5B mucin subunit. Thus, our data provide an important new insight into the structure of the proteinase-sensitive naked domains (Fig. 10).
Biochemical analyses of airway secretions have demonstrated that the MUC5AC (1) and MUC5B mucins are responsible for the respiratory mucus gel. Northern blot and in situ hybridization indicate that at least six other mucins, namely MUC1, MUC2, MUC3, MUC4, MUC7, and MUC8, are expressed in the airways (7, 9, 10, 12-14), but only the MUC2 mucin has been demonstrated to be of the large gel-forming type (28, 29). However, we and others are unable to find significant quantities of the MUC2 mucin in respiratory tract secretions (1, 16, 18). Thus, the level of mRNA expression may be a poor indicator of the amount of a particular mucin in the gel, and our findings demonstrate the necessity of determining the amount of mucin in the secretion with biochemical methods, rather than relying solely on the level of mucin mRNA expression.
We thank Prof. Tim Hardingham for reading of the manuscript and the Wellcome Trust for support.