Characterization of Gel-separated Glycoproteins Using Two-step Proteolytic Digestion Combined with Sequential Microcolumns and Mass Spectrometry*

Martin R. Larsen{ddagger}, Peter Højrup and Peter Roepstorff

From the Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230 Odense, Denmark


    ABSTRACT
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS
 DISCUSSION
 REFERENCES
 
Protein glycosylation can be vital for changing the function or physiochemical properties of a protein. Abnormal glycosylation can lead to protein malfunction, resulting in severe diseases. Therefore, it is important to develop techniques for characterization of such modifications in proteins at a sensitivity level comparable with state-of-the-art proteomics. Whereas techniques exist for characterization of high abundance glycoproteins, no single method is presently capable of providing information on both site occupancy and glycan structure on a single band excised from an electrophoretic gel. We present a new technique that allows characterization of low amounts of glycoproteins separated by gel electrophoresis. The method takes advantage of sequential specific and nonspecific enzymatic treatment followed by selective purification and characterization of the glycopeptides using graphite powder microcolumns in combination with mass spectrometry. The method is faster and more sensitive than previous approaches and is compatible with proteomic studies.


Glycosylation is one of the most abundant post-translational protein modifications in nature. The biological role of glycosylation varies from conformational stability and protection against degradation to molecular and cellular recognition in development, growth, and cellular communication (1, 2). Several diseases have been associated with abnormalities in carbohydrate degradation and recognition (3, 4). The majority of glycosylated proteins are secreted or membrane-bound. Many recombinant proteins produced by the biotechnology industry are glycoproteins, e.g. cytokines or antibodies. These require a correct glycan structure for optimal biological function and to avoid triggering an immune response. Therefore, faster and more sensitive methods for characterizing glycoproteins are particularly important to understand their biological functions and to verify the structure of recombinant glycoproteins.

Four types of glycosylation are known: N-linked where the sugar is attached to asparagine residues in the consensus sequence Asn-Xaa-(Ser/Thr/Cys); O-linked where the sugar is attached to serine or threonine; glycosylphosphatidylinositol anchors, which are attached to the carboxyl terminus of certain membrane-associated proteins; and finally C-glycosylation, which has been found attached to tryptophan residues in certain membrane-associated and secreted proteins (5). Except for the latter two glycosylation types, each glycosylated amino acid may have a vide variety of different glycan structures attached, leading to a pronounced heterogeneity (microheterogeneity). The consequence is that for a given amount of protein there is a significantly lower amount of each glycosylated isopeptide compared with the non-modified peptide from a glycosylated protein. In addition, different sites may be only partially glycosylated (macroheterogeneity), resulting in more complex mixtures and ambiguous glycosylation site assignment. Mass spectrometric analysis of glycopeptides is further complicated by the fact that the signals for glycosylated peptides tend to be suppressed in the presence of non-glycosylated peptides.

The current techniques for glycoprotein characterization from low amounts of glycoprotein do not easily provide information on both the structure of the different glycans attached to the peptide and the site of attachment. Consequently a combination of complementary techniques is needed to fully characterize a glycoprotein, resulting in decrease in the overall sensitivity. Characterization of glycoproteins by MS1 is typically performed on glycopeptides isolated by liquid chromatography, by sequential treatment with specific endo-/exoglycosidases, and by monitoring the mass changes by MS (6). This strategy requires detection of the glycopeptide in the peptide mass map, availability of suitable glycosidases, and sufficient material for performing multiple digestion steps. The advantage of this method is that the specificity of the glycosidases gives information on the identity of the attached monosaccharides as well as the linkage type. This approach has only in a few cases been applied to low levels of glycoproteins available in electrophoretic gels (7, 8).

Other strategies for characterization of glycosylated proteins include a combined enzymatic and chemical release of the glycan structures of gel-separated proteins after blotting to polyvinylidene difluoride membranes (9). The blotted glycosylated proteins can be identified by matrix-assisted laser desorption/ionization (MALDI)-MS after on-blot tryptic digestion, and the glycans can be characterized by liquid chromatography-MS using, for example, graphitized carbon column liquid chromatography for desalting and concentration (10). This can be combined with enzymatic release of the glycan in buffer containing 50% H218O (11). The glycosylated peptide and the glycosylation site can be identified based on its isotope pattern. This method has been combined with lectin affinity capturing and liquid chromatography-tandem MS for the identification of 400 unique glycosylation sites in an extract of Caenorhabditis elegans proteins (12). A similar strategy involves enrichment of the glycopeptide by using hydrophilic interaction liquid chromatography (normal phase chromatography) columns followed by identification of the glycan site by treatment with a mixture of glycosidases that leaves a single GlcNAc residue on N-linked glycosylation sites. This technique was successfully applied to the identification of glycosylation sites in serum proteins (13).

A recent quantitative technique utilizes glycoproteins that are covalently linked to a solid support by hydrazine chemistry followed by proteolysis and stable isotope labeling of the bound glycopeptides (14). The labeled glycopeptides are released by peptide-N-glycosidase F digestion and subsequently identified and quantified using a liquid chromatography-tandem MS-based strategy.

We have recently reported the use of GELoader tip microcolumns packed with graphite powder for desalting and purification of low amounts of small or hydrophilic peptides or phosphopeptides in proteomics (15, 16). Here we describe a method that allows extensive characterization of a small amount of gel-separated N-linked glycoprotein. It utilizes a two-step proteolytic digestion combined with purification of the glycopeptides by sequential use of microcolumns packed with reversed-phase (RP) resin and graphite powder. The method allows protein identification, glycosylation site mapping, and partial structure analysis of the attached glycans. The method is fast and sufficiently sensitive to allow characterization of glycoproteins in gel-based proteomics.


    EXPERIMENTAL PROCEDURES
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS
 DISCUSSION
 REFERENCES
 
Material and Reagents—
Graphite material (catalog number C5510), proteinase K, ovalbumin, and {alpha}-cyano-4-hydroxycinnamic acid were from Sigma. GELoader tips were from Eppendorf (Hamburg, Germany). 2,5-Dihydroxybenzoic acid (DHB) was from Fluka (St. Louis, MO). Modified trypsin was from Promega (Madison, WI). Poros R2 chromatographic material was from Applied Biosystems (Framingham, MA). All reagents used in the experiments were sequence grade, and the water was from a Milli-Q system (Millipore, Bedford, MA).

SDS-PAGE—
One-dimensional gel electrophoresis was performed according to Laemmli (17) using the mini Protean II gel system (Bio-Rad). The proteins were dissolved in SDS sample buffer (0.5 M Tris-HCl, pH 6.8, 10% glycerol, 2% SDS, 20 mM dithiothreitol, 0.05% bromphenol blue), boiled for 2 min, and applied to a 12% separation gel. Electrophoresis was carried out at a constant voltage of 160 V. The separated proteins were visualized by staining with standard colloidal Coomassie Brilliant Blue.

Reversed-phase High Performance Liquid Chromatography (RP-HPLC)—
RP-HPLC was performed on an ÄKTA basic system (Amersham Biosciences) using a Jupiter C5 reversed-phase 250 x 4.6-mm column (particle size, 5 µm; pore size, 300 Å). Solvent A was 0.1% (v/v) trifluoroacetic acid, and solvent B was 90% (v/v) acetonitrile, 0.08% (v/v) trifluoroacetic acid. The gradient is indicated in Fig. 5.



View larger version (36K):
[in this window]
[in a new window]
 
FIG. 5. Separation and identification of the proteins co-purified with commercial ovalbumin. A, HPLC separation of 250 µg of commercial ovalbumin (Sigma catalog numbers A5503 and P01012) using a Jupiter C5 column (Phenomenex). The acetonitrile gradient is indicated on the figure. B, SDS-PAGE of the three different fractions (F1–F3) obtained from the HPLC separation. Lane M, molecular weight marker. C, MALDI peptide mass map of peptides derived from tryptic in-gel digestion of ovomucoid. D, MALDI peptide mass map of peptides derived from tryptic in-gel digestion of ovoglycoprotein.

 
In-gel Digestion—
In-gel digestion was performed as described previously (18). Briefly the excised gel plug was washed in digestion buffer (50 mM NH4HCO3, pH 7.8, acetonitrile (60:40)) and dried by vacuum centrifugation. Modified trypsin (8 ng/µl) dissolved in 50 mM NH4HCO3, pH 7.8 was added to the dry gel pieces and incubated on ice for 1 h. After removing the supernatant additional digestion buffer was added, and the digestion was continued at 37 °C for 4–12 h. A small aliquot of the supernatant from the digestion was used for protein identification by peptide mass mapping. The remaining peptide solution was used for proteinase K digestion.

Proteinase K Digestion—
The supernatant from the tryptic digestion of the glycoprotein was transferred to a new Eppendorf tube and 0.02 unit of proteinase K dissolved in water was added to the tube. The solution was incubated at 37 °C overnight. The resulting solution was sequentially desalted and concentrated using GELoader tip microcolumns packed with Poros R2 material and graphite powder as described below.

Desalting and Concentration of the Glycopeptides—
Microcolumns packed with either Poros reversed-phase R2 resin or graphite powder were prepared using GELoader tips as described previously (15, 19). The typical column length was 0.5 mm corresponding to a bed volume in the low nanoliter range. Both columns were washed with 10 µl of ultrahigh quality water. An aliquot of the proteinase K digestion mixture was applied to the Poros R2 microcolumn. Gentle air pressure was applied with a 1-ml disposable syringe to force the liquid through the column. The Poros R2 column was washed with 20 µl of 0.1% trifluoroacetic acid, and the bound peptides were eluted directly onto the MALDI target using 0.2 µl of a matrix solution consisting of {alpha}-cyano-4-hydroxycinnamic acid (10 g/liter) in 70% acetonitrile, 0.1% trifluoroacetic acid or 2,5-dihydroxybenzoic acid (10 g/liter) in 50% acetonitrile, 0.2% formic acid (FA). The flow-through from the R2 column was applied to the graphite powder microcolumn. This column was washed using 20 µl of Milli-Q water. The glycosylated peptides were eluted from the graphite powder microcolumn using 0.5 µl of a solution containing 30% acetonitrile and 0.2% FA directly onto a MALDI target. Subsequently 0.5 µl of matrix solution (DHB in 50% acetonitrile, 0.2% FA) was added to the eluted peptide solution on the target.

MALDI-MS—
MALDI-MS was performed using a Voyager STR system (PerSeptive Biosystems, Framingham, MA) equipped with delayed extraction. Spectra were obtained in positive reflector or linear ion mode and negative reflector ion mode using an accelerating voltage of 20 kV. The frequency of the laser was reduced to the lowest setting since we have experienced higher sensitivity for larger biomolecules with reduced laser frequency. The spectra were calibrated using external calibration very close to the actual spot; this normally provided mass accuracy in the 70 ppm area.

MALDI tandem mass spectrometry was performed using a MALDI quadrupole time-of-flight mass spectrometer (Micromass, Manchester, UK). The spectra were acquired using the software MassLynx 3.5. The collision energy during tandem MS experiments was 70–100 eV. Argon was used as collision gas at an indicated manifold pressure of 3x10–5 millibars in the hexapole. The spectra were manually interpreted. The instrument was calibrated by multipoint calibration using polyethylene glycol 2000, resulting in mass accuracy below 50 ppm. Internal calibration of the fragment ion spectra using the oxonium ions (m/z 204.0873 and 366.1561) provided higher accuracy in the low mass area.

For peptide mass mapping and glycopeptide analysis by MALDI-MS, DHB in 50% acetonitrile was used as the matrix. Using this matrix generates some problems in the use of external calibration in normal MALDI time-of-flight MS experiments since the calibration is relying on the size of the matrix crystals. Therefore the mass accuracy in the MALDI-MS experiments presented here is in general lower compared with the MALDI tandem MS experiments where the size of the crystals does not influence strongly on the calibration.


    RESULTS
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS
 DISCUSSION
 REFERENCES
 
Principles of the Method—
The overall strategy for characterization of glycosylated proteins separated by gel electrophoresis is illustrated in Fig. 1. There are two major steps that underpin the method: two-stage proteolytic digestion and purification of the glycopeptides by sequential use of Poros R2 RP and graphite powder microcolumns. For the two-stage proteolytic digestion the gel-separated glycoprotein is submitted to primary in-gel proteolysis using a sequence-specific endoproteinase, e.g. trypsin. A small aliquot of the derived peptide mixture is analyzed by MS, and the protein is identified based on a peptide mass map or partial sequence information. The remaining peptide solution is subjected to a secondary treatment with a nonspecific proteinase, which in our case was proteinase K. This cleaves the majority of the tryptic peptides into smaller peptides (typically <5 amino acid residues). The presence of a glycan structure on a peptide creates steric hindrance for the proteinase K, thereby leaving a small "peptide tag" attached to the glycan. This results in small glycopeptides that will have molecular masses typically >1200 Da. For the next phase of the method the remaining non-glycosylated peptides (peptides derived by autoproteolysis of the proteinase K or those resistant to proteinase K) are removed by sequential use of Poros R2 and graphite microcolumns. By passing the digest through a GELoader tip microcolumn packed with Poros R2 material the unwanted peptides are removed, but the glycopeptides pass through and are trapped on a second GELoader tip microcolumn packed with graphite powder. The glycopeptides are not retained on the R2 since they have a hydrophilicity that primarily resembles the glycan. The graphite-bound glycopeptides can efficiently be washed to remove low molecular weight contaminants and subsequently eluted using 30% acetonitrile, 0.2% FA. Elution with this solvent composition usually does not allow efficient elution of peptides (15). MS then provides an accurate molecular weight, and tandem MS reveals both the amino acid sequence of the peptide and a partial glycan structure.



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 1. The strategy for site-specific characterization of glycoproteins separated by gel electrophoresis. See text for further details. MS/MS, tandem MS.

 
In cases where the glycan part is very big and creates more steric hindrance resulting in an increased length of the attached peptide, the glycopeptide can be partially retained on the Poros R2 column. By eluting this column with 5% acetonitrile those glycopeptides can be analyzed by MS. However, the binding is very dependent on the nature of the amino acids.

A similar strategy using nonspecific proteinases for trimming down glycopeptides has been described previously for purified glycoproteins (20, 21). These studies are based on Pronase, which is a mixture of several endo-/exoproteinases. However, incubation of Pronase at 37 °C results in a large number of peptides derived by "autoproteolysis" of the enzymes. We have found that the presence of such peptides significantly reduces the overall sensitivity of the method due to suppression effects in MS and saturation of the microcolumns.

To apply the strategy, the glycoprotein ovalbumin was chosen as model protein. Ovalbumin is a secreted 386-amino acid residue protein reported to have a single N-linked glycosylation site, Asn292. A great variation of glycan structures in ovalbumin has been described, including high mannose and complex/hybrid structures, some of which originate from contaminating proteins that co-purify with commercial ovalbumin (22). Commercial ovalbumin that was further purified by HPLC contains high mannose and hybrid glycan structures (22).

Eight picomoles of commercial ovalbumin were run on an SDS gel and stained with colloidal Coomassie Blue. The band was excised and submitted to in-gel tryptic digestion. A small aliquot (5%) of the digest supernatant was analyzed by MALDI-MS after desalting on a Poros R2 microcolumn. The resulting peptide mass map unambiguously identified chicken ovalbumin (Fig. 2A). An aliquot (30%) of the remaining peptide mixture was treated with 0.02 unit of proteinase K and applied to a microcolumn of Poros R2, and the column was washed with 0.1% trifluoroacetic acid. The bound peptides were eluted with DHB matrix solution and analyzed by MALDI-MS. The resultant peptide mass map is shown in Fig. 2B. The peptides marked with an asterisk are autoproteolytic products of proteinase K. The proteinase K treatment efficiently degraded the unglycosylated peptides. The flow-through from the R2 microcolumn was applied to a microcolumn packed with graphite powder, which was washed with water. The glycopeptides were eluted with 0.5 µl of 30% acetonitrile, 0.2% FA directly onto a MALDI target and mixed with 0.5 µl of DHB matrix. The resulting MALDI spectrum obtained in positive reflector ion mode shows several signals above 1500 Da where glycopeptides are expected to appear (Fig. 2C). Only signals for autodigestion products from proteinase K were detected, indicating that all glycopeptides were eluted with 30% acetonitrile, 0.2% FA. The distance between the signals in Fig. 2C corresponds to the typical mass difference for monosaccharides (162 and 203 Da for hexose and N-acetylhexosamine, respectively), showing the presence of a single glycopeptide with a heterogeneous glycan structure, resulting in several signals in MALDI-MS (See Fig. 4 for further details on the glycopeptide profile from ovalbumin). Subsequent elution of the graphite column using the DHB matrix in 50% acetonitrile revealed that the glycopeptides were eluted in the previous step and that only autoproteolysis products from proteinase K remain on the column (Fig. 2D).



View larger version (33K):
[in this window]
[in a new window]
 
FIG. 2. Characterization of 8 pmol of gel-separated ovalbumin. A, MALDI peptide mass map of a small aliquot (2%) of the digestion supernatant desalted on a Poros R2 microcolumn and eluted directly onto the MALDI target using the matrix solution. Circles show the peptides assigned to ovalbumin. B, MALDI spectrum of one-third of the peptide solution after treatment with proteinase K and desalting on a Poros R2 microcolumn. The peptides were eluted as in A. C, MALDI spectrum of glycopeptides isolated after application of the flow-through of the R2 microcolumn on a microcolumn packed with graphite powder. The glycopeptides were eluted from the graphite microcolumn with 30% acetonitrile, 0.2% FA directly on the MALDI target and mixed with DHB matrix. The spectrum was obtained in positive reflector mode, and the glycopeptides were predominantly detected as (M + Na)+ and (M + K)+ ions. D, MALDI spectrum of the subsequent elution from the graphite microcolumn using the DHB matrix solution. The asterisks indicate peptides derived by autoproteolysis of proteinase K. PMF, peptide mass Fingerprint.

 


View larger version (37K):
[in this window]
[in a new window]
 
FIG. 4. Overview of glycan profile of ovalbumin. The MALDI spectrum obtained in negative ion reflector mode of the glycopeptides obtained from ovalbumin after elution from the graphite column (same sample as shown in Fig. 2C) is shown. The glycopeptides are detected as (M – H) species. The glycan structures shown in the figure are based on prediction using the MALDI tandem MS results together with the GlycoMod software.

 
The mass of the attached glycopeptide(s) can be calculated either by subtracting the molecular weight of known glycan structures from the mass of the glycopeptides (e.g. by using the GlycoMod software (www.expasy.com)) or by tandem mass spectrometry. The latter strategy is illustrated below.

The glycopeptides were purified as above and eluted directly from the graphite microcolumn onto a target for analysis by MALDI quadrupole time-of-flight MS. The glycopeptide masses were selected in the quadrupole and submitted to collision-induced dissociation (CID). Fig. 3, A and B, shows the fragment ion spectra obtained from the glycopeptide signals at m/z 2132.8 Da ((M + H)+) and 1888.8 Da ((M + H)+), respectively. The fragment ions can in both cases be assigned to the loss of single monosaccharides from the glycopeptide (the carbohydrate composition of the fragment ions is illustrated), indicating that the charge is predominantly retained on the peptide (Y-type fragments (23)). In addition to the Y-type fragment ions a number of B-type oxonium ions are observed corresponding to the fragmentation of the glycan structure with the charge retained on the glycan. The mass of the peptide can easily be deduced when taking into account the common core structure of N-linked glycans, which is (GlcNAc)2(Man)3 (illustrated in Fig. 3). Thus, two signals in the lower mass range, corresponding to the peptide with one and two GlcNAc residues, respectively, spaced with 203.09 Da is indicative of the Y-type fragment ion series and consequently allows determination of the mass of the peptide. The mass of the peptide ((M + H)+ = 510.27 Da) can be assigned to the sequence 291YNLT294 in ovalbumin, consistent with previous findings. A fragment ion at m/z 569.25 in Fig. 3A, corresponding to (GlcNAc)2(Man)1, is indicative of a bisected hybrid glycan structure. The signal at m/z 1443.76 (Fig. 3A) can originate from multiple fragmentations pathways since it can either contain the bisected core structure or may contain HexNAc that is part of the antennary structure. Thus the glycopeptide at m/z 2132.8 Da contains a hybrid glycan structure, and the glycopeptide at m/z 1888.8 Da contains a high mannose glycan structure.



View larger version (31K):
[in this window]
[in a new window]
 
FIG. 3. Fragmentation of glycopeptides from ovalbumin. A, MALDI low energy CID spectrum of the glycopeptide with (M + H)+ at m/z 2132.8. The fragment ions detected can be assigned as B-type oxonium ions (labeled with the carbohydrate composition) or Y-type fragments with charge retention on the peptide shown accordingly (23). The asterisk illustrates an ion at m/z 569.27 that correspond to (GlcNAc)2(Man)1, a diagnostic ion for a bisected hybrid glycan. B, MALDI low energy CID spectrum of the glycopeptide with (M + H)+ at m/z 1888.8. The fragment ions detected can be assigned as B-type oxonium ions (labeled with asterisks) or Y-type fragments with charge retention on the peptide shown accordingly.

 
The same sample as in Fig. 2C was analyzed by MALDI-MS in negative ion reflector mode (Fig. 4). Here only (M – H) ions are detected, simplifying the spectrum significantly compared with positive ion mode where each glycopeptide is normally split into three different signals due to pronounced alkali metal adduct formation (i.e. sodium and potassium). The signals can all be assigned to the same peptide carrying different glycan structures (illustrated in Fig. 4). All 13 different glycan structures previously reported for HPLC-purified ovalbumin (22) were detected here. In addition, three low abundance but significant signals at m/z 1562.6, m/z 1765.7, and m/z 2292.9 were observed. They correspond to the same YNLT sequence carrying (GlcNAc)2(Hex)4, (GlcNAc)3(Hex)4, and (GlcNAc)4(Hex)6, respectively. The nature of the hexose cannot be assigned based on the mass of the glycan. These structures have not previously been assigned to ovalbumin purified by HPLC but were assigned to glycoproteins contaminating the commercial ovalbumin (22).

Characterization of the Contaminants in Commercial Ovalbumin—
Commercial ovalbumin was purified by HPLC, which resulted in two fully resolved peaks (Fig. 5A) in addition to the peak originating from ovalbumin. The two fractions were further subjected to SDS-PAGE, which after Coomassie Blue staining revealed one distinct band for each fraction (Fig. 5B). The two bands were excised and submitted to in-gel tryptic digestion. A small aliquot of the digestion supernatants was used to identify the two proteins as ovomucoid and ovoglycoprotein by MALDI-MS peptide mass mapping (Fig. 5, C and D, respectively). The identification was verified by MALDI tandem MS (data not shown). Following proteinase K digestion of the tryptic peptide mixtures the glycopeptides could be selectively purified using the method described above.

Fig. 6A illustrates the negative ion mode MALDI-MS peptide map obtained from the 30% acetonitrile eluate from the graphite microcolumn using ~5 pmol of tryptic peptides derived from ovomucoid. A number of signals are observed above 1200 Da corresponding to two different peptides carrying a number of different glycan structures (illustrated by stars and circles, respectively). The difference between the two series is 156 Da, which corresponds to the mass of an arginine residue, illustrating two cleavage variants obtained with proteinase K. The glycopeptide at m/z 2613,1 ((M + H)+) was submitted to CID using MALDI quadrupole time-of-flight tandem MS, and the resulting fragment spectrum is shown in Fig. 6B. The peptide mass can readily be deduced to m/z 705.37. Four peptides containing a consensus sequence of an N-linked glycosylation (underlined) can have this theoretical mass in the amino acid sequence of ovomucoid considering a mass accuracy of 50 ppm (31RFPNAT36, 198SNGTLTL204, 199NGTLTLS205, and 195VVESNGT201). The presence of the cleavage variant and small peptide fragment signals in the fragment ion spectrum (e.g. b3) indicate that the sequence of the peptide is 31RFPNAT36. This peptide carries more than 13 different glycan structures. The peak observed at m/z 788.42 originates from an internal fragmentation of the terminal GlcNAc residue attached to the peptide corresponding to a 0,2X0 type fragmentation. This result in a mass increase of the peptide by 83 Da and this type of fragmentation was observed in all MALDI tandem MS experiments performed in the present study.



View larger version (24K):
[in this window]
[in a new window]
 
FIG. 6. Characterization of glycopeptides from ovomucoid. A, MALDI mass spectrum of glycopeptides obtained from the 30% acetonitrile elution from the graphite column. The two glycopeptides series are marked with stars and circles, respectively. The spectrum was obtained in reflector negative ion mode. B, MALDI low energy CID spectrum of the glycopeptide with (M + H)+ at m/z 2613.06. Both B-type oxonium ions and Y-type fragment ions are detected. The mass of the peptide (underlined) corresponds to the sequence RFPNAT and the peptide fragment ion b3 is illustrated in the figure. Twelve different glycan structures are observed on the peptide with the largest (m/z 2773.1) having the following carbohydrate composition: (GlcNAc)2(Man)3(HexNAc)5(Hex).

 
A similar analysis was performed on the tryptic peptides derived from ovoglycoprotein. However, due to the presence of a high number of different glycopeptide signals in the mass spectrum of the initial 30% acetonitrile eluate (data not shown), the glycopeptides were stepwise eluted from the graphite microcolumn using increasing amounts of acetonitrile (9, 13, and 30% respectively). The positive ion mode MALDI-MS spectrum obtained from the 9% acetonitrile eluate resulted in one series of glycopeptides (Fig. 7A). MALDI tandem MS of the ion signal at m/z 1553.70 revealed a complex glycan structure attached to a peptide with the mass of 458.20 Da, corresponding to the sequence 106HNST109 in ovoglycoprotein (Fig. 7B). This peptide carries more than 10 different glycan structures with the smallest glycopeptide observed (m/z 1553.7) having the following carbohydrate composition: (GlcNAc)2(Man)3(HexNAc)1.



View larger version (36K):
[in this window]
[in a new window]
 
FIG. 7. Characterization of glycopeptides from ovoglycoprotein. A, MALDI mass spectrum of glycopeptides obtained from the 9% acetonitrile elution from the graphite column. One peptide carrying a variety of different glycans is observed. The glycopeptide carrying the smallest glycan (m/z 1553.6) has the following carbohydrate composition: (GlcNAc)2(Man)3(HexNAc)1. Potential carbohydrate compositions are added from this glycopeptide. The spectrum was obtained in reflector positive ion mode. B, MALDI low energy CID spectrum of the glycopeptide with (M + H)+ at m/z 1553.73. Both B-type oxonium ions and Y-type fragment ions are detected. The mass of the peptide (458.23 Da) is underlined. C, MALDI mass spectrum of glycopeptides obtained from the 13% acetonitrile elution from the graphite column. One peptide carrying a variety of different glycans is observed. The glycopeptide carrying the smallest glycan (m/z 1569.7) has the following carbohydrate composition: (GlcNAc)2(Man)3(HexNAc)1. Potential carbohydrate compositions are added from this glycopeptide. The spectrum was obtained in reflector negative ion mode. D, MALDI mass spectrum of glycopeptides obtained from the 30% acetonitrile elution from the graphite column. Two peptides containing the same glycosylation site, carrying a variety of different glycans, are observed. The glycopeptide carrying the smallest glycan (m/z 2170.8) in the most pronounced glycopeptide series has the following carbohydrate composition: (GlcNAc)2(Man)3(HexNAc)1. Potential carbohydrate compositions are added from this glycopeptide. The other ion series is marked with circles. The spectrum was obtained in reflector negative ion mode. E, MALDI low energy CID spectrum of the glycopeptide with (M + H)+ at m/z 2782.15. Both B-type oxonium ions and Y-type fragment ions are detected. The mass of the peptide is underlined, and the peptide fragment ions indicating the sequence EFNVT are illustrated in the figure.

 
Elution with 13% acetonitrile resulted in a number of signals in negative ion mode MALDI-MS (Fig. 7C) that all could be assigned to the peptide 89LNET92 in the amino acid sequence of ovoglycoprotein using the data base software GlycoMod. This peptide carries more than 10 different glycan structures with the smallest glycopeptide observed (m/z 1569.7) having the following carbohydrate composition: (GlcNAc)2(Man)3(HexNAc)1.

Fig. 7D represents the negative ion mode MALDI spectrum of the glycopeptides eluting with 30% acetonitrile. Two series of signals are represented in this spectrum originating from two different glycopeptides carrying complex glycan structures. The predominant series is illustrated with the differences of hexoses or N-acetylhexosamines. The other series is marked with circles. The glycopeptide at m/z 2782.15 was analyzed by MALDI tandem MS, and the resulting fragment ion spectrum is shown in Fig. 7E. The fragmentation represents a complex glycan structure attached to a peptide with the mass of 1077.5 Da, which corresponds to the amino acid sequence 76SHEDEFNVT84 in ovoglycoprotein

In addition, a series of fragments, i.e. peptide fragments from b4 to b8, in the low mass area of the fragment ion spectrum represent the partial fragmentation of this peptide (illustrated in Fig. 7E). The second glycopeptide was, after MALDI tandem MS, assigned to the amino acid sequence 78EDEFNVT84 containing the same glycosylation site (data not shown) in agreement with the observed similar glycosylation profile of the two glycopeptides. This N-linked glycosylation site carries more than 12 different glycan structures. The carbohydrate composition of the glycopeptide carrying the smallest observed glycan structure (m/z 2172.8) is (GlcNAc)2(Man)3(HexNAc)1. The composition of the larger glycan structures is illustrated in Fig. 7D.


    DISCUSSION
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS
 DISCUSSION
 REFERENCES
 
To our knowledge, the strategy presented here is the first that will allow characterization of glycoproteins with respect to site-specific assignment of glycan structures at sensitivity levels compatible with current proteomic gel-based approaches. In the study of ovalbumin with as little as 8 pmol of ovalbumin applied on a gel we identified all of the 13 glycan structures previously observed in HPLC-purified ovalbumin as well as three additional glycan structures. From lower amounts of starting material (2.5 pmol separated by SDS-PAGE) we have successfully detected 11 of the 13 previously observed structures on ovalbumin (data not shown). Our result indicates that the three new structures on ovalbumin are unlikely to originate from contaminating proteins since the ovalbumin in this experiment was purified by SDS-PAGE, and the known contaminating proteins do not have the same migration distance on gels. It has been postulated that the glycosidic bond can undergo in-source fragmentation resulting in the appearance of minor glycan fragments in the MALDI spectrum (e.g. Ref. 24). However, this kind of fragmentation is normally associated with loss of sialic acid residues and has only to a smaller extent been shown for other residues in a glycan structure. Whether or not the presence of the three low abundance species is naturally occurring has to be further investigated using other methods.

The characterization of the contaminating proteins ovomucoid and ovoglycoprotein revealed four different glycosylation sites (one site in ovomucoid and three sites in ovoglycoprotein) and a total of 46 different glycopeptides. More sites than the ones observed have been described in both ovomucoid and ovoglycoprotein according to Glycosuite. However, the ovalbumin used here was purified by the manufacturer using ion exchange chromatography, and this very likely has resulted in exclusion of structure variants, for example, the ones containing sialic acids. However, sialic acid-containing peptides require larger amounts for detection by MS, and thus the ideal characterization strategy of sialic acid-containing proteins in proteomics will be a combination of the method described above with two-dimensional gel electrophoresis, which is able to separate the different sialic acid forms into distinct spots on the gel.

The two-stage digestion described here, using trypsin and proteinase K, produces small glycopeptides that contain a peptide tag consisting of three to five amino acids. These glycopeptides resemble free glycans with respect to hydrophilicity and physiochemical properties as the attached peptide is generally hydrophilic due to the common consensus sequence for N-linked glycosylation (NX(S/T/C)). Because glycan structures in general are highly hydrophilic they can rarely be purified using normal reversed-phase chromatographic material. This is also observed for the small glycopeptides generated using the strategy described here. However, they can be efficiently purified using graphite powder microcolumns. The removal of interfering non-modified peptides either originating from the glycoprotein or from contaminating proteins (e.g. keratin) using the Poros R2 RP column dramatically increases the sensitivity of the method by reducing the suppression effect caused by the presence of non-glycosylated peptides. The sensitivity of the method is, in addition, markedly improved by using nanoliter bed volume columns prepared in GELoader tips for concentration and desalting prior to mass spectrometric analysis.

In the present study, information on site specificity and glycan structure has been obtained either by the use of special software, like GlycoMod, or tandem MS. In the first strategy the experimental masses of the glycopeptides obtained by MS are matched to theoretical masses of glycan structures together with potential peptide masses from the protein. This program can be used since the protein sequence is already known from the first stage digestion, and thus the masses of the potential peptides are known. However, unambiguous assignment of the peptide and glycan composition using the mass only is highly dependent on the mass accuracy of the mass spectrometric analysis due to the large number of potential glycan structures that can occupy the site. Mass spectrometric analysis of large glycopeptides will normally result in low resolution and consequently in decreased mass accuracy in the mass spectrometric analysis and ambiguous assignment of the peptide and glycan compositions. The reduction in the size of the glycopeptide obtained using the current strategy allows an accurate, high resolution determination of the monoisotopic mass of the glycopeptides using most state-of-the-art mass spectrometers. This high accuracy allows in a single experiment correct assignment of the monosaccharide composition and the attached peptide without losing the specificity of the glycan structure and attachment site.

When using the tandem MS strategy for assignment of the site and glycan composition, the heterogeneity in the size and nature of the peptide is highly influencing the outcome of the analysis since the fragmentation is dependent on both of the components and on the size of the analyte. Large glycopeptides are normally much harder to fragment than smaller ones, and the fragmentation takes place both on the peptide and on the glycan structure. By reducing the size of the glycopeptide the fragmentation predominantly takes place on the glycan structure providing an easily readable monosaccharide sequence and an accurate peptide mass. In addition, most tandem MS instruments do not have the capability to efficiently fragment analytes above 4000 Da.

One drawback with the proposed strategy is that information about the linkage type and the monosaccharide type cannot be obtained. This will require either sequential glycosidase treatment of the purified glycopeptides or studies by NMR. Both methods require considerably larger amounts of glycoprotein than used in the proposed strategy and therefore will not be applicable at sensitivity levels compatible with proteomic studies.

Using a nonspecific protease can in some cases result in difficulties in assignment of the correct peptide sequence even though the sequence of the protein is known. However in many cases we have observed peptide fragments in the spectrum that can assist in assigning the correct sequence. In addition, if the glycopeptides are analyzed by an instrument that is capable of performing MS/MS/MS the peptide can be selected for further fragmentation and thereby generate sequence information for correct assignment.

The method described here has only been tested for N-linked glycoproteins. However, it might be applied to O-linked glycoproteins provided that the glycan is of sufficient size to produce steric hindrance for the proteinase K action. It has previously been observed that even a single O-linked sugar residue is sufficient to block the action of carboxypeptidases (25), and it is likely that a similar effect will be observed with proteinase K. However, small O-linked glycans might contribute too little to the hydrophilicity of the glycopeptide resulting in altered retention in the two-stage separation procedure. Therefore the applicability of the method for general analyses of O-glycosylation needs to be investigated further.

The sensitivity of the method is sufficient for the initial investigation of the glycome of different types of cells or tissues in diseases or during different physiological stimuli, therefore it will be an efficient tool in proteomic studies. Another field of application will be in the biotechnology industry where many recombinant proteins produced in cells are glycoproteins, e.g. factor VIII, erythropoietin, cytokines, or antibodies. These proteins will require that correct glycan structures are produced for optimal biological function and to avoid triggering an immune response. Our strategy will allow a fast and efficient screening for optimal host cells, growth conditions, etc. for production of glycoproteins with correct glycan profile without using time-consuming protein purification and characterization procedures.


    ACKNOWLEDGMENTS
 
We thank Professor Phillip J. Robinson for critical review of the manuscript.


    FOOTNOTES
 
Received, June 2, 2004

Published, November 18, 2004

Published, MCP Papers in Press, November 22, 2004, DOI 10.1074/mcp.M400068-MCP200

1 The abbreviations used are: MS, mass spectrometry; MALDI, matrix-assisted laser desorption/ionization; RP, reversed-phase; FA, formic acid; DHB, 2,5-dihydroxybenzoic acid; CID, collision-induced dissociation; HPLC, high performance liquid chromatography; Hex, hexose; HexNAc, N-acetylhexosamine. Back

* This work was supported by a grant to the Danish Biotechnology Instrument Center from the Danish Research Agency. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Back

{ddagger} Supported by a Steno scholarship from The Danish Natural Science Research Council. Back

To whom correspondence should be addressed. Tel.: 45-6550-2475; E-mail: mrl{at}bmb.sdu.dk


    REFERENCES
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Varki, A. (1993) Biological roles of oligosaccharides: all of the theories are correct. Glycobiology 3, 97 –130[Abstract]

  2. Moens, S., and Vanderleyden, J. (1997) Glycoproteins in prokaryotes. Arch. Microbiol. 168, 169 –175[CrossRef][Medline]

  3. Dennis, J. W., Granovsky, M., and Warren, C. E. (1999) Protein glycosylation in development and disease. Bioessays 21, 412 –421[CrossRef][Medline]

  4. Grunewald, S., Matthijs, G., and Jaeken, J. (2002) Congenital disorders of glycosylation: a review. Pediatr. Res. 52, 618 –624[Abstract/Free Full Text]

  5. Hofsteenge J., Muller D. R., de Beer T., Loffler A., Richter W. J., and Vliegenthart, J. F. (1994) New type of linkage between a carbohydrate and a protein: C-glycosylation of a specific tryptophan residue in human RNase Us. Biochemistry 33, 13524 –13530[Medline]

  6. Stahl, B., Klabunde, T., Witzel, H., Krebs, B., Steup, M., Karas, M., and Hillenkamp, F. (1994) The oligosaccharides of the Fe(III)-Zn(II) purple acid phosphatase of the red kidney bean. Determination of the structure by a combination of matrix-assisted laser desorption/ionization mass spectrometry and selective enzymic degradation. Eur. J. Biochem. 220, 321 –330[Abstract]

  7. Mortz, E., Sareneva, T., Haebel, S., Julkunen, I., and Roepstorff, P. (1996) Mass spectrometric characterization of glycosylated interferon-gamma variants separated by gel electrophoresis. Electrophoresis. 17, 925 –931[Medline]

  8. Garner, B., Merry, A. H., Royle, L., Harvey, D. J., Rudd, P. M., and Thillet, J. (2001) Structural elucidation of the N- and O-glycans of human apolipoprotein(a): role of O-glycans in conferring protease resistance. J. Biol. Chem. 276, 22200 –22208[Abstract/Free Full Text]

  9. Wilson, N. L, Schulz, B. L., Karlsson, N. G., and Packer, N. H. (2002) Sequential analysis of N- and O-linked glycosylation of 2D-PAGE separated glycoproteins. J Proteome Res. 1, 521 –529[CrossRef][Medline]

  10. Packer, N. H., Lawson, M. A., Jardine, D. R., and Redmond, J. W. (1998) A general approach to desalting oligosaccharides released from glycoproteins. Glycoconj. J. 15, 737 –747[CrossRef][Medline]

  11. Kuster, B., and Mann, M. (1999) 18O-Labeling of N-glycosylation sites to improve the identification of gel-separated glycoproteins using peptide mass mapping and database searching. Anal Chem. 71, 1431 –1440[CrossRef][Medline]

  12. Kaji, H. Saito, H., Yamauchi, Y., Shinkawa, T., Taoka, M., Hirabayashi, J., Kasai, K., Takahashi, N., and Isobe, T. (2003) Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins. Nat. Biotechnol. 21, 667 –672[CrossRef][Medline]

  13. Hagglund, P., Bunkenborg, J., Elortza, F., Jensen, O. N., and Roepstorff, P. (2004) A new strategy for identification of N-glycosylated proteins and unambiguous assignment of their glycosylation sites using HILIC enrichment and partial deglycosylation. J. Proteome Res. 3, 556 –566[CrossRef][Medline]

  14. Zhang, H., Li, X. J., Martin, D. B., and Aebersold, R. (2003) Identification and quantification of N-linked glycoproteins using hydrazide chemistry, stable isotope labeling and mass spectrometry. Nat. Biotechnol. 21, 660 –666[CrossRef][Medline]

  15. Larsen, M. R., Cordwell, S. J., and Roepstorff, P. (2002) Graphite powder as an alternative to reversed phase material for desalting and concentration of peptide mixtures prior to mass spectrometric analysis. Proteomics 2, 1277 –1287[CrossRef][Medline]

  16. Larsen, M. R., Graham, M. E., Robinson, P. J., and Roepstorff, P. (2004) Improved detection of hydrophilic phosphopeptides using graphite powder micro-columns and mass spectrometry: evidence for in vivo doubly phosphorylated dynamin I and dynamin III. Mol. Cell. Proteomics, 3, 456 –465[Abstract/Free Full Text]

  17. Laemmli, U. K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 680 –685[Medline]

  18. Nawrocki A., Larsen M. R., Podtelejnikov A. V., Jensen O. N., Mann, M., Roepstorff, P., Gorg, A., Fey, S. J., and Larsen, P. M. (1998) Correlation of acidic and basic carrier ampholyte and immobilized pH gradient two-dimensional gel electrophoresis patterns based on mass spectrometric protein identification. Electrophoresis 19, 1024 –1035[Medline]

  19. Gobom, J., Nordhoff, E., Mirgorodskaya, E., Ekman, R., and Roepstorff, P. (1999) Sample purification and preparation technique based on nano-scale reversed-phase columns for the sensitive analysis of complex peptide mixtures by matrix-assisted laser desorption/ionization mass spectrometry. J. Mass Spectrom. 34, 105 –116[CrossRef][Medline]

  20. Juhasz, P., and Martin, S. A. (1997) The utility of nonspecific proteases in the characterization of glycoproteins by high-resolution time-of-flight mass spectrometry. Int. J. Mass Spectrom. Ion Process. 169/170, 217 –230[CrossRef]

  21. An, H. J., Peavy, T. R., Hedrick, J. L., and Lebrilla, C. B. (2003) Determination of N-glycosylation sites and site heterogeneity in glycoproteins. Anal. Chem. 75, 5628 –5637[CrossRef][Medline]

  22. Harvey, D. J., Wing, D. R., Kuster, B., and Wilson, I. B. (2000) Composition of N-linked carbohydrates from ovalbumin and co-purified glycoproteins. J Am. Soc. Mass Spectrom. 11, 564 –571[CrossRef][Medline]

  23. Domon B, and Castello, C. (1988) Carbohydrate fragmentation rules/nomenclature. Glycoconj. J. 5, 397 –409

  24. Mortz E., Sareneva T., Julkunen I., and Roepstorff, P. (1996) Does matrix-assisted laser desorption/ionization mass spectrometry allow analysis of carbohydrate heterogeneity in glycoproteins? A study of natural human interferon-{gamma}. J. Mass Spectrom. 31, 1109 –1118[CrossRef][Medline]

  25. Mirgorodskaya, E., Fierobe, H. P., Svensson, B., and Roepstorff, P. (1999) Mass spectrometric identification of a stable catalytic cysteinesulfinic acid residue in an enzymatically active chemically modified glucoamylase mutant. J. Mass Spectrom. 34, 952 –957[CrossRef][Medline]