Characterization of the Low Molecular Weight Human Serum Proteome*S
Radhakrishna S. Tirumalai,
King C. Chan,
DaRue A. Prieto,
Haleem J. Issaq,
Thomas P. Conrads and
Timothy D. Veenstra
From the SAIC-Frederick Inc., Laboratory of Proteomics and Analytical Technologies, Mass Spectrometry Center, National Cancer Institute at Frederick, Frederick, MD 21702-1201
 |
ABSTRACT
|
---|
Serum potentially carries an archive of important histological information whose determination could serve to improve early disease detection. The analysis of serum, however, is analytically challenging due to the high dynamic concentration range of constituent protein/peptide species, necessitating extensive fractionation prior to mass spectrometric analyses. The low molecular weight (LMW) serum proteome is that protein/peptide fraction from which high molecular weight proteins, such as albumin, immunoglobulins, transferrin, and lipoproteins, have been removed. This LMW fraction is made up of several classes of physiologically important proteins such as cytokines, chemokines, peptide hormones, as well as proteolytic fragments of larger proteins. Centrifugal ultrafiltration of serum was used to remove the large constituent proteins resulting in the enrichment of the LMW proteins/peptides. Because albumin is known to bind and transport small molecules and peptides within the circulatory system, the centrifugal ultrafiltration was conducted under solvent conditions effecting the disruption of protein-protein interactions. The LMW serum proteome sample was digested with trypsin, fractionated by strong cation exchange chromatography, and analyzed by microcapillary reversed-phase liquid chromatography coupled on-line with electrospray ionization tandem mass spectrometry. Analysis of the tandem mass spectra resulted in the identification of over 340 human serum proteins; however, not a single peptide from serum albumin was observed. The large number of proteins identified demonstrates the efficacy of this method for the removal of large abundant proteins and the enrichment of the LMW serum proteome.
A major goal of biomedical research is the determination of biomarkers whose measurement would effectively distinguish the onset of a defined disease state. For a biomarker (or set of biomarkers) to be clinically valuable it must not only be histologically specific but readily obtainable from the patient. While urine is widely used in diagnostic medicine, serum is potentially the most valuable specimen for biomarker elucidation (1). Because serum constantly perfuses tissues, it might be expected that the onset or presence of disease may be determined by measuring the altered presence or abundance of the constituent molecular species in serum. For example, increased serum levels of prostate-specific antigen (2) and CA125 (3) are routinely used for the detection of cancer in the prostate and ovary, respectively.
Serum is attracting increasing interest in proteomics, which is currently striving to broadly characterize its protein constituents. The expectation is that the characterization of the thousands of individual serum proteins/peptides will enable the discovery of an increasing number of reliable disease biomarkers. At first glance, serum presents many beneficial attributes for proteomic investigation because it has a high protein content (i.e. 6080 mg/ml), with many of these proteins being secreted and shed from cells and tissues (4, 5). The protein content of serum, however, is dominated by a handful of proteins such as albumin, transferrin, haptoglobulin, immunoglobulins, and lipoproteins (6). Unfortunately, serum proteins are present across an extraordinary dynamic range of concentration that is likely to span more than 10 orders of magnitude, which separates albumin from the rarest proteins now measured clinically (7). This large dynamic range exceeds the analytical capabilities of traditional proteomic methods, making the detection of lower abundance serum proteins extremely challenging. The reduction of sample complexity (e.g. to deplete the level of abundant proteins) is thus an essential first step in the analysis of the serum proteome.
Affinity methods (e.g. anti-human serum albumin antibody columns, protein A/G) have been developed to remove abundant proteins such as albumin and immunoglobulins from serum prior to mass spectrometric analysis (8, 9). One of the fundamental oversights of serum protein depletion methodologies, however, is that many important low molecular weight (LMW)1 proteins or peptides can be concomitantly removed by this sample preparation process as well. It is well known that albumin acts as a carrier and transport protein within the blood and binds physiologically important species such as hormones, cytokines, and lipoproteins (10), and since the affinity methods used to deplete high abundant serum proteins target native proteins under nondenaturing conditions, these methods are also likely removing those proteins or peptides bound to the target protein. Hence, an ideal fractionation/depletion method would completely remove highly abundant proteins but leave remaining those peptides and proteins bound to them.
Low molecular weight human serum proteins, peptides, and other small components have been associated with pathological conditions such as cancer (11), diabetes (12), and cardiovascular and infectious diseases (13), and likely reflect the state of the underlying cell or tissue. A recent study using surface-enhanced laser/desorption ionization time-of-flight mass spectrometry (SELDI-TOF MS) combined with advanced bioinformatic techniques to identify proteomic patterns in serum identified five species with molecular masses below 2500 Da that served to allow the distinction of neoplastic from non-neoplastic disease within the ovary (11). These results suggest that the LMW serum proteome may contain an unexplored archive of histological information and provide useful biomarkers for disease detection. The identification of these low abundant protein biomarkers, however, is generally hampered by the presence of the more abundant proteins in serum.
A simple method for the removal of high molecular weight species from serum without the concomitant loss of LMW components has been developed in this study. This method employs centrifugal ultrafiltration using solvent conditions that serve to disrupt protein-protein interactions so that LMW components that may be bound to larger species are released and are free to pass through the molecular weight cutoff (MWCO) membrane. The LMW serum proteome was digested with trypsin, and the peptide mixture was initially fractionated by strong cation exchange (SCX) chromatography. Each of these fractions was further analyzed by microcapillary reversed-phase liquid chromatography coupled on-line with electrospray ionization tandem mass spectrometry (µLC-MS/MS). Complete bioinformatic analysis of the MS/MS spectra resulted in the confident identification of 341 human serum proteins, and remarkably no peptides originating from human serum albumin were identified in any of the fractions analyzed. The large number of proteins identified demonstrates the efficacy of our enrichment method combined with multi-dimensional fractionation and µLC-MS/MS analysis for the characterization of the LMW serum proteome. This new method provides a rapid and robust procedure to enrich for those components that have been shown to be discriminating factors in the early diagnosis of such diseases as ovarian cancer.
 |
EXPERIMENTAL PROCEDURES
|
---|
Materials
Centriplus centrifugal filters with a MWCO of 30,000 were purchased from Millipore (Bedford, MA); standard human serum was obtained from the National Institute of Standards and Technology (Gaithersburg, MD); Bis-Tris and Tricine gels were obtained from Invitrogen Corporation (Carlsbad, CA); sequencing grade-modified trypsin was obtained from Promega (Madison, WI); ammonium bicarbonate (NH4HCO3) was obtained from Sigma (St. Louis, MO); formic and trifluoroacetic acid was obtained from Fluka (Milwaukee, WI); and high pressure liquid chromotography-grade methanol and acetonitrile was obtained from EM Science (Darmstadt, Germany). Bond Elut C-18 reversed-phase solid-phase extraction columns were purchased from Varian (Walnut Creek, CA).
Centrifugal Serum Ultrafiltration
The centrifugal filter membranes were rinsed and used according to the manufacturers specifications. Ten microliters of human serum was diluted by the addition of 50 ml of 25 mM NH4HCO3, pH 8.2, 20% (v/v) acetonitrile, and applied onto a Centriplus centrifugal concentrator membrane (MWCO 30,000). The sample was centrifuged (Avanti J30I; Beckman Coulter, Fullerton, CA) at 3,000 x g in a JA-30.50 fixed-angle rotor (Beckman Coulter) until >90% of the input serum had passed through the membrane. The filtrate was lyophilized to dryness and resuspended in 1 ml of 25 mM NH4HCO3, pH 8.2. An aliquot of the resuspended filtrate was analyzed by SDS-PAGE using a 412% Bis-Tris gel or a 1020% gradient Tricine gel. The remainder of the resuspended LMW filtrate was denatured by boiling for 5 min. After cooling to room temperature, the sample was reduced using 5 mM dithiothreitol with heating at 56 °C for 1 h followed by alkylation using 10 mM iodoacetamide at 25 °C for 1 h. Trypsin was added to the reduced and alkylated LMW filtrate at a protein-to-enzyme ratio of 50:1, followed by incubation overnight at 37 °C. While no chemical denaturant was added to the sample, the long incubation time should allow for a sufficiently high digestion efficiency that produces a complex mixture of peptides by which to characterize proteins within the LMW serum fraction (14). The digestion was stopped by acidifying with trifluoroacetic acid to a final concentration of 0.1% and desalted using a Bond Elut C-18 reversed-phase solid-phase extraction column as per the manufacturers protocol. The eluate from the solid-phase extraction column was lyophilized to dryness and resuspended in 500 µl of distilled and deionized water.
Based on the SDS-PAGE data (Figs. 2 and 3), all or most of the protein in the filtrate corresponds to the LMW fraction. A total of
1.07 mg of protein was recovered from the serum sample that had been ultrafiltered in the presence of acetonitrile. Assuming that the LMW fraction represents 1% of the total serum protein mass, an absolute yield of
68 mg could be expected when starting with 10 ml of serum. Therefore, the amount of recovered material represents a yield of
1520%.

View larger version (63K):
[in this window]
[in a new window]
|
FIG. 2. SDS-PAGE analysis of human serum before and after ultrafiltration. Serum was diluted 1:5 with 20 mM ammonium bicarbonate, 20% (v/v) acetonitrile, pH 8.2, and subjected to centrifugal ultrafiltration using 30,000 MWCO membranes. Aliquots of serum before and after ultrafiltration were subjected to SDS-PAGE and stained with Coomassie blue. Lane 1, molecular weight markers; lane 2, human serum albumin; lanes 3 and 4, unfiltered serum; lanes 5 and 6, serum filtrate.
|
|

View larger version (66K):
[in this window]
[in a new window]
|
FIG. 3. SDS-PAGE analysis of human serum ultrafiltered in denaturing and nondenaturing buffer. Serum was diluted 1:5, using either with 20 mM ammonium bicarbonate, pH 8.2 (nondenaturing), or 20 mM ammonium bicarbonate, 20% (v/v) acetonitrile, pH 8.2 (denaturing), and subjected to centrifugal ultrafiltration. Aliquots of the serum ultrafiltrates were subjected to SDS-PAGE and stained with Coomassie blue. Lanes 1 and 6, molecular weight markers; lanes 2 and 3, serum filtrate (denaturing conditions); lanes 4 and 5, serum ultrafiltrate (nondenaturing conditions).
|
|
Surface Enhanced Laser Desorption/Ionization Analysis
Immobilized metal affinity capture (IMAC3) ProteinChip Arrays (Ciphergen Biosystems Inc., Palo Alto, CA) were activated with 10 mM HCl, washed with double distilled water and equilibrated with 50 mM sodium acetate, pH 4.5, containing 0.1% Triton X-100. The immobilized metal used was copper (Cu2+). The amino acid side chain with the highest chelation affinity is histidine; however, any protein with an accessible electron-donor group (i.e. cysteine, tryptophan) may bind to an IMAC surface. Aliquots of raw serum, serum ultrafiltered using nondenaturing conditions (i.e. 25 mM NH4HCO3, pH 8.2), and serum ultrafiltered using denaturing conditions (i.e. 25 mM NH4HCO3, pH 8.2, 20% (v/v) acetonitrile) were diluted 1:1 in 9 M urea, 2% 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonic acid, followed by a 1:5 dilution in 50 mM sodium acetate, pH 4.5, containing 0.1% Triton X-100. The diluted serum samples were then added to separate spots on the IMAC3 array surfaces using a bioprocessor (Ciphergen Biosystems Inc.). The samples were incubated for 1.5 h at room temperature with gentle agitation. The ProteinChip arrays were washed three times with 50 mM sodium acetate, pH 4.5, containing 0.1% Triton X-100, followed by a final double distilled water wash. The bioprocessor was subsequently removed and the ProteinChip arrays air-dried. Two 1-µl aliquots of a 30%
-cyano-4-hydroxycinnamic acid solution in 50% acetonitrile, 0.5% trifluroacetic acid were added to each spot of the ProteinChip array and air-dried. ProteinChip arrays were placed in the Protein Biological System II TOF mass spectrometer (PBS-II, Ciphergen Biosystems Inc.), and mass spectra were recorded using the following settings: 91 laser shots/spectrum collected in positive ionization mode, laser intensity 175, detector sensitivity 8, detector voltage 1900, and a focus mass of 5,000 Da. The PBS-II TOF mass spectrometer was externally calibrated using the "All-In-One" peptide mass standard (Ciphergen Biosystems Inc.).
Strong Cation Exchange Chromatography
The digested LMW serum sample that had been ultrafiltered in the presence of acetonitrile was fractionated using SCX chromatography using an Agilent Model 1100 µLC system (Agilent Technologies, Palo Alto, CA) coupled on-line to an in-house manufactured laser-induced fluorescent detector. The separation was performed using a 150 mm x 1 mm Polysulfoethyl A (PolyLC, Columbia, MD) column at a flow rate of 50 µl/minute. The column was equilibrated with solvent A (0.1% formic acid/25% acetonitrile (v/v)). Seventy microliters (
150 µg) of the sample solution was loaded on the column and eluted with solvent B (0.5 M ammonium formate, pH 3.0/25% (v/v) acetonitrile) over 96 min using the following gradient conditions: 100% A for 5 min, 15% B at 45 min, 70% B at 80 min, and 100% B at 90 min and hold for 6 min. Fractions were collected every minute. Every five fractions were pooled, lyophilized to dryness, and resuspended in 100 µl of 0.1% (v/v) formic acid prior to MS analysis.
Microcapillary LC-MS/MS Analysis
Microcapillary reversed-phase LC was performed using an Agilent 1100 capillary LC system (Agilent Technologies) coupled on-line to an ion-trap mass spectrometer (IT-MS) (LCQ DecaXP, ThermoFinnigan, San Jose, CA). Reversed-phase separations of each sample were performed using 75 µm i.d x 10 cm long fused silica capillary column (Polymicro Technologies, Phoenix, AZ) that were slurry packed in house with 5-µm, 300-Å pore size Jupiter C-18 stationery phase (Phenomenex, Torrance, CA). After injecting 5 µl of sample, the column was washed for 20 min with 95% solvent A (0.1% formic acid in water, v/v), and the peptides were eluted using a linear gradient of 5% solvent B (0.1% formic acid in 100% acetonitrile, v/v) to 85% solvent B in 100 min at a constant flow rate of 0.5 µl/min.
The IT-MS was operated in a data-dependent mode in which each full MS scan was followed by three MS/MS scans where the three most abundant peptide molecular ions were dynamically selected for collision-induced dissociation (CID) using a normalized collision energy of 38%. The temperature of the heated capillary and electrospray voltage were 180 °C and 1.8 kV, respectively.
Data Processing and Analysis
Tandem MS spectra from the µLC-MS/MS analyses were searched against the human proteomic database using SEQUEST operating on a Beowulf cluster (ThermoFinnigan, San Jose, CA) (15). For a peptide to be considered a legitimate identification it had to achieve the charge state and proteolytic cleavage-dependent cross correlation (Xcorr) scores shown in Table I (9). A minimum delta correlation (DelCN) of 0.1 was required for an identification to be considered legitimate. To filter out false positives, all of the MS/MS spectra were searched against the human fasta database, without removing any of the viral proteins from the database. These viral protein entries represent over 100,000 unique entries within this database, compared with 21,000 human entrees. Any MS/MS spectrum that was identified as originating from a viral protein was subsequently removed as a false positive, because the standard serum sample used in this study is known to be at least HIV and hepatitis free. The inclusion of these viral sequences in the database has the same effect as searching the data against databases from other organisms (9) or other approaches such as searching against reversed databases (16). Additionally, the tandem MS spectra that identified human proteins based on a single peptide were manually assessed for acceptable signal-to-noise and the presence of at least three consecutive b or y ion fragments.
View this table:
[in this window]
[in a new window]
|
TABLE I Filter parameters used in the SEQUEST analysis of the ms/ms spectra generated for peptides within the LMW serum proteome
|
|
 |
RESULTS
|
---|
Albumin Depletion
Complex biological samples such as serum contain thousands of proteins and peptides that are present in a large dynamic concentration range from the common highly abundant proteins (such as albumin) to the extremely low abundant proteins (such as the vasoconstrictor peptide endothelin-1) (17). A major contributing factor to the analytical challenge of characterizing the serum proteome is that a single protein, albumin, comprises
50% of the protein content (Fig. 1; www.plasmaproteome.org). Indeed, only 10 proteins constitute 90% of the protein content of serum. Of the remaining 10%, 12 proteins make up 90% of this remaining total. In fact, only 1% of the entire protein content of serum is made up of proteins that are considered to be in low abundance and of great interest in proteomic studies in search of potential biomarkers. While the depletion of all 22 proteins considered to be highly abundant represents a difficult proposition, simply removing albumin would have a significant impact in the ability to characterize the serum proteome by MS.

View larger version (21K):
[in this window]
[in a new window]
|
FIG. 1. Pie chart representing the relative contribution of proteins within plasma. Twenty-two proteins constitute 99% of the protein content of plasma.
|
|
Several prefractionation approaches employing chromatographic adsorbents and immunoaffinity methods have been used to remove albumin (8, 18, 19). Many LMW proteins are poorly recovered, however, as albumin is known to act as a carrier and transport protein within blood and therefore is likely to bind many species of interest such as peptide hormones, cytokines, and chemokines (10). We have employed centrifugal ultrafiltration and have significantly depleted the large, highly abundant proteins such as albumin and enriched for LMW proteins. Taking into consideration the role of albumin as a transport protein in blood, centrifugal ultrafiltration was conducted using a buffer containing 20% acetonitrile to disrupt any potential protein-protein/peptide interactions. The serum samples were analyzed by SDS-PAGE before and after ultrafiltration. A significant depletion of albumin was seen in the ultrafiltrate as shown in Fig. 2. Indeed, no albumin could be detected by Coomassie staining (Fig. 2, lanes 5 and 6).
Low Molecular Weight Serum Proteome Enrichment
While the depletion of albumin by ultrafiltration has the potential to concomitantly remove LMW proteins/peptides through noncovalent interactions with albumin, it is important to perform the ultrafiltration under buffer conditions that disrupt these possible associations. To ensure that the buffer conditions used above hindered potential interactions between LMW proteins/peptide and albumin, we compared the effect of not adding acetonitrile to the diluted serum sample prior to ultrafiltration. An aliquot serum was diluted with 25 mM NH4HCO3, pH 8.2, to which acetonitrile had been added to a final concentration of 20% (v/v), while a second aliquot of serum was simply diluted with 25 mM NH4HCO3, pH 8.2. Both samples were ultrafiltered using YM30 filters and the ultrafiltrate was analyzed by SDS-PAGE. While the presence of acetonitrile had no affect on the ability to deplete albumin, it did have a drastic effect on the enrichment of LMW proteins, as shown in Fig. 3. The addition of acetonitrile (lanes 2 and 3) results in improved recovery of LMW species when compared with the sample to which no acetonitrile had been added (lanes 4 and 5). Similar SDS-PAGE patterns were seen under conditions wherein serum was ultrafiltered using 5 M urea in place of 20% acetonitrile (data not shown). This result suggests that the increase in recovery of the LMW fraction is a result of denaturation of the larger, more abundant proteins versus a simple increase in recovery due to the presence of acetonitrile.
A faint band was observed in lanes 2 and 3 of this gel. The experiment was repeated and these bands were excised from the gel, in-gel digested with trypsin, and the resultant peptides analyzed by µLC-MS/MS. This band was subsequently identified as the heavy chain of immunglobulin G (molecular mass
55 kDa) via database searching of the resulting MS/MS spectra. These ultrafiltered serum samples were also analyzed by SELDI-TOF MS (20). The SELDI-TOF MS spectra of the serum samples that were ultrafiltered in the presence and absence of acetonitrile are shown in Fig. 4. The mass spectrum of the sample ultrafiltered in the presence of acetonitrile shows many more peaks than obtained when serum is filtered without it. Compared with the SDS-PAGE results, however, the SELDI-TOF MS spectrum of the ultrafiltered serum shows few peaks above 6 kDa. While this discrepancy is not clear, we believe it may be due to ion suppression effects, as in our experience, highly abundant serum proteins that have high molecular weights do not produce intense signals in SELDI analysis. This phenomenon is independent of the protein chip surface used. Regardless, the SELDI-TOF MS results are consistent with the SDS-PAGE results in that both suggest the presence of acetonitrile in the filtration buffer is critical for enrichment of LMW serum proteins.

View larger version (66K):
[in this window]
[in a new window]
|
FIG. 4. SELDI-TOF MS analysis of human serum ultrafiltered in denaturing and nondenaturing buffer. Serum was diluted 1:5, using either with 20 mM ammonium bicarbonate, pH 8.2 (nondenaturing), or 20 mM ammonium bicarbonate, 20% (v/v) acetonitrile, pH 8.2 (denaturing), and subjected to centrifugal ultrafiltration. Aliquots of the unfiltered serum and the ultrafiltrates were analyzed using SELDI-TOF MS. A, SELDI-TOF MS spectrum of serum; B, SELDI-TOF MS spectrum of serum ultrafiltered in the presence of 20 mM ammonium bicarbonate, pH 8.2 (nondenaturing); C, SELDI-TOF MS spectrum of serum ultrafiltered in the presence of 20 mM ammonium bicarbonate, 20% (v/v) acetonitrile, pH 8.2 (denaturing).
|
|
Proteins Identified In Serum
Although two-dimensional PAGE methods are widely used to characterize complex protein mixtures, proteolytic digestion and the combination of peptide separation by chromatography coupled on-line with tandem MS are ideally suited for the detection of small peptides and proteins. Importantly, multidimensional chromatographic fractionation decreases the complexity of the samples being analyzed resulting in increased peak capacity and hence relaxes the dynamic range constraint of the overall measurements. Therefore, to reduce the complexity of the individual samples that were ultimately analyzed by µLC-MS/MS, peptides derived from human serum ultrafiltrate were separated in a first dimension on a Polysulfoethyl A SCX column and 96 fractions were collected (Fig. 5). Every five fractions were pooled to give 20 subfractions that were subsequently analyzed by µLC-MS/MS.
The MS/MS spectra from the µLC-MS/MS analyses were searched against the human proteomic database using SEQUEST (ThermoFinnigan), employing the parameters shown in Table I. Examples of the MS/MS spectra that resulted in positive identifications are shown in Fig. 6. Only one of the three spectra shown matched to peptides that possess tryptic cleavage sites at both the amino and carboxyl termini. This analysis resulted in the identification of 341 unique proteins in the human serum LMW proteome. The identified proteins are listed in the supplemental data table, which shows the identified protein, the number of times a peptide originating from that protein was identified, the number of unique peptides identified for that protein, as well as the peptide with the highest Xcorr score. The efficacy of the ultrafiltration method to deplete albumin, for example, is underscored by the fact that not a single peptide from this protein was identified in the entire analysis of the µLC-MS/MS data. The proteins identified in our study arise from a wide range of functional classes, as shown in Fig. 7. For example, many proteins anticipated being present in serum, such as common circulatory proteins, coagulation and complement factors, transport and binding proteins, cytokines, growth factors, and hormones were identified. A number of proteins not commonly associated with serum, such as transcription factors, nuclear proteins, channels, and receptors were also identified substantiating the observation that cell contents may be released into the bloodstream during necrosis, apoptosis, and hemolysis (9).

View larger version (23K):
[in this window]
[in a new window]
|
FIG. 6. Tandem MS spectra of selected peptides identified in the proteome analysis of the LMW proteome.
|
|

View larger version (31K):
[in this window]
[in a new window]
|
FIG. 7. Pie chart representing the relative numbers of proteins identified within the LMW serum proteome.
|
|
 |
DISCUSSION
|
---|
Low abundant proteins make up about 1% of the entire human serum proteome, with the remaining 99% being comprised of only 22 proteins. It is therefore imperative to deplete the level of abundant proteins as an essential first step in the characterization of serum by MS analyses. Prefractionation approaches employing chromatographic adsorbents (e.g. anti-HSA antibodies, Cibacron Blue) have been used to remove abundant proteins such as albumin, however, these methods likely result in the removal of LMW species bound to albumin (10). We used centrifugal ultrafiltration employing Centriplus 30 (YM30) membrane ultrafilters to deplete abundant proteins such as albumin and enrich for LMW proteins. The method described here allows for the rapid and efficient removal of albumin and other highly abundant proteins while minimizing the concomitant loss of LMW components potentially bound to high abundant proteins. An earlier study by Georgiou et al. (21) reported that ultrafiltration failed to remove albumin and other high molecular weight proteins from human plasma. In our study the filtrate showed a radically different profile than the serum that was applied, with no detectable albumin in the filtrate. The filtrate in our study was highly enriched for LMW proteins as shown by SDS-PAGE, SELDI-TOF MS, and µLC-MS/MS. A detailed comparison of the conditions used in our study and that in which plasma was used shows potential reasons for this discrepancy. We diluted the serum using buffer conditions designed to disrupt protein-protein interactions, for example, thereby liberating albumin-bound species allowing them to pass through the membrane. Indeed, our results illustrate the importance of using denaturing conditions, as much less enrichment for LMW proteins is observed when the ultrafiltration is conducted under nondenaturing solvent conditions. The previous study did not dilute the plasma prior to ultrafiltration. In addition, the centrifugation in our study was conducted at low speed (i.e. 3,000 x g), whereas Georgiou et al. conducted their filtration at 12,000 x g. At this high centrifugal force, it is probable that the integrity of the membrane may have been compromised thereby allowing high molecular weight components such as albumin to pass through. The solvent conditions used in our study are also amenable to other size fractionation methods such as size exclusion chromatography, which could result in a more defined low molecular fraction. Regardless of which size fractionation method is used, the results presented here indicate that the ability to deplete high abundant components, such as albumin, while enriching for LMW proteins is highly dependent on the solvent conditions used.
While one of the purposes of this study was to develop an efficient fractionation method to enable the identification of components of the LMW serum proteome, the list of proteins presented in the supplementary information reveals the presence of many proteins that possess molecular mass much greater than 30 kDa. For example, molecular mass of the intact version of the first protein listed in this table has a molecular mass of 309 kDa. The identification of peptides from proteins with intact masses greater than 30 kDa is primarily due to the high protease content of serum. To confirm the presence of proteins with predicted high molecular masses in the LMW serum fraction, a LMW serum filtrate was separated by SDS-PAGE and selected bands were excised and identified by µLC-MS/MS. Indeed, peptides originating from proteins with molecular masses greater than 30 kDa were identified including kininogen (47,000 Da), apolipoprotein A-IV precursor (44,000 Da),
-2HS-glycoprotein (41,000 Da),
2 antiplasmin precursor (54,000 Da), and complement factor B precursor (84,000 Da). Therefore, it appears that the LMW proteome of serum is comprised of many proteolytic fragments from larger proteins and it cannot be assumed to be composed solely of intact proteins.
The SEQUEST results used to identify the proteins present in the LMW fraction were filtered by applying commonly used criteria to the Xcorr and DelCn scores (22). As shown in Table I, different criteria were employed depending on the enzyme constraint used. Although the LMW serum proteome was digested with trypsin, serum contains a significant number of proteases resulting in a number of peptides having nontryptic (i.e. chymotryptic, elastic, etc.) termini (23). Indeed, in our analysis a significant number of serum proteases were identified. Because our analysis focuses on the LMW proteome, a larger relative percentage of amino- and carboxy-terminal peptides are expected than if the entire proteome were analyzed. Our results are consistent with those presented in the study by Adkins et al. who also identified a large number of peptides containing nontryptic termini using SCX fractionation followed by µLC-MS/MS analysis as well as identified most of their proteins based on a single peptide identification (9).
The most extensive characterization and cataloging of serum proteins to date is that of Adkins et al., in which 490 proteins were identified (9). Although a Protein A/G column was used to deplete immunoglobulins prior to MS analysis, a total of 35 immunoglobulin proteins (about 7% of the total number) were identified. On the contrary, only six immunoglobulin proteins were identified in our study (about 2% of the total number), thus minimizing the source of repeated identification of previously well characterized proteins. Also no albumin peptides were identified, suggesting the effectiveness of centrifugal ultrafiltration for high mass and high abundant protein depletion. Furthermore, only four peptides were identified as arising from haptoglobulin or transferrin, two other highly abundant serum proteins. It should to be noted that in the absence of a complete understanding of the serum proteome, much less, its LMW fraction, the presence or absence of a particular protein might be a matter of conjecture.
The overlap between the proteins identified in this study and in those conducted by Anderson et al. (7) and Adkins et al. (9) is
11 and 16%, respectively. This low degree of overlap is not entirely surprising as there are several differences in the sample, sample preparation, and analysis in all of the compared studies. The study by Anderson et al., listed proteins from plasma that had been identified via two-dimensional-PAGE combined with MS analysis (7). The study of Adkins et al., while using a separation and MS analysis approach similar to our study, preceded with fractionation only after immunoglobulin depletion (9). The ultrafiltration-fractionation within our study to enrich for the LMW proteome would result in a vastly different protein content of the serum sample being analyzed. Proteins that were identified in this study as well either in the studies by Anderson et al. (7) and Adkins et al. (9) are indicated in the supplementary data table.
While serum is one of the most difficult proteome samples to characterize, the ability to do so promises rich information regarding the histological state of a patient and its analysis using proteomic techniques is being counted on for the discovery of reliable disease biomarkers. Fortunately our study, as well as the one by Adkins et al., shows that the majority of proteins identified in serum are secreted or shed by cells during signaling, necrosis, apoptosis, and hemolysis. In fact a small proportion of the identified proteins are what would be thought of as classical "blood" proteins, such as cytokines, hormones, growth factors, as well as coagulation and complement factors. Some of the noteworthy proteins we have identified in serum include several oncogenes (genes that normally direct cell growth) that, if altered by heredity or environmental stress, can promote uncontrolled cell growth, as in cancer. Among the particularly interesting proteins are the 29-amino acid CRISPP peptide (cancer-associated serine protease protecting peptide), isolated earlier from plasma of cancer patients (24), a proteolytic fragment of the mut S homolog that has been associated with hereditary non-polyposis colon cancer (25), glioma pathogenesis-related protein (26), which is normally overexpressed in brain tumors, the proto-oncogene c-ets-1 (27), oncogene lbc (28), and the novel oncogene DJ1, which has been shown to interact with c-myc and also transform NIH 3T3 cells in cooperation with ras (29). In addition proteins such as the angiotensinogen precursor, the prostatic tumor suppressor Kangai-1 antigen (30), and the cytokines interleukin-15, and leukemia inhibitory factor (31), which is present in serum at a concentration below 10 pg/ml (32), were also identified.
 |
FOOTNOTES
|
---|
Received, April 7, 2003, and in revised form, August 12, 2003.
Published, MCP Papers in Press, August 13, 2003, DOI 10.1074/mcp.M300031-MCP200
*S This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. NO1-CO-12400. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 
1 The abbreviations used are: LMW, low molecular weight; µLC, micro-capillary reverse-phase liquid chromatography; SELDI, surface-enhanced laser/desorption ionization; TOF, time-of-flight; MS, mass spectrometry; MWCO, molecular weight cutoff; MS/MS, tandem mass spectrometry; IT-MS, ion-trap mass spectrometer; CID, collision-induced dissociation; SCX, strong cation exchange. 
By acceptance of this article, the publisher or recipient acknowledges the right of the U.S. Government to retain a nonexclusive, royalty-free license to any copyright covering the article. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
The on-line version of this article (available at http://www.mcponline.org ) contains supplemental data.
To whom correspondence should be addressed: Biomedical Proteomics Program, SAIC-Frederick Inc., National Cancer Institute at Frederick, P.O. Box B, Frederick, MD 21702-1201. Tel.: 301-846-7286; Fax: 301-846-6037; E-mail: veenstra{at}ncifcrf.gov
 |
REFERENCES
|
---|
- Ardekani, A. M., Liotta, L. A., and Petricoin, E. F. 3rd.
(2002) Clinical potential of proteomics in the diagnosis of ovarian cancer.
Expert Rev. Mol. Diagn.
2, 312
320[Medline]
- Grossklaus, D. J., Smith, J. A., Shappell, S. B., Coffey, C. S., Chang, S. S., and Cookson, M. S.
(2002) The free/total prostate-specific antigen ratio (%fPSA) is the best predictor of tumor involvement in the radical prostatectomy specimen among men with an elevated PSA.
Urol. Oncol.
7, 195
198[CrossRef][Medline]
- Whitehouse, C., and Solomon, E.
(2003) Current status of the molecular characterization of the ovarian cancer antigen CA125 and implications for its use in clinical screening.
Gynecol. Oncol.
88, S152
S157[CrossRef][Medline]
- Sasaki, K., Sato, K., Akiyama, Y., Yanagihara, K., Oka, M., and Yamaguchi, K.
(2002) Peptidomics-based approach reveals the secretion of the 29-residue COOH-terminal fragment of the putative tumor suppressor protein DMBT1 from pancreatic adenocarcinoma cell lines.
Cancer Res.
62, 4894
4898[Abstract/Free Full Text]
- Kennedy, S.
(2002) The role of proteomics in toxicology: identification of biomarkers of toxicity by protein expression analysis.
Biomarkers
7, 269
290[CrossRef][Medline]
- Turner, M. W., and Hulme, B.
(1970)
The Plasma Proteins: An Introduction, Pitman Medical & Scientific Publishing Co., Ltd., London
- Anderson, N. L., and Anderson, N. G.
(2002) The human plasma proteome: history, character, and diagnostic prospects.
Mol. Cell. Proteomics
1, 845
867[Abstract/Free Full Text]
- Sato, A. K., Sexton, D. J., Morganelli, L. A., Cohen, E. H., Wu, Q. L., Conley, G. P., Streltsova, Z., Lee, S. W., Devlin, M., DeOliveira, D. B., Enright, J., Kent, R. B., Wescott, C. R., Ransohoff, T. C., Ley, A. C., and Ladner, R. C.
(2002) Development of mammalian serum albumin affinity purification media by peptide phage display.
Biotechnol. Prog.
18, 182
192[CrossRef][Medline]
- Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., Angell, N. H., Smith, R. D., Springer, D. L., and Pounds, J. G.
(2002) Toward a human blood serum proteome: analysis by multidimensional separation coupled with mass spectrometry.
Mol. Cell. Proteomics
1, 947
955[Abstract/Free Full Text]
- Burtis, C. A., and Ashwood, E. R.
(2001)
Tietz Fundamentals of Clinical Chemistry, 5th Ed., W. B. Saunders Company, Philadelphia, PA
- Petricoin, E. F., Ardekani, A. M., Hitt, B. A., Levine, P. J., Fusaro, V. A., Steinberg, S. M., Mills, G. B., Simone, C., Fishman, D. A., Kohn, E. C., and Liotta, L. A.
(2002) Use of proteomic patterns in serum to identify ovarian cancer.
Lancet
359, 572
577[CrossRef][Medline]
- Basso, D., Valerio, A., Seraglia, R., Mazza, S., Piva, M. G., Greco, E., Fogar, P., Gallo, N., Pedrazzoli, S., Tiengo, A., and Plebani, M.
(2002) Putative pancreatic cancer-associated diabetogenic factor: 2030 MW peptide.
Pancreas
24, 8
14.[CrossRef][Medline]
- Rubin, R. B., and Merchant, M.
(2000) A rapid protein profiling system that speeds study of cancer and other diseases.
Am. Clin. Lab.
19, 28
29[Medline]
- Wierenga, S., Zocher, M. J., Mirus, M. M., Conrads, T. P., Goshe, M. B., and Veenstra, T. D.
(2002) A method to evaluate tryptic digestion efficiency for high-throughput proteome analyses.
Rapid Commun. Mass Spectrom.
16, 1404
1408[CrossRef][Medline]
- Eng, J. K., McKormack, A. L., and Yates, J. R.
(1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.
J. Am. Soc. Mass Spectrom.
5, 976
989[CrossRef]
- Peng, J., Elias, J. E., Thoreen, C. C., Licklider, L. J., and Gygi, S. P.
(2003) Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome.
J. Proteome Res.
2, 43
50.[CrossRef][Medline]
- Nicolaidou, P., Georgouli, H., Matsinos, Y., Psychou, F., Messaritaki, A., Gourgiotis, D., and Zeis, P.
(2003) Endothelin-1 in children with acute poststreptococcal glomerulonephritis and hypertension.
Pediatr. Int.
45, 35
38[Medline]
- Dockal, M., Carter, D. C., and Ruker, F.
(1999) The three recombinant domains of human serum albumin. Structural characterization and ligand binding properties.
J. Biol. Chem.
274, 29303
29310[Abstract/Free Full Text]
- Rothemund, D. L., Locke, V. L., Liew, A., Thomas, T. M., Wasinger, V., and Rylatt, D. B.
(2003) Depletion of the highly abundant protein albumin from human plasma using the Gradiflow.
Proteomics
3, 279
287[CrossRef][Medline]
- Issaq, H., Conrads, T. P., Prieto, D., Tirumalai, R., and Veenstra, T. D.
(2003) SELDI-TOF MS for diagnostic proteomics.
Anal. Chem.
75, 148A
155A[Medline]
- Georgiou, H. M., Rice, G. E., and Baker, M. S.
(2001) Proteomic analysis of human plasma: failure of centrifugal ultrafiltration to remove albumin and other high molecular weight proteins.
Proteomics
1, 1503
1506[CrossRef][Medline]
- Washburn, M. P., Wolters, D., and Yates, J. R., III.
(2001) Large-scale analysis of the yeast proteome by multidimensional protein identification technology.
Nat. Biotechnol.
19, 242
247[CrossRef][Medline]
- Richter, R., Schulz-Knappe, P., Schrader, M., Standker, L., Jurgens, M., Tammen, H., and Forssmann, W. G.
(1999) Composition of the peptide fraction in human blood plasma: database of circulating human peptides.
J. Chromatogr. B Biomed. Sci. Appl.
726, 25
35[CrossRef][Medline]
- Cercek, L., and Cercek, B.
(1992) Cancer-associated SCM-recognition, immunedefense suppression, and serine protease protection peptide. Part I. Isolation, amino acid sequence, homology, and origin.
Cancer Detect. Prev.
16, 305
319[Medline]
- Fishel, R., Lescoe, M. K., Rao, M. R., Copeland, N. G., Jenkins, N. A., Garber, J., Kane, M., and Kolodner, R.
(1993) The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer.
Cell
75, 1027
1038[Medline]
- Murphy, E. V., Zhang, Y., Zhu, W., and Biggs, J.
(1995) The human glioma pathogenesis-related protein is structurally related to plant pathogenesis-related proteins and its gene is expressed specifically in brain tumors.
Gene
159, 131
135[CrossRef][Medline]
- Reddy, E. S., and Rao, V. N.
(1988) Structure, expression and alternative splicing of the human c-ets-1 proto-oncogene.
Oncogene. Res.
3, 239
246[Medline]
- Toksoz, D., and Williams, D. A.
(1994) Novel human oncogene lbc detected by transfection with distinct homology regions to signal transduction products.
Oncogene
9, 621
628[Medline]
- Nagakubo, D., Taira, T., Kitaura, H., Ikeda, M., Tamai, K., Iguchi-Ariga, S. M., and Ariga, H.
(1997) DJ-1, a novel oncogene which transforms mouse NIH3T3 cells in cooperation with ras.
Biochem. Biophys. Res. Commun.
231, 509
513[CrossRef][Medline]
- Dong, J. T., Lamb, P. W., Rinker-Schaeffer, C. W., Vukanovic, J., Ichikawa, T., Isaacs, J. T., and Barrett, J. C.
(1995) KAI1, a metastasis suppressor gene for prostate cancer on human chromosome 11p11.2.
Science
268, 884
886[Medline]
- Tomida, M., Yoshida, U., Mogi, C., Maruyama, M., Goda, H., Hatta, Y., and Inoue, K.
(2001) Leukaemia inhibitory factor and interleukin 6 inhibit secretion of prolactin and growth hormone by rat pituitary MtT/SM cells.
Cytokine
14, 202
207[CrossRef][Medline]
- Wegner, N. T., and Mershon, J. L.
(2001) Evaluation of leukemia inhibitory factor as a marker of ectopic pregnancy.
Am. J. Obstet. Gynecol.
184, 1074
1076[CrossRef][Medline]