A Proteomic Analysis of Human Hemodialysis Fluid*,S

Henrik Molina{ddagger},§, Jakob Bunkenborg{ddagger},§,, G. Hanumanthu Reddy||, Babylakshmi Muthusamy||, Paul J. Scheel** and Akhilesh Pandey{ddagger},{ddagger}{ddagger}

From the {ddagger} McKusick-Nathans Institute for Genetic Medicine and Department of Biological Chemistry and Oncology and the ** Department of Nephrology, The Johns Hopkins University, Baltimore, Maryland 21205, the § Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense 5230, Denmark, and the || Institute of Bioinformatics, International Technology Park Ltd., Bangalore 560 066, India


    ABSTRACT
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 
The vascular compartment is an easily accessible compartment that provides an opportunity to measure analytes for diagnostic, prognostic, or therapeutic indications. Both serum and plasma have been analyzed extensively by proteomic approaches in an effort to catalog all proteins and polypeptides. Limitations of such approaches in obtaining a comprehensive catalog of proteins include the fact that a handful of proteins constitute over 90% of plasma protein content and that the renal glomeruli filter out proteins and polypeptides that are smaller than 66 kDa from blood. We chose to study hemodialysis fluid because it contains a higher concentration of small proteins and polypeptides and is also simultaneously depleted of the most abundant proteins present in blood. Using gel electrophoresis in combination with LC-MS/MS, we identified 292 proteins of which greater than 70% had not been previously identified from serum or plasma. More than half of the proteins identified from the hemodialysis fluid were smaller than 40 kDa. We also found 50 N-terminally acetylated peptides that allowed us to unambiguously map the N termini of mature forms of the corresponding proteins. Several identified proteins, including cytokines, were only present as predicted transcripts in data bases and thus represent novel proteins. The proteins identified in this study could serve as biomarkers in serum using more sensitive methods such as ELISA-specific antibodies.


A comprehensive analysis of human serum and plasma has proven to be difficult, especially for low molecular weight and low abundance proteins, because of the wide range of concentrations with the 10 most abundant proteins constituting almost 90% of the serum proteome by mass (1). The dynamic range issue is exacerbated for low molecular weight species because the kidneys filter away molecules with molecular mass of less than 66 kDa (2). The abundant proteins such as albumin, immunoglobulins, and transferrin hamper identification of less abundant proteins. Although several methods including ultracentrifugation (3), immunodepletion, solvent extraction/precipitation (4), and size exclusion chromatography (5) have been tried for removal of abundant components, it is still difficult to completely eliminate the interference from residual amounts of these abundant proteins.

Hemodialysis fluid has previously been used as a source of polypeptides (6) due to its higher concentration of low molecular weight components, but no comprehensive list of constituents present in the hemofiltrate has yet been published. We chose to analyze hemodialysis fluid because it is greatly reduced in the protein content, from 70 g/liter in the plasma to ~70 mg/liter. This is because the filtration cutoff used during hemodialysis results in selective depletion of proteins greater than ~60 kDa. It has been shown that the concentration of albumin in the hemodialysis fluid is 5,000-fold lower compared with its normal concentration in serum (7).

Our strategy involved one-dimensional gel electrophoresis separation and in-gel digestion of proteins followed by LC-MS/MS for identifying proteins in the hemodialysis fluid. Using this approach, we identified 292 proteins of which 205 had never been previously reported in the serum or plasma. Western blot analysis of a subset of these proteins revealed that they were also present in normal serum indicating that the sensitivity of detection might be the major reason why the majority of these proteins have never been identified previously in serum or plasma. We were also able to identify the N termini of a number of proteins based on peptides sequences that were acetylated at their N termini. A number of semitryptic peptides that were identified in this study were most likely derived from in vivo proteolysis. We were also able to identify a number of novel proteins including several cytokines in this analysis. The lack of a major overlap between the list of proteins identified in this study and previously reported proteins in serum or plasma reflects the difficulty of identifying these components using current proteomic methods. As demonstrated by our study, it is likely that more sensitive methods such as Western blotting or ELISA will be able to detect these proteins in serum or plasma.


    EXPERIMENTAL PROCEDURES
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 
Sample Collection and Preparation and Reagents—
The hemodialysis fluid was obtained from a 68-year-old white male with acute renal failure following a coronary artery bypass surgery. The patient was not known to have any other diseases. Vascular access was obtained utilizing a dual lumen catheter in the femoral position. Continuous venovenous hemodialysis was performed with a Prisma® dialysis machine (Cobe Renal Intensive Care, Lakewood, CO). The dialysis membrane was an acrylonitrile and sodium methallyl sulfonate copolymer, AN69® (Cobe Renal Intensive Care). Sterile dialysate was prepared in 5.0-liter aliquots (140 meq/liter sodium, 111 meq/liter chloride, 3.5meq/liter calcium, 3 meq/liter lactate, 1 meq/liter magnesium, 32 meq/liter bicarbonate, 2.0 meq/liter potassium), and the dialysis was carried out with a blood flow of 180 ml/min and a dialysate flow of 1.0 liter/hr. The spent dialysate was collected in 5.0-liter aliquots and analyzed after concentrating using a 3-kDa-cutoff filter (Centricon YM3, Millipore). The protein content in the hemodialysis fluid was measured using a modified Lowry protein assay kit (Bio-Rad). The concentrated hemodialysis fluid was run on precast NuPage 4–12% bis-tris1 and 10–20% Tricine minigels (Invitrogen). The gels were silver-stained (8), and between 20 and 25 visible protein bands were excised. Protein gel bands were reduced with dithiothreitol (Fluka, Buch, Switzerland) and alkylated with iodoacetamide (Sigma) before digestion with trypsin (Promega, Madison, WI) in 100 mM NH4HCO3 (Fluka). Solvents for liquid chromatography included heptafluorobutyric acid (Sigma), glacial acetic acid (Fisher Scientific), and HPLC-grade acetonitrile (J. T. Baker Inc.).

Trypsin Digestion and LC-MS/MS Analysis—
The excised gel slices were digested with trypsin as follows. The gel bands were washed twice in 0.1 M NH4HCO3, washed twice in 50% acetonitrile, and subsequently cut into 2 x 2-mm pieces. The gel pieces were shrunk using 100% acetonitrile, and proteins were reduced by addition of 0.1 M dithiothreitol followed by an incubation step at 56 °C for 45 min. The washing procedure described above was repeated, and proteins were alkylated by adding 55 mM iodoacetamide and incubating for 30 min at room temperature in the dark. After an additional wash and shrinkage, 10 ng/µl trypsin in 0.1 M NH4HCO3 sufficient to cover the gel pieces was added followed by an incubation on ice for 20 min. When the gel pieces were completely rehydrated, any excess trypsin solution was removed and replaced by 0.1 M NH4HCO3, and samples were incubated overnight at 37 °C. The digestion was stopped by adding 10 µl of glacial acetic acid, and the supernatant containing the tryptic peptides was harvested. An extraction step was carried out to recover the peptides from the gel slices by adding 50% acetonitrile and incubating at room temperature for 30 min. The supernatant was harvested again and pooled. The pooled peptide extracts were dried down to ~10 µl and subjected to LC-MS/MS analysis as follows. Samples were injected onto a 5-cm C18 trap column (inner diameter, 75 µm) packed with YMC ODS-A 5–15-µm beads (Kanematsu USA Inc., New York, NY) using an autosampler (1100 microwell plate autosampler, Agilent Technologies, Palo Alto, CA). The peptides were eluted from the trap column onto an analytical 10-cm C18 column (inner diameter, 75 µm) packed with Vydac MS218 5-µm beads (Vydac, Columbia, MD) with a gradient increasing from 10% solvent B, 90% solvent A (solvent A: 0.4% acetic acid, 0.005% heptafluorobutyric acid; solvent B: 90% acetonitrile, 0.4% acetic acid, 0.005% heptafluorobutyric acid) to 45% solvent B, 55% solvent A in 30 min. A flow of 4 µl/min during loading and 300 nl/min during elution was delivered by a nanoflow pump (Agilent Technologies 1100 nanopump). The LC setup was connected to either a quadrupole-time-of-flight mass spectrometer (QTOF API-US, Micromass, Manchester, UK) or an ion trap mass spectrometer (LC/MSD Trap XCT, Agilent Technologies) using nanoelectrospray sources from Proxeon (Odense, Denmark).

Western Blotting—
For Western blotting experiments, serum from a healthy person was first depleted of albumin, IgG, IgA, transferrin, haptoglobin, and antitrypsin using an Agilent Technologies multiple affinity removal kit. 20 µl of serum were diluted to 100 µl in Buffer A (Buffer A and Buffer B as supplied in the multiple affinity removal kit by Agilent Technologies) prior to loading onto the column at 250 µl/min in Buffer A. The elution from the column was monitored at 280 nm, and the depleted serum was collected in the interval from 2 to 4 min. Cleaning of the column was carried out by eluting the bound proteins as follows. 10 min after the injection, Buffer A was exchanged with Buffer B, and the flow rate was simultaneously increased to 1,000 µl/min. Following this, Buffer B at 1,000 µl/min was continued for 18 min after which the conditions were returned to the initial loading conditions. After conditioning the column under loading conditions for 5 min, the system was ready for a new depletion process.

Depleted serum was resolved by SDS-PAGE, and the proteins were transferred onto a nitrocellulose membrane. The membrane was blocked at 4 °C overnight with 5% BSA in phosphate-buffered saline containing 0.1% Tween 20. The membrane was incubated with the relevant antibodies for 2 h, washed, and incubated with horseradish peroxidase-conjugated secondary antibody for 1 h. The proteins were visualized using enhanced chemiluminescence detection according to the manufacturer’s instructions (Amersham Biosciences). The sources of primary antibodies were as follows: Cathepsin D, connective tissue growth factor, Galectin 3, and Lipocalin 2 (R&D Systems, Inc. Minneapolis, MN); Nucleophosmin 1 (Zymed Laboratories Inc.); {alpha}-defensins 1–3 (BD Biosciences); Cathepsin H (Serotec, Oxford, UK); and Cofilin 2 (Upstate Biotechnology, Lake Placid, NY). Conjugated secondary anti-mouse and anti-rabbit were from Amersham Biosciences, and anti-goat antibody was from Santa Cruz Biotechnology (Santa Cruz, CA).

Data Base Search and Analysis—
The mass spectrometry data files from individual LC-MS/MS experiments were merged and then searched against the RefSeq data base (human: 27,975 entries; July 28, 2004) using Mascot (Matrix Sciences Ltd., London, UK). The search parameters were as follows: mass accuracy of the monoisotopic precursor and peptide fragments was set to 1.5 and 0.5 Da, respectively, for the data acquired on the ion trap mass spectrometer and 0.2 and 0.2 Da, respectively, for the data acquired on the quadrupole-time-of-flight mass spectrometer. The following variable modifications were permitted: oxidation of methionine, histidine, and tryptophan residues; N-terminal acetylation of proteins; and cyclization of N-terminal glutamine. Two missed tryptic cleavages were allowed. Additional searches using data acquired on the quadrupole time-of-flight mass spectrometer were performed with the following criteria: 1) semitryptic constraints, 2) N-acetylhexosamine modification of asparagine residues, and 3) hydroxylation of proline residues. For validation purposes, the retrieved peptide sequences were divided into three groups as follows: 1) peptides with a low score (quadrupole-time-of-flight data, <25; ion trap data, <30), 2) peptides with intermediate to high scores (quadrupole-time-of-flight data, >25; ion trap data, >30), and 3) proteins identified with only one intermediate or high scoring peptide. All of the low scoring peptides were discarded. All intermediate and high scoring peptides as well as single peptide hits used to identify proteins were manually validated if the following criteria were met: 1) several consecutive y-ions although absence of y-ions after proline and glycine, 2) the existence of lower a- and b-ions, 3) none or few unassigned fragments ions, and 4) a charge state of the precursor ion and fragment ions that are in accordance with basic amino acids in the assigned peptide sequence. Supplemental Fig. 6 provides the MS/MS spectra of all proteins that were identified on the basis of a single peptide.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 
Our strategy to characterize the hemodialysis fluid proteome by mass spectrometry using a gel LC approach was as follows. The hemodialysis fluid was first desalted and concentrated using a 3-kDa-cutoff filter. For fractionating the proteins and polypeptides in the hemodialysis fluid, the desalted sample was resolved by SDS-PAGE and silver-stained, and the bands were excised, digested with trypsin, and analyzed by LC-MS/MS. Supplemental Table 1 lists all the proteins that were identified in this study along with the number of peptides that matched each protein, molecular weight, and assignments of the approved Human Genome Organization (HUGO) gene symbols, wherever available. Supplemental Table 1 provides a count of how many gene products were actually identified from this sample as the count on the basis of proteins can be misleading because of the presence of a large number of protein isoforms.


View this table:
[in this window]
[in a new window]
 
TABLE I Proteins found in hemodialysis fluid that were not previously described in serum/plasma

 
Hemodialysis Fluid Proteome Versus the Plasma Proteome
Of the proteins identified in this study, 85% were smaller than 60 kDa, and more than half were <30 kDa. This is expected because of the hemodialysis process in which blood is filtered through a membrane that has a molecular mass cutoff of ~60 kDa, which should only allow proteins less than 60 kDa into the hemodialysis fluid. Fig. 1 shows a histogram of the number of proteins identified plotted against the corresponding molecular masses. Most of the proteins are found between 15 and 30 kDa. Superimposed on the histogram is a similar plot from the combined studies by Anderson et al. (9) and Chan et al. (10). Those two studies resulted in more than 2,100 proteins of which ~50% were below 65 kDa, and only 16% were below 30 kDa. Another distinguishing feature of the hemodialysis fluid data set is the absence of the 40–80-kDa hump that is conspicuous in the serum/plasma proteome data set. A subset of the serum/plasma proteins were identified by Tirumalai et al. (3), who used serum that had been filtered through a 30-kDa filter prior to analysis. In terms of the number of proteins identified, this data set (340 proteins) is comparable to our hemodialysis data set (292 proteins), although significant differences are observed. Comparing the molecular weight distribution of the proteins identified in these two studies, it is clear that a considerably greater fraction of the proteins found in the hemodialysis fluid (56% of proteins with molecular mass < 30 kDa) had a lower molecular weight than in the analysis by Tirumalai et al. (18% of proteins with molecular mass < 30 kDa).



View larger version (37K):
[in this window]
[in a new window]
 
FIG. 1. Distribution of molecular masses of proteins identified from hemodialysis fluid. Superimposed are the similar distribution for serum and plasma proteins described in the literature (9, 10). Proteins were first sorted into bins of 5 kDa each, and the numbers in each bin were plotted as a percentage of the total number of entries.

 
We compared the known serum/plasma proteins with those identified in this study to determine the degree of overlap based on the genes that encoded the expressed proteins and their isoforms. We first mapped all the protein entries to HUGO-approved gene symbols wherever possible (www.gene.ucl.ac.uk/nomenclature/), which allowed us to eliminate redundancy and facilitated this comparison. Although the serum/plasma catalog contained more than 7 times as many protein entries, it was remarkable that 205 of 292 proteins found in the hemodialysis fluid were not reported in the serum plasma proteome. Supplemental Fig. 1 shows a distribution of proteins identified in this study according to their molecular function. Table I lists the proteins that were identified from hemodialysis fluid but not previously reported in serum/plasma. Not surprisingly, several of these proteins are of low molecular mass, validating our use of hemodialysis fluid to identify smaller proteins and polypeptides. Fig. 2 shows a Venn diagram indicating the relative numbers of overlapping and non-overlapping gene products.



View larger version (9K):
[in this window]
[in a new window]
 
FIG. 2. A Venn diagram showing the degree of overlap based on the non-redundant set of genes that encoded the proteins identified in this study and those previously reported in serum/plasma (9, 10).

 
It can be argued that this difference observed between serum and hemodialysis fluid is attributable to the underlying condition in the patient undergoing dialysis. In that case, the proteins specific to our list of hemodialysis fluid would not be observed in serum from normal individuals. It should be noted that the components that are normally filtered away by the kidneys are excreted in the urine. Therefore, we also compared our results with two proteomic analyses of the human urinary proteome (11, 12) with a total of 358 proteins. Of the proteins specifically found in hemodialysis fluid, 29 of those have been identified from urine of normal individuals. This indicates that a significant fraction of the proteins are indeed found in normal serum/plasma but have not been identified earlier because of the technical limitations mentioned earlier.

Western Blot Analysis to Confirm Expression in Normal Serum
To test whether the expression of the proteins found in hemodialysis fluid was also detectable in normal serum, we obtained antibodies that work in Western blotting experiments against a subset of proteins. As shown in Fig. 3, we were able to observe the expression of Galectin 3, Cofilin 2, Cathepsin H, {alpha}-defensins 1 and 3, Cathepsin D, Nucleophosmin 1, connective tissue growth factor, and Lipocalin 2 in normal serum. Thus, although we cannot formally rule out the possibility that some of the proteins are found in the serum because of the underlying disease of the patient undergoing hemodialysis, we think that the majority of the proteins identified in this study are normal components of serum/plasma and have not been previously identified due to their lower concentrations in the blood because of clearance by the kidneys.



View larger version (17K):
[in this window]
[in a new window]
 
FIG. 3. Western blot analysis to confirm expression in normal serum. The figure shows a Western blot analysis of serum from a healthy volunteer that was depleted of albumin, IgG, IgA, transferrin, haptoglobin, and antitrypsin. The antibodies used are directed against the proteins whose names are indicated above each panel.

 
Secreted Proteins Identified from the Hemodialysis Fluid
We identified several proteins that were either known or predicted to be secreted proteins. Below we will discuss a subset of these proteins that are likely to be especially interesting from a biological perspective. Many of them have not been previously identified in serum/plasma. All accession numbers are from the RefSeq data base.

A Novel Predicted Osteoblast Protein (FAM3C; NP_055703)—
This protein was initially identified from genomic data bases using structure-based methods in search of novel "four-helix bundle" cytokines. Four genes were identified in this family, although functional prediction has only been done for a related protein encoded by a related gene, FAM3B, which inhibits basal insulin secretion. This novel protein encoded by the FAM3C gene is expressed in all the tissues examined but was named as a predicted osteoblast protein as it was initially observed in osteoblasts. It contains an N-terminal signal peptide and no transmembrane domain, which suggests that it is a secreted protein (13).

Fatty Acid-binding Protein 3 (FABP3) (NP_004093)—
This was originally identified by screening a human adult muscle {lambda} gt11 expression library with an antibody to muscle FABP (14). The same molecule was also identified as mammary-derived growth inhibitor based on its activity as a growth inhibitor in lactating bovine mammary gland. This protein plays a role in the intracellular transport of long-chain fatty acids and their acyl-CoA esters. FABP3 is a candidate tumor suppressor gene involved in breast cancer (15).

Antrum Mucosa Protein 18 kDa (NP_062563)—
This molecule, also designated as CA11, was initially isolated using differential display in human gastric cancer tissue. The expression of CA11 was observed to be down-regulated in gastric cancer tissue as compared with the normal gastric mucosa. Northern blot and RACE indicated that it is predominantly expressed in stomach and at low levels in the uterus and placenta (16). It has been suggested that the loss of its expression in gastric tissues may play an important role in gastric carcinogenesis (17). This protein is localized to the secretory granules of mucosal epithelium lining the stomach lumen (18).

Resistin (NP_065148)—
Resistin (resistance to insulin) belongs to a family of proteins that is involved in inflammatory processes and in regulating metabolism. Human resistin (also referred to as FIZZ3) was identified by searching sequence data bases with a related mouse protein called FIZZ1 (19). The expression of resistin is induced during adipogenesis, and it is normally secreted by adipocytes. Elevated resistin levels are seen in serum in both genetic and diet-induced obesity suggesting that it could potentially link obesity to diabetes (20).

Dermcidin (NP_444513)—
Dermcidin is a novel human antimicrobial peptide secreted by the sweat glands. Screening of a subtracted cDNA library of primary melanoma and benign melanocytic nevus tissues with cDNA arrays led to isolation of dermcidin. It has been shown to possess antimicrobial activity against Escherichia coli, Enterococcus faecalis, Staphylococcus aureus, and Candida albicans (21). It may also play a role in tumorigenesis by enhancing cell growth and survival in a subset of breast carcinomas (22).

{alpha}-Defensin 1 (NP_004075)—
Defensins are a family of microbicidal and cytotoxic polypeptides involved in host defense. {alpha}-Defensin 1 was identified by screening a cDNA library constructed from HL-60, a human promyelocytic leukemia cell line, with an oligonucleotide probe based on the C-terminal sequence of human neutrophil peptides (23). It was found in different tissues including bone marrow, blood, neutrophils, and plasma (24). {alpha}-Defensins have been shown to inhibit the replication of HIV-1 (25).

{alpha}-Defensin 3 (NP_005208)—
Several proteins secreted by activated CD8+ T cells from long term non-progressors with HIV-1 were identified. One of them was {alpha}-defensin 3 encoded by the DEFA3 gene (25). {alpha}-Defensins 1, 2, and 3 collectively account for much of the anti-HIV-1 activity of CD8 antiviral factor that is not attributable to ß-chemokines (24). It is known to be expressed in bone marrow, leukocytes, and neutrophils (24, 26). All the family members are known to be present in plasma (27).

Thymosin ß 10 (NP_066926)—
Thymosin ß 10 was isolated from a kidney cDNA library and is an actin-sequestering protein (28). Thymosin ß 10 has been shown to be a putative progression marker for human cutaneous melanoma (28). It is likely that this protein is released into the plasma after lysis of cells.

Chromosome 19 Open Reading Frame 10 (NP_061980)—
A novel secreted protein was identified in a murine system using an expression cloning strategy (29). We have identified the human ortholog of this murine secreted protein. The function of this protein is not known, although its sequence suggests that it is likely to be a cytokine.

Semitryptic Versus Full Tryptic Data Base Searching
All of the 292 proteins in this study were identified by searching data bases using tryptic constraints for peptides. However, one would also expect to observe fragments derived after proteolysis in vivo. Thus, we tested this by searching a subset of our data having the highest mass accuracy (data from quadrupole-time-of-flight mass spectrometer) with semitryptic constraints. Searching our data with semitryptic instead of fully tryptic constraints resulted in a higher total score for nearly all of the proteins identified. However, for most of the entries, the higher score was a result of the contribution of low scoring peptides that did not pass our validation criteria. Semitryptic high scoring peptides that passed our manual validation are listed in Table II. Although some of the peptides presented in Table II could results from in-source fragmentation of labile proline-containing peptides (30), we expect that the majority arise from proteolytic cleavage events that occurred in vivo. Fig. 4A shows an example of an MS/MS spectrum of a semitryptic peptide derived from Gelsolin.


View this table:
[in this window]
[in a new window]
 
TABLE II A list of semitryptic peptides identified from the hemodialysis fluid

Peptide sequences identified by mass spectrometry are bold and underlined.

 


View larger version (33K):
[in this window]
[in a new window]
 
FIG. 4. Tandem mass spectra of a semitryptic peptide, LSSHIANVER, from Gelsolin (A), an N-terminally acetylated peptide, Ac-ADKPDMGEIASFDK, from thymosin ß 10 (B), an N-glycosylated peptide, NLEKNHexNAcSTKQEILAALEK, from prosaposin (C), a peptide containing hydroxyproline, GSAGPHypGATGFPHypGAAGR, derived from Collagen I (D). The corresponding peptide sequence is indicated in each case. Ac refers to acetylation, Hyp refers to hydroxyproline, and HexNAc refers to N-acetylhexosamine.

 
Proteolytic Activity in Hemodialysis Fluid
Theoretically the hemodialysis fluid should not contain proteins greater in size than 60 kDa. Nevertheless it is possible for larger proteins to be detected in the hemodialysis fluid if they undergo proteolytic cleavage in vivo. Thus, we decided to examine the distribution of peptides in greater detail. Of the proteins greater than 90 kDa, more than half exhibited a grouping of several peptides in either the N or C terminus of the protein, suggesting the presence of a fragment or isoform. Smaller fragments of proteins can be the result of proteolytic activity in plasma. Fig. 5 shows five examples of such groupings within a larger protein. For most of the proteins shown in Fig. 5, all peptides are clustered in regions that constitute only 4–8% of the whole protein sequence. For the 260-kDa protein fibronectin 1 and the 189-kDa protein complement component 3, a clustering of peptides was found in the middle of the proteins. This observation could be explained if these proteins were cleaved twice resulting in the observed fragment. For plasminogen (93 kDa) and desmocollin 1 (100 kDa), a clustering of peptides was observed in the N-terminal part of the protein, whereas a C-terminal clustering was observed in the case of Perlecan (480 kDa). Furthermore all of the above mentioned proteins were identified from bands that were at least a third of their expected molecular masses. For instance, fibronectin 1 was identified from the 15–19-kDa gel band, and desmocollin 1 was identified from the 6–15-kDa gel band. It is also worth noting that although several proteolytic cleavage sites are located in the C-terminal region of plasminogen (31), our data suggest that a cleavage occurred in the N terminus of the protein to generate Angiostatin (32), an important player in the regulation of angiogenesis (33). The probability of clustering of five peptides within a region that is 5% of the total protein length is 10–6 strongly suggesting that we are observing true in vivo proteolytic events. Of the above described proteins, all were retrieved from protein bands migrating in the ~6–15-kDa region, except for complement component 3 that was retrieved from the ~60–80-kDa region. For the latter this observation is in support of a single cleavage rather than a multicleavage event and is in agreement with literature where C3 convertase has been reported to cleave complement component 3 at residue 748, generating C3a anaphylatoxin (34).



View larger version (7K):
[in this window]
[in a new window]
 
FIG. 5. A distribution of peptides for five large proteins. The proteins shown in the figure were identified from bands corresponding to molecular masses that were significantly lower than the calculated molecular masses. The peptides corresponding to the proteins are indicated by short thick lines. comp., component.

 
Post-translational Modifications
Many proteins and peptides found in plasma and hemodialysis fluid have undergone a number of changes since the formation of nascent polypeptide chains. These changes have to be taken into account when interpreting the mass spectrometry data partly to get a better description of the biological sample and partly to avoid false-positive identifications. Basically the modifications can be grouped into enzyme-catalyzed changes and spontaneously occurring modifications. A large number of peptides were oxidized at methionine and tryptophan residues or exhibited pyroglutamine formation by cyclization by an N-terminal glutamine residue. It is difficult to infer any biological significance from these spontaneously occurring modifications, and they will not be discussed in detail. We searched the data from the quadrupole time-of-flight mass spectrometer to identify some of the commonly occurring post-translational modifications.

Acetylation
The N terminus of most proteins in vivo is processed by aminopeptidases and N-acetyltransferases (35). Aminopeptidases generally cleave the initiator methionine if the penultimate residue has a small radius of gyration (e.g. Gly, Ala, Ser, Cys, Thr, Pro, or Val). N-terminal acetylation is a very common co- and post-translational process where an acetyl group is transferred from acetyl-CoA to the N-terminal {alpha}-amino acid of a protein (35). In our study, the acetylated N termini of 43 proteins were detected. The proteins and the corresponding acetylated peptides are listed in Table III. We were able to find two different N termini for peptidylprolyl isomerase A; in one instance, the initiator methionine was acetylated, and in another, the initiator methionine was removed and the next amino acid, valine, was acetylated. Fig. 4B shows an MS/MS spectrum of an acetylated peptide derived from thymosin ß 10 protein. Our data on N-terminal acetylation is in good agreement with the preference of methionine aminopeptidases that the ultimate amino acid is small or unmodified. Of the 43 acetylated proteins, one N-acetylated peptide was found in the middle of a predicted protein (XP_371848), suggesting that this could be a wrongly predicted protein. Indeed a Blast analysis of orthologous proteins confirms our new assignment of the translational initiation site (see Supplemental Fig. 2), which is in agreement with sequence conservation in five different species beginning at this methionine. Supplemental Fig. 3 shows the MS/MS spectra of all peptides containing an acetylated residue.


View this table:
[in this window]
[in a new window]
 
TABLE III A list of N-terminally acetylated peptides and the corresponding proteins

 
N-Glycosylation of Peptides
Three peptides were found to be covalently modified with an N-acetylhexosamine moiety, which corresponds to a mass increase of 203 Da (Table III and Supplemental Fig. 4). The CID-induced fragmentation spectrum of the triply charged ion with m/z 711.396 is shown in Fig. 4C. The N-acetylhexosamine gives rise to an intense oxonium ion at m/z 204, and further fragmentation of the oxonium ion gives rise to ions with m/z 186, 168, 138, and 126 (see Fig. 4C, inset). Interpretation of the mass spectrum revealed that the second asparagine in the peptide was covalently linked to an N-acetylhexosamine moiety. This modified asparagine occurs in the glycosylation consensus sequence NX(S/T) where X can be any amino acid. This is an unusual modification because the common N-glycan is cotranslationally transferred to the asparagine residue en bloc as the oligosaccharide GlcNAc2Man9Glc3 and seldom trimmed beyond the trimannosyl-chitobiose core. It is interesting that only three peptides are found with this modification and come from a protein called prosaposin. Because N-glycosylation takes place only on asparagines in the glycosylation sequon NX(S/T), there will always be an adjacent serine of threonine. It is difficult to find peptides where O-GlcNAc is at least not a theoretically possible explanation (except in the rare instances where X is arginine or lysine, and trypsin cleavage is not hindered by the glycan moiety). However, there is a doubly charged ion in the spectrum that shows, even though it is small, that asparagine is the site of modification.

Prosaposin is a 524-amino acid glycoprotein that gives rise to four saposins that are predominantly localized in late endosomal/lysosomal compartment. A large number of endo- and exoglycosidases are also present in the lysosome, and it is possible that the extensive trimming of the glycan down to a single N-acetylglucosamine takes place there. Whether this trimming of the glycan structure has any biological significance is not known, but it should be noted that two cases of metachromatic leukodystrophy have been reported where the mutations in the glycosylation sequon, either N215H (36) or T217I (37) corresponding to glycosylation sequon in the identified peptide TNSTFVQALVEHVKEECDR, led to a dysfunctional saposin B protein emphasizing the importance of glycosylation.

Proline Hydroxylation
A number of structural molecules in the extracellular matrix are known to undergo extensive post-translational modifications. Hydroxylation of prolines by prolyl hydroxylase is a common modification of collagen that confers structural stability to the collagen triple helix (38). We identified one proline hydroxylation site in fibrinogen, two sites in collagen {alpha} 2 type I, and six sites in collagen {alpha} 1 type I (see Fig. 4D for the MS/MS spectrum of a peptide containing two hydroxylated proline residues). Table IV lists all the hydroxylation sites identified in this study, and the MS/MS spectra are shown in Supplemental Fig. 5.


View this table:
[in this window]
[in a new window]
 
TABLE IV A list of peptides with other post-translational modifications

 
Conclusions
Using one-dimensional gel electrophoresis and LC-MS/MS, we have identified 292 proteins from hemodialysis fluid of which more than half were proteins smaller than 30 kDa. Analysis of the modified peptides led to identification of 43 N-terminally acetylated proteins and three proteins hydroxylated on prolines. We also found three peptides from prosaposin to be modified with a single HexNAc at two different glycosylation sequons. We were able to map the identified peptides onto larger proteins, which showed groupings of peptides within limited regions. A comparison of our results with previously published studies that examined serum and plasma proteomes showed that two-thirds of the proteins identified in this study had not been identified previously as components of serum or plasma. We feel that this is mainly due to two major contributing factors: the first is the greater dynamic range of protein concentrations in serum/plasma samples, and the second is enrichment of the lower molecular weight proteins in the hemodialysis fluid. The proteins identified in this study will allow further investigations into their detection in serum/plasma and possible use as biomarkers of disease states.

This study presents the first comprehensive list of hemofiltrate proteins and in-depth analysis of the post-translational modifications. This proteomic survey is by no means exhaustive, and there are probably many proteins in the low molecular weight region we have not identified. This is not an uncommon phenomenon. In the Anderson et al. (9) list where literature and three proteomic studies have been compared, 196 of 1,275 proteins were reported in more than one study, and only 46 were reported in all four studies. It is currently not possible to extrapolate from the hemofiltrate to plasma or urine constituents, but it represents a little step toward mapping the enormous unknown of the human body fluid proteomes in health and disease.


    ACKNOWLEDGMENTS
 
We thank John Kloss for help with data base searching and Raghunath Reddy for creating helpful scripts. We thank Sun Microsystems for providing us a computer cluster under the Academic Equipment Grant Program.


    FOOTNOTES
 
Received, February 12, 2005

Published, MCP Papers in Press, February 20, 2005, DOI 10.1074/mcp.M500042-MCP200

1 The abbreviations used are: bis-tris, 2-[bis(2-hydroxyethyl)amino]-2-(hydroxymethyl)propane-1,3-diol; Tricine, N-[2-hydroxy-1,1-bis(hydroxymethyl)ethyl]glycine; FABP, fatty acid-binding protein; HIV-1, human immunodeficiency virus, type 1. Back

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Back

S The on-line version of this manuscript (available at http://www.mcponline.org) contains supplemental material. Back

Supported by a grant from the Danish Natural Sciences Research Council. Back

{ddagger}{ddagger} Supported by Beckman Young Investigator and Sidney Kimmel Scholar awards. Serves as Chief Scientific Advisor to the Institute of Bioinformatics. The terms of this arrangement are being managed by The Johns Hopkins University in accordance with its conflict of interest policies. To whom correspondence should be addressed. Tel.: 410-502-6662; Fax: 410-502-7544; E-mail: pandey{at}jhmi.edu


    REFERENCES
 TOP
 ABSTRACT
 EXPERIMENTAL PROCEDURES
 RESULTS AND DISCUSSION
 REFERENCES
 

  1. Anderson, N. L., and Anderson, N. G. (2002) The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell Proteomics 1, 845 –867[Abstract/Free Full Text]

  2. Maack, T. (2000) in The Kidney: Physiology and Pathophysiology (Seldin, D. W., and Giebisch, G. G., eds) 3rd Ed., pp.2235 –2267, Lippincott Williams & Wilkins, Philadelphia, PA

  3. Tirumalai, R. S., Chan, K. C., Prieto, D. A., Issaq, H. J., Conrads, T. P., and Veenstra, T. D. (2003) Characterization of the low molecular weight human serum proteome. Mol. Cell. Proteomics 2, 1096 –1103[Abstract/Free Full Text]

  4. Alpert, A. J., and Shukla, A. K. (2003) in Association of Biomolecular Resource Facilities Conference, Denver, Colorado, February 10–13, 2003, Abstr. P111-W, Association of Biomolecular Resource Facilities, Santa Fe, NM

  5. Pieper, R., Gatlin, C. L., Makusky, A. J., Russo, P. S., Schatz, C. R., Miller, S. S., Su, Q., McGrath, A. M., Estock, M. A., Parmar, P. P., Zhao, M., Huang, S. T., Zhou, J., Wang, F., Esquer-Blasco, R., Anderson, N. L., Taylor, J., and Steiner, S. (2003) The human serum proteome: display of nearly 3700 chromatographically separated protein spots on two-dimensional electrophoresis gels and identification of 325 distinct proteins. Proteomics 3, 1345 –1364[CrossRef][Medline]

  6. Raida, M., Schulz-Knappe, P., Heine, G., and Forssmann, W. G. (1999) Liquid chromatography and electrospray mass spectrometric mapping of peptides from human plasma filtrate. J. Am. Soc. Mass Spectrom. 10, 45 –54[CrossRef][Medline]

  7. Schepky, A. G., Bensch, K. W., Schulz-Knappe, P., and Forssmann, W. G. (1994) Human hemofiltrate as a source of circulating bioactive peptides: determination of amino acids, peptides and proteins. Biomed. Chromatogr. 8, 90 –94[Medline]

  8. Shevchenko, A., Wilm, M., Vorm, O., and Mann, M. (1996) Mass spectrometric sequencing of proteins from silver-stained polyacrylamide gels. Anal. Chem. 68, 850 –858[CrossRef][Medline]

  9. Anderson, N. L., Polanski, M., Pieper, R., Gatlin, T., Tirumalai, R. S., Conrads, T. P., Veenstra, T. D., Adkins, J. N., Pounds, J. G., Fagan, R., and Lobley, A. (2004) The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol. Cell. Proteomics 3, 311 –326[Abstract/Free Full Text]

  10. Chan, K. C., Lucas, D. A., Hise, D., Schaefer, C. F., Xiao, Z., Janini, G. M., Buetow, K. H., Issaq, H. J., Veenstra, T. D., and Conrads, T. P. (2004) Analysis of the human serum proteome. Clin. Proteomics 1, 101 –226[CrossRef]

  11. Davis, M. T., Spahr, C. S., McGinley, M. D., Robinson, J. H., Bures, E. J., Beierle, J., Mort, J., Yu, W., Luethy, R., and Patterson, S. D. (2001) Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. II. Limitations of complex mixture analyses. Proteomics 1, 108 –117[CrossRef][Medline]

  12. Pisitkun, T., Shen, R. F., and Knepper, M. A. (2004) Identification and proteomic profiling of exosomes in human urine. Proc. Natl. Acad. Sci. U. S. A. 101, 13368 –13373[Abstract/Free Full Text]

  13. Zhu, Y., Xu, G., Patel, A., McLaughlin, M. M., Silverman, C., Knecht, K., Sweitzer, S., Li, X., McDonnell, P., Mirabile, R., Zimmerman, D., Boyce, R., Tierney, L. A., Hu, E., Livi, G. P., Wolf, B., Abdel-Meguid, S. S., Rose, G. D., Aurora, R., Hensley, P., Briggs, M., and Young, P. R. (2002) Cloning, expression, and initial characterization of a novel cytokine-like gene family. Genomics 80, 144 –150[CrossRef][Medline]

  14. Peeters, R. A., Veerkamp, J. H., Geurts van Kessel, A., Kanda, T., and Ono, T. (1991) Cloning of the cDNA encoding human skeletal-muscle fatty-acid-binding protein, its peptide sequence and chromosomal localization. Biochem. J. 276, 203 –207[Medline]

  15. Phelan, C. M., Larsson, C., Baird, S., Futreal, P. A., Ruttledge, M. H., Morgan, K., Tonin, P., Hung, H., Korneluk, R. G., Pollak, M. N., and Narod, S. A. (1996) The human mammary-derived growth inhibitor (MDGI) gene: genomic structure and mutation analysis in human breast tumors. Genomics 34, 63 –68[CrossRef][Medline]

  16. Yoshikawa, Y., Mukai, H., Hino, F., Asada, K., and Kato, I. (2000) Isolation of two novel genes, down-regulated in gastric cancer. Jpn. J. Cancer Res. 91, 459 –463[Medline]

  17. Shiozaki, K., Nakamori, S., Tsujie, M., Okami, J., Yamamoto, H., Nagano, H., Dono, K., Umeshita, K., Sakon, M., Furukawa, H., Hiratsuka, M., Kasugai, T., Ishiguro, S., and Monden, M. (2001) Human stomach-specific gene, CA11, is down-regulated in gastric cancer. Int. J. Oncol. 19, 701 –707[Medline]

  18. Martin, T. E., Powell, C. T., Wang, Z., Bhattacharyya, S., Walsh-Reitz, M. M., Agarwal, K., and Toback, F. G. (2003) A novel mitogenic protein that is highly expressed in cells of the gastric antrum mucosa. Am. J. Physiol. 285, G332 –G343

  19. Holcomb, I. N., Kabakoff, R. C., Chan, B., Baker, T. W., Gurney, A., Henzel, W., Nelson, C., Lowman, H. B., Wright, B. D., Skelton, N. J., Frantz, G. D., Tumas, D. B., Peale, F. V., Jr., Shelton, D. L., and Hebert, C. C. (2000) FIZZ1, a novel cysteine-rich secreted protein associated with pulmonary inflammation, defines a new gene family. EMBO J. 19, 4046 –4055[Abstract/Free Full Text]

  20. Banerjee, R. R., Rangwala, S. M., Shapiro, J. S., Rich, A. S., Rhoades, B., Qi, Y., Wang, J., Rajala, M. W., Pocai, A., Scherer, P. E., Steppan, C. M., Ahima, R. S., Obici, S., Rossetti, L., and Lazar, M. A. (2004) Regulation of fasted blood glucose by resistin. Science 303, 1195 –1198[Abstract/Free Full Text]

  21. Cunningham, T. J., Hodge, L., Speicher, D., Reim, D., Tyler-Polsz, C., Levitt, P., Eagleson, K., Kennedy, S., and Wang, Y. (1998) Identification of a survival-promoting peptide in medium conditioned by oxidatively stressed cell lines of nervous system origin. J. Neurosci. 18, 7047 –7060[Abstract/Free Full Text]

  22. Porter, D., Weremowicz, S., Chin, K., Seth, P., Keshaviah, A., Lahti-Domenici, J., Bae, Y. K., Monitto, C. L., Merlos-Suarez, A., Chan, J., Hulette, C. M., Richardson, A., Morton, C. C., Marks, J., Duyao, M., Hruban, R., Gabrielson, E., Gelman, R., and Polyak, K. (2003) A neural survival factor is a candidate oncogene in breast cancer. Proc. Natl. Acad. Sci. U. S. A. 100, 10931 –10936[Abstract/Free Full Text]

  23. Daher, K. A., Lehrer, R. I., Ganz, T., and Kronenberg, M. (1988) Isolation and characterization of human defensin cDNA clones. Proc. Natl. Acad. Sci. U. S. A. 85, 7327 –7331[Abstract/Free Full Text]

  24. Date, Y., Nakazato, M., Shiomi, K., Toshimori, H., Kangawa, K., Matsuo, H., and Matsukura, S. (1994) Localization of human neutrophil peptide (HNP) and its messenger RNA in neutrophil series. Ann. Hematol. 69, 73 –77[Medline]

  25. Zhang, L., Yu, W., He, T., Yu, J., Caffrey, R. E., Dalmasso, E. A., Fu, S., Pham, T., Mei, J., Ho, J. J., Zhang, W., Lopez, P., and Ho, D. D. (2002) Contribution of human {alpha}-defensin 1, 2, and 3 to the anti-HIV-1 activity of CD8 antiviral factor. Science 298, 995 –1000[Abstract/Free Full Text]

  26. Linzmeier, R., Michaelson, D., Liu, L., and Ganz, T. (1993) The structure of neutrophil defensin genes. FEBS Lett. 326, 299 –300[Medline]

  27. Ashitani, J., Nakazato, M., Mukae, H., Taniguchi, H., Date, Y., and Matsukura, S. (2000) Recombinant granulocyte colony-stimulating factor induces production of human neutrophil peptides in lung cancer patients with neutropenia. Regul. Pept. 95, 87 –92[CrossRef][Medline]

  28. McCreary, V., Kartha, S., Bell, G. I., and Toback, F. G. (1988) Sequence of a human kidney cDNA clone encoding thymosin ß 10. Biochem. Biophys. Res. Commun. 152, 862 –866[CrossRef][Medline]

  29. Tulin, E. E., Onoda, N., Nakata, Y., Maeda, M., Hasegawa, M., Nomura, H., and Kitamura, T. (2001) SF20/IL-25, a novel bone marrow stroma-derived growth factor that binds to mouse thymic shared antigen-1 and supports lymphoid cell proliferation. J. Immunol. 167, 6338 –6347[Abstract/Free Full Text]

  30. Olsen, J. V., Ong, S. E., and Mann, M. (2004) Trypsin cleaves exclusively C-terminal to arginine and lysine residues. Mol. Cell. Proteomics 3, 608 –614[Abstract/Free Full Text]

  31. Ohyama, S., Harada, T., Chikanishi, T., Miura, Y., and Hasumi, K. (2004) Nonlysine-analog plasminogen modulators promote autoproteolytic generation of plasmin(ogen) fragments with angiostatin-like activity. Eur. J. Biochem. 271, 809 –820[Abstract/Free Full Text]

  32. Patterson, B. C., and Sang, Q. A. (1997) Angiostatin-converting enzyme activities of human matrilysin (MMP-7) and gelatinase B/type IV collagenase (MMP-9). J. Biol. Chem. 272, 28823 –28825[Abstract/Free Full Text]

  33. O’Reilly, M. S., Holmgren, L., Shing, Y., Chen, C., Rosenthal, R. A., Moses, M., Lane, W. S., Cao, Y., Sage, E. H., and Folkman, J. (1994) Angiostatin: a novel angiogenesis inhibitor that mediates the suppression of metastases by a Lewis lung carcinoma. Cell 79, 315[CrossRef][Medline]

  34. Hugli, T. E. (1975) Human anaphylatoxin (C3a) from the third component of complement. Primary structure. J. Biol. Chem. 250, 8293 –8301[Abstract]

  35. Polevoda, B., and Sherman, F. (2003) N-terminal acetyltransferases and sequence requirements for N-terminal acetylation of eukaryotic proteins. J. Mol. Biol. 325, 595 –622[CrossRef][Medline]

  36. Wrobe, D., Henseler, M., Huettler, S., Pascual Pascual, S. I., Chabas, A., and Sandhoff, K. (2000) A non-glycosylated and functionally deficient mutant (N215H) of the sphingolipid activator protein B (SAP-B) in a novel case of metachromatic leukodystrophy (MLD). J. Inherit. Metab. Dis. 23, 63 –76[CrossRef][Medline]

  37. Kretz, K. A., Carson, G. S., Morimoto, S., Kishimoto, Y., Fluharty, A. L., and O’Brien, J. S. (1990) Characterization of a mutation in a family with saposin B deficiency: a glycosylation site defect. Proc. Natl. Acad. Sci. U. S. A. 87, 2541 –2544[Abstract/Free Full Text]

  38. Pihlajaniemi, T., Myllyla, R., and Kivirikko, K. I. (1991) Prolyl 4-hydroxylase and its role in collagen synthesis. J. Hepatol. 13, Suppl. 3,S2 –S7[Medline]