ARTICLE

Serum Protein MALDI Profiling to Distinguish Upper Aerodigestive Tract Cancer Patients From Control Subjects

David Sidransky, Rafael Irizarry, Joseph A. Califano, Xianbin Li, Hening Ren, Nicole Benoit, Li Mao

Affiliations of authors: Department of Otolaryngology-Head and Neck Surgery, Head and Neck Cancer Research Division, The Johns Hopkins University School of Medicine, Baltimore, MD (DS, JAC, NB); Department of Biostatistics, The Johns Hopkins University Bloomberg School of Public Health, Baltimore (RI, XL); Department of Thoracic/Head and Neck Medical Oncology, University of Texas M. D. Anderson Cancer Center, Houston (HR, LM).

Correspondence to: David Sidransky, MD, Head and Neck Cancer Research, The Johns Hopkins University School of Medicine, 818 Ross Research Bldg., 720 Rutland Ave., Baltimore, MD 21205-2196 (e-mail: dsidrans{at}jhmi.edi) or Rafael Irizarry, Department of Biostatistics, The Johns Hopkins University, Bloomberg School of Public Health, 615 N. Wolfe St., Baltimore, MD 21205-2179 (e-mail: rafa{at}jhu.edu)


    ABSTRACT
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Background: There are no reliable blood markers for the early detection and monitoring of aerodigestive tract tumors. Recent studies have suggested that serum protein patterns may be able to distinguish cancer patients from control subjects. Methods: We used matrix-assisted laser desorption and ionization (MALDI) mass spectroscopy to obtain serum protein patterns from patients with head and neck cancer (n = 99) or lung cancer (n = 92) and from control subjects (n = 143) at risk for the development of these cancers. From the mass spectra, we predicted the cancer status of patients using a simple classification procedure based on a t test feature selection and linear discriminant analysis (LDA). We cross-validated the data with 200 random data simulations to establish a range of the LDA tuning parameter, which was used to construct receiver operating characteristic (ROC) curves. Results: Average total protein levels were higher in case patients than in control subjects, although the differences were not statistically significant. Ten individual m/z peaks, from 5 to 111 kd, appeared frequently in head and neck cancer patients but not in control subjects. Using the 45 top predictors, selected by spectral mass and LDA, we observed that ROC curves differed from those expected under the null hypothesis, suggesting that spectral profiles from the sera of patients with head and neck cancer statistically significantly differed from the sera of control subjects. The model developed on head and neck cancer patients could also be used to identify patients with lung cancer. Conclusions: The pattern of protein spectra in total serum reliably distinguished cancer case patients from control subjects. Incorporation of MALDI assays into prospective longitudinal trials to assess the true predictive values of protein spectra in cancer detection is needed.



    INTRODUCTION
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Lung cancer is one of the most common malignancies worldwide and is the leading cause of cancer death in humans (2). The high incidence and mortality of lung cancer is, in part, the result of frequent exposure to environmental factors such as smoking and the lack of effective early detection methods and treatments (3-6). Lung cancer is a complex disease that, on a pathologic level, is composed of two major histologic subtypes—small-cell and non–small-cell lung cancer. Non–small-cell lung cancers (NSCLCs) may be further divided into squamous cell carcinomas (SCCs), adenocarcinomas, large-cell carcinomas, and other subtypes. Surgical resection remains the only curative therapy for patients with NSCLC and is possible only for those with limited-stage disease. Thus, identifying lung cancer patients early in the disease process is a clinical and biologic challenge for improving survival in patients with NSCLC.

The use of tobacco products, particularly cigarette smoking, causes not only lung cancer but also head and neck cancer. Head and neck cancer has a high incidence of debilitating morbidity and high mortality in patients with advanced disease (8). In the United States, approximately 50 000 cases of head and neck cancer are diagnosed annually, the majority (90%) of which are SCCs (9). Although the most effective treatment for patients with head and neck cancer is surgical resection, more than 50% of all patients have advanced disease at the time of diagnosis. Consequently, a large percentage of patients who undergo surgical resection ultimately die of local or regional recurrence, suggesting that occult residual local or metastatic disease is often present at the time of diagnosis. Thus, early detection of head and neck cancer would potentially lead to more effective management of the disease.

Standard diagnostic techniques for both lung and head and neck cancer rely on direct or augmented visualization (11) and on imaging techniques such as spiral computed tomography or positron emission tomography (12-15). However, these cannot be used to detect tumors smaller than {approx}0.5–1.0 cm2 (representing approximately 109 cells). This limitation has spurred continued interest in identifying reliable tumor markers that can be detected in serum or blood. Although various serum tumor markers for lung and head and neck cancer have been identified, none have been integrated into general clinical practice because, as a rule, these markers have lacked adequate sensitivity and specificity for the clinic (16). Therefore, it is important to develop new methods that provide sensitive and reliable diagnostic markers for both lung and head and neck cancer (17).

Recently, there has been great interest in trying to identify quantitative or qualitative differences in serum protein components between cancer patients and control subjects. The existence of such differences is postulated on the basis that, when cancer cell products, including tumor-specific proteins, enter the circulation, they change the profile of circulating serum and/or plasma proteins. Serum profiling (i.e., the characterization of proteins, peptides, and macromolecules from serum) by surface-enhanced laser desorption/ionization (SELDI) mass spectroscopy (19) coupled with statistical algorithms has been used to distinguish patients with cancer from control subjects and patients with benign conditions (20-23). Whether serum profiling will prove to be sensitive and reliable in the diagnosis of all cancers is unknown.

Here, we hypothesized that the protein spectra from patients with head and neck cancer would be different qualitatively and quantitatively from the protein spectra of control subjects. To test this hypothesis, we used matrix-assisted laser desorption and ionization (MALDI) mass spectroscopy to determine the serum protein profiles from patients with head and neck cancer or lung cancer and control subjects at risk for the development of these cancers. We analyzed the spectra using a simple classification procedure based on a t test feature-selection procedure and linear discriminant analysis (LDA). We then assessed whether the model could be used to distinguish lung cancer patients from the same control subjects.


    SUBJECTS AND METHODS
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Study Population

A total of 341 serum samples were obtained from archived aliquots stored in the head and neck cancer tissue and human materials sample bank at The Johns Hopkins University. Samples were stored and maintained in the sample bank under approval by The Johns Hopkins University Institutional Review Board.

This study included sera from 191 cancer patients (99 patients with SCC of the head and neck and 92 patients with NSCLC), seven patients with pulmonary disease but free of cancer (confounding patients), and 143 control subjects. All sera from case patients were obtained at or after diagnosis but before treatment. The control subjects were chosen randomly from a community cancer-screening program that targeted individuals at increased risk of lung and head and neck cancers. Demographic information was collected by using an on-site questionnaire. This community cancer-screening program includes a higher percentage of individuals with a smoking and drinking history than the percentage found among the general population. We reasoned that these control subjects would be appropriate to define markers used in high-risk populations for eventual serum-screening approaches. All of the control subjects were believed to be free of cancer on the basis of clinical history and physical examination. However, no additional imaging approaches or routine marker assays were performed on the control subjects. Sera from control subjects and case patients were collected and stored in an identical fashion in the sample bank as aliquots at –80 °C. Patient demographic, tumor staging, and pathology information was collected from an institutional clinical database that was linked to the sample bank. All study participants provided informed consent.

Serum Preparation and Mass Spectrum Acquisition

Before the analysis, an aliquot of each serum was thawed at room temperature and mixed briefly by vortexing. To prepare the samples for MALDI mass spectroscopy, each serum sample was diluted 1 : 100 by first adding 5 µL of serum to 45 µL of a 1% solution of n-octyl{beta}-D-glucopyranoside (a detergent that allows proteins to be isolated in their native state) (Sigma Aldrich, St. Louis, MO) and then adding 5 µL of this solution to 45 µL of distilled water. The final sample solution contained 1% serum in 0.1% n-octyl{beta}-D-glucopyranoside. A 50% saturated solution of sinapinic acid (Sigma Aldrich), to be used as the MALDI matrix, was prepared in 30% acetonitrile–0.1% trifluoroacetic acid. Equal volumes of a diluted serum sample and the matrix solution (0.5 µL each) were mixed and added to a stainless steel sample plate (specifically designed for the MALDI mass spectrometer) that contained defined areas for individual samples. The plate was shielded from strong light, and the samples were air-dried.

Mass spectroscopy was performed in a blinded fashion (i.e., with no knowledge of whether a sample was from a case patient or control subject) by using a Kratos AXIMA CFR (Shimadzu Biotech, Chestnut Ridge, NY) mass spectrometer operated in a linear mode. The following parameters were set for the data acquisition: mass range from 0 to approximately 180 000 d; laser power at 90; profile at 300; and five shots per spot. The instrument was calibrated using m/z ratios for the standards bovine serum albumin, aldolase, apomyoglobin, and cytochrome c. m/z ratios were determined for all data points.

Statistical Analysis

We treated patient status as a binary outcome (i.e., no cancer or cancer) denoted Y. For each individual, the mass spectrum contained 284 027 data points. Each data point was dissected from the mass spectrometer signal. We then simplified the data by considering only every 100th value in the individual spectra to considerably reduce the volume of data and the length of computing time. From some preliminary evaluations of the data, this reduction in the volume of data did not affect the final results (data not shown). For spectral data, observations with high mean values tended to have larger variances than observations with low mean values. Thus, the spectral values were log-transformed to reduce the mean–variance dependence. Because we wanted to predict outcomes using the mass spectra, the log-transformed spectra were designated as predictors or covariates and denoted as X = X1,...,X2840.

We used LDA with the spectral masses as predictors of outcome (24). To reduce the number of predictors, i.e., to make the calculations simpler, we used a simple feature-selection procedure that is based on the ratio of the across-group variance to the within-group variance (equivalent to the t test) comparing the values in control subjects with those in patients with head and neck cancer. We ranked all spectral masses by their absolute value of the t test and chose only the highest P (P = 45 top predictors, see below) to include in the LDA. To assess the predictive ability of our procedure, we used a cross-validation protocol in which we randomly chose two-thirds of the data as a training set and the other one-third as a test set to determine how well we could predict a cancer case. The 45 predictors were recalculated with each training set (n = 200).

We considered false-positive and true-positive results only in the test set. We created 200 cross-validation data sets by considering 200 randomly chosen groupings of the subjects. The average false-positive and true-positive rates, which were generated from the cross-validation data sets, were considered measures of the predictive ability of our procedure. To compute the expected false-positive and true-positive rates under the null hypothesis that the spectra lack predictive ability, we repeated this procedure after randomly permuting the outcomes (Y).

The specificity (false-positive) and sensitivity (true-positive) rates derived from the LDA can be altered by using a simple stochastic model. We assumed that the predictors X followed a multivariate normal distribution that was conditional on the binary outcome (Y). To predict Y for a particular value of X, we found the value of Y that maximized the posterior probability, which is the probability of being a case or a control given the observed spectra. We assigned a prior probability to each value of Y. These prior probabilities were used to control sensitivity and specificity. For example, if the prior probability of being a case patient was assumed to be 0, then the false-positive and true-positive rates would be 0%. If the prior probability was assumed to be 1, then the false-positive rate is maximized and the true-positive rate would be 100%. We used the training data to estimate the parameters (mean and covariance matrix) associated with each of the conditional distributions (i.e., the probability of observing the spectra for a case patient or a control subject). As already noted, with LDA it is possible to set a tuning parameter that directly affects the balance between sensitivity and specificity (25). Therefore, we used the cross-validation results for a range of tuning parameters to construct receiver operating characteristic (ROC) curves. Because theoretical P values are intractable (due to the complicated nature of our procedure), we estimated a P value based on the 200 simulations.

To obtain the mean false-positive and true-positive rates per subject, we considered the number of times that correct and incorrect calls were made over the 200 simulations. We then compared these false-positive and true-positive rates across different groups of subjects stratified by the covariates sex, age, stage of disease, smoking history (pack-years), and history of alcohol use using the general linear methods function in R (26).


    RESULTS
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
We sought to test the hypothesis that serum proteins from cancer patients differ quantitatively or qualitatively from serum proteins from individuals without cancer (control subjects). To test this hypothesis, we selected a population of 191 patients with lung or head and neck cancer and 143 control subjects that included a higher frequency of individuals who smoked and/or drank than the frequency found among the general population (Table 1). Diluted serum samples were subjected to MALDI mass spectroscopy operated in a linear mode, with data acquired from 0 to 180 kd (14). We made no a priori assumptions about which proteins to analyze, i.e., based on mass cutoffs, fractionation schemes, or apparently large spectra peaks (m/z values).


View this table:
[in this window]
[in a new window]
 
Table 1. Demographics of the study population

 
Our first approach was to see whether average spectra differed between case patients and control subjects. We trained the procedure on two-thirds of the serum samples from the head and neck cancer patients and control subjects for average protein level across the spectra and then tested the predictor profile for the remaining one-third of the serum samples. In the test set, we classified head and neck cancer patients as those individuals who had a mass protein spectra that was closer to the average spectra of the case patients used in the training set than it was to the average spectra from the control subjects. The entire procedure was repeated 200 times. Although peak height and amplitude did not necessarily correspond with molecular concentrations in MALDI spectra (data not shown), we found that, on average, protein levels were higher in case patients than in control subjects and that there was some information in total protein levels to help distinguish case patients from control subjects. However, after the data were plotted on ROC curves, repeated simulations regularly overlapped with the null hypothesis (i.e., were not statistically significant) (Fig. 1). We concluded that an average protein level was not very informative: at a mean sensitivity of 70%, the false-positive rate was approximately 50%.



View larger version (59K):
[in this window]
[in a new window]
 
Fig. 1. Receiver operating characteristic (ROC) curves for observed data based on average protein prediction (solid line) and the null hypothesis (dashed line). Data were divided randomly 200 times into training and test sets, and ROC curves were generated. The thick dashed diagonal line represents the expected ROC curve under the null hypothesis, in which X and Y are independent and there is no information in the spectra about the outcomes. The gray dashed lines represent null permutations, whereas gray solid lines represent spectral data permutations. Spectral data included all the head and neck cancer patients and control subjects.

 
We then extracted information from the points along the entire mass spectra by treating the data as one continuous curve from 0 to 180 kd along the x-axis. Again, we did not preselect observable peaks, m/z values, or specific areas of the spectra. We selected the optimal number of spectral features to use in the LDA. For each value of P (number of features), the area under the ROC curves obtained using the cross-validation procedure described above was calculated. This provided a function of area under the curve on the y-axis and the number of covariates on the x-axis. The area under the ROC curve is a typical one-number summary of an ROC curve. However, because we were interested in procedures with high specificity, we considered the area under the curve for false-positive rates up to 10% (0.10). We plotted these areas against the number of features used by the LDA (Fig. 2). The maximum area under the ROC curve value occurred when we used 45 features (Fig. 2). We thus defined a feature-selection procedure that selects, as predictors in the LDA, the top 45 spectral masses in a ranking according to the absolute value of the t test. Next, we randomly chose two-thirds of the data to train the procedure and the other one-third to test the procedure. By considering false-positive and true-positive rates in only the test set, average rates in the test set provided a measure of prediction.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 2. Area under the receiver operating characteristic (ROC) curves for false-positive rates between 0 and 1 (solid line) and area under the ROC curves for false-positive rates between 0 and 0.10 (dashed line) plotted against number of features (P) used in linear discriminant analysis. Vertical lines show the maximum occurrence for each curve. Data include all head and neck cancer patients for each value of P. Area under the ROC curves was calculated using a cross-validation procedure (see "Subjects and Methods" section).

 
We first developed our procedure to distinguish sera from the 99 patients with head and neck SCC from the 143 control subjects. Our models accounted for the possibility that the procedure might be applied in settings where the prevalence of head and neck SCC differed from that used here [99/(99 + 143) = 41%]. We predicted outcomes for the test sets on the basis of training sets of randomly chosen divisions of the data as described above. To be sure that the predicted outcomes were not the result of mathematical artifacts, we repeated the procedure 200 times after randomly permuting the outcomes of Y. We calculated the specificity and sensitivity of each model across a range of cutoffs, generated an ROC curve for each of the 200 permutations, and averaged the ROC curves (Fig. 3). The average ROC curve was computed by averaging the true-positive rate associated with each false-positive rate. We found that, at the mean outcome with a sensitivity of 70% (0.70) at a specificity of 90% (0.90), the 200 permutations never intersected with the null hypothesis (P = .01, 95% confidence interval = 0.00 to 0.02). Because these ROC curves were always calculated on data independent from the data that generated the models, they reflect what would be expected in practice and demonstrate that our prediction model is statistically significantly better than the null hypothesis.



View larger version (48K):
[in this window]
[in a new window]
 
Fig. 3. Average receiver operating characteristic (ROC) curves for observed data (solid line) and the null hypothesis (dashed line). Data were divided randomly 200 times into training and test sets, and ROC curves were generated. Thick dashed diagonal line represents the expected ROC curve under the null hypothesis, in which X and Y are independent and there is no information in the spectra about the outcomes. Gray dashed lines represent null permutations, whereas gray solid lines represent spectral data permutations. Spectral data included all data from the head and neck cancer patients and control subjects. Numbers shown on the curves represent the value of linear discriminant analysis tuning parameters that yielded specificity and sensitivity represented by the respective black squares and generated by the cross-validation procedure described in the "Subjects and Methods" section.

 
The model developed on the head and neck cancer case patients was then used to determine whether it could predict the presence of lung cancer. By using the optimal head and neck cancer model cutoff of 73% sensitivity and 90% specificity, we obtained sensitivities of 52% for lung SCC, 34% for lung adenocarcinomas, and 40% for lung large-cell carcinomas when the false-positive rate was 10% (Table 2). These sensitivities were unexpected, given the histologic diversity in this heterogenous group of lung tumors and the fact that the model was developed by using head and neck cancers. Moreover, when the seven individuals without cancer but with confounding conditions (acute pneumonia or other inflammatory conditions) were included with the lung cancer patients (Table 2), the model predicted that all seven were free of disease, suggesting that certain comorbid conditions did not increase the false-positive rate (i.e., decrease the specificity).


View this table:
[in this window]
[in a new window]
 
Table 2. Key determinate prediction from head and neck model tested in serum samples from patients with lung cancer (n = 92) and individuals without cancer but with confounding diseases (n = 7)

 
We next assessed how specific demographic factors in the tested populations might have influenced the ability of our models to predict correctly who had cancer. By multivariable logistic regression analysis, we found no differences in prediction across disease stage, race/ethnicity, sex, or smoking history in either the head and neck or lung cancer populations (data not shown). For individuals with a history of heavy alcohol consumption, we found a trend toward a higher false-positive rate among the control group (data not shown). This analysis thus suggested the possibility of a slight overprediction for heavy alcohol exposure in our models.

We created a summary of the average spectra for head and neck cancer case patients and control subjects (Fig. 4). Sera from head and neck cancer case patients generally contained more total protein than sera from control subjects (Fig. 4, upper portion). The lower portion of Fig. 4 is a histogram distribution of individual points, demonstrating the number of times the points emerged as features during the 200 random divisions of the data. The most frequently appearing points correspond to positions where peaks appeared or disappeared in the head and neck cancer samples. One particular peak, at 111 kd, was different between sera from case patients and control subjects in all 200 simulations. Many peaks represent proteins of less than 70 kd (i.e., 5, 10, 12, 15, 20, 45, 47, 54, and 64 kd) and may represent interesting molecules and candidate serum cancer markers that merit further identification (28-30).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 4. Differences in average mass spectra between case patients (solid line) and control subjects (dashed line). Average spectra are derived from all 99 head and neck cancer patients and 143 control subjects. The frequency at which features were selected during the 200 random divisions of the data into training and test sets is shown in the bottom panel. The range of the y-axis (0%–100%) is for spectral peaks occurring in case patients but not control subjects.

 

    DISCUSSION
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
We sought to classify case patients and control subjects on the basis of mass protein spectra of their sera and found that using a simple feature-selection procedure combined with the LDA allowed us to predict the cancer status of case patients statistically significantly better than using chance alone. This procedure was not based on any prior feature selection from spectral peaks. Because we generated ROC curves and tested these on data independent of those used in the training sets, we believe the results reflect accurately what would be expected in clinical practice.

Recently, at least two other studies (21) have sought to identify differences in sera protein spectra between case patients and control subjects. Both studies used SELDI, a separation approach that preselects proteins in a sample by fractionation based on prebinding to different surfaces or chemical coatings, and analyzed only a subset of proteins over a small range of m/z ratios. In one study (20), sinapinic acid was used to fractionate serum proteins, but only proteins between 2 and 40 kd were examined. In the other study (21), proteins were ionized with {alpha}-cyano-4-hydroxycinnamic acid, and only proteins less than 20 kd were examined. By contrast, we used MALDI technology and studied proteins that ranged up to 180 kd in size. MALDI is more straightforward technically than SELDI because it requires no washing procedures, which simplifies the sample processing and may also reduce the possibility of introducing artifacts. We identified many protein peaks at higher molecular masses (e.g., 111 kd) in sera from cancer patients that were usually absent in sera from control subjects. We believe that these higher molecular mass peaks may represent intact circulating proteins and not fragments or nonprotein macromolecules, which may contribute to some of the lower molecular mass peaks. Moreover, because MALDI does not require modification of the proteins, we will be able to fractionate, digest, and identify the protein peaks to identify key molecules likely to play a role in lung and head and neck cancer tumorigenesis (28-30).

In general, control populations are limited by a low prevalence of pertinent risk factors for the disease to be tested. Our control patient population was selected to include patients at risk for aerodigestive tract cancers and, thus, our study provides information on the practical use of cancer detection by using MALDI. The cancer patients in our study represent typical patients who undergo attempts at curative treatment in an academic medical institution. Our lung cancer patient population also consisted of individuals with different histologic diagnoses, which is characteristic of NSCLC. One advantage of this diverse population was that it allowed us to optimize the LDA for SCC (or any specific histologic type) and to detect other histologic types of NSCLC (Table 2 and data not shown). These results suggest that a pattern of serum protein markers can ultimately be identified that will enable the detection of most major histologic types of lung cancer.

Our control population included individuals that were closely matched with the case patients and fit the demographic profile for those at increased risk of both head and neck and lung cancer. Approximately 33% of our control subjects were smokers and 15% were heavy drinkers, and thus the control subjects represent a reasonable target population for eventual cancer screening. When the heavy drinkers (more than one drink/day) were included in the models, the rate of false-positive results increased slightly (i.e., the models overpredicted). One explanation for this result is that the overfitting of the data could be associated with factors expressed in the sera of heavy alcohol drinkers (and perhaps smokers) that will continue to confound prediction algorithms in exposed populations. An alternative explanation is that patients at high risk for aerodigestive tract cancers may already harbor premalignant lesions or small occult cancers that will be discovered with time. It was reassuring to find that a small number of individuals with no history of drinking alcohol but with confounding conditions could be identified as free of cancer when they were included among the lung cancer population. A larger number of control subjects with clear risk factors and comorbid or confounding conditions will help validate serum protein patterns and ascertain the ultimate value of such tests. In addition, peak differences remain to be tested prospectively in an entirely independent population of case patients and control subjects.

Statistical analyses for protein spectral data continue to be developed. Currently, two common methods are cluster analysis and decision tree classification analysis. A recent example of cluster analysis for protein spectral data can be found in a study on ovarian cancer (21). The analysis was based on genetic algorithms first described by Holland (31) and on a cluster analysis attributed to Kohonen (32,33). In phase I of the data analysis (i.e., the training phase), the algorithm first identified a small subset of key values across the x-axis using an iterative searching process. A subset of values was judged important because the y amplitude patterns at the specific m/z values segregated case patients from control subjects. In phase II of the analysis (i.e., the test phase), only the key subset of m/z values identified in phase I was used to classify the unknown samples. Each unknown was then classified as a case patient, a control subject, or a new cluster (http://clinicalproteomics.steem.com). A recent example of a decision tree classification analysis for protein spectral data can be found from the same group in a study on prostate cancer (20). A decision tree classification analysis was used to split the dataset into two bins by using one rule or question at a time. This splitting of the dataset continued until terminal nodes or leaves were produced and further splitting provided no additional gain. The classification of terminal nodes was determined by the class of samples (case patients, control subjects, or subjects with benign disease) representing the majority of samples in that node. Protein peaks selected by this process to form splitting rules were the ones that achieved the maximum reduction of cost (or complexity) in the two descendant nodes. The area under the curve was then computed to identify the m/z peaks with the highest potential to discriminate between the major groups or class of samples. A Bayesian approach was used to calculate the expected probabilities of each class in each terminal node. Cluster algorithms such as these do not necessarily take advantage of thoroughly training the data.

To analyze our data, we searched for a procedure that could reliably predict the outcome Y given the protein spectra X. For this problem, we were more interested in correct predictions than in interpretation of model parameters, which made the typical logistic regression model impractical. We selected a procedure that could reliably predict outcomes between case patients and control subjects regardless of the interpretability of the individual parameters. Thus, we did not preselect key features that would likely raise the predictability but lower the reproducibility in different populations. The sensitivity of our procedure could be raised by various simple approaches, including better alignment of peaks from run to run and further refinement of key predictors (34). Moreover, we used only one-tenth of the data; more input data could result in further refinements of the model.

Our study thus used a standard statistical analysis to test the hypothesis that serum mass spectra contain sufficient information to separate case patients from control subjects (24). Unlike other studies (19-23), which reported the best-case analysis, our study was a fundamental test of the hypothesis, the results of which gave a reliable estimate of what can be routinely achieved with such an experimental approach. Indeed, inspection of the ROC curves shows that they approach high sensitivity and specificity similar to those sensitivities and specificities reported in some earlier studies (20-21). Clearly, a number of statistical approaches can generate the relevant m/z peaks for group classification. Studies (19-23) on serum mass spectra published so far have all shown that several peaks are needed to separate case patients from control subjects. No single peak, and thus no single protein, is able to distinguish case patients from control subjects reliably. Serum-profiling approaches are a substantial advance over previous attempts to isolate and characterize a single protein marker to detect any of a variety of cancers. Indeed, investigators using a variety of molecularly based approaches in serum DNA have come to the same conclusion—the need to use more than one molecular target in cancer detection (35).

Although single serum biomarkers for cancer such as the CA 125 antigen for ovarian cancer or prostate-specific antigen for prostate cancer have been identified, they have several limitations including limited sensitivity and recurring false-positive results. Studies have yet to identify protein biomarkers for lung or head and neck cancers that are suitable for evaluation in screening trials. We have now identified m/z peaks that may yield such markers in these diseases. Our study suggests that a number of informative patterns and key protein markers may be identified in the near future for different cancers using similar serum-based approaches. One benefit of such an approach is that MALDI and SELDI assays are known to be reliable and highly reproducible. Moreover, after the key proteins are identified, antibodies can be used to develop high-throughput, low-cost immunoassays. These assays can then be incorporated into prospective longitudinal trials to assess the true predictive values of these proteins in cancer detection (36).


    NOTES
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Supported by Public Health Service grant 1 U01 CA84986 (to D. Sidransky) from the National Cancer Institute, National Institutes of Health, Department of Health and Human Services, as part of The Early Detection Research Network.

Cangen provided partial funding for the research described in this article. Under a licensing agreement between Cangen and The Johns Hopkins University, Dr. Sidransky is entitled to a share of royalties received by the University on sales of products described in this study. Dr. Sidransky and the University own Cangen stock, which is subject to certain restrictions under University policy. The terms of this arrangement are being managed by The Johns Hopkins University in accordance with its conflict of interest policies.


    REFERENCES
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 

1 Cancer: causes, occurrence and control. IARC Sci Publ 1990;(100):1–352.

2 Wingo PA, Tong T, Bolden S. Cancer statistics, 1995. CA Cancer J Clin 1995;45:8–30.[Abstract/Free Full Text]

3 Smart CR. Annual screening using chest x-ray examination for the diagnosis of lung cancer. Cancer 1993;72:2295–8.[ISI][Medline]

4 Frost JK, Ball WC Jr, Levin ML, Tockman MS, Baker RR, Carter D, et al. Early lung cancer detection: results of the initial (prevalence) radiologic and cytologic screening in the Johns Hopkins study. Am Rev Respir Dis 1984;130:549–54.[ISI][Medline]

5 Birrer MJ, Brown PH. Application of molecular genetics to the early diagnosis and screening of lung cancer. Cancer Res 1992;52(9 Suppl):2658s–64s.[Abstract]

6 Eddy DM. Screening for lung cancer. Ann Int Med 1989;111:232–7.[ISI][Medline]

7 Blot WJ, McLaughlin JK, Winn DM, Austin DF, Greenberg RS, Preston-Martin S, et al. Smoking and drinking in relation to oral and pharyngeal cancer. Cancer Res 1988;48:3282–7.[Abstract]

8 Forastiere A, Koch W, Trott A, Sidransky D. Head and neck cancer [published erratum appears in N Engl J Med 2002;346:788]. N Engl J Med 2001;345:1890–900.[Free Full Text]

9 American Cancer Society. Cancer facts & figures–1997. Atlanta (GA): American Cancer Society; 1997.

10 Palcic B, Lam S, Hung J, MacAulay C. Detection and localization of early lung cancer by imaging techniques. Chest 1991;99:742–3.[ISI][Medline]

11 Lam S, Kennedy T, Unger M, Miller YE, Gelmont D, Rusch V, et al. Localization of bronchial intraepithelial neoplastic lesions by fluorescence bronchoscopy. Chest 1998;113:696–702.[Abstract/Free Full Text]

12 Swenson SJ, Jett JR, Sloan JA, Midthun DE, Hartman TE, Sykes AM, et al. Screening for lung cancer with low-dose spiral computed tomography. Am J Respir Crit Care Med 2002;165:508–13.[Abstract/Free Full Text]

13 Altorki N, Kent M, Pasmantier M. Detection of early-stage lung cancer: computed tomographic scan or chest radiograph? J Thorac Cardiovasc Surg 2001;121:1053–7.[Abstract/Free Full Text]

14 Vansteenkiste JF. Imaging in lung cancer: positron emission tomography scan. Eur Respir J Suppl 2002;35:49s–60s.[Medline]

15 Haberkorn U. Positron emission tomography (PET) of non-small cell lung cancer. Lung Cancer 2001;34(Suppl 2):S115–21.[CrossRef][ISI][Medline]

16 Hansen M, Pedersen AG. Tumor markers in patients with lung cancer. Chest 1986;89(4 Suppl):219S–24S.[Abstract]

17 Mulshine JL. Opinion: screening for lung cancer: in pursuit of pre-metastatic disease. Nat Rev Cancer 2003;3:65–73.[CrossRef][ISI][Medline]

18 Merchant M, Weinberger SR. Recent advancements in surface-enhanced laser desorption/ionization-time of flight-mass spectrometry. Electrophoresis 2000;21:1164–77.[CrossRef][ISI][Medline]

19 Wright GL Jr, Cazares LH, Leung SM, Nasim S, Adam BL, Yip TT, et al. Proteinchip (R) surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel proteomic technology for detection of prostate cancer biomarkers in complex protein mixtures. Prostate Cancer Prostatic Dis 1999;2:264–76.[CrossRef][ISI][Medline]

20 Adam BL, Qu Y, Davis JW, Ward MD, Clements MA, Cazares LH, et al. Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia. Cancer Res 2002;62:3609–14.[Abstract/Free Full Text]

21 Petricoin EF III, Ardekani AM, Hitt BA, Levine PJ, Fusario VA, Steinberg SM, et al. Use of proteomic patterns in serum to identify ovarian cancer. Lancet 2002;359:572–7.[CrossRef][ISI][Medline]

22 Valerio A, Basso D, Mazza S, Baldo G, Tiengo A, Pedrazzoli S, et al. Serum protein profiles of patients with pancreatic cancer and chronic pancreatitis: searching for a diagnostic protein pattern. Rapid Commun Mass Spectrom 2001;15:2420–5.[CrossRef][ISI][Medline]

23 Li J, Zhang Z, Rosenzweig J, Wang YY, Chan DW. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem 2002;48:1296–304.[Abstract/Free Full Text]

24 Fisher RA. The use of multiple measurements in taxonomic problems. Ann Eugen 1936;7:179–88.

25 Venables WN, Ripley BD. Modern applied statistics with S. 4th ed. New York (NY): Springer; 2002.

26 Ihaka R, Gentleman R. A language for data analysis and graphics. J Comput Graph Stat 1996;5:299–314.

27 Sobin LH, Hermanek P, Hutter RV. TNM classification of malignant tumors. A comparison between the new (1987) and the old editions. Cancer 1988;61:2310–4.[ISI][Medline]

28 Srinivas PR, Verma M, Zhao Y, Srivastava S. Proteomics for cancer biomarker discovery. Clin Chem 2002;48:1160–9.[Abstract/Free Full Text]

29 Petricoin EF, Zoon KC, Kohn EC, Barrett JC, Liotta LA. Clinical proteomics: translating benchside promise into bedside reality. Nat Rev Drug Discov 2002;1:683–95.[CrossRef][ISI][Medline]

30 Pardanani A, Wieben ED, Spelsberg TC, Tefferi A. Primer on medical genomics. Part IV: expression proteomics. Mayo Clin Proc 2002;77:1185–96.[ISI][Medline]

31 Holland JH, editor. Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence, 3rd ed. Cambridge (MA): MIT Press; 1994.

32 Ferran EA, Ferrara P. Topological maps of protein sequences. Biol Cybern 1991;65:451–8.[ISI][Medline]

33 Kohonen T. The self-organizing map. Proc Inst Electr Electron Eng 1990;78:1464–80.

34 Chakravarti DN, Chakravarti B, Moutsatsos I. Informatic tools for proteome profiling. Biotechniques 2002;Suppl:4–10, 12–5.

35 Sidransky D. Emerging molecular markers of cancer. Nat Rev Cancer 2002;2:210–9.[CrossRef][ISI][Medline]

36 Sullivan Pepe M, Etzioni R, Feng Z, Potter JD, Thompson ML, Thornquist M, et al. Phases of biomarker development for early detection of cancer. J Natl Cancer Inst 2001;93:1054–61.[Free Full Text]

Manuscript received April 3, 2003; revised September 4, 2003; accepted September 22, 2003.


This article has been cited by other articles in HighWire Press-hosted journals:


             
Copyright © 2003 Oxford University Press (unless otherwise stated)
Oxford University Press Privacy Policy and Legal Statement