Histomathematical Analysis of Clinical Specimens : Challenges and Progress
Pathogenetics Unit, Laboratory of Pathology and Urologic Oncology Branch (GG,JWG,RFC,MAT,MRE-B), and Urologic Oncology Branch (WML), Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
Correspondence to: Michael R. Emmert-Buck, Pathogenetics Unit, Advanced Technology Center, Laboratory of Pathology and Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, 8717 Grovemont Circle, Bethesda, MD 20892-4605. E-mail: mbuck{at}helix.nih.gov
![]() |
Summary |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
(J Histochem Cytochem 53:177185, 2005)
Key Words: histomathematics histopathology mathematics immunohistochemistry prostate cancer
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Historically, investigators have semiquantitatively measured the level of one (or perhaps two) transcripts or proteins at a time in tissue sections using in situ hybridization or immunohistochemistry (IHC), respectively. More recently, the use of tissue microdissection has facilitated profiling of specific, dissected cell populations. This technique is a useful advance, but the analysis is limited to the relatively few cell types that are procured in a study, and a comprehensive view of histopathology is not obtained.
The molecular profiling field will advance most efficiently when investigators have a range of analysis technologies at their disposal, that is, a complete set of tools that can be utilized alone or in combination depending upon the particular goals of a study. Thus, in addition to developing new analysis methods, several groups are experimenting with multiplex expression measurements based on conventional methods such as IHC. The hope is to build and expand upon an established platform with which investigators are already experienced. Along these lines, we evaluated the feasibility of performing and analyzing multiplex immunohistochemical data from prostate tissue sections. Whole-mount prostate cases were utilized because each case has several different histological areas of interest that can be stained, examined, and analyzed simultaneously. This same approach can be used for studying many tissue types, including brain, developing embryos, or any organ exhibiting a disease process. The present study represents a step toward generating mathematical descriptions of histopathology.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Immunohistochemistry
Tissue sections were cut to 5-µm thickness and stained with various antibodies according to a standard IHC protocol (Zymed Histostain-Plus Kit; Zymed, South San Francisco, CA). Briefly, IHC reactions were carried out on non-pretreated sections according to the following conditions: sections were blocked for 10 min with blocking serum (Zymed). Sections were incubated for 1 hr at room temperature with one of the primary antibodies: prostate-specific antigen (PSA), monoclonal (cat. #MS-1406-PABX; NeoMarkers, Fremont, CA) dilution 1:25; HSP27 (heat shock protein), monoclonal (cat. #NCL-HSP27; Novocastra, Newcastle, UK) dilution 1:20; Pim-1, goat polyclonal (cat. #CACA sc-7856; Santa Cruz Biotechnologies, Inc., Santa Cruz, CA) dilution 1:100; histone H1, monoclonal (cat. #MS-628-PABX; NeoMarkers) dilution 1:20; P130/Rb2, monoclonal (cat. #MS-866-PABX; NeoMarkers) dilution 1:20; glyceraldehyde-3-phosphate dehydrogenase (GAPDH), monoclonal (cat. #4699-9555; Biogenesis, Kingston, NH) dilution 1:10; cytokeratin clone AE1/AE3, monoclonal (cat. #M3515; DAKO, Carpinteria, CA) dilution 1:50; anti-ubiquitin, monoclonal (cat. #13-1600; Zymed) dilution 1:10; vimentin, monoclonal (cat. #M0725; DAKO) dilution 1:15; CD3, rabbit anti-human (cat. #08-0102; Zymed) prediluted; cathepsin-D mouse anti-human (cat. #28-0002; Zymed) prediluted; S-100 rabbit anti-human (cat. #08-4046; Zymed) prediluted; smooth-muscle actin (SMA), rabbit anti-human (cat. #08-0106; Zymed) prediluted; pan ERK monoclonal (cat. #610123; BD Biosciences, San Jose, CA) dilution 1:20; phospho-p44/42 MAP kinase (ERK) polyclonal rabbit (cat. #9101S; Cell Signaling Technology, Inc., Beverly, MA) dilution 1:15; p53 monoclonal (cat. #M7001; DAKO) dilution 1:10; Ki-67 polyclonal rabbit (cat. #sc15402; Santa Cruz Biotechnologies, Inc.) dilution 1:5; p27Kip1 monoclonal (cat. #MS-256-PABX; NeoMarkers) dilution 1:20; p21WAF1 monoclonal (cat. #MS-230-PABX0; NeoMarkers) dilution 1:10; Her-2-c-erbB2 monoclonal (cat. # NCL-L-CB11; Novocastra) dilution 1:20; phospho-EGF receptor monoclonal (Upstate Biotechnology; Lake Placid, NY) dilution 1:10; caspase-3 rabbit polyclonal (cat. #67341A; Pharmingen, San Diego, CA) dilution 1:20; Akt rabbit polyclonal (cat. #9272; Cell Signaling Technology, Inc.) dilution 1:20; phospho Akt (cat. #9277S; Cell Signaling Technology, Inc.) dilution 1:10.
Sections were incubated for 10 min with a biotinylated secondary antibody, and a signal was detected by streptavidin-peroxidase using 3-amino-9 ether-carbazol chromogen as a substrate for peroxidase. Slides were counterstained with hematoxylin. Positive reactions (i.e., positively stained cells) were identified by the presence of a red precipitate. Negative reactions (i.e., negative cells) were identified by the absence of a red precipitate and only blue counterstain.
Data Collection
Two pathologists (JG and RC) analyzed all IHC-stained tissue sections and photographed representative stained regions corresponding to four separate morphological areas:
The stained sections were photographed (constant tissue area of 0.1 mm2) using a magnification of x200 with an Olympus microscope and a charged-coupled device (CCD) camera. The camera used had a resolution of 2080 x 1542 pixels (CCD color bayer mosaic; Q-Color-3, Olympus America Inc., Melville, NY). Seven images were taken per slide, including one image of each of the seven different morphological areas (Figure 1) , for a total of 2100 images. Each pathologist took 1050 images, consisting of seven histologic areas per slide, multiplied by 15 antibodies, and then multiplied by 10 cases.
|
Images were examined simultaneously using the ACDSee program (ACD Systems of America; Miami, FL) and the positive staining was evaluated according to the most intensely stained and the least intensely stained image for each antibody. The data were saved using a manual arrow pointer to the red staining in the intensely stained image (i.e., positively stained cells) and for the blue staining in the least intensely stained case (i.e., negative cells). Every image for each antibody was screened using ImagePro according to the intensity staining for the total positive and negative cell count per image. This manual adjustment was performed because the ImagePro RGB did not separate the colors sufficiently.
The ImagePro watershed separation was used for the image analysis. This is a method to separate objects and assist in the counting process. The size of the counted objects does change; however, because the same process was applied uniformly across all of the cases, it did not alter the obtained final results.
Every measurement was transferred to an MS Excel (Microsoft Excel 2000; Seattle, WA) spreadsheet, and the mean staining was calculated according to the formula [positive cells/total cells (positive + negative cells)].
Data Analysis
The Principal Component Analysis (PCA) module of the Partek Pro software package (Partek Inc.; St Charles, MO) was used to analyze the results. The data were imported from Microsoft Excel to Partek Pro 5.1, and a PCA scatterplot graph was generated for the four different cell lineages (epithelium, nerve, stroma, inflammatory infiltrate). Overall, this comprised M observations with N components each (in our case, M is the number of patients multiplied by seven histologic areas = 70. The N is 15, equal to the number of antibodies in the study). To find the correlation between the different cell populations, we calculated the covariance matrix of the measurements:
![]() |
![]() |
This represents a multidimensional matrix that makes it difficult to show correlation between observations. Simplification is needed so that patterns are easier to interpret; thus, we applied PCA (Raychaudhuri et al. 2000; Lindsay 2002
). PCA expresses data in such a way as to highlight similarities and differences among analysis groups, particularly in data of high dimensions where graphical representations are not possible. Another advantage of PCA is that one can compress large datasets, thereby reducing the number of dimensions, without much loss of information.
After generating a covariance matrix, the eigenvectors of the matrix were calculated. These are orthogonal base vectors (principal components) used to represent the data. There are, in fact, M such vectors; however, one typically works with the higher value vectors that explain most of the observed variance (PC #1, #2, and #3).
One-way analysis of variance (ANOVA) (Lane and Nelder 1982) was applied to the data using Partek Pro. ANOVA was chosen as the statistical tool because it has the capability to perform comparisons for multiple factors, each of which can have several levels of information. Although other statistical tools are also capable of performing multiplex data analysis, ANOVA provided the capabilities that best fit the goals of the project.
For normalization, the GAPDH and histone antibody measurements were combined in a Microsoft Excel spreadsheet. This specific combination of antibodies was the most uniform among all the groups as determined by analysis with the Partek Pro software. The mean value for GAPDH and histone served as the denominator. The numerator was the reading for the rest of the antibodies examined according to the following formula: [Mean value for a picture/(mean GAPDH value for same area + mean histone value for same area)]. Areas that were negative for GAPDH or histone were excluded from the normalization.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Fifteen histological sections from each case were stained by IHC. The antibodies (Abs) were selected based on cell specificity for different histological areas (Figure 1). Cytokeratin and PSA Abs were selected for normal glands, PIN, and carcinoma cells. SMA was used for stroma (comprised of an admixture of smooth muscle and fibroblasts). Vimentin was used for mesenchymal-derived tissue. CD3 Ab was chosen for inflammatory infiltrate and S-100 Ab for nerve. Other antibodies were selected based on reports that they are differentially expressed in prostate cancer tissue including p130 (Claudio et al. 2002), Pim1 (Dhanasekaran et al. 2001
), HSP 27 (Cornford et al. 2000
), ERK, and pERK (Bell et al. 2003
). Antibodies against housekeeping proteins (GAPDH, ubiquitin, and histone H1) were added for subsequent normalization of results.
For each Ab in the study, the percentage of positive cells was established as follows. First, the most intensely stained of the 10 cases was determined by the reviewing pathologists. The percentage of positive cells was then measured in this case using ImagePro, and this index case was subsequently used as a reference for the other nine patient samples in the study. This manual adjustment was performed because the ImagePro RGB did not separate the colors sufficiently. Figure 2 shows a representative case to demonstrate the quantification by the ImagePro program. The staining pattern was different for every antibody: cytoplasmic, membranous, or nuclear. Thus, the method for counting was adjusted to fit each antibody and was based on counting of stained cells.
|
|
|
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The first challenge encountered was the failure to detect many of the target proteins. Although the antibodies for the study were commercially available and advertised to perform well on human tissue samples, 9 of the 24 produced equivocal staining. The use of standard antigen-retrieval methods was not successful in improving performance. The failure to detect more than one-third of the proteins is of concern, especially in light of the fact that they are known to be "highly abundant" in at least a subset of the cell types present in the prostate sections. This points to a potentially significant problem for the field of proteomics, that is, the inability to measure low- and moderate-abundant proteins in a complex matrix such as a histological section. This difficulty is likely to become increasingly problematic. Many, if not most, of the genes newly discovered by the Human Genome Project are expressed at relatively lower levels of abundance than the "known genes/proteins"; hence, they have escaped discovery for the past several decades by investigators in the laboratory. These proteins likely will be even more difficult to measure with IHC than those in the present study and may not be amenable to this technique at all. Clearly, the inability to detect and accurately measure a significant fraction of the proteome in specific cell phenotypes represents a major hurdle for the proteomics community to overcome in the future.
A second difficulty in the present study was the requirement for multiple tissue sections, in other words, the need for a separate histological section for each antibody. This was problematic for each of the different cell types because they often exhibited significant changes as the tissue block was serially sectioned. For example, PIN foci are small, localized groups of cells that can vary qualitatively (based on histological grading) and/or quantitatively (appear/disappear) among the sections. These problems were partly solved by incorporating immediately sequential sections and by close coordination between the reviewing pathologists. Nonetheless, this was a difficulty in the current study and is an inherent problem of analysis strategies that require serial sections for expression measurements.
Further challenges were encountered in performing quantitative measurement of protein levels. Historically, IHC has been analyzed using a semi-quantitative grading scheme, for example, on a scale from 0 to 4 based on somewhat subjective parameters such as staining pattern, frequency, and/or intensity. O'Neill et al. (2004) previously showed that standard pathologist-based grading using a standard light microscope does, in fact, correlate with higher optical density as measured by a quantitative image analysis system. However, while a pathologist-based strategy is effective for answering general "yes/no" or "high/low abundant" questions, it is not effective for generating robust and multiplex expression profiles that can be analyzed in depth. We addressed this problem by utilizing digital images of the sections and employing densitometry to measure staining intensity. The two reviewing pathologists visually assessed each section and chose a representative area of IHC staining for subsequent imaging, generating a total of 2100 digital photomicrographs. To ensure that this process did not induce bias based on the pathologist's interpretation of representative staining patterns, we independently compared the results produced by each pathologist and found them to be in close agreement (Figure 7).
Densitometric measurement of staining profiles from images is dependent on background (nonspecific staining reaction) and on electronic settings, including camera and lamp settings, white balance, and slide thickness. A major decision for the investigator is in choosing threshold values. In one approach, images can be converted to gray scale and enhanced for analysis, as was done by Nieruchalska et al. (2003). However, this method poses a major disadvantage for IHC images because no discrimination between differently stained (colored) areas can be made. Alternatively, Patel et al. (2002)
performed image-based analysis of IHC staining using automated threshold values. The computer program was set to identify positive (stained) or negative (not stained) areas. This was possible because the staining patterns generated with their protocol had a single detectable peak in the image analysis program, as opposed to images with intermingled RGB peaks. For our analyses of color IHC images, the transition from a color scale to a quantitative scale required several adjustments to allow for uniform measurements. The ImagePro software program does not allow one to set an automatic threshold for the color images; therefore, the range of staining intensity among the cases was precalculated using a pathologist's review. For each Ab, the most and least intensely stained sections among the patient set were selected, and their red and blue (respectively) staining properties measured and saved in the program. All cases were subsequently evaluated according to these saved staining parameters for each antibody in the study. Although this approach utilized a somewhat arbitrary selection of an index case as a comparator, we found it preferable to using the automated threshold features of ImagePro because positive cells are not always distinctively outlined, and converting the image to a gray scale can result in loss of positive staining. It is important to note that each method for image analysis initially needs to be evaluated and correlated with manual counting methods because many different parameters can affect the results, both quantitatively and qualitatively.
Quantification of IHC staining was another issue that needed to be addressed. There are several possible approaches that one can employ. For example, Cunnane et al. (1999) calculated IHC positivity based on the area occupied by surfaced-stained cells compared with the total area (as determined by the corresponding hematoxylin-stained section). In their study, this was possible because the counted cells were lymphocytes without abundant cytoplasm. However, it becomes problematic when cells with more cytoplasmic area are studied. De Boer et al. (2001)
summarized ideas discussed in a workshop on image analysis and quantification of staining patterns. Participants in the workshop articulated that expressing [number of cells per unit area] is an accurate way of counting. In a different approach, Nabi et al. (2004)
used an image analysis program to assess IHC results for nuclear staining of the androgen receptor. The program they utilized recognized nuclei automatically because of their shape and integrated the data such that mean optical density could be calculated. Alternatively, Bishop et al. (2002)
analyzed membrane IHC staining by deducting the nuclear signal from the total cell staining. Similarly, Wang et al. (2001)
used a space transformation proprietary technology to distinguish cell membrane staining from cytoplasmic staining using an automated imaging system. In our study, we counted the positively stained cells and then calculated their percentage from the total cell number in a given image. This approach was successful in quantifying both highly cellular tissues with minimal staining and low cellular stromal areas that were intensely stained, for both cytoplasmic and nuclear staining (De Boer et al. 2001
).
To normalize expression levels of the non-housekeeping proteins, we divided each value (nominator) by the value produced by the combined staining of GAPDH and histone antibody IHC expression (combined as a denominator), because these readings were stable and evenly expressed among the different morphological groups. This allowed the staining data to be compared with internal controls in each cell population, similar to an investigator normalizing a measurement on an immunoblot using an internal housekeeping protein. This is a beneficial feature of multiplex protein analysis of histology as compared with a standard, "one-protein IHC approach." After normalizing each data point to GAPDH and histone, the differences between the groups became sharper, emphasizing the variance between them (Figure 4).
Once the images were analyzed, the resultant data were imported from Microsoft Excel spreadsheets directly to the Partek Pro program and grouped based on individual point variance in PCA scatterplots. Analysis of the full set of 15 antibodies with the PCA scatterplots generated four separate groups that varied in their expression pattern (different shape of wire mesh) and in their location in the xyz-axis (Figure 3). Multiplex analysis was useful in that it provided optimal segregation of the various cellular types, including cancer and corresponding normal epithelium (Figure 5) (Jiang et al. 2002; Kwong et al. 2003
; Iwafuchi et al. 2004
).
In summary, production and analysis of high-throughput proteomic data sets from histology sections is likely to become an essential tool in basic research and clinical evaluation of normal cellular physiology and disease states. The present work indicates this is feasible using currently available laboratory tools. However, the study also points to several technical challenges that the proteomic community needs to overcome before genuine and comprehensive numerical descriptions of cellular phenotypes become a reality.
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bell WC, Myers RB, Hosein TO, Oelschlager DK, Grizzle WE (2003) The response of extracellular signal-regulated kinase (ERK) to androgen-induced proliferation in the androgen-sensitive prostate cancer cell line, LNCaP. Biotechnol Histochem 78:1116
Bishop JW, Marcelpoil R, Schmid J (2002) Machine scoring of Her2/neu immunohistochemical stains. Anal Quant Cytol Histol 24:257262[Medline]
Claudio PP, Zamparelli A, Garcia FU, Claudio L, Ammirati G, Farina A, Bovicelli A, et al. (2002) Expression of cell-cycle-regulated proteins pRb2/p130, p107, p27(kip1), p53, mdm-2, and Ki-67 (MIB-1) in prostatic gland adenocarcinoma. Clin Cancer Res 8:18081815
Cornford PA, Dodson AR, Parsons KF, Desmond AD, Woolfenden A, Fordham M, Neoptolemos JP, et al. (2000) Heat shock protein expression independently predicts clinical outcome in prostate cancer. Cancer Res 60:70997105
Cunnane G, Bjork L, Ulfgren AK, Lindblad S, FitzGerald O, Bresnihan B, Andersson U (1999) Quantitative analysis of synovial membrane inflammation: a comparison between automated and conventional microscopic measurements. Ann Rheum Dis 58:493499
De Boer WI, Hiemstra PS, Sont JK, De Heer E, Rabe KF, Van Krieken JH, Sterk PJ (2001) Image analysis and quantification in lung tissue. Clin Exp Allergy 31:504508[CrossRef][Medline]
Dhanasekaran SM, Barrette TR, Ghosh D, Shah R, Varambally S, Kurachi K, Pienta KJ, et al. (2001) Delineation of prognostic biomarkers in prostate cancer. Nature 412:822826[CrossRef][Medline]
Gillespie JW, Best CJ, Bichsel VE, Cole KA, Greenhut SF, Hewitt SM, Ahram M, et al. (2002) Evaluation of non-formalin tissue fixation for molecular profiling studies. Am J Pathol 160:449457
Iwafuchi H, Mori N, Takahashi T, Yatabe Y (2004) Phenotypic composition of salivary gland tumors: an application of principle component analysis to tissue microarray data. Mod Pathol 17:803810[CrossRef][Medline]
Jiang J, Ulbright TM, Zhang S, Eckert GJ, Kao C, Gardner TA, Koch MO, et al. (2002) Fas and Fas ligand expression is elevated in prostatic intraepithelial neoplasia and prostatic adenocarcinoma. Cancer 95:296300[CrossRef][Medline]
Kwong J, Lui K, Chan PS, Ho SM, Wong YC, Xuan JW, Chan FL (2003) Expression study of three secretory proteins (prostatic secretory protein of 94 amino acids, probasin, and seminal vesicle secretion II) in dysplastic and neoplastic rat prostates. Prostate 56:8197[CrossRef][Medline]
Lane PW, Nelder J (1982) Analysis of covariance and standardization as instances of prediction. Biometrics 38:613621[Medline]
Lindsay LI (2002) A tutorial on Principal Components Analysis. http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf (accessed February 20, 2004)
Nabi G, Seth A, Dinda AK, Gupta NP (2004) Computer based receptogram approach: an objective way of assessing immunohistochemistry of androgen receptor staining and its correlation with hormonal response in metastatic carcinoma of prostate. J Clin Pathol 57:146150
Nieruchalska E, Strzelczyk R, Wozniak A, Zurawski J, Kaczmarek E, Salwa-Zurawska W (2003) A quantitative analysis of the expression of alpha-smooth muscle actin in mesangioproliferative (GnMes) glomerulonephritis. Folia Morphol (Warsz) 62:451453[Medline]
O'Neill PA, Shaaban AM, West CR, Dodson A, Jarvis C, Moore P, Davies MP, et al. (2004) Increased risk of malignant progression in benign proliferating breast lesions defined by expression of heat shock protein 27. Br J Cancer 90:182188[CrossRef][Medline]
Patel VM, Heinel LA, Provencio JJ, Vinall PE, Kramer MS, Rosenwasser RH (2002) Validation of image analysis for enzyme histochemical and immunocytochemical staining. Biotechnol Histochem 77:213221
Raychaudhuri S, Stuart JM, Altman RB (2000) Principal components analysis to summarize microarray experiments: application to sporulation time series. Pac Symp Biocomput 5:452463
Simone NL, Remaley AT, Charboneau L, Petricoin EF 3rd, Glickman JW, Emmert-Buck MR, Fleisher TA, et al. (2000) Sensitive immunoassay of tissue cell proteins procured by laser capture microdissection. Am J Pathol 156:445452
Wang S, Saboorian MH, Frenkel EP, Haley BB, Siddiqui MT, Gokaslan S, Wians FH Jr, et al. (2001) Assessment of HER-2/neu status in breast cancer. Automated Cellular Imaging System (ACIS)-assisted quantitation of immunohistochemical assay achieves high accuracy in comparison with fluorescence in situ hybridization assay as the standard. Am J Clin Pathol 116:495503[CrossRef][Medline]
Winter EE, Goodstadt L, Ponting CP (2004) Elevated rates of protein secretion, evolution, and disease among tissue-specific genes. Genome Res 14:5461
Yan H, Zhou W (2004) Allelic variations in gene expression. Curr Opin Oncol 16:3943[CrossRef][Medline]
Zellweger T, Ninck C, Mirlacher M, Annefeld M, Glass AG, Gasser TC, Mihatsch MJ, et al. (2003) Tissue microarray analysis reveals prognostic significance of syndecan-1 expression in prostate cancer. Prostate 55:2029[CrossRef][Medline]