William S. Rowe Division of Rheumatology and 1 Division of Pediatric Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, 2 Genome Center and Department of Biomedical Informatics, Columbia University, New York, NY 10032 and 3 Division of Rheumatology, Children's Hospital of Pittsburgh, Pittsburgh, PA 15213, USA.
Correspondence to: S. D. Thompson, Division of Rheumatology, 3333 Burnet Avenue, Cincinnati, OH 45229-3039, USA. E-mail: susan.thompson{at}cchmc.org
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods. Microarray data (Affymetrix U95Av2) from 26 PBMC and 20 SFMC samples collected from patients with active disease (classified by course according to ACR criteria) were analysed for expression patterns that correlated with disease characteristics. For comparison, PBMC gene expression profiles were obtained from 15 healthy controls. Real-time PCR was used for confirmation of gene expression differences.
Results. Statistical analysis of gene expression patterns in PBMC identified 378 probe sets corresponding to 342 unique genes with differing expression levels between polyarticular course patients and controls (t test, P<0.0001). The genes represented by these probe sets were enriched for functions related to regulation of immune cell functions, receptor signalling as well as protein metabolism and degradation. Included in these probe sets were a group of CXCL chemokines with functions related to angiogenesis. Further analysis showed that, whereas angiogenic CXCL (ELR+) gene expression was elevated in polyarticular PBMC, expression of angiostatic CXCL (ELR) chemokines was lower in polyarticular SFMC compared with corresponding pauciarticular samples (t test, P<0.05).
Conclusions. This pilot study demonstrates that juvenile arthritis patients exhibit complex patterns of gene expression in PBMC and SFMC. The presence of disease-correlated biologically relevant gene expression patterns suggests that the power of this approach will allow better understanding of disease mechanisms, identify distinct clinical phenotypes in disease subtypes, and suggest new therapeutic approaches.
KEY WORDS: Human, Juvenile, Rheumatoid arthritis, Chemokines, Inflammation, Polyarticular course, JAK/STAT
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Although more common in adults, juvenile-onset spondyloarthritides (JSpA), which are similar to but distinct from the rheumatoid arthritides, can begin during childhood and may initially have a phenotype indistinguishable from late-onset pauciarticular JRA, especially in the absence of enthesitis and axial involvement [5]. Individuals who develop axial disease frequently progress to ankylosing spondylitis (AS), as defined by the modified New York criteria [6]. When this occurs during childhood, these patients are referred to as juvenile-onset AS (JAS). The presence of enthesitis is a useful predictor for JSpA, and HLA-B27 is frequently positive. However, many children with JSpA do not go on to develop axial disease, and identification of those whose disease will progress is not currently possible. The ILAR classification system for JIA includes a category for enthesitis-related arthritis (ERA), which attempts to identify patients with a spondyloarthropathy who do not fulfill criteria for JAS.
These arthropathies are thought to be autoimmune in origin and to represent complex genetic traits, albeit the genes involved and the precise aetiologies are poorly defined. Current subtype categorization is incomplete, as shown by the fact that the response to therapy varies within subtypes. Although JRA and RA have an arthritis phenotype in common, the diseases differ in many clinical features and with respect to age of onset, the presence of autoantibodies and genetic markers, which are largely different between RA and JRA [1, 79]. For example, only about 5% of children with JRA are positive for immunoglobulin M rheumatoid factor (RF) compared with over 70% of adult RA patients. While the expression of a variety of genes has previously been shown to be modulated in JRA subpopulations, particularly in synovial fluid, additional differentially expressed genes in peripheral blood mononuclear cells (PBMC) and synovial fluid mononuclear cells (SFMC) are likely to contribute to variability in phenotype, long-term outcome and response to treatment. It is possible that these yet to be identified factors can be used to distinguish previously unrecognized subtypes of patients, which will allow more tailored treatment regimens. The development of microarray technology may enable the identification of these additional factors by facilitating the analysis of global gene expression patterns in small amounts of sample acquired from PBMC and SFMC.
We report observations from a pilot study demonstrating the utility of PBMC- and SFMC-derived gene expression patterns with respect to the childhood autoimmune arthritides, indicating the potential of this technology to differentiate those with the most severe polyarticular course from those with less severe disease or controls. The apparent power of this approach to aid in the clinical evaluation of these arthropathies is supported by both statistical correlation with disease entities and the occurrence of functional relationships within gene groups that are correlated with disease entities. In addition, gene expression analysis is shown to be effective in categorizing patients. Other studies have reported PBMC gene expression differences in RA [10], spondyloarthropathy [11] and systemic lupus erythematosus [12, 13], but this is among the first [14] reports using microarray gene expression analysis to specifically investigate the chronic childhood arthritides.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Gene expression
For microarray analysis, biotinylated cRNA was synthesized from total RNA (Enzo, Farmingdale, NY, USA). After processing according to the Affymetrix GeneChip Expression Analysis Technical Manual (Affymetrix, Santa Clara, CA, USA), labelled cRNA was quantitated by hybridization to Affymetrix U95Av2 chips. Quality was assessed and expression values derived using Microarray Suite 5.0 (MAS 5.0; Affymetrix). For confirmation of expression patterns, real-time reverse transcriptionPCR (Assays-on-Demand; Applied Biosystems, Foster City, CA, USA), using total RNA prepared separately from that for the microarray analysis, was performed for two genes: CXCL3 (part number: Hs00171061_m1) and CXCL8 (part number Hs00174103_m1). Results were normalized to GAPDH (part number Hs99999905_m1).
Data analysis
Microarray expression values were analysed using algorithms available in GeneSpring 4.2.1 and 6.0 (Silicon Genetics, Redwood City, CA, USA). Chip-to-chip variation in expression intensities was controlled with per-chip normalization using the 50th percentile. Each probe set was then normalized such that the median expression for each gene was 1.0. Expression values were analysed by ANOVA and/or the t test with P values as indicated in the text. Real time PCR results were analysed with the MannWhitney test. Fold change was calculated as the ratio of geometric means in all cases. Ontological classification was performed using the program classScore (http://microarray.cpmc.columbia.edu/classScore). Gene designations from National Center for Biotechnology Information (www.ncbi.nlm.nih.gov) were used.
Cross-validation
Sample identity was predicted by hold-one-out cross-validation. This algorithm (Genespring version 6) used 50 genes, eight nearest neighbours, and significance was judged based on a P value of 0.2. This technique begins by removing one sample from the data set and identifying the 50 genes which best discriminate the two classes, polyarticular or control, considering only the remaining samples. Using these 50 genes, the eight closest samples (measured in Euclidian distance) to the held-out sample are identified. Using these nearest neighbours, two P values are calculated to determine the likelihood of the closest eight samples being of a specific class, considering the proportion of samples in the data set (calculated for both control and polyarticular samples in our case). The P value ratio is the ratio (lowest P value)/(other P value). The sample is assigned to the class providing a P value ratio 0.2. The algorithm then starts again by removing a different sample and continues iteratively until all samples have been held out of one analysis and their identity has been predicted. Because each iteration derives a new list of 50 genes, no single gene list is available.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Genes are expressed differentially between polyarticular JRA and control PBMC
The ability of microarrays to identify differentially expressed genes between juvenile arthritis subtypes and control samples in PBMC and SFMC was evaluated by ANOVA or t test (P<0.0001). Considering disease course subtypes defined by ACR criteria, the majority of differences in gene expression were found between polyarticular and control PBMC samples while few differences were observed for other comparisons using this conservative statistical cut-off (data not shown). The paucity of differences in the other PBMC comparisons may be due to the more extreme nature of polyarticular disease but nonetheless would also be harder to detect because of the smaller numbers of pauciarticular (n = 5) and JSpA (n = 6) samples. For SFMC, few genes were identified with significant differences in expression between disease groups. This is not surprising because comparisons with control samples are not available, and only small sample sizes were again available for pauciarticular (n = 5) and JSpA (n = 5) samples.
Further analysis was limited to polyarticular JRA vs control PBMC comparisons, the origin of the majority of the expression differences. A comparison of global gene expression levels (t test, P 0.0001) between polyarticular (n = 15) and control (n = 11) PBMC identified 378 probe sets representing 342 unique genes with differential expression (Supplementary Table 2). Compared with the 0.77 probes expected by chance (7670 probe sets x 0.0001), the polyarticular vs control comparison represented a false-discovery rate of 0.2%. Inspection of the data for these 378 probe sets showed that 312 had decreased expression in polyarticular patients compared with controls, while 66 had increased expression. It should be noted that many of the genes determined as differentially expressed in this study had only a modest shift in level of expression. For example, the relative level of expression for MVK (mevalonate kinase) is on average 1.18-fold higher in polyarticular than in control, while YWHAQ (tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, theta polypeptide) has a slight decrease (polyarticular/control, 0.78).
|
|
The genes identified as part of the JAK-STAT cascade in the gene ontology consortium database that were also present in the list of 342 genes include: STAT1, STAT2, STAT5B, CCL2, AMSH and NMI (Table 2). Following the statistical identification of this category, STAT6 was individually identified in the list of 342 genes and was added to this category [16]. These genes are involved in immune cell regulation, including functions related to the Th1/Th2 dichotomy, STAT6 being critical to the Th2 cytokine signalling cascades. In contrast to the chemokines discussed below, the expression of most of these genes was lower in polyarticular samples than controls. Values in the pauciarticular and JSpA samples were intermediate between polyarticular and controls (Fig. 2A, Table 2).
|
Expression patterns of angiogenic (ELR+) and angiostatic (ELR) CXCL genes in PBMC and SFMC suggest mechanistic differences between disease subtypes
In addition to overall subtype differences, the ability of this data set to identify mechanistic differences among the disease subtypes in both PBMC and SFMC was tested. An interesting group of CXCL chemokines was identified as part of the group of genes with chemotactic activity that also had high levels of expression in polyarticular and JSpA patients compared with controls (Fig. 2). In addition to their chemotactic activity, the CXCL chemokines can generally be categorized as angiogenic or angiostatic, based on the presence or absence (respectively) of an ELR (glutamic acid, lysine and arginine) motif. Based on our interest in angiogenesis, we chose to examine this family of genes for expression differences in PBMC and SFMC [17]. Vascular endothelial growth factor (VEGF), a prototypical angiogenic factor, was also included in this analysis. Of the 14 genes (CXCL113 and VEGF) related to angiogenesis available on the U95Av2 microarray, 12 were detected in at least one sample (six angiogenic chemokines, five angiostatic chemokines and VEGF). As there were only 12 genes in this analysis, we used the more liberal cut-off of P<0.05 to distinguish genes that were differentially expressed between disease subtypes. Therefore, we were able to use all the disease subtypes, even those with limited sample size.
Considering the angiogenic (six ELR+ and VEGF) genes [18], significantly higher expression levels were found for four (CXCL1, CXCL2, CXCL3 and CXCL8) in PBMC from polyarticular compared with pauciarticular patients and healthy controls (Fig. 3A). In contrast, while expression in SFMC was increased compared with control PBMC, expression was generally equivalent between all subtypes (Fig. 3C). Conversely, expression of three angiostatic (ELR) genes [18] (CXCL9, CXCL10 and CXCL11), while increased compared with control PBMC, was lower in polyarticular SFMC compared with pauciarticular SFMC (Fig. 3D). CXCL10 was also expressed at lower levels in JSpA SFMC compared with pauciarticular SFMC. Expression of the ELR genes was essentially the same among subtypes in PBMC (Fig. 3B).
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In identifying genes of potential biological interest, the choice of statistical tests and acceptable significance levels must consider the large number of comparisons inherent in microarray data [18]. A P value cut-off of 0.0001 was used initially in this study and was expected to return 0.77 probe sets by chance. With our result of 378 regulated probe sets, this corresponds to a 0.2% false-discovery rate. On the other hand, a P value of 0.05 is generally considered a cut-off for statistical significance. Using P<0.05, a t test of our microarray data returned 3395 probe sets with differential expression between polyarticular and control samples with 384 expected by chance, providing an unacceptably high 11.3% false-discovery rate. In this exploratory study, all of the genes identified by our statistical cut-off were retained as they may represent genes with large expression changes in minor cell populations or small expression changes in major cell populations. Either possibility may be important for disease and merit further investigation.
The enrichment of genes differentially expressed between JRA patients and controls for those defined as a part of the JAK-STAT signalling cascade is consistent with the biologic role of these molecules. For instance, relatively lower levels of expression of STAT6, a molecule that is critical for the development of Th2 T-cell differentiation, was one of the distinguishing features of polyarticular JRA samples. This is consistent with previous reports suggesting that, while the Th1 immune response is disease-promoting in JRA in general, the presence of Th2 cytokines in some patients may be associated with a better outcome [8].
The prominence of genes encoding chemokines is also probably relevant pathogenically. One of the most important properties of chemokines is the ability to direct trafficking of inflammatory cells and retain the cells at a site of inflammation. Therefore, the differences in the levels of chemokine gene expression may explain differences in the degree of severity of inflammation between patients with poly- and pauciarticular disease. In addition, many chemokines, including those of the CXCL family, influence angiogenesis.
The CXCL chemokines can be subdivided based on the presence or absence of an ELR (gluleuarg) motif. This structural difference is functionally important; in addition to influencing the type of cell that a chemokine attracts, the ELR motif also affects the ability to influence vascular remodelling seen in inflammation [1821]. Thus, ELR+ chemokines generally promote neovascularization, which occurs in inflamed tissues, while ELR chemokines block it (with the exception of CXCL12 [22], which we did not observe in our study). Extensive neovascularization of the inflamed synovium is an important pathogenic mechanism in arthritis. Newly formed blood vessels provide not only the source of nutrients for the growing pannus, but also increased access for inflammatory cells to infiltrate the synovium and thus promote joint damage (reviewed in [23, 24]). The overall outcome of vascular remodelling depends on the balance between the pro- and anti-angiogenic factors present in the tissues, while levels of angiogenic factors have been correlated with arthritic inflammation [17, 25, 26]. Therefore, we hypothesize that the increased expression of angiogenic factors in the PBMC and SFMC of all patients, especially polyarticular, may increase the angiogenic potential in the system. At the same time, expression of angiostatic factors is increased to a greater extent in SFMC samples from pauciarticular patients than either polyarticular or JSpA, thereby potentially limiting angiogenesis in the pauciarticular joint. Taken together, these expression patterns may help shift the overall balance towards relatively higher angiogenic activity in polyarticular patients, consistent with the more aggressive synovial expansion and joint destruction seen in polyarticular patients compared with those with pauciarticular JRA and JSpA.
While a number of microarray studies have focused on sorted cell populations, others have used mixed populations, as was done in this study [12, 13, 27]. Each sample type is useful for answering different questions. For instance, sorted cells can be used to ask how gene expression in activated macrophages is affected by disease state. On the other hand, using the same sorted population would be unlikely to provide information about T cells. Because we cannot define the exact population of cells which is important in arthritis (multiple types are probably important) and because we are primarily interested in identifying genes to use as clinical markers of disease, we chose to use the mixed PBMC and SFMC populations. Our present study has indicated the likelihood of identifying genes useful for the classification of clinical samples in these sample types, a result that was not entirely expected, given the joint localization of the major pathology.
Two of the polyarticular patients have expression patterns that are consistently different from those of the other polyarticular patients. These expression differences could not be explained by clinical findings such as the number of active joints, duration of disease, medication, age or gender. These expression findings are consistent with distinct mechanistic processes for these patients. Using traditional methods of RNA analysis, which only look at one or a few genes at a time, it would not be possible to appreciate the coordinated expression of this many genes and it is likely that these samples would simply be classified as outliers for the specific gene in question. Limitations in sample quantity would also probably prohibit the analysis of sufficient numbers of genes from the same sample to identify this coordinated regulation. A larger patient population will be necessary to distinguish any previously unrecognized subpopulations and to allow identification of genes that are regulated specifically in these subpopulations.
It is hoped that, separately or in conjunction with genomic screening, we will eventually be able to use gene expression profiling to identify patients at greatest risk of destructive disease and to predict their response to therapy at the earliest time-point possible. Future studies with more samples and purified cell populations will allow us to investigate mechanisms of the disease process, to more rigorously define genes with the power to categorize patients, and to validate any classification schemes with independent cohorts.
|
![]() |
Acknowledgments |
---|
The authors have declared no conflicts of interest.
Supplementary data
Supplementary data are available
at Rheumatology Online.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|