Research ArticleComputational Biology

A Molecular Signature Predictive of Indolent Prostate Cancer

See allHide authors and affiliations

Science Translational Medicine  11 Sep 2013:
Vol. 5, Issue 202, pp. 202ra122
DOI: 10.1126/scitranslmed.3006408

This article has a correction. Please see:

Abstract

Many newly diagnosed prostate cancers present as low Gleason score tumors that require no treatment intervention. Distinguishing the many indolent tumors from the minority of lethal ones remains a major clinical challenge. We now show that low Gleason score prostate tumors can be distinguished as indolent and aggressive subgroups on the basis of their expression of genes associated with aging and senescence. Using gene set enrichment analysis, we identified a 19-gene signature enriched in indolent prostate tumors. We then further classified this signature with a decision tree learning model to identify three genes—FGFR1, PMP22, and CDKN1A—that together accurately predicted outcome of low Gleason score tumors. Validation of this three-gene panel on independent cohorts confirmed its independent prognostic value as well as its ability to improve prognosis with currently used clinical nomograms. Furthermore, protein expression of this three-gene panel in biopsy samples distinguished Gleason 6 patients who failed surveillance over a 10-year period. We propose that this signature may be incorporated into prognostic assays for monitoring patients on active surveillance to facilitate appropriate courses of treatment.

INTRODUCTION

With more than 200,000 new diagnoses per year (1), prostate cancer is one of the most prevalent forms of cancer in men over the age of 50. Several factors, including an increase in the aging population and widespread screening for the biomarker prostate-specific antigen (PSA), have contributed to a substantial rise in diagnoses of early-stage prostate tumors, many of which require no immediate therapeutic intervention (24). Indeed, the primary means of determining the appropriate treatment course for men diagnosed with prostate cancer still relies on Gleason grading, a histopathological evaluation that lacks a precise molecular correlate (5). Immediate treatment is recommended for patients whose tumor biopsies are assigned a high Gleason score (Gleason ≥8). However, the appropriate treatment for patients whose tumor biopsies are assigned a low (Gleason 6) or intermediate (Gleason 7) Gleason score remains ambiguous.

The current lack of reliable and reproducible assays to identify tumors destined to remain indolent has resulted in substantial overtreatment of patients who would not have died from prostate cancer if the disease had been left untreated (4, 68). Consequently, a practice called “watchful waiting” (9) or, more recently, “active surveillance” (1012) has emerged as an alternative for monitoring men with potentially low-risk prostate cancer, with the intention of avoiding treatment unless there is evidence of disease progression. The advantage is to minimize overtreatment; however, the concern is that active surveillance may miss the opportunity for early intervention of tumors that are seemingly low risk but actually are aggressive. Thus, there is a critical need to identify biomarker panels that distinguish most of the low Gleason score tumors that will remain indolent from the few that are truly aggressive. Unfortunately, identification of such biomarkers has been hampered by the fact that, unlike other cancer types, prostate cancer has proven remarkably resilient to classification into molecular subtypes associated with distinct disease outcomes (13, 14). In addition, an inherent lack of understanding of the biological processes that distinguish indolence from aggressiveness is a considerable limitation for identifying relevant biomarkers.

One of the most significant risk factors associated with prostate cancer is aging (13), which represents a balance of antitumorigenic and protumorigenic signals. One of the principal antitumorigenic signals is senescence (cellular aging) (1518). Indeed, it is now widely appreciated that senescence plays a critical role in tumor suppression in general and has been associated with benign prostate lesions in humans (19, 20) and in mouse models (21). Thus, in the current study, we asked whether prostate tumors destined to remain indolent can be distinguished from those destined to become aggressive on the basis of differences in cellular processes associated with aging and senescence, and if so, whether we can identify related biomarkers of indolence versus aggressiveness.

Using gene set enrichment analysis (GSEA), we now show that a gene signature representing biological processes of aging and senescence can distinguish indolent from aggressive prostate tumors. Analyses of enriched genes led to the identification of a 19-gene “indolence signature,” which was then interrogated using a decision tree algorithm to identify a three-gene panel that accurately predicts outcome of low Gleason score prostate tumors. We demonstrated the prognostic accuracy of this three-gene panel on biopsies from patients monitored by active surveillance and, therefore, the potential clinical utility of this biomarker panel.

RESULTS

A gene signature of aging and senescence distinguishes indolent versus aggressive prostate cancer

We designed a bioinformatics approach to test the hypothesis that indolent and aggressive prostate tumors can be distinguished on the basis of expression of genes associated with cellular processes of aging and senescence; specifically, such genes should be up-regulated in indolent tumors and down-regulated in aggressive ones (Fig. 1). We first generated a literature-, pathway-, and manually curated 377-gene signature associated with aging and senescence (Fig. 1, step 1, and table S1). This gene signature was assembled primarily from meta-analyses of aging-related genes (22) and, accordingly, was enriched for biological pathways associated with various aging-associated diseases, but not for protumorigenic pathways such as those associated with cell proliferation. The 377-gene signature had virtually no overlap with previously identified signatures associated with cell proliferation (23, 24).

Fig. 1 Study design.

Step 1: Assembly of a 377-gene signature enriched for cellular processes associated with aging and senescence (table S1). Step 2: GSEA using the 377-gene signature to query (i) aggressive human prostate tumors from Yu et al. (25), (ii) aggressive cancers from lung and breast followed by meta-analyses with the human prostate data set, and (iii) cross-species analysis on indolent mouse prostate lesions from Ouyang et al. (31). The intersection of the leading edge from mouse prostate lesions and the lagging edge from the meta-analyses of human aggressive cancers led to identification of 19-gene indolence signature (table S5). The indolence signature was validated on human prostate tumors from Taylor et al. (14). Step 3: Decision tree learning to classify the 19-gene indolence signature to identify a three-gene prognostic panel of indolent prostate cancer using Sboner et al. (33). Step 4: Validation of the three-gene panel at the mRNA and protein levels. Step 5: Validation of the three-gene panel on biopsies from Gleason grade 6 patients.

We then performed GSEA to evaluate whether the aging and senescence signature was enriched for genes down-regulated in aggressive human prostate cancer and up-regulated in indolent prostate cancer (Fig. 1, step 2). We extended these analyses to infer that the intersection of the genes enriched among those down-regulated in aggressive human prostate cancer (that is, the lagging edge) and up-regulated in indolent prostate cancer (that is, the leading edge) would identify those genes that are most closely associated with indolence (that is, an indolence signature; Fig. 1, step 2). For these and subsequent analyses, we used published expression profiling data sets either to discover or refine gene sets for classification purposes (training sets) or to validate their statistical power and performance (test/validation sets), but never for both purposes (Fig. 1 and Table 1).

Table 1 Clinical and pathological features of the human prostate cancer data sets.

The table describes the characteristics of the samples within the given data sets that were actually used in the study (see table S2). HICCC, Herbert Irving Comprehensive Cancer Center; MAD, median absolute deviation; NA, information was either not available or not applicable; SVI, seminal vesicle involvement; TURP, transurethral resection of prostate.

View this table:

To evaluate the expression of the 377-gene signature of aging and senescence in aggressive prostate cancer, we performed GSEA analyses using the Yu et al. data set, which includes a subset of aggressive, locally invasive prostate tumors (n = 29) with adjacent normal prostate tissue (n = 58) as controls (25) (Table 1 and table S2A). Consistent with our hypothesis, the 377-gene signature was enriched for genes down-regulated in aggressive prostate tumors compared with the normal controls [normalized enrichment score (NES) = −1.87; P < 0.001] (Fig. 2A and table S3A). Other aggressive epithelial cancers, lung and breast (26, 27), also showed significant enrichment of genes in the 377-gene signature that were down-regulated in aggressive tumors (NES = −1.90 and −1.52, respectively; P < 0.001 in both cases) (fig. S1A and table S3, B and C). Meta-analysis of the down-regulated (that is, lagging-edge) genes from the prostate, lung, and breast tumors led to a refinement of the original 377-gene signature to a subset of 68 genes that were most significantly enriched in aggressive tumors (table S4A). These findings support the hypothesis that genes associated with aging and senescence are enriched among down-regulated genes in aggressive prostate cancer as well as in other epithelial cancers.

Fig. 2 A gene signature of aging and senescence stratifies human prostate cancer.

(A to C) Identification of an indolence signature. GSEA analyses using the 377-gene signature to query expression profiles from aggressive prostate tumors [(A), from Yu et al.] and mouse indolent prostate cancer [(B), from Ouyang et al.]. (C) Intersection from the lagging edge in the meta-analyses of aggressive tumors and the leading edge in the mouse indolent lesions to identify the 19-gene indolence signature. (D to F) Validation of an indolence signature. (D) the GSEA analyses on aggressive prostate tumors from the Taylor et al. (Gleason scores 8 and 9 BCR <22 months; n = 15). (F) Low Gleason score [Gleason scores 6 and 7 (3 + 4)] prostate tumors from the Taylor et al. cohort separated into a short time to BCR group (BCR <35 months; n = 5) and a group that displayed no evidence of recurrence over a long time period (BCR >100 months; n = 5). (E) Summary of the enrichment scores from GSEA analyses done on all Gleason 6 prostate tumors (n = 41) partitioned by time interval free of BCR. Leading- and lagging-edge genes from each of GSEA plot are provided in table S3; genes in indolence signature are provided in table S5.

Cross-species analysis identifies a 19-gene indolence signature

Because the 377-gene set is enriched for genes that are down-regulated in aggressive prostate cancers (Fig. 2A), we expected that the most informative genes in this signature should be up-regulated in indolent prostate tumors. However, independent human data sets that contained purely indolent prostate tumors were not available to evaluate this hypothesis. Therefore, as a source of purely indolent prostate lesions, we performed cross-species analyses using a well-characterized mouse model of preinvasive prostate cancer, which is based on germline loss of function of the Nkx3.1 homeobox gene (28, 29). This cross-species approach, which uses enrichment analyses of a relatively homogeneous mouse model to “filter” the characteristically heterogeneous human prostate tumors, also enabled identification of the most conserved and relevant genes in the 377-gene signature.

Human NKX3.1 is localized to a chromosomal cancer hotspot, 8p21, which is frequently lost in prostate intraepithelial neoplasia (PIN), and down-regulation of NKX3.1 expression is associated with cancer initiation, although this event is not sufficient for overt carcinoma development (30). Targeted inactivation of Nkx3.1 in mice leads to PIN, which does not progress to adenocarcinoma even in aged mice (28, 29) (fig. S2, A to D). Further, this age-associated arrest in cancer progression in the Nkx3.1 mutant mice is coincident with elevated cellular senescence and abrogation of cellular proliferation (fig. S2, E to I). Because the Nkx3.1 mutant mice develop preinvasive prostate lesions with an aging-associated halt in tumor progression that is coincident with cellular senescence, we reasoned that they would provide a relevant model of indolent prostate cancer.

We performed GSEA using expression profiles on prostate tissue from aged Nkx3.1 homozygous mutant mice and age-matched wild-type control mice (n = 9 per group) (31). Whereas the 377-gene signature was enriched for genes down-regulated in aggressive human prostate tumors (that is, in the lagging edge) (Fig. 2A), the indolent prostate lesions from the Nkx3.1 mutant mice were enriched for genes that were up-regulated in the 377-gene signature (that is, in the leading edge) relative to control mice (NES = 1.81; P < 0.001) (Fig. 2B and table S3D). We reasoned that the intersection of genes down-regulated in aggressive human tumors (that is, the 68 genes from the meta-analysis of human cancers) and those up-regulated in the indolent prostate lesions from the Nkx3.1 mice (that is, the 73 genes from the leading edge) would identify the most consistently regulated genes for an effective indolence classifier (Fig. 2C). Indeed, these analyses identified 19 genes that were significantly up-regulated in indolent human prostate cancers and down-regulated in aggressive human prostate tumors from the meta-analyses of prostate, breast, and lung (referred to as the 19-gene indolence signature) (Fig. 2C and table S5). This intersection was highly statistically significant compared to the random selection model (P < 0.001, by Fisher’s exact test), which suggested that these genes are under coordinated regulation in aggressive and indolent tumors and thus are well suited for classification of these states. Together, these findings support the hypothesis that genes associated with aging and senescence can distinguish among prostate cancers according to aggressive versus indolent behavior.

Aging and senescence gene signature distinguishes disease outcome of low Gleason score prostate cancer

To independently validate these observations, we used the Taylor et al. data set, which is one of the few publicly available human prostate cancer data sets with extensive clinical outcome data (14) (Table 1). The Taylor et al. data set contains a substantial number of prostatectomy samples (n = 131) with adjacent normal control tissue samples (n = 29) from patients that encompass a wide range of Gleason scores and times to biochemical recurrence (BCR), as measured by increased levels of PSA (14) (Table 1 and table S2B). This data set includes a significant number (n = 13) of samples from aggressive prostate tumors (Gleason score of 8 or 9) that displayed a short time to BCR (<22 months) (Table 1 and table S2B). GSEA of these high Gleason grade tumors relative to controls demonstrated the tumors’ similarity with aggressive tumors from the Yu et al. data set; indeed, genes that were down-regulated in the aggressive Taylor et al. tumors also were significantly enriched in the 377-gene signature developed from the Yu et al. data set (NES = −2.60 and P < 0.001) and included 18 of the 19 genes in the 19-gene indolence signature (Fig. 2D and table S5). Therefore, the specific enrichment of the 377-gene signature was conserved in an independent data set from aggressive human prostate cancers.

The Taylor et al. data set also contains a substantial number of low Gleason score tumors [Gleason score 6, n = 41; Gleason score 7 (3 + 4), n = 54] with varying times of progression to BCR ranging from >100 months (indolent) to <35 months (aggressive) (Table 1 and table S2B). Thus, we asked whether we could recapitulate the differential enrichment of the 377-gene signature in the indolent versus aggressive tumors, limiting our analyses to only low Gleason score prostate tumors (Fig. 2, E and F, and fig. S2B). For these and most subsequent analyses, we focused primarily on Gleason score 6 tumors, but for increased statistical power, we also included the subset of Gleason score 7 tumors that were scored as having a combined Gleason score of 3 + 4 [that is, those having tumors with more Gleason 3 than Gleason 4; herein, we refer to these combined Gleason 6 and Gleason 7 (3 + 4) tumors as low Gleason score tumors]. We consistently found in our molecular analyses that Gleason 7 tumors scored as 3 + 4 behaved more like Gleason score 6 tumors, whereas those scored as having a combined Gleason score of 4 + 3 (that is, those having more Gleason 4 than Gleason 3) behaved more like the advanced Gleason score (8 or 9) tumors; these findings agree with a recent study by Sowalsky and colleagues showing that Gleason 3 + 4 lesions have different molecular features and progressive potential relative to 4 + 3 lesions (32).

First, we performed GSEA on the low Gleason score prostate tumors from the Taylor et al. data set to evaluate enrichment of the aging and senescence 377-gene signature in the two extreme patient groups (that is, the most lethal versus the most indolent). In particular, the first group included patients with a short time to BCR [the aggressive group, Gleason score 6 and 7 (3 + 4) tumors with times to BCR of <35 months; n = 5], and the second included patients whose tumors did not recur within the considerable follow-up period of >100 months [the indolent group, Gleason score 6 and 7 (3 + 4) tumors with times to BCR of >100 months; n = 5] (Fig. 2F and table S2B). GSEA analyses demonstrated that the 377-gene signature was enriched in genes up-regulated in the indolent group (BCR >100 months), with a positive NES score (NES = 1.52, P <0.001), whereas the 377-gene signature was enriched in genes down-regulated in the aggressive group (BCR <35 months), with a negative NES score (NES = −1.85, P < 0.001; Fig. 2F and table S3, E and F).

We further assessed enrichment of the 377-gene signature only in indolent versus aggressive tumors with Gleason scores of 6 from the Taylor et al. data set. We partitioned the Gleason score 6 patients into subgroups that represented varying intervals to BCR: >1 month (n = 41), >35 months (n = 32), >50 months (n = 20), >65 months (n = 8), >80 months (n = 5), and >100 months (n = 3), and then performed GSEA on each of these subgroups. Although all of the subgroups displayed enrichment of expression differences in genes in the 377-gene signature relative to the controls, the direction of the expression change for the enriched genes was dependent on the interval to BCR (Fig. 2E). In particular, Gleason grade 6 tumors with a longer interval to BCR (>65, >80, and >100 months) were enriched in up-regulated aging and senescence genes (that is, the leading edge of the 377-gene signature) and had positive NES scores, whereas those with a shorter interval to BCR (>1, >35, or >50 months) were enriched in down-regulated aging and senescence genes (that is, the lagging edge of the 377-gene signature) and had negative NES scores (Fig. 2E and fig. S1B).

Together, these GSEAs suggest that differential enrichment of genes in an aging and senescence gene signature can distinguish low Gleason score tumors that are destined to remain indolent from those destined to become aggressive. Furthermore, meta-analyses of the leading- and lagging-edge genes in these indolent versus aggressive subgroups of Gleason 6 tumors included most of the genes in the 19-gene indolence signature (14 of 19 genes; table S5). Together, these findings demonstrate that low Gleason score prostate tumors can be distinguished as indolent or aggressive on the basis of enrichment for an aging and senescence gene signature and constitute an independent validation of the indolence signature.

A three-gene prognostic biomarker panel can classify low Gleason score prostate tumors

Although the 19-gene indolence signature is differentially enriched in indolent versus aggressive subtypes, it is not sufficient to stratify patients using Kaplan-Meier analyses (fig. S5A). Thus, we sought to identify a minimal subset(s) of genes among those in the 19-gene indolence signature that most effectively predicts clinical outcome for low Gleason score prostate tumors. In particular, we used a decision tree learning model to evaluate gene combinations among the 19-gene signature that best distinguish indolent versus lethal prostate tumors (Figs. 1, step 3, and 3A). The decision tree model iteratively partitions patients according to the expression state of the gene with the highest predictive value, considering both synergistic and antagonistic effects between genes, and terminating once further partitioning has no additional statistical predictive value. Each leaf node in the resulting predictive tree corresponds to a set of patients with predicted prognostic outcome; each branch corresponds to the expression state of a predictive gene, and a walk from the root of the tree to a leaf node reveals the expression state of the gene panel used to predict outcome at the leaf node.

Fig. 3 A decision tree learning model identifies a three-gene prognostic panel.

(A) Schematic representation of the decision tree learning model. The decision tree algorithm systematically samples the expression states of all combinations of the 19-gene indolence signature to identify combinations most effective in segregating patients into indolent and lethal groups. The decision tree learning model was performed with the Sboner et al. data set (Table 1 and table S2). (B) Summary of the top three-gene combinations from the decision tree learning model. The first column shows combinations ranked by cross-validation error (table S6). The next two columns show independent validation using (i) the odds ratio for each of the three-gene combinations to accurately predict patient outcome (indolence or lethality) using confusion matrices (fig. S4) and (ii) Kaplan-Meier analyses of low Gleason score patients using the Taylor et al. data set. Log-rank P values are summarized here, and Kaplan-Meier plots are shown in (C) and fig. S5. (C) Kaplan-Meier analysis of patients with low Gleason scores [Gleason 6 and 7 (3 + 4); n = 95] from Taylor et al. showing stratification of FGFR1, PMP22, and CDKN1A for fast-recurring versus slow-recurring patients. The log-rank P value is indicated. (D and E) C-statistical analysis and Cox proportional hazard model on Gleason 6 and 7 (3 + 4) patients comparing the performance of FGFR1, PMP22, and CDKN1A expression levels with the D’Amico classification or with Gleason score alone. DF, degree of freedom.

We performed decision tree analyses using an independent data set, namely, the Swedish watchful waiting cohort of Sboner et al., which includes expression profiles from transurethral resection of prostate (TURP) specimens from 281 patients with localized prostate cancer that were followed for up to 30 years (33). Notably, this data set differs from the Taylor et al. data set in several important respects: (i) sample collection in Sboner et al. cohort predates the PSA screening era (tissues collected before 1996); (ii) expression profiles were obtained from TURP rather than from prostatectomies; and (iii) the primary endpoint in the Sboner et al. cohort is death due to prostate cancer rather than time to BCR, as in the Taylor et al. cohort (Table 1). Considering these important distinctions between the Taylor et al. and the Sboner et al. cohorts, biomarkers that show consistent stratification power in both are expected to be robust.

To focus on genes that most effectively inform outcome, we limited our analysis to the extreme outcome cases in the Sboner et al. data set. Specifically, we identified an “indolent group” with long-term survival following initial diagnosis (t ≥ 10 years; n = 25), and a “lethal group” in which patients died early from prostate cancer (t < 4 years; n = 29) (Table 1 and table S2). Thus, the decision tree was constructed using these extreme patient groups in the Sboner et al. training set.

Among thousands of possible trees evaluated in the decision tree model, only 14 three-gene combinations had cross-validation power greater than 0.25 (fig. S3A and table S6). Trees with significant predictive power repeatedly included CDNK1A, FGFR1, PMP22, Clusterin, and CLIC4 (Fig. 3B and table S6A). We tested the top-ranked combinations for predictive accuracy using confusion matrices to “score” predicted versus actual indolent and lethal cases (Fig. 3B and fig. S4). First, we assembled a test set from cases in Sboner et al. that had not been used for decision tree learning (n = 28 indolent and 8 lethal; Table 1 and table S2). Then, we used each gene panel to classify patients on the basis of survival. The best gene panel (odds ratio = 1.94) identified from confusion matrix analysis was also the top-ranked panel from the decision tree model. This panel included FGFR1, PMP22, and CDKN1A (Fig. 3B and fig. S4) and was selected as our candidate biomarker panel to further evaluate for stratifying low Gleason score prostate tumors.

The three-gene panel was validated at the mRNA and protein levels

We first validated the prognostic accuracy of the three-gene panel (that is, FGFR1, PMP22, and CDKN1A) at the mRNA expression level (Fig. 1, step 4), using the low Gleason score [that is, Gleason score 6 and Gleason score 7 (3 + 4)] tumors from Taylor et al. (n = 95; Table 1 and table S2). The ability of the three-gene panel to segregate these low Gleason score tumors into low- and high-risk groups was evident in k-means clustering (fig. S3B), an unsupervised clustering approach that relies only on similarity of gene expression in different samples without using any clinical information about the patients. Furthermore, as evident by Kaplan-Meier analysis, the three-gene panel (FGFR1, PMP22, and CDKN1A) robustly segregated the low Gleason score prostate tumors into high- and low-risk groups on the basis of time to BCR (n = 95 cases; P = 0.005) (Fig. 3C).

In these and subsequent analyses, we consistently observed that stratification of low Gleason score tumors by the three-gene panel was more effective than for the entire patient population, including higher Gleason score tumors (n = 131; P = 0.047) (fig. S5B). Furthermore, the three-gene panel was significantly more effective in segregating high- and low-risk patients than was the 19-gene indolence signature (compare Fig. 3C with fig. S5, A and B), which further demonstrates the efficacy of the decision tree learning model for selecting the most clinically relevant biomarkers among the 19-gene signature. Notably, only one of the other top six gene combinations from the decision tree model (FGFR1, B2M, and CDKN1A) was significant (P = 0.02) in stratifying low Gleason score prostate tumors into high- and low-risk groups (Fig. 3B and fig. S5C), and it is noteworthy that this combination shares two genes in common with the three-gene panel. Finally, although certain individual genes (FGFR1, PMP22, and CDKN1A) had prognostic power in some assays, only the three-gene panel was consistently observed to have prognostic potential in all of the models and cohorts evaluated (see fig. S6).

The prognostic value of the three-gene panel was further evident using C-statistics in comparison with pathological Gleason score or the D’Amico classification nomogram, which takes into account Gleason score, clinical T stage, and PSA levels (34) (Fig. 3D). In particular, the three-gene panel performed better [C-index, 0.86; confidence interval (CI), 0.65 to 1.0; P = 3.3 × 10−4] than either Gleason score alone (C-index, 0.82; CI, 0.54 to 1.0; P = 0.010) or the D’Amico classification alone (C-index, 0.72; CI, 0.52 to 0.90; P = 0.012), whereas the three-gene panel significantly improved prognostic capability when combined with either Gleason or D’Amico (C-index, 0.89; CI, 0.74 to 1.0; P = 4.7 × 10−8 and C-index, 0.83; CI, 0.73 to 0.95; P = 1.8 × 10−9, respectively) (Fig. 3D). Furthermore, multivariate Cox proportional hazard analysis showed that the three-gene panel together with Gleason had statistically significant improved prognostic ability over using Gleason alone (P = 0.04). For D’Amico classification, the improved prognostic ability was mostly a result of additive effects of the three-gene panel, which was significant (P = 0.017). This improvement was diluted by the high degrees of freedom of the full interaction model between D’Amico covariates and the three-gene panel prediction (P = 0.11) (Fig. 3E). Together, these findings demonstrate the independent prognostic value of the three-gene panel at the mRNA level.

We extended these findings to evaluate whether the three-gene panel was also prognostic at the protein level (Fig. 1, step 4). We performed immunohistochemical staining on a tissue microarray (TMA) composed of primary prostate tumors that corresponded to a wide range of Gleason scores, although we focused on the low Gleason score tumors [that is, Gleason 6 and Gleason 7 (3 + 4)] (Fig. 4, A and B, Table 1, and fig. S7). The predictive accuracy of the three-gene panel was supported by unsupervised k-means clustering analyses, in which there was two- to fourfold higher staining intensity for tumors classified in the indolent versus the aggressive clusters (fig. S3C). Moreover, Kaplan-Meier analyses revealed that the protein expression levels of FGFR1, PMP22, and CDKN1A effectively stratified the low Gleason score tumors into high- and low-risk groups (P = 0.015) (Fig. 4B).

Fig. 4 The three-gene predictive panel shows predictive accuracy at the protein expression level.

(A) Analyses of TMAs immunostained for FGFR1, PMP22, and CDKN1A showing representative cases of Gleason grade 6 tumors that were indolent or lethal. (B) Kaplan-Meier analysis for patients with Gleason 6 and 7 (3 + 4) included in the Herbert Irving Comprehensive Cancer Center (HICCC) TMAs (n = 44) separated into high-risk versus low-risk cancers by expression of FGFR1, PMP22, and CDKN1A proteins. The log-rank P value is indicated. (C) C-statistical analysis and Cox proportional hazard models for Gleason 6 and 7 (3 + 4) patients from the TMAs comparing the performance of FGFR1, PMP22, and CDKN1A protein expression levels with Gleason score. (D) Representative immunohistochemical results from the nonfailed and failed biopsy groups of Gleason 6 patients monitored by surveillance (see Table 1). Shown are expression levels of FGFR1, PMP22, and CDKN1A proteins. (E) Summary of analyses of initial biopsy samples using all the failed cases (n = 14) in the active surveillance cohort (Table 1) compared to nonfailed cases (n = 19) and validated with a second group of nonfailed cases (n = 10) from the same cohort.

Furthermore, C-statistic analyses of this cohort revealed that the three-gene panel performed significantly better (C-index, 0.95; CI, 0.90 to 1.0; P = 2.0 × 10−54) than Gleason score alone, which in this cohort displayed a relatively low C-index (C-index, 0.62; CI, 0.34 to 0.89; P = 0.198), whereas the three-gene panel significantly improved the prognostic accuracy of the Gleason score (C-index, 0.82; CI, 0.70 to 0.94; P = 1.0 × 10−7) (Fig. 4C). In addition, multivariate Cox proportional hazard analyses showed that the three-gene panel together with Gleason had improved prognostic ability (P = 0.034) over using Gleason alone (Fig. 4C). Together, these findings demonstrate that the three-gene panel can accurately stratify low Gleason score primary prostate tumors at both the mRNA and protein levels, and provides independent prognostic information that improves the predictions of widely used clinical nomograms.

The three-gene panel shows prognostic capability on biopsy samples from surveillance patients

Given these findings, we asked whether analyses of protein expression of the three-gene panel could be effectively incorporated into clinical diagnosis of patients with low Gleason score prostate cancer (Fig. 1, step 5). Toward this end, we performed retrospective analyses of biopsy specimens from patients who had been monitored by surveillance in the Department of Urology at Columbia University Medical Center from 1992 to 2012 (35). In particular, we assembled a cohort of patients that had presented with clinically low-risk prostate cancer as defined by the following: normal digital rectal exam (DRE), serum PSA <10 ng/ml, biopsy Gleason score ≤6 in no more than 2 cores, and cancer involving no more than 50% of any core on at least a 12-core biopsy (35). The protocol to monitor these patients included DRE and serum PSA testing every 3 months, and repeat biopsy every 12 months for the first 3 years and every 18 months for the next 3 years, or a “for-cause” biopsy if there was any sign of progression (such as abnormal DRE and increasing PSA). As long as all parameters and biopsy findings remained stable, patients were advised to remain on the surveillance protocol (and are referred to here as “nonfailed”). Patients were considered “failure” for surveillance if they showed increasing cancer grade or volume on biopsy. Notably, all patients included in the “failed” group herein had failed on the basis of defined clinical parameters and not, for example, those who opted to undergo treatment for other reasons such as anxiety about having an untreated cancer.

From a consecutive series of 213 patients that strictly adhered to the above criteria, we identified all patients that failed surveillance for which the initial biopsy tissue was available (n = 14) (Table 1). For comparison, we analyzed an equivalent group of patients that did not fail surveillance for at least 10 years for which initial biopsy tissue was available (n = 29) (Table 1). Note that in both cases, we evaluated the initial biopsies used to enroll the patients to surveillance monitoring.

Immunohistochemical analyses of these failed and nonfailed groups of biopsy samples showed a marked correlation between the expression of FGFR1, PMP22, and CDKN1A and outcome (Fig. 4, D and E, and fig. S7). In particular, all of the biopsies from the Gleason 6 patients that did not fail surveillance had robust and fairly uniform levels of expression of FGFR1, PMP22, and CDKN1A (average composite staining score of 4.11 ± 1.0). In marked contrast, the biopsies from the Gleason 6 patients that had failed active surveillance had reduced staining overall, as well as much more variable levels of FGFR1, PMP22, and CDKN1A (average composite staining score of 1.71 ± 1.2). Notably, the difference in the protein expression levels of the biomarker panel (FGFR1, PMP22, and CDKN1A) in these Gleason 6 biopsy samples from patients that had failed or had “not-failed” surveillance was significant (P = 1.5 × 10−5, t test), suggesting that expression levels of this biomarker panel can be used to develop a prognostic indicator for these low Gleason score prostate tumors. Thus, these findings support the idea that detection of FGFR1, PMP22, and CDKN1A on biopsy samples can be evaluated for use, in conjunction with other clinical parameters, to identify the subset of patients with low Gleason score prostate tumors that are likely to progress to aggressive disease.

DISCUSSION

Many newly diagnosed cases of prostate cancer now present with low Gleason score tumors that are considered to be clinically low risk and destined to remain asymptomatic throughout patients’ natural lives; however, a minority of these will progress to aggressive, lethal tumors. Current clinical practices are ineffective in distinguishing, at the earliest disease stages, which of these low Gleason score tumors will remain indolent versus those that will progress to aggressive disease, which has contributed to a serious dilemma of overtreatment. As a consequence, it has been recommended that screening for prostate cancer should be more limited to reduce the adverse effects of overtreatment; however, more limited screening may ultimately mean missed opportunities for detecting aggressive tumors at early disease stages when they might have been most effectively treated. Thus, to realize the benefits of early detection, while minimizing the adverse consequences of overtreatment, there is a clear need to develop more effective approaches to predict the risk of low Gleason score prostate tumors. We have now identified a three-gene panel—FGFR1, CDKN1A, and PMP22—that provides accurate prognostic information regarding the outcome of low Gleason score tumors, including on biopsies from patients diagnosed with Gleason score 6 tumors that had been monitored on an active surveillance protocol. We propose that this three-gene panel be evaluated for its ability to provide prognostic assessments of men with low Gleason score tumors to monitor their progress on active surveillance protocols.

Identification of biomarker panels that can be used in clinical practice to provide accurate prognostic information for distinguishing outcome of low Gleason score prostate tumors has proven to be difficult. We attribute the apparent success of our current study to several critical and unique parameters inherent in our approach. The first is the underlying hypothesis, which presumes that prostate tumors destined to remain indolent should be enriched for cellular processes associated with aging and cellular senescence. Indeed, whereas the relationship of aging to cancer is long known to be complex (36), the association of aging and cellular senescence is well appreciated for its essential tumor-suppressive role in many cancers (1518), including prostate cancer (1921). Our current findings extend these biological observations by demonstrating the potential clinical value of using molecular models based on aging and senescence-related signatures to distinguish outcomes for prostate cancer, and suggest that such an approach may be more broadly applicable to other aging-associated epithelial cancers.

Second, rather than focusing on a single bioinformatic or statistical model or data set, our biomarker panel identification represents the culmination of an integrative analysis of several complementary bioinformatic and statistical models, which were interrogated using several independent data sets and including cross-species analyses. Notably, we used an unbiased approach based on a decision tree learning algorithm for refinement of an indolence signature to identify a three-gene panel, namely, FGFR1, PMP22, and CDKN1A, which accurately predicts indolent status for low Gleason score prostate tumors at both the mRNA and protein levels. The importance of using an “unbiased” approach is reflected by the identity of the genes themselves, which were (i) expected to be associated with indolent prostate cancer (CDKN1A); (ii) known to be associated with prostate cancer, but not necessarily associated with indolence (FGFR1); or (iii) not previously associated with prostate cancer at all (PMP22).

In particular, CDKN1A (p21) is a cell cycle regulatory gene whose expression is closely linked to senescence and whose down-regulation has been associated with promoting cancer progression in general, including prostate cancer (37, 38). Therefore, our current findings showing that CDKN1A (p21) expression is associated with indolence are consistent with previous studies. In contrast, our findings showing that expression of FGFR1 is associated with indolence was rather unexpected. FGFR1 encodes the major receptor for fibroblast growth factor (FGF) signaling in the prostate and is known to play a critical role in prostate development as well as in prostate tumorigenesis (39, 40). On the basis of previous analyses of its functional role in cancer, including a recent study that evaluated the functional consequences of FGFR1 expression in a mutant mouse model of lethal prostate cancer (41), we might have predicted that elevated expression of FGFR1 should be associated with cancer progression rather than indolence. However, the complexity of FGFR1 status in prostate cancer is highlighted by the fact that whereas a subset of aggressive, castration-resistant prostate tumors has been shown to display amplification of the gene locus that includes FGFR1 (42), in the Taylor et al. data set, the specific genomic region that includes FGFR1 is frequently deleted, which is correlated with down-regulation of FGFR1 gene expression (14).

However, among the three genes in the biomarker panel, PMP22 was the most surprising and unexpected, because it encodes a glycoprotein that comprises ~5% of total myelin protein in the nervous system and has not previously been associated with prostate cancer (43, 44). Although genetic defects involving PMP22 are associated with peripheral neuropathy and PMP22 is abundantly expressed in neurons, the gene is also expressed in other tissues and has been linked to regulating cellular proliferation and growth arrest in fibroblasts (45). Therefore, our current findings suggest a new and unexpected role for PMP22 in the prostate, which could reside either within the epithelium or in neural cells that innervate prostate tissue.

The third relatively unique aspect of our study is that it was specifically designed to focus on stratifying low Gleason score prostate tumors and, therefore, to fill a critical niche in terms of biomarker discovery (4). In particular, our approach to identify biomarker panels that can distinguish indolence among low Gleason score prostate tumors is distinct from previous studies that have largely focused on aggressiveness in advanced tumors (46). For example, a previously identified four-gene signature of aggressive tumors, including Pten, Smad4, Cyclin D1, and SPP1, does not overlap with our three-gene panel of indolence. Notably, this four-gene biomarker panel, which was identified on the basis of its ability to stratify advanced prostate tumors, was not effective for stratifying low Gleason score prostate tumors (fig. S8). This suggests the potential for developing distinct biomarker panels suitable for evaluating low or advanced Gleason score tumors.

Furthermore, our current analyses of a gene signature associated with aging and senescence differ from previously identified signatures associated with other relevant biological processes, cell cycle regulation (24), and “stemness” (47). Because these other signatures are distinct and have virtually no overlap with the 19-gene indolence signature identified in the current study, it may be feasible to query these signatures using decision tree analyses to identify biomarker panels that complement the three-gene panel described herein. Finally, our three-gene panel based on detection of protein expression on biopsy samples complements approaches aimed at detection of biomarkers in urine or other body fluids, as exemplified in a previous study showing stratification of cancer risk by detection of TMPRSS2:ERG and PCA3 in urine (48) and one for stratifying patients with castration-resistant prostate cancer by analyses of mRNA levels in blood (49). Thus, we envision that accurate staging in the clinical setting may ultimately encompass multiple independent biomarker approaches, which ideally will be developed for evaluation of specific clinical stages (that is, early stage or advanced) or for analyses of specific clinical materials (such as prostatectomy, biopsy, urine, blood, or circulating cells).

The new three-gene panel should be further evaluated to assess whether analyses of their expression on biopsies from patients with low Gleason score prostate tumors can be incorporated into clinical assays to distinguish indolent tumors and identify those most likely to progress to aggressive disease. When combined with other clinical parameters, such a prognostic test performed on biopsies may contribute to more effective monitoring of men on active surveillance protocols. Stratification of low Gleason score prostate cancer into indolent and aggressive subtypes may improve the landscape for effective prognosis by distinguishing those patients in need of expedited treatment interventions from those likely to remain clinically asymptomatic.

MATERIALS AND METHODS

Detailed descriptions of patient cohorts (table S2), experimental procedures, statistical analyses, and computational methods, including code (table S7), are given in the Supplementary Materials.

Study design

The study design is shown in Fig. 1. The present study was designed to test the hypothesis that molecular processes of aging and senescence distinguish indolent versus aggressive prostate cancer (Fig. 1). This hypothesis was tested by first assembling a 377-gene signature of aging and cellular senescence, which was used to query human cancer profiles (Table 1), and by performing GSEA with a mouse model of indolent prostate cancer. These experiments resulted in the identification of a 19-gene indolence signature, which was then used to perform decision tree learning with the use of an independent human cohort to identify a three-gene panel. The three-gene panel was validated at the mRNA and protein levels using independent patient cohorts, and then validated on biopsies from patients on active surveillance.

Computational methods

The 377-gene signature of aging and cellular senescence was assembled from the following sources: (i) meta-profile analyses (22), (ii) Ingenuity pathway analysis (http://www.ingenuity.com), and (iii) manual curation (5052). A complete description of the 377-gene set is provided in table S1. GSEA was performed as described (53). Integrative P values were calculated with Fisher’s combined probability test. The decision tree learning algorithm was run by selecting the “classification” method from the “classregtree” function (MATLAB, Statistics Toolbox).

Statistical methods

K-means clustering was done with the “k-means” function from the Statistics Toolbox in MATLAB. For confusion matrices, accurate predictions were calculated for indolent or lethal clusters and combined to calculate an odds ratio. Kaplan-Meier analyses were conducted with the MATLAB script; P values were computed with a log-rank test. The overall C-index (54), CIs, and corresponding P values were calculated with the survcomp package of R (55). The predicted probability of survival for computing C-index was obtained through the multivariate Cox proportional hazard models. All Cox proportional hazard models fitted were examined and found to be valid for the data under study, as suggested in (56).

Immunohistochemical analyses

All studies involving human subjects were approved by the Institutional Review Board of Columbia University Medical Center. TMAs were composed of primary prostate tumors obtained from the Herbert Irving Comprehensive Cancer Center Tissue Bank (Table 1). Biopsy samples were obtained from patients seen in the Department of Urology at Columbia University Medical Center from 1992 to 2012. Immunohistochemical analyses were performed with anti-FGFR1 (Abcam, catalog no. ab10646), anti-PMP22 (Sigma, catalog no. P0078), and anti-CDKN1A (BD Pharmingen, catalog no. 556431). The percentage of positive tumor cells (0 to 100%) and staining intensities (0 to 2) were assessed for each core or biopsy, and composite scores were generated.

SUPPLEMENTARY MATERIALS

www.sciencetranslationalmedicine.org/cgi/content/full/5/202/202ra122/DC1

Materials and Methods

Fig. S1. Supplementary GSEA data for human cancer.

Fig. S2. Phenotypic analysis of a mouse model of indolent prostate cancer.

Fig. S3. Supplementary data for the decision tree learning model and k-means clustering.

Fig. S4. Confusion matrices for top-ranked three-gene combinations from the decision tree learning model.

Fig. S5. Supplementary Kaplan-Meier analyses comparing the 19-gene indolence signature and the top three-gene combinations from the decision tree learning model.

Fig. S6. Supplementary Kaplan-Meier analyses for the single genes in the three-gene panel.

Fig. S7. Immunostaining of three-gene panel comparing biopsies and primary tumors.

Fig. S8. Kaplan-Meier analyses comparing the three-gene panel with biomarkers from Ding et al. (46) and Cuzick et al. (24).

Decision tree analysis to identify the best gene combinations

Unsupervised clustering analysis using k-means clustering and Kaplan-Meier survival analysis

Table S1. Description of the 377-gene set of aging and senescence.

Table S2. Description of patient samples used in this study.

Table S3. Leading/lagging-edge genes from the GSEA analyses.

Table S4. Integrative analyses of the 377-gene set.

Table S5. Description of the 19-gene indolence signature.

Table S6. Three-gene combinations from the decision tree learning model.

Table S7. SWEAVE documents.

Table S8. REporting of tumor MARKing studies (REMARK summary).

References (5761)

REFERENCES AND NOTES

  1. Acknowledgments: We thank E. Gelmann, S. Emerson, and A. Neugut for thoughtful comments on the manuscript. We acknowledge the support of the Herbert Irving Comprehensive Cancer Center Shared Resource in Molecular Pathology for generation of the prostate cancer TMA and for providing the biopsy samples. Funding: Supported by grants CA154293 (to M.M.S. and C.A.-S.), CA084294 (to C.A.-S., M.M.S., and A.C.), CA121852 (to A.C.), Silico Research Centre of Excellence NCI-caBIG, SAIC 29XS192 (to A.C.), and an award from the T. J. Martell Foundation for Leukemia, Cancer and AIDS Research (M.C.B.). A.A. is a recipient of a Marie Curie International Outgoing Fellowship (PIOF-GA-2009-253290), co-sponsored with the Catalan Institute of Oncology–Bellvitge Institute for Biomedical Research, Barcelona, Spain. C.L.M. is supported by a fellowship from the Swiss National Science Foundation (PBBSP3-146959). C.A.-S. is an American Cancer Society Research Professor supported in part by a gift from the F. M. Kirby Foundation. Author contributions: S.I. designed and performed most of the experiments and wrote the paper; M.B. designed and performed all computational analyses and wrote the paper; M.C.-M. performed pathological analyses of the TMA and biopsy studies; T.Z. performed all statistical analyses; A.A. designed and performed the experiments; S.W. assembled the active surveillance patient cohort; C.L.M. designed and performed the experiments; P.G. performed computational analyses; P.S. performed computational analyses; M.C.B. designed the analyses and monitored active surveillance patients; M.M.S. designed the analyses and wrote the paper; A.C. designed the analyses and wrote the paper; C.A.-S. designed the analyses and wrote the paper. Competing interests: S.I., C.A.-S., M.M.S., M.B., and A.C. have a patent pending for use of their biomarker panel for cancer screening. A.C. is on the scientific advisory board of Cancer Genetics.
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article