Research ArticleKidney Disease

Tissue transcriptome-driven identification of epidermal growth factor as a chronic kidney disease biomarker

See allHide authors and affiliations

Science Translational Medicine  02 Dec 2015:
Vol. 7, Issue 316, pp. 316ra193
DOI: 10.1126/scitranslmed.aac7071

Urine marker to the rescue

Chronic kidney disease is a common medical problem worldwide, but it is difficult to predict which patients are more likely to progress to end-stage disease and need aggressive management. Ju et al. have now drawn on four independent cohorts totaling hundreds of patients from around the world to identify the expression of epidermal growth factor (EGF) in the kidneys as a marker of kidney disease progression. Moreover, the authors demonstrated that the amount of EGF in the urine is just as useful, providing a biomarker that can be easily tracked over time without requiring invasive biopsies.


Chronic kidney disease (CKD) affects 8 to 16% people worldwide, with an increasing incidence and prevalence of end-stage kidney disease (ESKD). The effective management of CKD is confounded by the inability to identify patients at high risk of progression while in early stages of CKD. To address this challenge, a renal biopsy transcriptome-driven approach was applied to develop noninvasive prognostic biomarkers for CKD progression. Expression of intrarenal transcripts was correlated with the baseline estimated glomerular filtration rate (eGFR) in 261 patients. Proteins encoded by eGFR-associated transcripts were tested in urine for association with renal tissue injury and baseline eGFR. The ability to predict CKD progression, defined as the composite of ESKD or 40% reduction of baseline eGFR, was then determined in three independent CKD cohorts. A panel of intrarenal transcripts, including epidermal growth factor (EGF), a tubule-specific protein critical for cell differentiation and regeneration, predicted eGFR. The amount of EGF protein in urine (uEGF) showed significant correlation (P < 0.001) with intrarenal EGF mRNA, interstitial fibrosis/tubular atrophy, eGFR, and rate of eGFR loss. Prediction of the composite renal end point by age, gender, eGFR, and albuminuria was significantly (P < 0.001) improved by addition of uEGF, with an increase of the C-statistic from 0.75 to 0.87. Outcome predictions were replicated in two independent CKD cohorts. Our approach identified uEGF as an independent risk predictor of CKD progression. Addition of uEGF to standard clinical parameters improved the prediction of disease events in diverse CKD populations with a wide spectrum of causes and stages.


Chronic kidney disease (CKD) presents a major public health challenge and is an important cause of global mortality (1). Eight to 16% of the population worldwide is affected by CKD (24) with increased risk for end-stage kidney disease (ESKD), cardiovascular disease (CVD), and death (5). A limited set of therapeutic approaches have been shown to slow progression of CKD and reduce its associated CVD risk (2). The main challenges to improve outcomes in patients with CKD are the current inability to identify patients at high risk of loss of renal function in early CKD and then to provide the patients targeted, effective treatments. Identifying pathobiologically important biomarkers that can address both of these challenges would have a major impact on outcomes of patients with progressive CKD.

Multiple lines of evidence have emerged over the last four decades, supporting the concept of a common pathway activated in CKD progression (68). This concept suggests that once kidney damage reaches a threshold, the subsequent progression is largely independent of the initial kidney injury and uses a shared molecular mechanism for disease progression (8). Combining information on genetic risk and intrarenal gene regulation allows identification of the functional underpinnings of CKD progression, not only in model systems but also in human disease, and strongly supports the notion of a common progression pathway (9).

Here, we define the shared molecular mechanism of CKD, using four cohorts from across three continents (Fig. 1). Our approach is rooted in the hypothesis that both clinical phenotypes, such as baseline estimated glomerular filtration rate (eGFR) and proteinuria, and renal tissue alterations, such as tubular atrophy (TA), interstitial fibrosis (IF), and glomerulosclerosis, seen in CKD are associated with, and a consequence of, the dynamic molecular mechanisms reflected in the transcriptional programs of the diseased kidney tissues. By implementing this strategy, we identified epidermal growth factor (EGF) as a biomarker, with intrarenal mRNA and urinary EGF protein concentrations tightly correlated with eGFR at the time of biopsy and with the change in renal function over time, independent of eGFR and proteinuria.

Fig. 1. Schematic overview of the tissue transcriptome-driven strategy to identify urinary biomarkers for CKD progression.


Identification and validation of intrarenal candidate markers for prediction of kidney function

As the first step in the sequential strategy for CKD predictor identification (Fig. 1), we quantified intrarenal transcripts and evaluated their association with clinical phenotypes. Affymetrix GeneChip–based steady state transcript expression was derived from the tubulointerstitial compartment of kidney biopsies of 164 patients from the European Renal cDNA Bank (ERCB) (Table 1) and correlated with eGFR at the time of biopsy. A total of 72 candidate genes passed the selection criteria of significant correlation [false discovery rate (FDR) < 0.01] with eGFR and differential expression compared to living donors. These transcripts are enriched in the key pathogenic pathways that have been implicated in driving CKD progression (table S1). The expression of these transcripts was measured in 55 CKD patients from the ERCB (Table 1) by quantitative real-time polymerase chain reaction (qRT-PCR) (table S2) and assessed for their ability to predict patients’ eGFR at time of biopsy (Fig. 2A). A panel of six genes [nicotinamide N-methyltransferase (NNMT), EGF, thymosin β 10 (TMSB10), tissue inhibitor of metalloproteinases (TIMP1), tubulin α 1a (TUBA1A), and annexin A1 (ANXA1)] showed the best predictive performance (correlation between predicted and observed eGFR values, r = 0.77, P < 0.0001; Fig. 2B). Candidate transcripts were then assessed in a second, independent North American cohort, the Clinical Phenotyping Resource and Biobank Core (C-PROBE), consisting of 42 patients with CKD (Table 1), with the EGF, NNMT, and TSMB10 panel showing the best eGFR prediction (r = 0.62, P < 0.0001; fig. S1).

Table 1 Demographic characteristics of CKD patients in the gene expression discovery and validation cohorts.

Age and eGFR are presented as means ± SD. SLE, lupus nephropathy; RPGN, rapidly progressive glomerulonephritis; HTN, hypertensive nephropathy; MGN, membranous glomerulonephritis; DN, diabetic nephropathy; FSGS, focal segmental glomerulosclerosis; MCD, minimal change disease; TMD, thin basement membrane disease.

View this table:
Fig. 2. Transcript-eGFR association and EGF expression in the kidney in patients with CKD.

(A) Correlation between observed and predicted eGFR (continuous line) on the basis of intrarenal transcripts integrated by ridge regression analysis (top 1 to 30 transcripts shown), providing eGFR prediction in the validation cohort (n = 55). Gray area represents 95% confidence interval (CI). Black arrow indicates the highest correlation provided by a six-marker set, which is depicted in (B). (B) Maximal correlation (r = 0.77, P < 0.0001) was demonstrated between observed eGFR and that predicted by gene expression values of six transcripts (n = 55). Dotted line represents 95% CI. (C) Intrarenal EGF mRNA showed a significant correlation with patients’ eGFR in discovery (n = 164) and validation cohorts (n = 55 and 42). TLDA, TaqMan Low Density Array. (D) EGF mRNA expression in major human organs/tissues (a selection out of a panel of 84 human organs/tissues and cell lines), derived from BioGPS (, indicating a highly kidney-specific expression pattern. For full data set, see fig. S2. (E) EGF mRNA expression pattern in adult kidney (glomeruli, inner and outer cortex, inner and outer medulla, papillary tips, and renal pelvis); data were extracted from Higgins et al. (10). (F) In situ hybridization demonstrates tubule-specific EGF mRNA in both cortex (I and III) and medulla (II and IV), with reduced expression in CKD (III and IV) compared to healthy controls (I and II). Black arrows indicate positive staining in pink. Scale bars, 50 μM.

Intrarenal transcript-based selection of urinary biomarkers

Next, noninvasive biomarker candidates were selected from the intrarenal transcript predictor panel. Markers for noninvasive evaluation were prioritized on the basis of (i) correlation with eGFR in multiple cohorts (table S3), (ii) kidney-specific transcript and protein expression, and (iii) compelling biology for a mechanistic role in CKD. These selection criteria converged on EGF as the top candidate. First, intrarenal EGF mRNA expression consistently showed significant correlation with eGFR (FDR ≤ 0.05; r = 0.66, 0.81, and 0.42 in the discovery and the two validation cohorts, respectively) and contributed to the best performing gene expression predictor panel in both ERCB and C-PROBE (Fig. 2C). Second, the EGF transcript showed highly kidney-specific expression compared to other candidates in the human tissue panel (Fig. 2D and fig. S2). EGF mRNA was enriched in renal cortex and medulla compared to glomeruli, papillary tips, and renal pelvis (Fig. 2E) (10). In situ hybridization showed EGF mRNA expression in tubular cross sections in medulla and cortex (Fig. 2F, I and II), consistent with published immunohistochemical data demonstrating that EGF expression is restricted to the thick ascending limb of Henle and distal tubules (11). Third, in situ hybridization showed diminished EGF transcript expression in tubular cells of the cortex and medulla in CKD (Fig. 2F, III and IV versus I and II). Finally, EGF is a known mediator of differentiated epithelial cell function and regeneration. Both in vivo and in vitro studies have shown that EGF enhances renal tubule cell regeneration and repair and accelerates the recovery of renal function after injury (1214) through its pro-proliferative and antiapoptotic action.

To test whether urinary EGF protein reflected intrarenal EGF transcript expression, we measured EGF protein concentrations and normalized them to urine creatinine concentration (uEGF/Cr) in patients with paired urine and kidney biopsy samples. uEGF/Cr correlated significantly with the intrarenal EGF RNA expression in 34 C-PROBE (r = 0.70, P < 0.0001; Fig. 3A) and 85 Nephrotic Syndrome Study Network (NEPTUNE) patients (r = 0.69, P < 0.0001; Fig. 3B; for cohort details, see table S4).

Fig. 3. Correlation of uEGF/Cr with intrarenal EGF mRNA and eGFR.

(A and B) uEGF/Cr is correlated with intrarenal EGF mRNA in patients with matching urine enzyme-linked immunosorbent assay (ELISA) data and tissue mRNA expression data in C-PROBE (A) (n = 34) and NEPTUNE (B) (n = 85) cohorts. uEGF/Cr is correlated significantly (P < 0.0001) with eGFR at the time of biopsy in patients from C-PROBE [(C) n = 349], NEPTUNE [(D) n = 141], and PKU-IgAN [(E) n = 452].

Next, we used urine samples of 349 C-PROBE patients (Table 2) to evaluate the association between uEGF/Cr and eGFR, independent of the 34 C-PROBE patients described above. uEGF/Cr correlated significantly with eGFR (r = 0.77, P < 0.0001) at the time of sample procurement (Fig. 3C). The uEGF/Cr-eGFR correlation was significant in C-PROBE subgroups of patients with DN (n = 70) or with diabetes (n = 135) (fig. S3). The correlation was further replicated in 141 NEPTUNE patients with biopsy-proven FSGS, MCD, and MGN (Table 2) (r = 0.82, P < 0.0001, Fig. 3D) and in 452 patients from a prospective cohort study enrolling patients with biopsy-proven primary immunoglobulin A (IgA) nephropathy at Peking University First Hospital, China (PKU-IgAN) (Table 2) (r = 0.78, P < 0.0001; Fig. 3E). In all three cohorts, uEGF/Cr showed only a weak negative correlation (r = −0.14, −0.17, and −0.31 in C-PROBE, NEPTUNE, and PKU-IgAN, respectively) with urinary albumin-to-creatinine ratio (ACR, fig. S4), suggesting that uEGF/Cr could represent a different pathophysiologic mechanism than that detected by albuminuria.

Table 2 Baseline characteristics of patients whose urine samples were analyzed in urine studies.
View this table:

Association of EGF with renal biopsy outcome predictors

The correlation of uEGF/Cr with histological scores of IF and TA was measured to assess the association of this biomarker with tubulointerstitial lesions (15). IF/TA scores have been reported to be associated with cross-sectional eGFR (16) and can predict long-term kidney function (17). IF and TA scores were obtained from 102 NEPTUNE patients, and a significant inverse correlation was observed between uEGF/Cr and IF and TA [ρ = −0.75 and −0.74, respectively (P < 0.001); Fig. 4]. The correlations remained significant after adjusting for eGFR and ACR (ρ = −0.33 for both, P < 0.001).

Fig. 4. Association of uEGF/Cr with tubulointerstitial damage.

(A and B) Tubulointerstitial damage is reflected by the percentage of cortex affected by IF/TA. Scoring of IF/TA was based on evaluation of silver, periodic acid–Schiff (PAS), and trichrome-stained kidney sections (n = 102) by six readers who were blinded to the uEGF/Cr value. For each section, the average scores of the six readers for IF and TA were calculated as indicators of tubulointerstitial damage. uEGF/Cr is significantly correlated with IF (A) and TA (B) scores (Spearman correlation, P < 0.001).

EGF and CKD progression

To determine whether EGF can predict CKD progression, we assessed the association between EGF and loss of renal function using eGFR slope as a continuous variable. Both intrarenal gene expression data and three or more eGFR values (over 4.0 ± 1.8 years) were available for 29 C-PROBE patients, allowing correlation of renal transcript amounts with eGFR slope. Four hundred thirty-one transcripts were found to be correlated with eGFR slope (FDR < 0.05). A functional network analysis predicted EGF to be the top upstream regulator of 37 eGFR slope-correlated transcripts (P = 2.09 × 10−12; table S5 and fig. S5).

Both intrarenal EGF mRNA and urinary EGF protein/Cr showed a significant correlation with eGFR slope in the C-PROBE cohort [r = 0.58, P < 0.001, for EGF tissue mRNA (Fig. 5A) and r = 0.64, P < 0.001, for uEGF/Cr (fig. S6)]. We then used uEGF/Cr to predict eGFR slope via a multivariable regression model. Adjusted for age and gender, the eGFR slope predicted by uEGF/Cr showed a higher correlation with observed eGFR slope (r = 0.65, P < 0.001; Fig. 5B), compared to a model using ACR instead of uEGF/Cr (r = 0.29, P < 0.001; Fig. 5C).

Fig. 5. Association of EGF with eGFR slope.

(A) Intrarenal EGF RNA expression correlated significantly (P < 0.001) with eGFR slope in C-PROBE patients (n = 29). (B and C) Correlation of the observed eGFR slope of CKD patients in C-PROBE (n = 344) with slope predicted by uEGF/Cr (B) or ACR (C) using a regression model (adjusted for age and gender). eGFR slope predicted by uEGF/Cr (B) exhibited a higher correlation with the observed value than slope predicted by ACR (C).

The association of baseline uEGF with renal outcome (composite end point of ESKD or 40% reduction from baseline eGFR) was evaluated by classification power [comparing the area under receiver operating characteristic (ROC) curves of the corresponding models] and survival analysis (time-to-event analysis). ROC analysis showed an area under the curve (AUC) for the base model (including eGFR and ACR, adjusted for age and gender) of 0.84 (95% CI, 0.76 to 0.91) in C-PROBE. Addition of uEGF to the base model increased AUC to 0.90 (95% CI, 0.84 to 0.95) and resulted in a statistically improved model as evaluated by the likelihood test (P = 0.001). The classification value of uEGF added to the base model was replicated in the NEPTUNE and PKU-IgAN cohorts (fig. S7).

In survival analysis, addition of uEGF/Cr [model 2 (M2)] to the base prediction model M1 (including eGFR and ACR, adjusted for age and gender) demonstrated a significant (P < 0.0001) covariate effect on the likelihood ratio test, as well as an improvement in model fit, as R2 increased from 0.15 (M1) to 0.22 (M2), and the C-statistic increased from 0.75 (M1) to 0.87 (M2) (Table 3). Thus, addition of uEGF/Cr substantially improved the ability to predict renal outcomes when compared to ACR and baseline eGFR in combination. The improved prediction ability of the composite model M2 (M1 plus uEGF/Cr) was replicated in both NEPTUNE and PKU-IgAN cohorts (Table 3). Multivariable-adjusted associations of uEGF/Cr with the hazard of progression to composite end point are shown in Fig. 6. Higher uEGF/Cr was associated with decreased risk of ESKD or 40% reduction of baseline eGFR in three independent cohorts: the estimated hazard ratios (HRs, 95% CI) were 0.27 (0.13 to 0.54), 0.29 (0.15 to 0.58), and 0.53 (0.37 to 0.69) for C-PROBE, NEPTUNE, and PKU-IgAN, respectively, adjusted for age, gender, eGFR, and ACR. Thus, with one unit increase of uEGF/Cr (in log2 scale) and the other covariates withheld, the hazard of disease progression would decrease by 73, 71, and 47%, respectively, in these three cohorts. This is equivalent to a one-unit decrease of uEGF/Cr (log2) being associated with an increased risk of CKD progression of 3.73 (1.85 to 7.69)–fold, 3.43 (1.72 to 6.67)–fold, and 1.96 (1.45 to 2.70)–fold in these three cohorts, respectively.

Table 3 The association of uEGF with time to composite end point.

The association was evaluated by a Cox proportional hazards regression analysis in C-PROBE, NEPTUNE, and PKU-IgAN cohorts. Renal event is defined as the presence of composite end point of ESKD or 40% reduction in baseline eGFR. The follow-up lengths (in years) for this analysis for the three cohorts were 1.8 ± 0.8, 2.0 ± 0.8, and 3.6 ± 2.2, respectively. AIC, Akaike information criterion; LR, likelihood ratio test.

View this table:
Fig. 6. Multivariable-adjusted HRs for predicting the composite end point on the basis of uEGF/Cr.

HRs were adjusted by age, gender, eGFR, and ACR. Adjusted HRs and 95% CIs were obtained by separate Cox regression models in each study cohort. A one-unit decrease of uEGF/Cr (in log scale) was associated with an increased risk of CKD progression of 3.73 (1.85 to 7.69)–fold, 3.43 (1.72 to 6.67)–fold, and 1.96 (1.45 to 2.70)–fold in these three cohorts. The unadjusted HRs for EGF were 0.33 (0.21 to 0.51) (C-PROBE), 0.33 (0.21 to 0.52) (NEPTUNE), and 0.57 (0.46 to 0.70) (PKU-IgAN).


By applying a renal biopsy transcriptome-driven sequential marker discovery approach, we demonstrated that (i) a set of intrarenal transcripts correlates with kidney function across a broad range of renal diseases; (ii) uEGF concentration showed a strong correlation with tissue EGF transcript expression and maintained the ability to predict eGFR; (iii) uEGF concentration was associated with eGFR slope; (iv) uEGF added predictive power to traditional clinical prognostic markers of CKD progression end points across cohorts.

The GFR-correlated transcripts are enriched in the key pathogenic pathways that have been implicated in driving CKD progression, ranging from chronic inflammation and extracellular matrix modulation to tubular cell differentiation (table S1). Because these pathways are involved in CKD progression independent of etiology, prognostic biomarkers that best represent them should be applicable to a wide spectrum of causes of CKD.

Estimates of GFR are the best overall indices of kidney function in health and disease, according to the National Kidney Foundation Kidney Disease Outcomes Quality Initiative (NKF KDOQI) Clinical Practice Guidelines (18). Baseline eGFR and proteinuria are established predictors of CKD progression and capture primary elements of glomerular function, with eGFR reflecting glomerular hemodynamics and ultrafiltration capacity and proteinuria reflecting changes in glomerular permselectivity and its impact on CKD progression. Multiple potential biomarkers for renal inflammation and fibrosis, a dominant signature in the CKD pathway analysis presented here, have been shown to be associated with kidney impairment. These include tumor necrosis factor receptors 1 and 2 (TNFR1 and TNFR2) (19, 20), monocyte chemotactic protein 1 (MCP1) (21), and matrix metalloproteinases (22). Because these molecules are found in plasma as well, their urinary concentration is a function of both glomerular filtration and intrarenal production in proteinuric disease, potentially affecting their utility as urinary biomarkers of tubulointerstitial disease. Biomarkers of tubular function and reserve are not available for clinical use in CKD. As described in this study, uEGF has the potential to serve this role, given its highly restricted intrarenal expression and strong correlation between renal transcripts and urinary protein concentration.

EGF function provides an independent rationale for its evaluation as a CKD progression biomarker. EGF has been reported to be responsible for modulation of tissue response to injury in the kidney after tubulointerstitial damage (23, 24). Exogenous EGF enhanced the tubular cell repair process and accelerated tubular regeneration and recovery of renal function in an animal model of acute renal injury (13), suggesting that uEGF concentrations are not only associated with concurrent renal function but also regulate signaling pathways of tubular recovery after injury. This offers the intriguing possibility that uEGF might serve as a surrogate marker for regenerative functional reserve of the renal tubules, reflecting their ability to respond to future acute or chronic injury, which are key factors in CKD progression. This concept is further supported by the unbiased functional analysis of gene expression data sets, which predicted that EGF was as an upstream regulator of eGFR slope-associated genes. uEGF inversely correlated with IF and TA, established independent morphometric predictors of renal outcomes (2527). These associations not only provide strong support for our hypothesis but also shed light on the pathogenic role of EGF in CKD progression.

Reduced concentrations of EGF protein in the urine have been previously observed in DN, IgAN, adult polycystic kidney disease, and children with chronic renal failure (2831). For example, Torres et al. showed that in a single center cohort of 132 IgAN patients, the ratio of EGF to MCP1 in the urine was negatively associated with the composite outcome of ESKD and/or doubling of the baseline serum creatinine (23), but it was not compared to conventional predictors of ESKD in predicting renal survival.

A critical factor for the uEGF-eGFR correlation is the high degree of tissue specificity, making uEGF robust to extrarenal events that may affect the accuracy of other nonspecific biomarkers. EGF is absent or only minimally detectable in plasma (32). Urinary EGF is reported to be derived from the ascending portion of Henle’s loop and the distal convoluted tubule (33, 34), and in situ hybridization confirmed local synthesis of EGF in tubular compartments. The tissue specificity might be responsible for the robustness of the correlation of uEGF with renal function and renal functional decline across a wide spectrum of systemic and primary renal diseases in diverse environments.

Our study has several limitations. The discovery approach used in this study critically depends on renal biopsy tissue, raising concerns about the ability to extrapolate the renal tissue findings to CKD patients not undergoing biopsy. However, uEGF correlated as tightly with eGFR across all C-PROBE patients as it did in the biopsy patients from the same cohort (r = 0.77 for both, P < 0.001), supporting the utility of uEGF across CKD. A second potential limitation is that the biopsy cohorts had a relatively small proportion of patients with DN (19 of 261 across the three cohorts). To determine whether the results of our study were generalizable to CKD patients with diabetes, who form the majority of CKD in the United States, we specifically investigated patients with DN. In the ERCB discovery cohort, we saw a correlation between intrarenal EGF mRNA expression and eGFR in the 17 DN patients. In addition, of the 349 C-PROBE patients analyzed for noninvasive marker studies, about 39% had diabetes and 20% of patients were diagnosed with DN. Among these patients, we found correlations of uEGF with eGFR and eGFR slope. These data suggest that the results of our study may be generalizable to CKD patients with diabetes, but further validation in cohorts of patients with DN will be required.

To assess association with and prediction of CKD progression by uEGF, we used two principal approaches. First, eGFR slope was derived by a mixed-effect model to capture the trajectory of the disease over time. eGFR slope estimates assume linearity of progression and are vulnerable to acute kidney injury and treatment effects, particularly in patients with glomerular disease at the time of presentation (35, 36). Given these constraints, we focused the assessment of EGF and its association with eGFR slope in the chronic renal disease cohort (C-PROBE). The second approach used progression to ESKD or 40% eGFR reduction as composite end points, reflecting current concepts of clinical trial end points in CKD (37). The three cohorts used in this study had relatively short prospective follow-up time periods (1.8 to 3.6 years/mean), enriching this analysis for patients with rapid progression of CKD. The ability of uEGF to identify CKD patients with rapid progression might be particularly useful to enrich clinical trials with patients who are more likely to reach end points during the limited trial observation period. To assess the utility of uEGF for informing CKD management in routine clinical settings, further comprehensive studies with structured long-term follow-up will be needed.

In summary, our study used an unbiased screening strategy to identify uEGF as a predictor of CKD progression in patients with glomerular disease, in three independent cohorts with diverse ethnic and geographic backgrounds. uEGF was linked to tubular differentiation and regeneration, a mechanism essential to retain renal function in CKD, but not well reflected by the conventional predictors (proteinuria or baseline GFR). Addition of uEGF into a CKD biomarker panel will likely improve risk stratification of CKD patients and thereby enhance the ability to target clinical care and limited health care resources to those in most need, as well as to optimize clinical trial design. In addition, the importance of EGF-dependent mechanisms in CKD progression suggests potential therapeutic targets for patients with impaired regenerative tubular functional reserve.


Study design

The objective of this study was to identify and validate noninvasive markers for CKD progression. We applied a transcriptome-driven sequential strategy. For tissue transcriptome analysis, gene expression profiles from 164 consecutive biopsies from the ERCB (38) were used as a discovery cohort. Subsequent profiles of 55 patients from ERCB served as the first validation cohort, and 42 patients from C-PROBE whose samples were available for tissue gene expression analysis served as the second validation cohort (Table 1). Thirty-two living donor transplant biopsies obtained at the time of transplantation were used as controls. Urinary biomarker analysis was performed on all samples available at the time of the study from baseline visits from three independent cohort studies (C-PROBE, NEPTUNE, and PKU-IgAN), representing 349, 141, and 452 CKD patients, respectively (Table 2). In the above cohorts, only patients aged 18 years or older were included in this study. All biospecimens were procured after informed consent and with approval of the local ethics committees. The details of cohort designs and sample procurement are provided in Supplementary Materials and Methods.

Gene expression analysis

Transcriptome analysis was performed on microdissected tubulointerstitial components of human renal biopsies prospectively procured for molecular analysis, using Affymetrix GeneChip and TaqMan Low Density Arrays as previously published (39). Data processing details are provided in Supplementary Materials and Methods. Normalized expression data were log2-transformed and batch-corrected. FDR was applied to account for multiple testing. The Cel files are available at Gene Expression Omnibus ( under reference nos. GSE32591, GSE37455 (5), GSE35488 (6), GSE47185 (7), and GSE69438.

In situ hybridization

In situ hybridization was performed by branched-DNA signal amplification with QuantiGene ViewRNA ISH Tissue Assay Kit (Affymetrix/Panomics Solutions) on formalin-fixed paraffin-embedded human kidney tissue using the specific probe set for human EGF mRNA. Further details are provided in Supplementary Materials and Methods.

Measurements of urinary EGF

uEGF concentration was measured in spot urine samples with Human EGF Immunoassay Quantikine ELISA (R&D Systems). Samples (1:150 dilution) and standards were run in duplicate, absorbance was measured with a VersaMax ELISA plate reader, and results were calculated with SoftMax Pro (Molecular Devices). Detailed assay validation information is provided in Supplementary Materials and Methods. uEGF was normalized to urine creatinine concentration and is referred to as uEGF/Cr.

GFR estimates

GFR was estimated by the four-variable Modification of Diet in Renal Disease (MDRD) study equation (40) in ERCB, C-PROBE, and NEPTUNE. Both MDRD- and CKD-EPI GFR estimates were determined for C-PROBE subjects, and their log2-transformed values showed a high correlation (r = 0.993), indicating limited impact of the GFR estimation on the data presented. In PKU-IgAN cohort, GFR was estimated by a modified MDRD equation (c-aGFR4) specifically adapted for Chinese CKD patients (41, 42). Log2-transformed c-aGFR4 and MDRD-eGFR were highly correlated (r = 0.996).

Outcome measure

Two outcomes were used to evaluate CKD progression: (i) eGFR slope and (ii) a composite end point of ESKD or 40% reduction of baseline eGFR (37). eGFR slope was calculated for patients with at least three eGFR records over a minimum follow-up of 1.5 years, including eGFRs recorded before the baseline visit, if available. To account for irregularly spaced creatinine measurements, a linear mixed-effects model (43) was applied to calculate subject-specific eGFR slope. More details are provided in Supplementary Materials and Methods.

Evaluation of tubulointerstitial damage using IF/TA

Histopathology was assessed in NEPTUNE biopsy cohort with digital images obtained from paraffin-embedded kidney tissue stained with hematoxylin and eosin, PAS, silver-based, and trichrome stains (44). Whole slide images of glass slides from 102 cases stored in the NEPTUNE digital pathology repository were assessed for percentage of cortex involved by IF/TA by six pathologists. The percentage of cortex involved by IF/TA was determined in each individual stain and averaged for an overall % value. More details are provided in Supplementary Materials and Methods.

Statistical analysis

The correlations of log2-transformed transcript expression values with both baseline eGFR and eGFR slope were calculated using Pearson correlation. FDR was applied to account for multiple testing. The selection criteria for candidate transcript markers included (i) significant correlation of transcript levels with baseline eGFR (|r| > 0.4, FDR < 0.01) and (ii) significant differential expression of transcripts compared to living donor kidney biopsy controls (FDR ≤ 0.05, fold change ≥1.8 or ≤0.56) in at least three of eight kidney diseases studied in the discovery cohort.

Ridge regression was performed on the discovery cohort to combine transcripts showing associations with baseline eGFR into a multimarker signature for better prediction. In the analysis, the transcripts were ranked by their strength of marginal association with log2 eGFR. The top-ranked k genes were first standardized, then combined as linear predictors with weights of correlation coefficients, and then carried over into two validation cohorts.

Linear regression was applied to investigate the association between eGFR slope and the noninvasive protein marker uEGF, adjusted for age and gender. The overall quality of fit was evaluated by an F test at the 0.05 level of significance.

To evaluate the classification power of the candidate marker, we used logistic regression and AUC values from the ROC curves to compare nested models ( To assess the effect of the candidate marker on the hazard of reaching composite end points, Cox proportional hazards models were fit to the available prospective data. We examined nested models, starting with the base model (M1) including baseline eGFR, ACR, age, and gender. Samples with missing data were excluded from the analysis. In the subsequent model (M2), uEGF/Cr was added, and the goodness of fit and improved prediction ability of the additional parameters were assessed by likelihood ratio tests, C-statistics, and AIC. All statistical analysis was performed using R ( and SAS software (version 9.3).

Our approach to minimize overfitting included rigorous correction for multiple testing of the initial gene expression analysis, sequential validation of intrarenal markers in two test cohorts with two different mRNA quantification technologies (microarray and qRT-PCR), and replication of the uEGF analysis (association with eGFR and prediction of the composite renal end point) in two independent cohorts. Furthermore, we limited the number of covariates in Cox models with respect to observations. All patients who passed inclusion criteria and had samples available at the time of the study were included in the analysis.


Materials and Methods

Fig. S1. Baseline eGFR prediction by a three-marker panel in C-PROBE.

Fig. S2. EGF and two other candidate mRNAs’ expression patterns in a panel of 84 human organs, tissues, and cell lines.

Fig. S3. Correlation of EGF mRNA and uEGF/Cr with eGFR and eGFR slope in DN patients and CKD patients with diabetes.

Fig. S4. Correlation of uEGF/Cr with ACR.

Fig. S5. EGF as the top upstream regulator of genes correlated with eGFR slope.

Fig. S6. Correlation of uEGF/Cr with eGFR slope.

Fig. S7. ROC curve and corresponding AUC statistics for models with and without uEGF.

Fig. S8. mRNA localization by in situ hybridization: Negative and positive control images.

Table S1. Significantly enriched canonical pathways in intrarenal marker set.

Table S2. qRT-PCR assays used to validate expression of the intrarenal transcripts.

Table S3. Correlations of identified intrarenal transcripts with log2 eGFR of patients from the discovery and two validation cohorts.

Table S4. Demographic characteristics of NEPTUNE patients with intrarenal EGF expression data available.

Table S5. Top 10 upstream regulators of transcripts correlated with eGFR change over time (eGFR slope).

References (4553)


  1. Acknowledgments: We acknowledge all participating centers of the ERCB–Kröner-Fresenius Biopsy Bank (ERCB-KFB), the C-PROBE, the NEPTUNE, the PKU-IgAN study cohort, and their participants for their cooperation. We thank the support of George M. O’Brien Michigan Kidney Translational Core Center and the Michigan Diabetes Research Center at the University of Michigan. Funding: This study was supported by the Else Kröner-Fresenius Foundation (for ERCB); by the European Consortium for High-Throughput Research in Rare Kidney Diseases (EURenOmics; European Union FP 7:305608); by NIH (R01DK079912, P30DK081943, DK083912, P30DK020572, and UL1RR000433); by Office of Rare Diseases Research, National Center for Advancing Translational Sciences, National Institute of Diabetes and Digestive and Kidney Diseases, University of Michigan and NephCure Kidney International (U54DK083912); and by the University of Michigan Health System and Peking University Health Sciences Center Joint Institute for Translational and Clinical Research. Analysis of urine samples of C-PROBE patients was supported by Hoffman–La Roche. Author contributions: M.K., W.J., and V.N. participated in the study design. W.J., S.S., L.Z., A.R., M.T., C.S., and B.S. (for C-PROBE), W.J., S.S., S.M.B., and L.B. (for NEPTUNE), L.Z. (for PKU-IgAN), and C.D.C. (for ERCB) participated in data generation. K.S., V.N., L.Z., W.J., S.S., F.H.E., C.C.B., J.Y.-C.L., Y.Z., P.X.K.S., L.H.M., L.E., I.F., G.C.D.-P., G.D.-N., B.S., M.C.M., M. Bobadilla, H.-Y.W., H.Z., J.L., L.Z., and M.K. participated in data analysis. W.J., V.N., L.Z., H.Z., L.H.M., K.S., M. Bitzer, S.M.B., L.B., M.G.S., F.C.B., and M.K. participated in data interpretation; J.J.H., C.A.G., H.-Y.W., H.Z., and C.D.C. provided study materials; W.J. and C.C.B. participated in figure preparation; and W.J., V.N., L.H.M., P.X.K.S., F.C.B., K.S., and M.K. participated in writing the paper. The contribution of V.N. in this study is, in part, to fulfill the thesis requirement at the Ludwig Maximilian University of Munich. W.J. and V.N. share first authorship. W.J. and M.K. share correspondence. Competing interests: M. Bobadilla, G.C.D.-P., G.D.-N., L.E., M.C.M., M.T., I.F., C.S., M.K., V.N., and W.J. hold a patent PCT/EP2014/073413 “Biomarkers and methods for progression prediction for chronic kidney disease” related to this work. M.K. reports grants from Hoffman–La Roche during the conduct of the study; research support from AbbVie, AstraZeneca, Boehringer Ingelheim, and Eli Lilly outside the submitted work. M.K. is on the Board Advisory Committee of AbbVie, Eli Lilly, and Pfizer (honoraria paid to institution). C.D.C. received speaker honoraria from Hoffman–La Roche. Data and materials availability: The gene expression Cel files are available at Gene Expression Omnibus ( under reference nos. GSE32591, GSE37455, GSE35488, GSE47185, and GSE69438.
View Abstract

Navigate This Article