Research ArticleCancer

A Genomics-Based Classification of Human Lung Tumors

See allHide authors and affiliations

Science Translational Medicine  30 Oct 2013:
Vol. 5, Issue 209, pp. 209ra153
DOI: 10.1126/scitranslmed.3006802


We characterized genome alterations in 1255 clinically annotated lung tumors of all histological subgroups to identify genetically defined and clinically relevant subtypes. More than 55% of all cases had at least one oncogenic genome alteration potentially amenable to specific therapeutic intervention, including several personalized treatment approaches that are already in clinical evaluation. Marked differences in the pattern of genomic alterations existed between and within histological subtypes, thus challenging the original histomorphological diagnosis. Immunohistochemical studies confirmed many of these reassigned subtypes. The reassignment eliminated almost all cases of large cell carcinomas, some of which had therapeutically relevant alterations. Prospective testing of our genomics-based diagnostic algorithm in 5145 lung cancer patients enabled a genome-based diagnosis in 3863 (75%) patients, confirmed the feasibility of rational reassignments of large cell lung cancer, and led to improvement in overall survival in patients with EGFR-mutant or ALK-rearranged cancers. Thus, our findings provide support for broad implementation of genome-based diagnosis of lung cancer.


Lung cancer is traditionally classified into non–small cell lung cancer (NSCLC), small cell lung cancer (SCLC), and carcinoid (CA). NSCLC is further divided into adenocarcinoma (AD), squamous cell carcinoma (SQ), and large cell carcinoma (LC), which also includes tumors with neuroendocrine differentiation [large cell neuroendocrine carcinoma (LCNEC)] (1). These categories have been enriched with detailed histomorphological and immunohistochemical characteristics leading to the 2004 World Health Organization (WHO) classification. The most detailed subcategories exist for AD, mostly defined by growth patterns (2). This rather descriptive taxonomy has been complicated in the past decade because of the recognition of somatic genetic alterations occurring in some of these subtypes: EGFR mutations, KRAS mutations, and EML4-ALK fusions occur mainly in lung AD (37), whereas mutations in DDR2, FGFR2, and NFE2L2 or amplifications of FGFR1 and SOX2 mainly affect SQ (811). The fact that some of these are associated with clinical response to molecularly targeted therapeutics (35, 12, 13) emphasizes the importance of adding genetic annotation to the current taxonomy. Systematic efforts to comprehensively characterize the cancer genome (14) constantly add genome alterations to the compendium of such potentially actionable alterations (9, 1521). Similarly, immunohistochemical analyses have challenged some of the original histomorphological diagnoses, in particular in the case of LC (22, 23). We therefore sought to assess cancer genome alterations linked to histomorphological and immunohistochemical features of the disease as well as to patient outcome to identify genetically defined subtypes of lung tumors and optimize them for biologically informed patient stratification for personalized therapeutic approaches. We then tested the clinical relevance of these molecularly defined patient subgroups prospectively in a diagnostic outreach setting.


Cancer genome alterations in human lung tumors

Recent studies have provided analyses of genome alterations in lung cancer (20, 21, 24, 25). To establish a systematic relationship of such alterations across the different cancer subtypes, we collected a total of 1882 surgically resected, fresh-frozen human lung tumor specimens with clinical annotation, yielding 1255 specimens suitable for genetic analysis (Table 1, fig. S1A, and table S1). Visual inspection of somatic copy number alterations (SCNAs) in those tumors for which both single-nucleotide polymorphism (SNP) array and histology data were available (922 of 1032) revealed distinctive patterns in cases sorted according to their initial histological subtype (Fig. 1A): some SCNAs were present in all subtypes (for example, gains affecting 5p), whereas other SCNAs were subtype-specific (for example, amplifications of 3q containing SOX2 in SQs) (8). In addition, the overall pattern of SCNAs differed across histological entities; for example, SCLCs and some LCs exhibited a predominance of chromosome arm–level events, in contrast to focal events in AD and SQ. Stromal admixture could mask the detection and amplitude of SCNAs (26), but only partly accounted for the diversity of SCNA patterns between subtypes (Fig. 1A). LC, in contrast to other subtypes, did not exhibit a specific SCNA pattern. To identify significant copy number alterations across lung tumor subtypes, we applied a rank sum–based method that is insensitive to tumor purity and identified 8 regions of amplification and 12 regions of deletion (fig. S2 and tables S2 and S3) (11). In cases with only one deletion, CDKN2A (located at 9p21) was affected in 38% (fig. S3) (11, 27).

Table 1. Clinical characteristics of lung cancer patients in the retrospective [Clinical Lung Cancer Genome Project (CLCGP)] and prospective (NGM) data sets.

UICC, Union Internationale Contre le Cancer.

View this table:
Fig. 1. A global view of the lung cancer genome.

(A) Copy number profiles of lung cancer specimens of the major histological subtypes (n = 992) (red, increases; blue, decreases) are plotted along the genome (horizontal axis: chromosomes as indicated, centromeres in red). Vertical colored bars on the left indicate lung cancer subtypes. Bottom: The frequencies (y axis) of copy number gains (red; cutoff, 2.7) and losses (blue; cutoff, 1.3) across all samples, calculated for adjoining 1-Mb fragments using segmented copy number data, are represented along the genome. Purity of tumor samples determined through SNP array–derived copy number data (26) is shown on the right, with the median purity calculated for each histological subgroup indicated in red. (B) Mutations and ALK rearrangements (ALK*) are depicted per sample per gene as colored ellipses. Sample order was conserved from (A), and colors were chosen consistent with lung cancer subtypes. Total mutation frequencies per gene expressed as a percentage of all cases are shown as a bar graph at the bottom. Frequencies below 1% are marked with an asterisk. (C) Kaplan-Meier curves for overall survival are shown for the overall population per histological subtype (LC includes LCNEC), per genotype, for EGFR-mutant cases according to their TP53 mutation status, and for TP53-mutant cases according to their RB1 alteration status (from left to right) (P values for survival were calculated using the log-rank test). Numbers of cases with wild-type (wt) and mutant (mut) TP53 in early stages (I and II) and late stages (III and IV) are given for EGFR-mutant cases (inset; P value was calculated using the Pearson χ2 test). Color code for histology: orange, AD; black, CA; green, LC (including LCNEC); red, SCLC; blue, SQ.

Similarly, most mutations showed histological subtype specificity (Fig. 1B). The most frequently mutated genes were TP53 (53.6%), KRAS (16.1%), STK11 (9.8%), EGFR (7.2%), KEAP1 (6.6%), and NFE2L2 (4.5%) (20, 21, 28). Seventeen genes were altered in at least two samples (figs. S4 and S5 and tables S1, S4, and S5). The NFE2L2/KEAP1 axis was one of the most frequently mutated oncogenic pathways in lung cancer. Furthermore, gene set enrichment analysis revealed significant changes in the expression of NFE2L2 and KEAP1 target genes in cases harboring such mutations (false discovery rate, q = 0.0008) (29). Either NFE2L2 or KEAP1 was mutated in 10.4% of AD (21) and 16.9% of SQ (20) in mutually exclusive fashion. NFE2L2 mutations were mainly found in SQ (3032) (Fig. 1B and fig. S4). In addition to known frequent driver mutations, we also found rare mutations that are possibly relevant for therapy. For example, we found an R248C mutation in the fibroblast growth factor (FGF)–binding domain of FGFR3 (33, 34) in 3 of 365 SQ (0.8%), all of which were negative for FGFR2 mutations (35). FGFR3-R248C was oncogenic in vitro and associated with sensitivity to FGFR inhibition (fig. S6). Overall, more than 55% of all malignant lung tumors harbored at least one genetic alteration with features of a possibly tractable target (fig. S7).

We observed clinically relevant associations related to histology and stage (1). The most frequent genome alterations had no significant impact on survival in AD, LC, and SQ (Fig. 1C). TP53 mutations were associated with inferior survival in EGFR-mutant patients (P = 0.028), which was partly driven by the higher stage of these patients. Because most of these patients were diagnosed before EGFR inhibitors were broadly available, this observation is likely to reflect a general aggressive behavior of EGFR/TP53 double-mutant tumors (36). Similarly, patients with concurrent mutations of TP53 and RB1 loss had a particularly poor prognosis (Fig. 1C), independent of stage and histology (Cox regression analysis, P = 0.023).

Subtype-specific genome alterations

As a next step, we sought to determine which of the genomic alterations were significant in each subtype. Across all lung tumor cases, we identified amplified and deleted regions (fig. S2) that included known proto-oncogenes or tumor suppressor genes, which could be assigned to specific histological subtypes. Significantly amplified chromosomal regions in AD were 5p, 7p (EGFR), 8q (MYC), 11q (CCND1), 12q (MDM2), 14q (NKX2-1), and 17q (ERBB2) (21, 37); in SCLC were 1p (MYCL1), 2p (MYCN), 5p, 8p (FGFR1), and 19q (CCNE1) (24); and in SQ were 1p (MYCL1), 3q (SOX2), 7p (EGFR), 8p (FGFR1), and 11q (CCND1) (8, 11, 20). CA harbored no significant SCNA. As before, in LC, we observed amplifications typical of other histologies (for example, amplification of NKX2-1, which is specific for AD, or of SOX2, which is typical of SQ). Deletions in 9p (CDKN2A) were present in all subtypes except SCLC, which had deletions of 3p (FHIT) and 13q (RB1) as its hallmarks (Fig. 2A).

Fig. 2. Genomic alterations in histological subgroups of lung cancer.

(A) Significantly amplified (red) and deleted (blue) regions calculated using a rank sum–based algorithm (24) are plotted along the genome (y axis) for the five major lung cancer subtypes [AD (n = 421), CA (n = 69), LC (n = 101), SCLC (n = 63), and SQ (n = 338)]. Statistical significance, expressed by q values (x axes: amplification, upper scale; deletion, lower scale), was computed for each genomic location. Known or potential oncogenes (red) or tumor suppressor genes (blue) are given at respective locations. Vertical lines indicate level of significance of q = 0.01. (B) Frequencies of significant genomic alterations are given per gene per histological subtype. Colors of gene names are encoded as follows: red, amplified; blue, deleted; and black, mutated. Frequencies of alterations correspond to circle size [frequencies of deletions of FHIT and RB1 and mutations in TP53 were adapted by dividing values by three (asterisks); frequencies of mutations in EGFR, KRAS, and STK11, of deletions in CDKN2A, and of amplifications in FGFR1 and SOX2 were adapted by dividing values by two (circles)]. Significant mutations were determined using a binomial test with a background mutation rate of 0.5%. P values were adjusted for multiple hypothesis testing using the Benjamini and Hochberg method across each histological subtype. q values of significant results (q < 0.05) are indicated by the color code of the symbols (color key provided below the chart). (C) Associations of copy number alterations and mutations calculated using Fisher’s exact test followed by Benjamini and Hochberg adjustment are represented with a Circos plot. Involved genes are named at corresponding genomic locations (copy number gains in red, copy number deletions in blue, and mutations in black) outside the ring representing the genome. Internal lines show significant co-occurring (red) and mutually exclusive (blue) events (q < 0.05) between two copy number alterations or two frequently mutated genes (solid lines) or between a copy number alteration and a mutation (dashed lines) found in lung cancer.

The specific co-occurrence of several of the alterations in individual lung cancer subtypes suggested that it might be possible to determine patterns of alterations that could be used to identify subtypes based on genomics alone. We integrated data on mutations and SCNAs that were significant above the assumed background mutation rate of 0.005 (binomial test) to identify significant signature alterations for each histological subtype (Fig. 2B). Some of these alterations mainly occurred in a certain subtype, such as alterations in ALK (3.4%), BRAF (2.7%), EGFR (15.3%), ERBB2 (1.7%), KRAS (32.6%), and STK11 (17.4%) in AD; MYCN amplifications (6.5%) in SCLC; and mutations in DDR2 (1.1%), FGFR3 (0.8%), and NFE2L2 (10.6%) in SQ (Figs. 1B and 2B and fig. S4). Others were not only enriched in a given subtype (for example, NKX2-1 amplification in AD, MYCL1 amplification and RB1 deletion in SCLC, or SOX2 and FGFR1 amplification in SQ) but also present in other histologies (Fig. 2B). By contrast, LC harbored alterations typical of all other subtypes (amplification of ERBB2 and NKX2-1and mutations in KRAS and STK11 as in AD; amplification of MYCL1 and RB1 as in SCLC; amplification of CCND1, FGFR1, and SOX2 as in SQ) (Fig. 2B) but had no significant signature alterations.

The availability of large genomic data sets enabled us to conduct a systematic analysis of co-occurrence and exclusivity of genome alterations (Fig. 2C and fig. S8). In 5 of 21 lung tumors with BRAF mutations affecting residues other than V600E, either NRAS or KRAS was mutated (38); such cases might be particularly sensitive to mitogen-activated protein kinase kinase (MEK) inhibition (39). Furthermore, ERBB2 mutations never co-occurred with mutations in BRAF, HRAS, KRAS, NRAS, or STK11 (table S1). FGFR2 mutations did not co-occur with FGFR1 amplifications, in support of both alterations being oncogenic drivers (fig. S6). In our data set, EGFR amplifications correlated with EGFR mutations in AD (P = 0.0009), but not in SQ (fig. S9). EGFR amplification predicted treatment response and outcome of patients receiving EGFR inhibitors in some studies (40, 41), although this could not be confirmed in a study focusing exclusively on ADs (42). Thus, whether patients with EGFR-amplified SQ may benefit from EGFR inhibition is currently unclear. Similarly, loss of PTEN may influence the dependency of tumors on mutant receptor tyrosine kinases (43, 44). PTEN was homozygously or hemizygously deleted in 5.5 and 11.1% of EGFR-mutant ADs, respectively (fig. S10). In summary, genome alterations that define specific lung tumor subgroups were determined in the major histological subgroups, except for LC, which has genomic features of all other subtypes.

The heterogeneity of large cell lung cancer

Immunohistochemistry has become an indispensable method for lung cancer diagnosis. We therefore performed an independent immunohistochemistry-based pathology review of 583 cases confirming all subclasses of lung cancer except LC (fig. S11). In 42% of all LC cases, pathology review led to reclassification to the other subtypes sharing similar immunohistochemical and genetic features.

To gain further insight into the heterogeneity of LC, we applied gene expression–based unsupervised hierarchical clustering to 261 lung tumors, including 31 initially diagnosed LCs. Whereas 87% of ADs, 92% of CA, 84% of SCLCs, and 76% of SQs formed distinct clusters, LCs were dispersed across all other clusters (Fig. 3A and tables S1 and S6). Except for one case, all LCs clustered with those tumors of the other subtypes that shared the same histology-defining signature alterations (Fig. 3, A and B, and table S7). Similar results were obtained when applying consensus clustering, where almost 98% of AD cases formed a distinct cluster, as well as 84% of SCLCs and 77% of SQs. Sample overlap between clusters obtained by the two methods was 95%, 93%, and 75% for the clusters, which mainly included tumors with neuroendocrine differentiation (NEC), AD, or SQ cases, respectively (table S1). Applying recently described transcriptional classifiers of subtypes of AD (45) and SQ (46) revealed comparable results (figs. S12 and S13 and table S1). An integrative analysis of 209 cases including copy number and expression data using iCluster defined two subgroups that were mainly driven by amplifications on chromosome 3 (3q13.31-3q29) and chromosome 12 (12p13.33-12q15) and thus did not reveal distinct clinically relevant subtypes (47).

Fig. 3. Genetic features typical of other lung cancer subtypes in LC.

(A) Unsupervised hierarchical clustering using 294 highly variable (SD/mean >2.1) expressed genes identified four gene expression subgroups containing mainly CA (I), SCLC (II), AD (III), and SQ (IV). LC samples are indicated as triangles at corresponding positions below the cluster dendrogram. They are colored orange if they have AD-specific alterations, blue if they have SQ-specific alterations, gray if the case was initially diagnosed as an LCNEC, and green if they have no known alteration. Genetic alterations (label: red, amplified; blue, deleted; black, mutated; ERBB includes mutation in EGFR or ERBB2) are given for selected genes per sample as vertical lines (LC cases in green; others in black). (B) Typical immunohistochemistry is shown for LC specimens with immunohistochemical and genetic characteristics of AD (AD-like), SQ (SQ-like), and NEC, as well as LC lacking features of other lung cancer subtypes (NOS, not otherwise specified). The corresponding genetic alterations are indicated on the right. H&E, hematoxylin and eosin. (C) Distribution of mutations (in red, symbols according to type of mutation: diamond for missense, square for nonsense, and circle for indel) and copy number loss (in blue) of TP53, RB1, and EP300 across all whole exome–sequenced LCNECs. (D) Overall survival corresponding to each histological lung cancer subtype, with LC separated into LCs with neuroendocrine (gray) and without neuroendocrine (green) features.

LCNEC exhibited substantial transcriptional similarity to SCLC when analyzed with unsupervised hierarchical clustering and classification (Fig. 3A, gray triangles, and figs. S12 and S13). Furthermore, similar amplified and deleted regions were observed in LCNEC and SCLC when compared with other histological subtypes (fig. S14).

Finally, LCNEC shared the significantly mutated genes TP53, RB1, and EP300 with SCLC, as determined by whole-exome sequencing of 15 and transcriptome sequencing of 10 pathologically reviewed LCNECs (Fig. 3C and tables S8 and S9). We also found additional mutated genes in LCNEC that typically occurred in AD or SQ but did not reach significance in LCNEC (tables S8 and S9). Thus, LCNEC is most similar to SCLC, with individual cases bearing mutations of other subtypes. The genetic similarity between LCNEC and SCLC is also reflected by a similar overall survival (Fig. 3D). In summary, LC exhibits a general diagnostic plasticity when considering data on chromosomal copy number, gene mutations, gene expression, and immunohistochemistry. Combined immunohistochemical and genomic analysis is therefore ideal to classify this heterogeneous group as AD, SQ, or NEC.

Automated genomics-based lung tumor classification

Given the strong correlation of specific genome alterations with certain histological subtypes, we devised a statistical model to test if subtypes could be predicted robustly based on such alterations alone. The diagnosis predicted by our model and the original diagnosis (Fig. 4A, left) or the diagnosis obtained from pathology review were highly similar for AD and SQ (Fig. 4A, right, fig. S15, and table S10). Only a few of the AD or SQ cases were reclassified according to the predominant genome alterations in these cases (fig. S15). Feature selection and automated reclassification using a similar model was highly stable when applied to validation data sets of 382 AD (21), SQ (20), and SCLC (25) cases (Fig. 4B, fig. S16, and table S11). Review of individual discrepant cases revealed subtype-specific alterations, such as in one case that was predicted to be AD but classified by central pathology review as SQ, which harbored the oncogenic S310F ERBB2 mutation (48); ERBB2 mutations were typical of AD. Furthermore, one of three cases predicted to be SQ but classified as AD by pathology review exhibited amplification of FGFR1 and CCND1, both predictive of SQ in our data set (fig. S16A). Nevertheless, most initial diagnoses of AD, SCLC, and SQ were confirmed by both our model and pathology review (Fig. 4, A and B, and fig. S11), but most of the LCs with at least one genomic alteration were reassigned to either AD, SQ, or aggressive neuroendocrine lung cancer/SCLC in accordance with the pathological review (Fig. 4C, fig. S15, and table S10). Even in those cases where immunohistochemistry did not yield an unequivocal diagnosis, most remaining LC cases could be reassigned genetically to the other lung cancer subtypes (Fig. 4C).

Fig. 4. Genomics-based classification of lung cancer.

(A) Semisupervised reclassification of lung tumor samples. The relative proportion of cases per histological subtype (left; the LC group includes LCNEC cases) that were reclassified on the basis of 18 genetic alterations (table S11) to a certain subgroup (labels in the middle) is illustrated as lines. The weight of the lines is proportional to the fraction of cases classified to the respective subgroup. All cases that were predicted to be LC were histological LCNEC. Bars in the right graph give the concordance of each predicted class with the central pathological review (CPR). Subtypes for which no CPR was available are denoted with asterisks. (B) Supervised in silico classification of lung cancer specimens based on genetic features for 637 tumor samples with at least one genetic alteration and validation of the classifier in independent data sets of all three subgroups (20, 21, 25). Original histological subtypes defined groups for supervised learning. Bars indicate classification frequencies relative to the original histology. Classification results for the CLCGP data set are shown on the left, and results of the three validation data sets are on the right. (C) Semisupervised genetics-based reclassification of LC specimens without neuroendocrine features. For each sample (rows), prediction to a certain subtype (color per row in accordance to the color code used for histological subtypes, see below) is given (lower graph). Degree of supervision (x axis, upper graph) decreases continuously from left to right, the farthest right representing a genetics-based prediction. Agreement of the prediction with the CPR is plotted for each stage of supervision (upper part). Detailed information is given in Supplementary Materials and Methods. Genome alterations (black lines) and immunohistochemistry results (black, positive; brown, negative; thin gray line, not available) are indicated for each sample (middle and right panels). Genes are sorted according to their predictive value for histological subtypes. For the cases in the lower part of the figure, immunohistochemistry was not performed. Color code for predicted classes and CPR: orange, AD; black, CA; green, LC; gray, LCNEC; red, SCLC; blue, SQ; combination of colors, mixed subtype; white, no CPR.

Clinical evaluation of genomics-based lung cancer diagnoses

To evaluate our combined genomic and immunohistochemical diagnostic approach, we enrolled 5145 lung cancer patients in a molecular screening outreach program run by Network Genomic Medicine (NGM) in the region of our cancer center between January 2010 and April 2013 (Table 1, fig. S1B, and table S12). The addition of immunohistochemistry to the diagnostic workup of tumors with LC features reduced the prevalence of this subgroup from 5.9% in the retrospective analysis (table S1) to 1.3% in the diagnostic NGM data set (Fig. 5A). The expression of TTF-1 and CK7 as immunohistochemical markers of AD, as well as p63 and CK5 as markers of SQ, allowed such cases to be assigned to either AD or SQ. Expression of neuroendocrine markers CD56, chromogranin A, and synaptophysin was tested to identify tumors with neuroendocrine differentiation.

Fig. 5. Clinically relevant genome alterations in lung cancer subtypes.

(A) Genetic alterations per histological subtype in retrospective and prospective sample sets. Each chart represents the overall population with proportions of histological subtypes color-coded in the outer ring. Frequencies of alterations (wild type: no alteration in ALK, EGFR, FGFR1, KRAS, or PIK3CA) are given per gene relative to all cases within each histological subtype. Distribution of alterations for the LC population is shown separately. (B) Genotyping results of 3590 patients enrolled in the prospective screening effort are sorted according to the histological subtype [AD (2250), CA (3), SCLC (265), SQ (1018), LC (47), and LCNEC (7)]. Colored lines indicate alterations, and gray lines indicate wild type. Frequencies of alterations for AD (orange) and SQ (blue) are plotted below the respective genes, comparing the mutation frequency in the prospective data set (dark colors) to the retrospective data set (light colors). No significant difference between the data sets was seen (q values are given on the graph; P values were adjusted for multiple hypothesis testing using the Benjamini and Hochberg method). Color code: orange, AD; black, CA; green, LC; gray, LCNEC; red, SCLC; blue, SQ. (C) Kaplan-Meier curves for overall survival are shown for stage IIIB/IV patients per histological subtype for the retrospective (left) and prospective (right) sample sets. No significant difference was seen between subtypes within each data set. (D) Prospective sample set: Kaplan-Meier curves for overall survival are shown for all patients who were genetically tested versus those without available genetic information (top left). Overall survival is shown for stage IIIB/IV patients with alterations in given genes versus patients with wild type in the given genes (top right). Overall survival was statistically significantly longer in EGFR-mutant cases compared to all other (log-rank test, P < 0.05) except ALK-rearranged cases (P = 0.065). Overall survival is shown for patients with EGFR mutation treated with an EGFR inhibitor or standard chemotherapy (bottom left) and patients with ALK translocations treated with crizotinib or standard chemotherapy (bottom right). Gain of overall survival in the patient group treated with kinase inhibitors versus standard chemotherapy is given by the median overall survival (mOS). P values were corrected using the Bonferroni adjustment. HR, hazard ratio.

We performed central genotyping for key alterations identified within our retrospective genomic study (ALK, BRAF, DDR2, EGFR, ERBB2, FGFR1, KRAS, and PIK3CA) (fig. S17). Genomic testing was feasible in 3863 (75%) paraffin-embedded tumor samples obtained by routine diagnostic procedures, yielding 1481 genomic alterations (table S12). Treatment recommendations were provided to the network partners to enable genetically tailored cancer therapy either with approved drugs or within clinical trials. Sixty-four of 84 advanced-stage (IIIB or IV) patients with an EGFR mutation (76%) received erlotinib or gefitinib, and 15 of 30 advanced-stage patients with ALK translocation (50%) received crizotinib. Furthermore, 34 patients with alterations in BRAF (n = 4), KRAS (n = 10), or FGFR1 (n = 20) were enrolled in clinical trials. The frequencies of genomic alterations in the diagnostic NGM data set were similar to those of the retrospective discovery data set (Fig. 5, A and B). BRAF mutations were predominantly activating (49, 50) (60%, possibly associated with sensitivity to BRAF or MEK inhibition). In 24% of the cases, BRAF mutations were predicted to be inactivating (50, 51), which might predict sensitivity to dasatinib (52) (fig. S18). In 34% of the remaining LC cases, which were not otherwise classifiable, we found signature alterations of AD or SQ (Fig. 5, A and B) that provided both a genetic diagnosis and a rationale for genetically tailored therapy. Thus, combined immunohistochemical and genetic diagnosis reduced the subgroup of LC to 1.1%.

We next compared the survival of patients in our historical and prospective data sets. Survival of stage IIIB/IV patients in the retrospective data set was better than that in the prospective data set (P < 0.02, fig. S19). The most likely reason for this difference is a higher proportion of patients undergoing surgery with curative intention in the retrospective data set (which consisted of surgically resected cases only), whereas very few patients underwent surgery in the prospective data set. In contrast, the improved survival of patients with earlier-stage disease in the prospective data set compared to the retrospective data set most likely reflects treatment improvements. Histology had no major impact on survival of advanced-stage patients in both data sets (Fig. 5C).

We next performed analyses within the prospective NGM cohort alone. When comparing the survival of patients in our prospective data set, whose tumors had been genotyped (n = 975) to that of patients in the same cohort, in whom genetic diagnosis was not feasible (for example, because of insufficient tissue; n = 277), genotyping alone had stage- and histology-independent impact on overall survival in multivariate analyses (P = 0.002) (Fig. 5D, upper left). Although this observation most likely results from the favorable outcome in patients treated with kinase inhibitors, it demonstrates that genotyping is mandatory for patients to benefit from targeted therapeutic intervention. Accordingly, among the different genotypes determined within the prospective NGM cohort, EGFR mutations were associated with improved survival (Fig. 5D, upper right; hazard ratio, 0.617; 95% confidence interval, 0.442 to 0.859; P = 0.004). We note, however, that by the time ALK fusion testing was introduced, ALK inhibitors were not yet broadly available, which may explain why ALK fusions were not generally associated with improved survival in our cohort. Patients with EGFR-mutant lung cancer treated with EGFR inhibitors survived longer than those not receiving EGFR inhibitors (median overall survival, 31.5 versus 9.6 months; P < 0.001) (Fig. 5D, lower left). Similarly, the overall survival of patients with ALK-rearranged lung cancer treated with crizotinib was significantly better compared to ALK-positive patients not receiving crizotinib (median overall survival, 23 versus 11 months; P = 0.024) (Fig. 5D, lower right). No difference in the number of therapeutic regimens and the number of platinum-based treatment regimens existed between patients with EGFR-mutant lung cancer treated with EGFR inhibitors and patients not treated with EGFR inhibitors (two-sided t test, P = 0.43). Patients with lung cancer harboring ALK rearrangements treated with crizotinib did not differ in the number of platinum-based chemotherapy regimens, but differed in the total number of treatment regimens received: patients treated with crizotinib received an average of 2.17 regimens, compared to 1.3 regimens in patients who did not receive crizotinib (two-sided t test, P = 0.034) (table S12). However, in a Cox regression analysis including treatment with kinase inhibitors and number of treatment regimens as variables, only treatment with kinase inhibitors had a significant impact on survival (P = 0.043).

In summary, we validated the frequencies of signature alterations across lung cancer subtypes, demonstrated the feasibility of thorough and broad genome diagnostics in an academic-to-nonacademic outreach setting, confirmed the almost universal reassignment of LC to the other biologically relevant diagnoses, and showed that the introduction of a molecular diagnosis coupled to specific therapeutic intervention improves the overall survival of patients with alterations in EGFR and ALK compared to standard chemotherapy (53, 54).


Here, we defined a minimal set of genome alterations for genomic lung cancer diagnosis using comprehensive copy number analyses in combination with focused sequencing and RNA expression analysis. This approach yielded robust frequencies of genome alterations and afforded reassignment of LC to therapeutically relevant groups such as AD or SQ. Finally, introduction of genome diagnostics in an outreach setting not only confirmed these observations but also resulted in substantial improvement in overall survival of patients receiving genetically informed therapeutic intervention compared to standard chemotherapeutic treatment.

A major advantage of this combined histopathological and genetic analysis is the classification of the group of LCs, a subtype that is poorly defined, mainly because of a lack of specific morphologic features of AD, SCLC, or SQ. Applying immunohistochemistry (22, 23) (for example, TTF-1 for AD and p63 for SQ) helped assigning some of the LC cases to other categories. However, the addition of genome annotation not only confirmed several of the immunohistochemical assignments but also added information on possibly therapeutically relevant alterations and afforded classification in cases where definite pathological diagnosis was not possible. We have also observed a marked similarity between several LCNEC tumors and SCLC that shared the same pattern of SCNAs, the particularly poor survival, and the most significant gene mutations in this data set of limited size.

We also found (exceptionally rare) SQ tumors bearing EGFR mutations or ALK rearrangements, which might be treatable with targeted therapies as well. Thus, genomic diagnosis should include all subtypes and all genome alterations to provide a therapeutic rationale for all possible patients. We propose to capture both genomic and immunohistochemical data to link these taxa to treatment benefit in trials (fig. S20).

In our outreach study, genotyping alone—as a prerequisite for personalized treatment—was associated with improved patient survival. These epidemiological results emphasize the need for broad availability of systematic and comprehensive genomic lung cancer diagnoses.

We note that the prospective part of our study was not a randomized clinical trial but an observational diagnostic intervention study. Unfortunately, obtaining overall survival data in a randomized fashion requires prohibiting patients from crossing to the other treatment arm—an irresponsible measure in this setting. Thus, registry data from observational studies like ours may be an approach to demonstrate differences in survival between two different therapeutic strategies.

Furthermore, the two cohorts analyzed in this study differ in important aspects: Whereas patient registration and genotyping occurred in a central study office in the prospective cohort, tumors of the retrospective cohort were obtained from multiple centers before targeted therapies became broadly available, thus giving us limited control over the quality of the clinical data. Additional differences existed in the time of treatment and distribution of stages and rate of surgical treatment. Finally, clinical annotation was not complete, and data gaps exist for performance status, smoking status, and treatment. We note, however, that all major class-related genotypic and immunohistochemical findings were consistent across both cohorts, thus underscoring the validity of our approach.

In summary, we provide a blueprint for genomic diagnosis of lung tumors. Determination of immunohistochemical, genomic, and clinical features may thus be combined to yield classes of tumors that are biologically relevant, afford genomically tailored stratification of patients into clinical trials, and improve overall survival of patients with lung cancer.


Study design

Detailed information on materials and methods, including study design and statistics, is given in the Supplementary Materials. In brief, we collected frozen tissue or genomic DNA from 1882 resected primary lung tumors (table S1) after obtaining informed consent. Mutations in 28 genes (ABL1, AKT2, ALK, BRAF, CDK4, DDR2, EGFR, EPHA3, EPHA5, ERBB2, FGFR1, FGFR2, FGFR3, FLT3, HRAS, JAK2, KEAP1, KIT, KRAS, NFE2L2, NRAS, NTRK1, NTRK3, PDGFRA, PIK3CA, STK11, TP53, and RET) were analyzed in 1127 tumor specimens using multiple technologies (table S13, A and B) (9, 28, 38). Rearrangements of ALK (n = 602), RET (n = 362), and ROS1 (n = 211) were detected by fluorescence in situ hybridization. Gene copy number of 1032 and whole-exome sequencing data of 15 LCNEC tumors were analyzed as described previously (11, 24). Gene expression analyses were performed in 261 samples. Five hundred eighty-three cases were independently reviewed by lung pathologists (E.B. and W.D.T.) (1, 2) (fig. S1). For the prospective diagnostic cohort, 5145 lung cancer patients were included for central genotyping for alterations in ALK, BRAF, DDR2, EGFR, ERBB2, FGFR1, KRAS, or PIK3CA, and written reports were provided to the treating oncologists, containing the detected mutation and a treatment recommendation (fig. S17 and table S12).


Materials and Methods

Fig. S1. Overview of sample processing.

Fig. S2. Significantly amplified and deleted regions in lung cancer.

Fig. S3. 9p21 is the most frequently deleted region in lung cancer.

Fig. S4. Mutation frequencies by histological subtype.

Fig. S5. Distribution of mutations within genes.

Fig. S6. FGFR alterations in lung tumors.

Fig. S7. Frequencies of alterations in genes with known or potential clinical relevance in lung cancer.

Fig. S8. Genetic associations in lung cancer.

Fig. S9. EGFR mutations and amplifications frequently co-occur in lung AD but not in SQ.

Fig. S10. Homozygous and hemizygous deletions of PTEN in EGFR-mutant ADs.

Fig. S11. Central pathological review.

Fig. S12. Gene expression subtypes of AD according to Wilkerson.

Fig. S13. Gene expression subtypes of SQ according to Wilkerson.

Fig. S14. Genome-wide comparison of LCNEC and SCLC.

Fig. S15. Semisupervised reclassification of lung cancer specimens.

Fig. S16. Automated reclassification of AD, SCLC, and SQ.

Fig. S17. Prospective testing of genomics-based diagnosis of lung cancer.

Fig. S18. BRAF mutations in lung cancer.

Fig. S19. Overall survival by stage.

Fig. S20. Diagnostic algorithm for lung tumors.

Table S1. Clinical and genetic features of all CLCGP cases.

Table S2. Significant copy number alterations in lung cancer for each histological subtype.

Table S3. Median copy number per patient for chromosome regions with significantly altered copy number.

Table S4. Annotation of genetic alterations in lung cancer patients.

Table S5. Statistical results comparing alteration frequencies between histological subtypes.

Table S6. Gene expression data used for hierarchical clustering.

Table S7. Reclassification of LCs.

Table S8. Mutations detected in 15 whole exome–sequenced LCNECs of the lung.

Table S9. RNAseq-derived gene expression data for 10 LCNECs of the lung.

Table S10. Results of the unsupervised genetics-based classification of lung tumors.

Table S11. Automated supervised prediction of lung cancer subtypes based on genetic alterations.

Table S12. Clinical and genetic features of all NGM cases.

Table S13. Primer sequences.

References (5562)


  • Data analysis team: Danila Seidel,1,2* Thomas Zander,1,3,52* Lukas C. Heukamp,8,52* Martin Peifer,1* Marc Bos,3,52* Lynnette Fernández-Cuesta,1 Frauke Leenders,1,2 Xin Lu,1 Sascha Ansén,3 Masyar Gardizi,3,52 Chau Nguyen,5,11 Johannes Berg,5 Prudence Russell,46 Zoe Wainer,45 Hans-Ulrich Schildhaus,8,52,76 Toni-Maree Rogers,13 Benjamin Solomon,44 William Pao,38 Scott L. Carter,42 Gad Getz,42 D. Neil Hayes,51 Matthew D. Wilkerson,51,77 Erik Thunnissen,24 William D. Travis,26 Sven Perner,14 Gavin Wright,45,78 Elisabeth Brambilla,36,37 Reinhard Büttner,2,8,52 Jürgen Wolf,2,3,52 and Roman K. Thomas1,2,8,52

    Biospecimens core resources: Franziska Gabler,2,4 Ines Wilkening,4 Christian Müller,1 Ilona Dahmen,1 Roopika Menon,14 Katharina König,8,52 Kerstin Albus,8,52 Sabine Merkelbach-Bruse,8,52 Jana Fassunke,8,52 Katja Schmitz,8,52 Helen Kuenstlinger,8,52 Michaela A. Kleine,8,52 Elke Binot,8,52 Silvia Querings,2,4 Janine Altmüller,6,10 Ingelore Bäßmann,6 Peter Nürnberg,6,9,10 Peter M. Schneider,12 and Magdalena Bogus12

    Pathology committee: Reinhard Büttner,2,8 Sven Perner,14 Prudence Russell,46 Erik Thunnissen,24 William D. Travis,26 and Elisabeth Brambilla36,37

    Biospecimens and data source sites: Alex Soltermann,33 Holger Moch,33 Odd Terje Brustugun,31,32 Steinar Solberg,30 Marius Lund-Iversen,73 Åslaug Helland,31,32 Thomas Muley,18,48 Hans Hoffmann,18 Philipp A. Schnabel,17,48 Yuan Chen,15 Harry Groen,22 Wim Timens,23 Hannie Sietsma,23 Joachim H. Clement,71 Walter Weder,33 Jörg Sänger,19 Erich Stoelben,7 Corinna Ludwig,7 Walburga Engel-Riedel,7 Egbert Smit,24 Daniëlle A. M. Heideman,24 Peter J. F. Snijders,24 Lucia Nogova,3,52 Martin L. Sos,3,52,53 Christian Mattonet,3,52 Karin Töpelt,3,52 Matthias Scheffler,3,52 Eray Goekkurt,52,55 Rainer Kappes,52,56,62 Stefan Krüger,52,56 Kato Kambartel,52,57 Dirk Behringer,52,58 Wolfgang Schulte,52,59 Wolfgang Galetke,52,60 Winfried Randerath,52,61 Matthias Heldwein,52,63 Andreas Schlesinger,52,64 Monika Serke,52,65 Khosro Hekmat,52,63 Konrad F. Frank,52,66 Roland Schnell,52,67 Marcel Reiser,52,68 Ali-Nuri Hünerlitürkoglu,52,69 Stephan Schmitz,52,70 Lisa Meffert,52,40 Yon-Dschun Ko,52,40 Markus Litt-Lampe,52,72 Ulrich Gerigk,52,41 Rainer Fricke,54 Benjamin Besse,36 Christian Brambilla,37 Sylvie Lantuejoul,36,75 Philippe Lorimier,36 Denis Moro-Sibilot,37 Federico Cappuzzo,27 Claudia Ligorio,28 Stefania Damiani,28 John K. Field,29 Russell Hyde,29 Pierre Validire,35 Philippe Girard,35 Lucia A. Muscarella,47 Vito M. Fazio,47,74 Michael Hallek,2,3 Jean-Charles Soria,34 Scott L. Carter,42 Gad Getz,42 D. Neil Hayes,51 Matthew D. Wilkerson,51,77 Viktor Achter,50 and Ulrich Lang49,50

    Writing committee: Danila Seidel,1,2* Thomas Zander,1,3,52* Lukas C. Heukamp,8,52* Martin Peifer,1* Marc Bos,3,52* William Pao,38 William D. Travis,26 Elisabeth Brambilla,36,37 Reinhard Büttner,2,8,52 Jürgen Wolf,2,3,52 and Roman K Thomas1,2,8,52

    *These authors contributed equally to this work.

    Study leaders: Reinhard Büttner,2,8,52 Jürgen Wolf,2,3,52 and Roman K. Thomas1,2,8,52

    1Department of Translational Genomics, Center of Integrated Oncology Köln-Bonn, University of Cologne, 50931 Cologne, Germany.

    2Laboratory of Translational Cancer Genomics, Center of Integrated Oncology Köln-Bonn, University of Cologne, 50937 Cologne, Germany.

    3Department I of Internal Medicine, Center of Integrated Oncology Köln-Bonn, University of Cologne, 50937 Cologne, Germany.

    4Max Planck Institute for Neurological Research with Klaus-Joachim-Zülch Laboratories of the Max Planck Society and the Medical Faculty of the University of Cologne, 50931 Cologne, Germany.

    5Institute for Theoretical Physics, University of Cologne, 50937 Cologne, Germany.

    6Cologne Center for Genomics, University of Cologne, 50931 Cologne, Germany.

    7Thoracic Surgery, Lungenklinik Merheim, Kliniken der Stadt Köln gGmbH, 51109 Cologne, Germany.

    8Institute of Pathology, Center of Integrated Oncology Köln-Bonn, University of Cologne, 50937 Cologne, Germany.

    9Center for Molecular Medicine Cologne, University of Cologne, Cologne, Germany.

    10Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases, University of Cologne, Cologne, Germany.

    11Bonn-Cologne Graduate School of Physics and Astronomy, 53115 Bonn, Germany.

    12Institute of Legal Medicine, University of Cologne, 50823 Cologne, Germany.

    13Department of Pathology, Peter MacCallum Cancer Centre, Melbourne, 3002 Victoria, Australia.

    14Department of Prostate Cancer Research at the Institute of Pathology, University Hospital of Bonn, 53127 Bonn, Germany.

    15Institute of Pathology, Jena University Hospital, Friedrich-Schiller-University, 07743 Jena, Germany.

    16Technical University Dortmund, Department of Chemical Biology, 44227 Dortmund, Germany.

    17Institute of Pathology, University of Heidelberg, 69120 Heidelberg, Germany.

    18Department of Thoracic Surgery, Thoraxklinik am Universitätsklinikum Heidelberg, 69126 Heidelberg, Germany.

    19Institute for Pathology Bad Berka, 99438 Bad Berka, Germany.

    20Department for Internal Medicine II, Jena University Hospital, Friedrich-Schiller University, 07740 Jena, Germany.

    21Center for Medical Genetics, Ghent University, 9000 Ghent, Belgium.

    22Department of Pulmonary Diseases, University of Groningen, University Medical Centre Groningen, 9713 GZ Groningen, the Netherlands.

    23Department Pathology and Medical Biology, University of Groningen, University Medical Center Groningen, 9713 GZ Groningen, the Netherlands.

    24Department of Pathology, VU University Medical Center Amsterdam, 1007 MB Amsterdam, the Netherlands.

    25Department of Pulmonary Diseases, VU University Medical Center Amsterdam, 1007 MB Amsterdam, the Netherlands.

    26Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA.

    27Department of Medical Oncology, Ospedale Civile, 57100 Livorno, Italy.

    28Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40138 Bologna, Italy.

    29Roy Castle Lung Cancer Research Programme, The University of Liverpool Cancer Research Centre, University of Liverpool Cancer Research Centre, Department of Molecular and Clinical Cancer Medicine, Institute of Translational Medicine, The University of Liverpool, Liverpool L3 9TA, UK.

    30Department of Thoracic Surgery, Rikshospitalet, Oslo University Hospital, N-0027 Oslo, Norway.

    31Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, N-0424 Oslo, Norway.

    32Department of Oncology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway.

    33Institute for Surgical Pathology, University Hospital Zurich, 8091 Zurich, Switzerland.

    34Phase I Unit–Department of Medicine, Institute Gustave Roussy, 94800 Villejuif, France.

    35France Service d’Anatomie-Pathologie, Institut Mutualiste Montsouris, 75014 Paris, France.

    36Department Cancer Medicine, Institute Gustave Roussy, 94800 Villejuif, France.

    37Pole de Cancérologie et Médecine Aigue Communautaire, CHU Grenoble, CS 10217, 38043 Grenoble, France and INSERM U823, Grenoble, France.

    38Vanderbilt-Ingram Cancer Center, Nashville, TN, USA.

    39Department of Genome Sciences, University of Washington, Seattle, WA 98195–5065, USA.

    40Klinik für Internistische Onkologie Evangelische Kliniken Bonn, 53113 Bonn, Germany.

    41Klinik für Thoraxchirurgie, Malteser Krankenhaus, 53123 Bonn, Germany.

    42The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.

    43Department of Genetics, Stanford University, Stanford, CA, USA.

    44Department of Haematology and Medical Oncology, Peter MacCallum Cancer Centre, Melbourne, 3002 Victoria, Australia.

    45The University of Melbourne Department of Surgery, St Vincent’s Hospital, Melbourne, 3065 Victoria, Australia.

    46Department of Pathology, St. Vincent’s Hospital, Melbourne, 3065 Victoria, Australia.

    47Laboratory of Oncology, Istituto di Ricovero e Cura a Carattere Scientifico Casa Sollievo della Sofferenza, San Giovanni Rotondo, Italy.

    48Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany.

    49Department of Informatics, University of Cologne, 50931 Cologne, Germany.

    50Computing Center, University of Cologne, 50931 Cologne, Germany.

    51Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

    52Network Genomic Medicine, University Hospital Cologne, Center of Integrated Oncology Köln Bonn, 50937 Cologne, Germany.

    53Howard Hughes Medical Institute, Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.

    54Epidemiologisches Krebsregister NRW gGmbH, 48149 Münster, Germany.

    55Hämatologisch-Onkologische Praxis Eppendorf, 20249 Hamburg, Germany.

    56Klinik für Pneumologie, Florence Nightingale Krankenhaus, 40489 Düsseldorf, Germany.

    57Lungenklinik, Krankenhaus Bethanien Moers, 47441 Moers, Germany.

    58Klinik für Hämatologie, Onkologie und Palliativmedizin, Augusta Kliniken Bochum, 44791 Bochum, Lungenkrebszentrum Herne-Bochum, Germany.

    59Abteilung für Pneumologie, Malteser Krankenhaus, 53123 Bonn, Germany.

    60Klinik für Innere Medizin, Krankenhaus der Augustinerinnen, 50678 Köln, Germany.

    61Klinik für Pneumologie, Krankenhaus Bethanien Solingen, 42669 Solingen, Germany.

    62Schwerpunktpraxis für Lungen- und Bronchialheilkunde, Allergologie, Friedrichstr. 33-35, 40217 Düsseldorf, Germany.

    63Department of Cardiothoracic Surgery, University Hospital Cologne, Center of Integrated Oncology Köln Bonn, 50924 Cologne, Germany.

    64Abteilung für Pneumologie, Evangelisches Krankenhaus Kalk, 51103 Köln, Germany.

    65Abteilung Pneumologie III/Thorakale Onkologie, Lungenklinik Hemer, 58675 Hemer, Germany.

    66Department III of Internal Medicine, University Hospital Cologne, Center of Integrated Oncology Köln Bonn, 50924 Cologne, Germany.

    67PIOH Frechen, Kölner Straße 9, 50226 Frechen, Germany.

    68PIOH Köln, Richard Wagner Straße 13-17, 50674 Köln, Germany.

    69Lukaskrankenhaus Neuss, Innere Medizin II, 41464 Neuss, Germany.

    70Gemeinschaftspraxis für Hämatologie und Onkologie, Sachsenring 69, 50677 Köln, Germany.

    71Department Hematology and Oncology, Jena University Hospital, Friedrich-Schiller University, 07747 Jena, Germany.

    72Innere Medizin II Gastroenterologie/Pneumologie/Onkologie, Marienhospital Brühl, 50321 Brühl, Germany.

    73Department of Pathology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway.

    74University Campus Bio-Medico of Roma, Center for Integrated Research, Laboratory for Molecular Medicine and Biotechnology, 00128 Rome, Italy.

    75Université Joseph Fourier-INSERM U 823 Institut Albert Bonniot, 38700 La Tronche, France.

    76Institute of Pathology, University Hospital Goettingen, Goettingen, Germany.

    77Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

    78Division of Cancer Surgery, Peter MacCallum Cancer Centre, East Melbourne, 3002 Victoria, Australia.


  1. Acknowledgments: We thank W. Vogel, C. Beschorner, C. Becker, B. Pinther, and P. Löhrer for their technical assistance and M. Knowles for providing the FGFR3b pBabe plasmids. We also thank the regional computing center of the University of Cologne (Regional Computing Centre) for providing the CPU time on the German Research Foundation (DFG)–funded supercomputer “CHEOPS” as well as for support. The Victorian Cancer Biobank stored and processed the Australian biospecimens used in this study. Funding: This work was supported by the EU-Framework Programme CURELUNG (HEALTH-F2-2010-258677 to R.K.T., R.B., and J.W.), by the Deutsche Forschungsgemeinschaft through TH1386/3-1 (to R.K.T. and M.L.S.) and through SFB832 (TP6 to R.K.T. and J.W.; TP5 to L.C.H.) and SB680 (to J.B.), by the German Ministry of Science and Education as part of the NGFNplus program (grant 01GS08100 to R.K.T. and J.W.; grant 01GS08101 to P.N.), by the Deutsche Krebshilfe as part of the Oncology Centers of Excellence funding program (to R.K.T., R.B., and J.W.) and SyBaCol (to J.B.), by the Max Planck Society, by the Behrens-Weise Foundation (M.I.F.A.NEUR8061 to R.K.T.), by Stand Up To Cancer–American Association for Cancer Research Innovative Research Grant (SU2C-AACR-IR60109 to W.P. and R.K.T.), by the Roy Castle Lung Cancer Foundation UK (to J.K.F.), and by an anonymous foundation (to R.K.T.). T.M. was supported by 82DZL00402 (DZL biobank platform) from the German Centre for Lung Research. Funding of supercomputer CHEOPS was provided by the DFG and the Ministry of Research of the State of North Rhine–Westphalia. Competing interests: R.K.T. is a founder and shareholder of Blackfield AG. R.K.T. received consulting and lecture fees (Sanofi-Aventis, Merck, Roche, Lilly, Boehringer Ingelheim, AstraZeneca, Atlas-Biolabs, Daiichi-Sankyo, and Blackfield AG) as well as research support (Merck, EOS, and AstraZeneca). R.B. is a cofounder and owner of Targos Molecular Diagnostics and received honoraria for consulting and lecturing from AstraZeneca, Boehringer Ingelheim, Merck, Roche, Novartis, Lilly, Qiagen, and Pfizer. J.W. received consulting and lecture fees from Roche, Novartis, Boehringer Ingelheim, AstraZeneca, Bayer, Lilly, Merck, and Amgen and research support from Roche, Bayer, Novartis, and Boehringer Ingelheim. M.L.S. is a fellow of the International Association for the Study of Lung Cancer. M.S. received funds from Lilly, Roche, Boehringer Ingelheim, AstraZeneca, and Pfizer. P.N. is a founder, CEO, and shareholder of ATLAS Biolabs GmbH. W.P. received research funding from Enzon Xcovery, AstraZeneca, Symphogen, Clovis Oncology, and Bristol-Myers Squibb and had paid consulting relationships with MolecularMD, AstraZeneca, Bristol-Myers Squibb, Symphony Evolution, Clovis Oncology, Exelixis, and Clarient. Rights to EGFR T790M testing were licensed on behalf of W.P. and others by Memorial Sloan-Kettering Cancer Center to MolecularMD. M.P. is a founder and shareholder of Blackfield AG, J.H.C. is a member of the Advisory Board of Boehringer Ingelheim Pharma GmbH & Co KG. H.G. is a member of the Advisory Boards of Roche, Eli Lilly, and Pfizer. H.-U.S. has an advisory relationship with Roche, Abbott Molecular, and Pfizer and has received honoraria from Novartis, Roche, Abbott Molecular, and Pfizer. B.S. is a member of the Advisory Boards for Pfizer, Novartis, Roche, Boehringer Ingelheim, and AstraZeneca. J.W. is a member of the Advisory Boards for Novartis, Roche, AstraZeneca, Boehringer Ingelheim, Lilly, Bristol-Myers Squibb, and Bayer and has received speaking fees from Novartis, Roche, AstraZeneca, Boehringer Ingelheim, and Lilly and research support from Roche, Boehringer Ingelheim, Novartis, and Bayer. T.Z. has performed advisory work for Merck, Lilly, Amgen, Roche, and Boehringer Ingelheim. D.B. is a member of regional advisory boards for Roche, Lilly, and Boehringer Ingelheim. F.L. received consulting fees from Blackfield AG. S.M.-B. received honoraria and grants from Roche and Novartis. L.N. received speaking fees from Roche and Novartis and honoraria from Pfizer. M.D.W. had paid consulting relationships with GeneCentric Cancer Therapeutics Innovation Group. Data and materials availability: Segmented copy number data, gene expression data from expression arrays and RNAseq, and binary sequence alignment data of 300–base pair regions around all somatic mutations that were identified from whole-exome sequencing data are available at The data are also available through ArrayExpress at accession number E-MTAB-1999. Author contributions: The CLCGP and NGM Initiatives contributed collectively to this study. The Biospecimens and Data Source Sites provided biospecimens or data for further analysis. Data were generated and analyzed by the biospecimen core resources and data analysis team led by D.S., T.Z., L.C.H., M.P., M.B., R.B., J.W., and R.K.T.
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article