Research ArticleCancer

APOBEC mutation drives early-onset squamous cell carcinomas in recessive dystrophic epidermolysis bullosa

See allHide authors and affiliations

Science Translational Medicine  22 Aug 2018:
Vol. 10, Issue 455, eaas9668
DOI: 10.1126/scitranslmed.aas9668

Mutational signature sleuthing

Individuals with the inherited skin disease recessive dystrophic epidermolysis bullosa (RDEB) are predisposed to developing aggressive squamous cell carcinomas (SCCs), although why this patient group is prone to these cancers at such early ages is unknown. Cho et al. sequenced multiple RDEB SCC tumors and found that the mutation profile in these carcinomas was most consistent with APOBEC-associated mutagenesis, unlike other types of SCC that may be driven by ultraviolet light or tobacco smoke exposure. This finding could open up new lines of thinking on how to successfully prevent or target SCCs in RDEB patients.


Recessive dystrophic epidermolysis bullosa (RDEB) is a rare inherited skin and mucous membrane fragility disorder complicated by early-onset, highly malignant cutaneous squamous cell carcinomas (SCCs). The molecular etiology of RDEB SCC, which arises at sites of sustained tissue damage, is unknown. We performed detailed molecular analysis using whole-exome, whole-genome, and RNA sequencing of 27 RDEB SCC tumors, including multiple tumors from the same patient and multiple regions from five individual tumors. We report that driver mutations were shared with spontaneous, ultraviolet (UV) light–induced cutaneous SCC (UV SCC) and head and neck SCC (HNSCC) and did not explain the early presentation or aggressive nature of RDEB SCC. Instead, endogenous mutation processes associated with apolipoprotein B mRNA-editing enzyme catalytic polypeptide–like (APOBEC) deaminases dominated RDEB SCC. APOBEC mutation signatures were enhanced throughout RDEB SCC tumor evolution, relative to spontaneous UV SCC and HNSCC mutation profiles. Sixty-seven percent of RDEB SCC driver mutations was found to emerge as a result of APOBEC and other endogenous mutational processes previously associated with age, potentially explaining a >1000-fold increased incidence and the early onset of these SCCs. Human papillomavirus–negative basal and mesenchymal subtypes of HNSCC harbored enhanced APOBEC mutational signatures and transcriptomes similar to those of RDEB SCC, suggesting that APOBEC deaminases drive other subtypes of SCC. Collectively, these data establish specific mutagenic mechanisms associated with chronic tissue damage. Our findings reveal a cause for cancers arising at sites of persistent inflammation and identify potential therapeutic avenues to treat RDEB SCC.


The notion of tumors as wounds that do not heal and the link between tissue damage, inflammation, and cancer have a long history (1). This relationship is exemplified by the disease recessive dystrophic epidermolysis bullosa (RDEB) in which tissue damage, inflammation, and aberrant wound healing lead to early-onset, highly malignant squamous cell carcinoma (SCC) of the skin (2). Five-year survival in RDEB patients diagnosed with SCC is close to 0%. With the exception of the rare nuclear protein in testis (NUT) midline carcinoma (3), RDEB SCCs carry the worst prognosis of any of the diverse anatomic subtypes of SCC.

RDEB is an autosomal recessive disorder caused by mutations in COL7A1, which lead to skin fragility and trauma-induced blisters that form beneath the lamina densa of the basement membrane of the epidermis (4). RDEB wounds frequently exhibit poor healing and characteristically resolve with excessive scarring and fibrosis. Patients also often develop nail dystrophy, pseudosyndactyly, dysphagia/esophageal strictures, and anemia, in addition to an increased risk (up to 90% cumulative risk by age 55) of highly malignant cutaneous SCC. Recent work has shown that RDEB SCC is human papillomavirus (HPV)–negative (5) and that the tumor microenvironment likely contributes to progression (6, 7), but little is known about the somatic mutation or genetic landscape in these tumors. In particular, it is not clear what genetic mechanisms cause SCC at such a young age.

Identifying mutational signatures in cancer can provide insight into the underlying cause (8). In sporadic skin cancer, including SCC, ultraviolet (UV) damage is the principal driver of genetic mutation (specifically, C > T transition in the context of dipyrimidines), leading to some of the highest numbers of mutations identified in any cancer (9). In lung and head and neck SCC (HNSCC), tobacco smoke yields a distinct mutation signature dominated by C > A transversion (8). More recently, a correlation has been made between smoking and mutation signatures associated with endogenous deaminases of the active polynucleotide cytosine deaminase family, collectively termed APOBEC (apolipoprotein B mRNA-editing enzyme catalytic polypeptide–like) and characterized by C > T and C > G in the TpCpW context (10). APOBEC signatures are also prevalent in HPV-induced SCC of the cervix and head and neck (8) as well as bladder cancer (11) and are emerging as a major mutation mechanism in human cancers more generally (12, 13). We hypothesized that the mutational signatures dominant in RDEB SCC would provide fresh insight into the mechanisms driving early onset of this devastating cancer and that comprehensive genetic analysis might identify much-needed therapeutic options, as current clinical guidelines offer limited options consisting of wide local excision, radiotherapy, and, in late stages, limb amputation (14).


RDEB SCC acquires classic recurrently mutated genes and structural alterations

We performed exome sequencing of 27 independent SCC tumors isolated from 26 RDEB patients and compared our results with published data sets from 38 UV-induced, immune-competent cutaneous SCC (UV SCC) patient tumors (15) and 279 HNSCC patient tumors (16). Table S1 details all samples used for this study. The average number of unique reads per RDEB tumor and normal samples for exome sequencing was 195,525,950 and 101,766,518, respectively, generating 217.3-fold mean and 169-fold median target base coverage for all samples. A total of 16,136 somatic coding mutations were identified from exome sequencing of 27 RDEB SCC (table S2), and the mean overall mutation rate was 9.6 mutations/million base pairs (Mbp) with 3.5 nonsynonymous mutations/Mbp. The mean age of the RDEB SCC cohort was 32.4 years, significantly less than the UV SCC (66.7 years; Mann-Whitney U test, P = 2.45 × 10−10) and HNSCC (61.3 years; Mann-Whitney U test, P = 2.6 × 10−14) cohorts. We identified an average of 147 nonsynonymous mutations per tumor (table S2), significantly less than UV SCC (1753 nonsynonymous mutations per tumor; Mann-Whitney U test, P = 4.3 × 10−7) and comparable to visceral SCC such as HNSCC (133 nonsynonymous mutations per tumor). Genes with common mutations were identified using the MutSigCV and MuSiC algorithms (Fig. 1A) (17, 18). MuSiC identified eight genes—CASP8, NOTCH1, TP53, FAT1, CDKN2A, HRAS, ARID2, and KMT2B (MLL4)—as significantly mutated with a false discovery rate (FDR) q < 0.001 (table S3). MutSigCV identified only three genes—CDKN2A, CASP8, and TP53—as significantly mutated, although the other cancer-related genes HRAS and ARID2 returned a P < 0.01 (table S4). CASP8, NOTCH1, TP53, FAT1, CDKN2A, HRAS, ARID2, and KMT2B were previously identified as potential drivers in multiple skin and visceral SCC sequencing studies (Fig. 1B) (9, 15, 16, 1922). All tumors demonstrated copy number variants, with loss of chromosomal regions 3p and 8p and gains of 3q, 5, 7, 8q, and 20 frequently observed (Fig. 1C), thus resembling UV SCC, HNSCC, and lung SCC (15, 16, 19). RDEB SCC exomes showed high instability with a mean of 66 copy number variants (amplifications or deletions) per tumor (table S5).

Fig. 1 RDEB SCC somatic mutation and DNA copy number alterations resemble those identified in HNSCC and UV SCC.

(A) Significantly mutated genes in RDEB SCC. Bar graph indicates number of mutations affecting protein-coding sequence. Matrix rows represent significantly mutated genes (identified using the MuSiC algorithm; FDR, <0.001) ordered by P value. Each column represents a different sample (n = 27; number indicating samples RDEBSCC_01 through RDEBSCC_31) with box color-coded by mutation type; if more than one mutation is seen in a gene in a sample, then the most impactful mutation (activating Ras > nonsense > frameshift > splice site > missense) is coded. Dark blue bars on left show percentage of samples containing mutations for that gene. (B) Comparison of significantly mutated genes in RDEB SCC, UV SCC, and HNSCC. Color scale corresponds to MutSigCV algorithm q value. (C) Copy number alterations for 27 RDEB SCC. Red indicates regions of copy number gain, whereas blue indicates regions of copy number loss, with chromosome indicated at the top.

APOBEC mutation signatures are sharply enhanced in RDEB SCC

We investigated what mutational processes could lead to the acquisition of classic SCC drivers at such an early age. RDEB SCC arises at sites of chronic wounding and healing, which are usually bandaged and shielded from sunlight. Therefore, it seems unlikely that these SCCs are caused primarily by UV damage (23). To determine the primary mutation processes active in RDEB SCC, we analyzed the mutational spectra in each sample using a previously described bioinformatics framework (24). For each somatic variant, this approach used the specific base change and surrounding sequence context to deduce a likely cause of mutation (8). This analysis was performed on exome sequencing–derived nucleotide variants from 27 RDEB SCC samples, as well as 38 UV SCC and 279 HNSCC exomes. Mutational signature analysis was also performed on variants identified by whole-genome sequencing from 3 of the 27 RDEB SCC samples. Three de novo mutation signatures were extracted from exome-sequenced RDEB SCC samples, and two signatures were identified from whole-genome sequenced samples. These signatures represented either known Catalogue of Somatic Mutations in Cancer (COSMIC) signatures or mixtures of known COSMIC signatures (fig. S1A). No new mutation signatures were identified in RDEB SCC tumors. The RDEB SCC mutations we identified were associated with signatures 1 and 5 (of unknown mechanism and previously associated with age in a number of different cancer types) (25), UV damage (signature 7), APOBEC editing (signatures 2 and 13), and signature 18 (of unknown mechanism, associated with C > A transversion mutations and identified in a number of different cancers; Fig. 2A). Signatures 1, 2, 5, 7, 13, and 18 were identified in all SCC subtypes analyzed, and comparison of whole-exome with whole-genome sequencing results showed excellent concordance (fig. S1B). APOBEC-associated mutation signatures were identified in all RDEB SCC samples. Additional mutation signatures identified in UV SCC and HNSCC were associated with defective mismatch repair (signature 6 identified in 4 of 36 UV SCC samples, contributing to 2.4% overall mutations) or tobacco smoke (signature 4 identified in 64 of 191 HNSCC samples, contributing to 19% overall mutations; Fig. 2B). We detected transcriptional strand bias in the RDEB SCC–extracted signatures associated with COSMIC signature 7 (UV radiation damage; fig. S1A) along with high numbers of CC > TT dipyrimidine mutations consistent with UV damage. Transcriptional strand bias at T > C mutations at ApTpN contexts consistent with COSMIC signature 5 was identified in both exome and genome data (fig. S1A). Although exome data are not able to confirm APOBEC-associated mutation more frequently on the lagging compared with the leading strand during DNA replication (R asymmetry), identification of C > G transversion mutations in the TpCpW context, which are unique to APOBEC-associated mutation, confirms that this mutational process is present in all RDEB SCC samples analyzed in this study.

Fig. 2 Mutation signature analysis identifies prevalent APOBEC signatures in RDEB SCC.

(A) The y axis displays number of mutations, and x axis organizes 24 RDEB SCC samples in descending order of total number of single-nucleotide variants, with colors representing mutation types. The pie chart shows the collective proportion of all mutational signatures across all 15,776 mutations in all 24 samples. (B) Pie charts show proportions of single-nucleotide variants corresponding to specific signatures for 36 UV SCC (15) and 191 HNSCC (16), separated by HPV status or presence of tobacco signature mutations. Only high-confidence samples, those showing cosine similarity of >0.85 between observed nucleotide context and reconstructed nucleotide context using the identified signatures and their activities, are shown.

Overall, only 38% of RDEB SCC mutations was UV-induced compared to 78% of UV SCCs (Fig. 2). Therefore, although a median of 1543 exome mutations harbored UV signatures in UV SCC, a mere 75 median mutations appeared sun-related in RDEB SCC. In contrast, APOBEC mutation processes accounted for 42% of RDEB mutations compared to <2% in UV SCCs. APOBEC activity (signatures 2 and 13) was detected in all RDEB SCCs (contributing a median of 179 mutations per exome) and in certain subsets of HNSCC but was virtually absent in spontaneous cutaneous SCCs arising from UV damage (2%; median, 0 exome mutations; Fig. 2). HPV-positive HNSCC and RDEB SCC mutations shared similar proportions of signatures 2 and 13, although RDEB SCC acquired more APOBEC mutations that were present in all RDEB tumors, in contrast to HPV-positive HNSCC (Fig. 3). We have previously shown that RDEB SCCs are HPV-negative (5) and their CDKN2A-mutant, wild-type PIK3CA status also resembles HPV-negative HNSCC (Fig. 1) (16).

Fig. 3 Increased signatures 2 and 13 mutations in RDEB SCC.

(A to C) Each box plot represents the distribution of mutation proportions identified by exome sequencing (as a percentage) corresponding to specific mutational signatures for RDEB SCC (n = 24), UV SCC (n = 36), HNSCC (tobacco- and HPV-negative; n = 104), tobacco-positive HNSCC (Tob+; n = 62), and HPV-positive HNSCC (HPV+; n = 25). Solid box outlines show the second and third quartiles, and the median is shown by the red line. The first and fourth quartiles are denoted by black broken lines, with outliers (more or less than two-thirds of the maximum/minimum) represented by +.

DNA repair is active in RDEB SCC

We sought to distinguish the potential roles of enhanced mutagenesis and impaired DNA repair in RDEB SCC. We applied an established method to assess DNA repair competence in a given tumor (26). When DNA repair is active, as in wild-type SCC, higher transcription levels allow the nucleotide excision repair machinery to access heterochromatic DNA, reducing the mutation rate. In xeroderma pigmentosum SCC, this relationship is minimal because of the lack of DNA repair (Fig. 4). Three RDEB SCCs (RDEBSCC_02, RDEBSCC_03, and RDEBSCC_05) were analyzed for these patterns, as they harbored sufficient single-nucleotide variants to establish trends. All three RDEB SCCs analyzed showed active DNA repair, closely resembling the DNA repair wild-type SCCs. (Fig. 4). These data thus establish that an accelerated mutagenesis process drives the greater mutation accumulation in RDEB SCC.

Fig. 4 DNA nucleotide excision repair persists in RDEB SCCs.

The x axis shows RDEBSCC_02 (EB2), RDEBSCC_03, and RDEBSCC_05, which have acquired sufficient numbers of UV damage–induced mutations to support the described analysis (n > 1000); DNA repair wild-type UV damage–induced SCCs (WTSCC); and SCCs obtained from individuals with defective nucleotide excision repair (xeroderma pigmentosum patients; XPC and XPD). The y axis shows, for each sample, the fold decrease in the untranscribed strand mutation rate in genes with greater than 38.1 reads per kilobase of transcript per million mapped reads (RPKM), compared to genes with an RPKM of 0, for regions of high H3K9me3 density [>10 median chromatin immunoprecipitation sequencing intensity (42)]. Note that RDEBSCC_05 has fewer mutations (73 nonsynonymous mutations) and a lower contribution of UV mutation (15%) compared with RDEBSCC_02 and RDEBSCC_03 (334 and 153 nonsynonymous mutations, >50% UV contribution).

APOBEC mutation rate is accelerated in RDEB SCC

Given the inevitability and very early onset of RDEB SCC, we next asked how rapidly mutations were acquired over time. Mutations present in signature 1 were previously found to correlate with age in HNSCC (25), but the number of signature 1 mutations and patient age at time of tumor excision in RDEB SCC were not related (fig. S2). However, signature 5 mutations increased as a function of age at tumor excision (Pearson correlation, r = 0.49; P = 0.01), establishing that mutations associated with this process increase with time in RDEB SCC (fig. S2). APOBEC signatures did not correlate with age. However, more APOBEC mutations were acquired per year in an average RDEB patient SCC compared to both UV-induced patient SCC (7.4-fold; Mann-Whitney U test, P = 1.3 × 10−9) and HNSCC (8.3-fold; Mann-Whitney U test, P = 1.8 × 10−12), demonstrating that this mutational process is increased in RDEB patients.

We speculated that the chronic damage environment giving rise to these tumors may enhance APOBEC activity. RNA sequencing analysis of RDEB SCC and unaffected RDEB skin samples (fig. S3) showed that all but two APOBEC3 genes were up-regulated in RDEB SCC (Fig. 5A), particularly APOBEC3A (log2 fold change, +4.8), APOBEC3B (log2 fold change, +2.56), and APOBEC3H (log2 fold change, +4.46; Fig. 5B and fig. S4). We next asked whether APOBEC3A, APOBEC3B, and APOBEC3H expression was increased in areas of chronic tissue damage. We measured mRNA transcript levels in biopsies taken from tumor excisions and their peripheries or, in two cases, from limbs amputated as a result of recurrent SCC (Fig. 5C). Quantitative polymerase chain reaction (qPCR) showed that transcript levels of these three APOBEC subunits increased from distal to peritumoral tissue, with a sharp increase in the tumor (Fig. 5C). Analyzing the ratio of APOBEC-associated mutations identified by exome sequencing in the RpTpC motif (where R at −2/+2 delineates a purine, either A or G) relative to APOBEC-associated mutations in the YpTpC nucleotide motif (where Y at −2/+2 delineates a pyrimidine, either T or C) suggested that APOBEC3A is a dominant contributor to APOBEC mutagenesis in RDEB SCC (fig. S5).

Fig. 5 RNA sequencing identifies increased APOBEC3 gene expression in RDEB SCC.

(A) The y axis displays number of log2 fold change compared to control housekeeping mRNA expression for 9 of 11 APOBEC-related genes. Norm, normal; Tum, tumor; Ctrl. avg, control average. (B) The y axis displays number of log2 fold change in SCC over normal, and x axis shows each of 11 APOBEC-related genes. RDEB SCC has the greatest increase in APOBEC3 gene expression comparing UV SCC and all other SCC cancers profiled in the Cancer Genome Atlas (TCGA). (C) qPCR measurement of APOBEC3A, APOBEC3B, and APOBEC3H mRNA expression relative to ACTB from three separate RDEB SCC and surrounding tissue.

Endogenous mutation processes dominate RDEB SCC driver mutations

It is possible that APOBEC mutation rates are increased in RDEB SCC but acquired late, resulting in passenger mutations, rather than driver mutations. To identify the mechanism of tumor initiation, we analyzed the spectrum of established driver mutations found in RDEB SCC (Fig. 1). The probability of each process causing each single-nucleotide driver variant was calculated using the described approach (27). Driver point mutations associated with signatures 1 and 5 accounted for 30% of all driver mutations compared with 18% overall point mutations in RDEB SCC. APOBEC signatures (signatures 2 and 13) were also strongly represented in driver point mutations (contributing 37% versus 42% overall), whereas UV mutations (signature 7) were reduced in driver genes, contributing only 30% of driver mutations. Therefore, endogenous mutation processes dominated by APOBEC cause more than two-thirds of driver mutations acquired by RDEB SCC, as opposed to one-fifth in UV SCC. Less than 1/50 of UV SCC driver mutations showed an APOBEC signature compared with more than one-third in RDEB SCC.

APOBEC mutagenesis persists throughout RDEB SCC tumor evolution

To identify whether mutation processes identified in RDEB SCC are active throughout tumor evolution, we used multiregion whole-exome sequencing (M-WES) to deduce temporal clonal evolution in RDEB SCC. M-WES was performed on five tumors, four from distant portions of the same tumor (separated by >5 mm), and one from separate islands within the same tumor slice (separated by <1 mm; fig. S6). APOBEC mutation signatures were present in both the tumor “trunk” (shared between each portion of the tumor and considered to have arisen early in tumor development) and tumor “branches” (mutations confined to a given tumor region and considered to have arisen late in tumor development; Fig. 6, A and B), showing that this endogenous mutation process is active throughout the development of RDEB SCC.

Fig. 6 M-WES in RDEB SCC.

Separate regions of five different tumors (A and B) were compared with respect to mutation type and mutation signature. Heat maps show the presence (color indicated by key) or absence (white) of a somatic mutation in each tumor region (T). Each gene is arranged in a row, and putative driver mutations are indicated. The pie chart indicates the proportion of somatic mutations attributed to a given mutation signature, as described. Trunk refers to those somatic mutations shared by different tumor regions, and branch indicates private mutations, which were analyzed together for mutation signatures. (C) Primary (T1, RDEBSCC_20) and rapid recurrence (T2, RDEBSCC_25) were compared with respect to mutation type and mutation signature.

RDEB SCC tumors are homogenous with respect to driver mutation

M-WES analysis of driver mutations demonstrated that with the exception of a single NOTCH1 mutation, all identified driver genes harbored mutations in the tumor trunk (Fig. 6, A and B). In one case, additional CASP8 mutations were found in tumor branches. Shared driver mutations (TP53 and CASP8) were also observed when comparing a primary tumor with a rapid recurrence at the same location, 10 weeks after primary excision, whereas only two new nonsynonymous mutations were identified in the recurrence (Fig. 6C).

Our series includes multiple tumors from the same RDEB patient, enabling us to ask whether these mutation signatures describe inherent processes driving cancers in a given individual. Two tumors analyzed by M-WES were isolated from the same patient and showed remarkable similarity in both mutation profile and mutation signatures yet did not share a single somatic variant. Although the two tumors were from distant sites, each contained separate mutations in CASP8 (p.Q406* and p.Q482*), NOTCH1 (p.A729Gfs*14 and p.D412Y), and FAT1 (p.T2732Dfs*16/p.Q1114* and p.S4031*) and harbored different, oncogenic HRAS mutations (p.G12D and p.Q61L; Fig. 6A).

RDEB SCC transcriptomes share close homology with HNSCC subtypes

A gene set enrichment analysis comparison of RNA sequencing–based RDEB SCC with UV SCC (28), HNSCC [stratified by previous expression profiling analysis (16)], and visceral SCC showed that RDEB SCC is more similar to the basal and mesenchymal subtypes of HNSCC, whereas UV SCC is more closely related to the atypical subtype of HNSCC and esophageal SCC (Fig. 7A). Pathway analysis identified enrichment of inflammatory processes in RDEB SCC, including those associated with microbial infection (Fig. 7B). No viral transcripts were detected in the RDEB SCC samples.

Fig. 7 RDEB SCC transcriptomes resemble HNSCC.

(A) Normalized enrichment scores for each SCC signature compared to normal tissue were determined, and cancer types ranked by similarity clockwise on a CIRCOS plot. The outer ring shows enrichment of up-regulated transcripts, and the inner ring shows enrichment of down-regulated transcripts. Using this analysis, RDEB SCC was most closely related to the basal and mesenchymal subtypes of HNSCC, whereas UV SCC was more closely related to the atypical subtype of HNSCC and esophageal (ESCA) SCC. LUSCC, lung SCC; CSCC, cervical SCC. (B) KEGG (Kyoto Encyclopedia of Genes and Genomes) analysis identified inflammatory-associated pathways (red) increased in RDEB SCC. Pathways shown P < 0.001, increasing from left (P = 2.8 × 10−19) to right (P = 0.00064). ECM, extracellular matrix; PI3K, phosphatidylinositol 3-kinase; Jak, Janus kinase; STAT, signal transducers and activators of transcription.


Here, we comprehensively profiled the somatic alterations and transcriptomes of RDEB SCC, an early-onset, devastating cancer that develops in most RDEB patients. Given the rarity of RDEB, this is a large study of its kind, sufficiently powered to reveal driver mutations found in other SCC subtypes. We identified that endogenous mutation processes in RDEB patients are dominated by APOBEC and cause RDEB tumors to rapidly acquire a burden of mutation in childhood, comparable to that of visceral SCC occurring later in adulthood. We also showed that DNA repair remains highly active in these tumors, indicating that increased APOBEC activity is the primary cause of increased mutagenesis in RDEB SCCs.

Strikingly, the APOBEC subunits likely contributing to cancer mutagenesis, APOBEC3A (29, 30), APOBEC3B (31), and APOBEC3H (32), demonstrated the highest transcript expression in RDEB SCC compared with normal skin and for all SCC subtypes analyzed. Furthermore, the proportion of APOBEC mutations associated with APOBEC3A was greater than the context associated with APOBEC3B. Therefore, inhibition of APOBEC3A may represent a plausible tumor prevention strategy in RDEB and perhaps other tissue damage–driven cancers.

The contribution of endogenous mutagenic processes in RDEB SCC is profound. The UV-induced driver mutation burden in these patients was 30%, whereas endogenous processes increased this burden by 222%. These mechanisms generated more than twice as many mutations in an RDEB SCC than in an average HPV-negative, tobacco signature–negative HNSCC (371 versus 178, SD ± 278 exome mutations) but at less than half the average age (32.4, SD ± 8.5 versus 61.3, SD ± 12.5). Under a reductionist model in which any five independent, equally probable driver mutations are sufficient to form a cancer, the >4-fold increase in mutagenesis created by endogenous processes would therefore explain a 45, or a 1000-fold greater tumor formation, relative to HNSCC.

The tumor microenvironment has been implicated previously in RDEB SCC progression (6, 7) and may accelerate endogenous mutation rate both indirectly, as a result of increased cellular turnover, and directly, as a result of APOBEC activation. Cellular turnover has been associated with signatures 1 and 5 based on correlations among age in certain cancers (25), somatic mutations arising in early human embryos (33), and over time in certain human tissues (34). However, signatures 1 and 5 mutation frequencies do not always correlate, suggesting that additional biological or environmental factors influence these processes (25). The significant correlation between signature 5 and age we report in RDEB SCC may reflect an increase in cellular turnover in wounded skin. We may not detect a relationship between signature 1 or APOBEC mutations and age in RDEB SCC because of insufficient sample size. However, APOBEC activation in other cancers does not correlate with age. Instead, APOBEC mutation has been associated with DNA replication stress (35), and particulate matter in tobacco smoke has been hypothesized to induce inflammation and APOBEC activity (10). In RDEB, cellular stress and inflammation resulting from continual tissue damage and microbial insult represent obvious sources of APOBEC activation. The identification of an enrichment of microbiome- and inflammatory-associated pathways in RDEB SCC supports this model. We also predict that SCC arising in the margins of chronic ulcers or burn scars (Marjolin’s ulcer) is driven by APOBEC mutagenesis. Heat induction of APOBEC (36) may also enhance mutation after burn injury.

RDEB SCC harbored a mutation prevalence similar to that of aged sun-exposed Caucasian skin (3.5 average coding mutations/Mbp versus between 2 and 6 mutations/Mbp) (28, 37) but with a far lower proportion of UV signature mutations. Therefore, RDEB SCC not only likely arises from mutagenic processes distinct from those dominant in UV-damaged tumors but also develops at much lower overall mutation burdens, even while acquiring similar driver mutations. This observation suggests that microenvironmental conditions may favor selection of SCCs in RDEB patients, distinct from enhanced mutation rate.

Neither enhanced mutation rate nor development of SCCs with lower mutation burdens directly explains why RDEB SCCs are so clinically aggressive. We do not detect somatic mutations that explain increased metastasis. Other UV-induced and visceral SCC sequencing studies have not identified cancer-associated COL7A1 mutations (9, 15, 20, 28), making it unlikely that this RDEB germline defect cell autonomously enhances tumor dissemination in non-RDEB individuals. We thus propose that the highly permissive, inflammatory microenvironment in RDEB SCC likely enables tumor cell invasion and metastasis.

RDEB SCC often arises as multiple primary tumors, and current treatment still relies on wide local excision, radiotherapy, and, in late stages, limb amputation (14). The dismal cure rates highlight an urgent need for new treatment options. Both our transcriptome and mutation signature analyses revealed greater similarity between RDEB SCC and subtypes of HNSCC than with UV SCC. These observations suggest that in addition to a therapeutic role for specific APOBEC inhibitors, treatments effective for HNSCC should be evaluated for response in RDEB SCC.

The high tumor and driver clonality we report in RDEB SCC indicates that these cancers may escape the selective pressure to develop private oncogenic mutations observed in other SCC (38, 39). Separate tumors arising at distant locations in a single patient showed remarkable similarity in driver gene profile, suggesting that tumor genetics are recurrently shaped by host genetics, a highly inflammatory microenvironment, or microbial colonization. Collectively, these data suggest that use of targeted therapies in RDEB SCC may have greater efficacy than with polyclonal tumors that have acquired the ability to adapt to selective treatment pressures.

Our study is not without limitations. As stated above, although we have identified a clear mechanism for mutation acquisition in RDEB SCC, increased sample number will strengthen any correlation between endogenous mutation signatures and age in this patient group. Because this is a genetic study, it does not reveal whether APOBEC induction in keratinocytes is caused by cell-autonomous inflammatory processes or by inflammatory input from the microenvironment. Finally, our study is not designed to directly identify the mechanism that makes RDEB SCCs exceptionally aggressive. However, the fact that RDEB SCC recurrent mutations appear quite similar to those in more indolent, spontaneous cutaneous SCCs suggests a role for the inflammatory microenvironment in permitting progression.


Study design

We started this study with the aim of comprehensively profiling the somatic mutation spectrum of RDEB SCC with the goal of determining whether a particular mutation or combination of mutations could explain the aggressive nature and early onset of this cancer. To this end, we set out to profile a minimum of 20 RDEB SCC patients using exome sequencing based on previous work identifying somatic mutation profiles in UV SCC in similar numbers of tumors (9, 20). Ethical approval for this investigation was obtained from all local ethics committees, and the study was conducted according to the Declaration of Helsinki Principles. All patients participating in this study, or in the case of minors, their representatives, provided written, informed consent. With one exception, all patients were treatment-naive at the time of tumor excision. Patient details are given in table S1. In total, we sequenced 31 unique tumors, each with corresponding normal paired samples, and excluded 4 tumors based on absence of somatic mutation. All tumors with evidence of somatic mutation were included in this study. During the course of the study, we identified eight large tumors with >80% tumor cellularity, which were suitable for RNA isolation and sequencing in the absence of laser-capture microdissection. Three tumors from the initial exome sequencing batch of 13 tumors contained a high number of exome mutations and were further sequenced at the whole-genome level for comparison and DNA repair estimates. Five large tumors were identified for M-WES.

Patient samples and DNA isolation

Punch biopsies of SCC tissue were taken after surgical excision of tumors and either processed for cell culture or immediately snap-frozen in liquid nitrogen, with the remainder of the tumor sent for formalin fixation and histopathologic diagnosis. To enrich for tumor cell populations, fresh-frozen SCC biopsies were laser capture–microdissected using the ArcturusXT Laser Capture Microdissection System (Applied Biosystems). Depending on sample size and tumor purity as estimated from a reference hematoxylin and eosin slide, between 30 and 60 sections of 8-μm thickness were cut onto 1.0-mm polyethylene naphthalate membrane slides (Zeiss), stained in 0.05% acid fuchsin (Acros Organics) in distilled water and 0.05% toluidine blue O (Acros Organics) in 70% ethanol, and microdissected, with tumor cells collected into 180 μl of ATL buffer (QIAGEN). DNA extraction was performed using the QIAamp DNA Micro Kit (QIAGEN), according to the manufacturer’s instructions. To provide a source of germline DNA, paired venous blood samples or normal skin was obtained concomitantly with cutaneous SCC tissue and stored at −80°C before DNA extraction using the protocol above in the case of normal skin or the DNeasy Blood and Tissue Kit (QIAGEN). In some instances, normal dermal fibroblasts or fibroblasts isolated from tumor tissue were used for germline DNA isolation using the DNeasy Blood and Tissue Kit (QIAGEN).

Calling of copy number variants

All identified somatic variants are detailed in table S2. For copy number variants, average tumor versus matched normal relative coverage and SD were calculated for each captured exon by dividing read depth measured in the tumor by the read depth in the matched normal for each position within the exon. Exons that were insufficiently covered (average read depth, <5 reads) in both tumor and matched normal were removed from the remaining analysis. Average relative coverage was corrected for GC bias using locally weighted scatterplot smoothing regression. Initial segmentation of exonic relative coverage estimates was performed using circular binary segmentation (40), requiring the mean relative coverages of adjacent segments to be within one SD of each other before they can be merged. The average majority allele fraction in both tumor and matched normal was computed for each segment with at least one heterozygous single-nucleotide polymorphism (SNP) in the normal tissue:Embedded Imagewhere Embedded Image and Embedded Image are the read depths of the germline allele with the greatest read support and Embedded Image and Embedded Image are the total read depths at the position in the tumor and matched normal, respectively. Relative coverage was determined simply by dividing the coverage observed in tumor by the matched normal coverage,Embedded Image. The SD of AFm estimates was computed for segments featuring at least three heterozygous SNPs. Only single nucleotide polymorphism database (dbSNP) sites with sufficient read support in the tumor and matched normal Embedded Image deemed heterozygous (Embedded Image) were considered when computing majority allele fraction estimates.

To determine copy number variants, the exon-level statistics computed above were iteratively aggregated into larger segments in an agglomerative process similar in spirit to hierarchical clustering. In the first round, every pair of neighboring exons was analyzed. Neighboring exons that did not have significantly different relative coverage and Embedded Image (only for exons with heterogeneous SNPs) estimates (P > 0.95, two-sample Student’s t test) were merged into a single segment. The average relative coverage for the new segment was calculated as the base-pair count–adjusted averages and SDs of the two individual exons measurements, whereas Embedded Image and its SD were recomputed for the new segment using the same procedure described above. This procedure was continued with neighboring segments using the above method for three rounds. The relative coverage estimates of all segments were centered to the median of the entire genome, which was assigned the value of 1.0, signifying the normal copy number state. The skew in Embedded Image estimates caused when the majority allele’s read support was due to sampling bias instead of an underlying imbalance in allele copy number was corrected by subtracting out the Embedded Image estimated in the matched normal sample. However, because this bias was only present in regions where both alleles had equal copy numbers, a correction was made using the following equation:Embedded ImageAll copy number variants are detailed in table S5.

Mutational signature analysis

We performed mutational signature analysis using a two-step approach (24, 25). First, we analyzed all tumors using the freely available Wellcome Trust Sanger Institute computational framework for deciphering mutational signatures, available at Mutational signatures were extracted separately for genome- and exome-sequenced data. Mutational signatures extracted from exome sequences were normalized to the trinucleotide frequency of the human genome. Mutational signatures were extracted using 96 nonnegative components (single-base somatic substitutions and their immediate sequence context) and 192 nonnegative components (single-base somatic substitutions with transcriptional annotation and their immediate sequence context). The 96 component-based mutational signatures were compared to the validated consensus mutational signatures in COSMIC ( (41) to identify the set of COSMIC mutational signatures in the SCC data sets. Next, we evaluated the activity of the COSMIC mutational signatures in each sample using the approach previously outlined (24). Briefly, this method identifies the optimal number of somatic mutations assigned to each mutational signature in each sample without violating previous biological knowledge about mutational signatures. The method also allows identifying the cosine similarity between the mutational pattern of the original sample and the mutational pattern of the sample reconstructed using the consensus mutational signatures. Samples with cosine similarity below 0.85 were not considered in subsequence analyses. Last, using the derived activities of mutational signatures, we assigned a probability to each somatic mutation to be generated by each mutational signature as performed in (27). Further details are available in Supplementary Materials and Methods.


Materials and Methods

Fig. S1. Exome and genome mutation signature profiles and evidence of strand bias support identification of signatures 7 and 5.

Fig. S2. Signature 5 correlates with age in RDEB SCC.

Fig. S3. RNA sequencing analysis of RDEB SCC tumors.

Fig. S4. APOBEC gene expression in HNSCC subtypes and UV SCC.

Fig. S5. APOBEC3A mutation motifs are enriched in RDEB SCC.

Fig. S6. Histology of tumors used for M-WES.

Table S1. RDEB SCC tumor sample details.

Table S2. Somatic mutations identified across 27 RDEB SCC tumors.

Table S3. MuSiC output analysis of 27 RDEB SCC tumors.

Table S4. MutSigCV output analysis of 27 RDEB SCC tumors.

Table S5. Copy number variations identified across 27 RDEB SCC tumors.

References (4355)


Funding: This work was funded by grants from the Sohana Research Fund (to A.P.S. and R.J.C.), the Epidermolysis Bullosa Research partnership (to R.J.C. and A.P.S.), and National Cancer Institute (5R01CA194617 to K.Y.T., C.C., K.R., X.S., and H.Y.). Sample collection was supported in part by a grant from DEBRA International (to L.B.-T.). Quantitative real-time PCR used the Sidney Kimmel Cancer Center Genomics Shared Resource Cancer Center supported by grant 5P30CA056036-17. Author contributions: R.J.C., L.B.A., K.Y.T., and A.P.S. designed the study. R.J.C., L.B.A., N.Y.d.B., V.S.A., M.F., M.P., I.F., R.H., E.P., T.N.N., E.J.G., K.Y.T., and A.P.S. performed the research. C.G.-G., J.P.H., I.F., D.F.M., J.C.S.-A., F.P., A.L.B., W.R., C.H., L.B.-T., M.T., M.F.J., E.R., S.M.L., J.E.M., J.A.M., J.W.B., and A.H. contributed reagents and analytical tools. R.J.C., L.B.A., N.Y.d.B., E.P., C.C., K.R., J.S., P.T., E.A.C., W.W., H.Y., X.S., S.C.B., J.G., E.A.E., C.M.D., G.E.D., K.R.C., K.Y.T., and A.P.S. analyzed data. R.J.C. and A.P.S. wrote the paper. Competing interests: E.A.C. is consultant for Bayer, Celgene, and Guardant Research and receives grants from Ignyta and Genentech. K.R.C. is an employee of Castle Biosciences Inc. and holds stock options in the company. D.F.M. is an investigator and advisor for clinical trials being conducted in the field of epidermolysis bullosa by Amicus, Amryt Pharma, Castle Creek Pharma, and with Stanford University. D.F.M. is also an advisor for Sienna and DEBRA Australia. M.F.J. receives grants from Castle Creek Pharma, Amicus, and ProQR and owns shares in Philae Pharmaceuticals. J.E.M. is a paid consultant for Castle Creek Pharma and ProQR and has been paid as an educator by Thornton & Ross. A.P.S. is a paid consultant for Krystal Biotech Inc. and Amryt Pharma and owns stock options in Krystal Biotech Inc. Data and materials availability: All data associated with this study are present in the paper or the Supplementary Materials. Binary sequence alignment/map files are available at the NIH Sequence Read Archive submission: SRP136498 ( RNA sequencing data can be accessed through the following link:

Stay Connected to Science Translational Medicine

Navigate This Article