Research ArticleCancer

Personalized circulating tumor DNA analysis to detect residual disease after neoadjuvant therapy in breast cancer

See allHide authors and affiliations

Science Translational Medicine  07 Aug 2019:
Vol. 11, Issue 504, eaax7392
DOI: 10.1126/scitranslmed.aax7392

Early detection, no time travel needed

Analysis of tumor DNA shed into a patient’s circulation can provide a noninvasive means of detecting the presence of a tumor and analyzing its DNA for targetable mutations. Unfortunately, it can be difficult to detect small amounts of tumor DNA in the blood, especially in patients who have already undergone initial chemotherapy treatment. To address this problem, McDonald et al. developed a method of targeted digital sequencing (TARDIS), which is customized for each patient but can then be used to monitor the patient over time for and allow early detection of tumor recurrence.

Abstract

Longitudinal analysis of circulating tumor DNA (ctDNA) has shown promise for monitoring treatment response. However, most current methods lack adequate sensitivity for residual disease detection during or after completion of treatment in patients with nonmetastatic cancer. To address this gap and to improve sensitivity for minute quantities of residual tumor DNA in plasma, we have developed targeted digital sequencing (TARDIS) for multiplexed analysis of patient-specific cancer mutations. In reference samples, by simultaneously analyzing 8 to 16 known mutations, TARDIS achieved 91 and 53% sensitivity at mutant allele fractions (AFs) of 3 in 104 and 3 in 105, respectively, with 96% specificity, using input DNA equivalent to a single tube of blood. We successfully analyzed up to 115 mutations per patient in 80 plasma samples from 33 women with stage I to III breast cancer. Before treatment, TARDIS detected ctDNA in all patients with 0.11% median AF. After completion of neoadjuvant therapy, ctDNA concentrations were lower in patients who achieved pathological complete response (pathCR) compared to patients with residual disease (median AFs, 0.003 and 0.017%, respectively, P = 0.0057, AUC = 0.83). In addition, patients with pathCR showed a larger decrease in ctDNA concentrations during neoadjuvant therapy. These results demonstrate high accuracy for assessment of molecular response and residual disease during neoadjuvant therapy using ctDNA analysis. TARDIS has achieved up to 100-fold improvement beyond the current limit of ctDNA detection using clinically relevant blood volumes, demonstrating that personalized ctDNA tracking could enable individualized clinical management of patients with cancer treated with curative intent.

INTRODUCTION

To maximize the rate of cure, patients with nonmetastatic cancer are often treated with multiple modalities including preoperative systemic therapy, surgery and radiation therapy, and postoperative therapy. However, for some patients, this results in overtreatment and adverse effects when they could have been cured with less intensive treatment, and the benefit of each consecutive modality of therapy is not certain (1). A treatment monitoring biomarker that can accurately distinguish residual disease from disease eradication could enable individualized management of localized cancers, but this has remained elusive because current diagnostics have inadequate sensitivity. In breast cancer, ~30% patients treated with neoadjuvant therapy (NAT) achieve pathological complete response (pathCR) with no histological evidence of invasive tumor in the resected breast tissue and lymph nodes (2). pathCR during NAT is associated with excellent long-term clinical outcomes. Ten-year relapse-free survival rates are 95, 86, and 83% in patients with human epidermal growth factor receptor 2–positive (HER2+), triple-negative, and estrogen receptor–positive, human epidermal growth factor receptor 2–negative (ER+HER2) breast cancer who achieve pathCR, respectively (3). In these patients, the added value of surgery, beyond removing preneoplastic lesions and confirming pathCR, might be questioned. An alternative diagnostic test to accurately detect residual disease could guide choice and planning of local treatment options such as the extent of surgical resection or the use of radiation therapy (4, 5).

Recent advances in circulating tumor DNA (ctDNA) analysis have shown promise in monitoring patients with nonmetastatic cancer, but these have primarily focused on recurrence monitoring and have limited accuracy for residual disease detection during treatment (69). In particular, detection of ctDNA after completion of NAT has been challenging in patients with breast and rectal cancer even when residual disease is observed at the time of surgery. Recent studies have found that ctDNA becomes undetectable in more than 90% of patients during NAT (10). As a result, no association has been observed between ctDNA detection and pathCR (11, 12). Detection of low amounts of ctDNA in patients with nonmetastatic cancer is impeded by limited blood volumes accessible in a clinical environment and low concentrations of total cell-free DNA (cfDNA). Unlike in patients with metastatic cancer where cfDNA concentrations are much higher, a 10-ml blood tube (4 ml of plasma) from patients with early stage cancer typically yields only 20 ng of cfDNA (~6000 haploid genome copies). In addition, ctDNA concentrations in patients with early and locally advanced cancer are lower compared to those in patients with metastatic cancer. For example, median ctDNA concentration before treatment in triple-negative breast cancer (TNBC) was 12.5% in patients with metastatic cancer and 0.68% in patients with nonmetastatic cancer (almost 20-fold lower) (12, 13). During and after completion of treatment, ctDNA signal from residual disease is expected to be even lower. As a result, sensitivity and analytical precision of ctDNA tests for residual disease are often limited due to stochastic sampling variation (Fig. 1A).

Fig. 1 Development of a multiplexed assay for personalized ctDNA detection and monitoring.

(A) Results of binomial sampling at varying input DNA amounts (bottom x axis) and corresponding plasma volumes (top x axis). Maximum theoretical sensitivity for 1 in 105 tumor fraction (y axis) was calculated as the probability of detecting at least one mutated DNA fragment for at least one targeted mutation. Each line shows the number of mutations tested (25 to 100, increments of 5). A plasma DNA concentration of 5 ng/ml of plasma (or 1500 haploid genome copies) and no molecular loss during library preparation are assumed. Sensitivity for detection of ctDNA at 0.001% tumor fraction is limited if only two to four mutations are assayed but can be improved with higher input of plasma DNA and increasing number of patient-specific mutations. (B) For TARDIS, sequencing library preparation includes linear pre-amplification to improve molecular conversion, single-stranded DNA ligation using hairpin oligonucleotides to allow error suppression using template fragment sizes and unique molecular identifiers (UMIs), and multiplexed PCR to enrich targeted genomic loci. (C) Schematic representation of read structure and error suppression. TARDIS uses UMIs (indicated by different read colors) and fragment sizes to group sequencing reads into RFs. We exclude PCR errors (red circle) by requiring consensus of all RF members and polymerase errors (yellow circles) introduced during linear pre-amplification by requiring support by at least two RFs. Additional description of error suppression strategies is provided in Materials and Methods.

Sampling variation can be overcome by increasing the volume of blood obtained at each time point to increase the amount of plasma DNA analyzed, by improving the rate of conversion of DNA into sequencing-ready molecules, and by simultaneously analyzing multiple patient-specific somatic founder mutations. Founder mutations are present in all cancer cells, and therefore, each is equally informative of tumor-derived DNA in blood (14). To leverage these principles and enable residual disease detection, we have developed a personalized approach for tumor-guided ctDNA detection and quantification called targeted digital sequencing (TARDIS). Here, we describe the development and analytical performance of TARDIS using dozens of replicates of reference material with tumor fractions as low as 3 in 105, and we demonstrate the clinical performance of ctDNA detection and quantification in patients with early and locally advanced breast cancer, before and after completion of neoadjuvant systemic therapy.

RESULTS

Tumor-guided ctDNA analysis using TARDIS

We developed TARDIS to improve analytical sensitivity and quantitative precision for ctDNA analysis by maximizing interrogation of tumor-derived DNA fragments in limited amounts of plasma DNA. To achieve this, we leveraged simultaneous deep sequencing of patient-specific somatic mutations while minimizing template DNA losses during library preparation and suppressing background errors. For each patient, we identified putative founder somatic mutations using exome sequencing of tumor biopsies and analyzed dozens to hundreds of mutations simultaneously in serial plasma DNA samples obtained during treatment (fig. S1). To maximize capture and analysis of input DNA while preserving specificity, we perform targeted linear pre-amplification, followed by single-stranded DNA ligation with unique molecular identifiers (UMIs), targeted exponential polymerase chain reaction (PCR), and sequencing (Fig. 1B). The resulting sequencing reads at each targeted locus have a fixed amplification end and a variable ligation end, preserving fragment size information unlike conventional PCR amplicons (15, 16). We used fragment sizes and UMIs to group sequencing reads into read families (RFs) and required consensus of all members to distinguish true low abundance mutations from polymerase or sequencing background errors (Fig. 1C).

Evaluation of assay performance in reference samples

To evaluate the analytical performance of TARDIS at low ctDNA concentrations, we designed a multiplexed panel targeting eight mutations in commercially available reference samples for cfDNA analysis (Seraseq ctDNA Mutation Mix v2, SeraCare; table S1). We analyzed a total of 93 replicates, 7 to 16 each at 1, 0.5, 0.25, 0.125, 0.063, and 0.031% allele fractions (AFs) and 16 wild-type samples. AFs for individual mutations were verified by droplet digital PCR (ddPCR) by the vendor (except for 0.063 and 0.031% that were dilutions of 0.125% in wild type, table S2). Input DNA in each replicate was 5.6 to 7.9 ng (1682 to 2394 haploid genomic equivalents). Mean number of mutated molecules expected for each targeted mutation in a sample was 0.90 to 19.6 across 0.031 to 1% AFs.

To exclude polymerase errors introduced during linear or exponential amplification, we required at least two independent DNA fragments (≥2 RFs) and measured AF consistent with ≥0.5 mutant molecules to support each variant call. In raw sequencing results, we observed a mean error rate per base of 6.4 × 10−4 and a median error rate per base of 2.2 × 10−4, with background errors observed at 77% of tested positions. By requiring consensus of all members of an RF and a minimum of two RFs, we found a significantly reduced mean error rate of 1.1 × 10−4 and a median of 0 (Wilcoxon rank-sum P < 1 × 10−99; fig. S2). Ninety-three percent of tested positions were error free. In reference samples, we achieved a mutation-level sensitivity of 94.6, 90.6, 65.6, 50.8, 25.8, and 19.6% at 1, 0.5, 0.25, 0.125, 0.063, and 0.031% AFs, respectively, consistent with the decreasing number of mutant molecules at lower AFs (Fig. 2A and data file S1). Using the same criteria, none of the 128 candidate mutations were detected in wild-type samples (100% specificity). Analogous to the detection of tumor-derived DNA in plasma, we leveraged multiple mutations to evaluate sample-level sensitivity. To ascertain mutant DNA in a sample, we required ≥2 RFs contributed by one or more mutations, each with measured AF consistent with ≥0.5 mutant molecules in input DNA. In samples where a single mutation was detected, we required supporting RFs with ≥2 fragment sizes. We achieved a sample-level sensitivity of 100% for 0.125 to 1% AFs, 87.5% for 0.063% AF, and 78.6% for 0.031% AF (Fig. 2B). Using the same criteria, we detected mutant DNA in 1 of 16 wild-type samples (93.8% specificity). These results demonstrate the principle underlying TARDIS: leveraging multiple patient-specific mutations to overcome limits of sampling and to improve limit of detection. We successfully detected mutant DNA in 11 of 14 replicates with 0.031% AF with 7.8 ng of input DNA per reaction, when we expected a total of 7.2 mutant molecules per reaction across eight mutations (<1 mutant molecule per mutation).

Fig. 2 Analytical performance of TARDIS in reference samples.

(A) Mutation-level sensitivity and specificity across 93 reference samples and 8 mutations, requiring each mutation to be supported by ≥2 RFs and an AF consistent with ≥0.5 mutant molecules. Each row corresponds to a targeted mutation, and each column corresponds to a single sample analyzed at the identified AF. (B) Sample-level sensitivity and specificity, requiring ≥2 RFs contributed by 1 mutation with multiple fragment sizes or >1 mutation, each with an AF consistent with ≥0.5 mutant molecules. (C) Comparison of variant AFs observed using TARDIS (y axis) with expected variant AFs measured using ddPCR (x axis, 48 data points). For each variant, mean observed AF across all replicates (at the same expected AF) is presented. Gray line is linear fit. (D) Comparison of sample AFs observed using TARDIS (mean for all eight mutations assayed in each replicate sample, 77 data points) with known sample AFs (mean of known variant AFs). Gray line is linear fit to the mean at each expected AF. (E) CVs of variant AFs decreased with increasing number of mutant molecules per mutation. CVs calculated across 7 to 16 replicates at each mutation fraction for each of eight mutations (48 data points). (F) CVs of sample-level AFs were lower than those for individual mutations, demonstrating the advantage of leveraging multiple mutations for ctDNA quantification. CVs calculated across 7 to 16 replicates for sample-level AFs across six mutation fractions.

To determine quantitative accuracy, we compared known AFs for variants measured by ddPCR in reference samples to mean AFs measured using TARDIS and found a strong correlation (Pearson r = 0.921, P < 2.2 × 10−16, Fig. 2C). To evaluate agreement between observed and expected mutant fraction in each sample (equivalent to ctDNA fraction in plasma samples), we calculated sample-level mean AFs (mean of eight mutations in each replicate) and found an excellent correlation between observed and expected AFs (Pearson r = 0.937, P < 2.2 × 10−16, Fig. 2D). To evaluate quantitative precision, we calculated the coefficient of variation (CV) for observed AFs for each set of replicates (Fig. 2E). Variant-level CVs were strongly correlated with expected number of mutated molecules (Pearson r = −0.854, P = 1.2 × 10−14), ranging from 0.28 (17.8 average mutant molecules per mutation) to 3.74 (0.08 mutant molecules per mutation). A similar pattern was observed in CVs for sample-level mean AFs (Pearson r = −0.936, P = 0.006), although precision improved when multiple mutations were aggregated together. Sample-level CVs ranged from 0.16 for 1% expected AF (137.9 average mutant molecules per reaction) to 0.87 for 0.031% expected AF (5.4 mutant molecules per reaction, Fig. 2F).

To evaluate whether we can improve the limit of detection further using clinically accessible amounts of plasma DNA, we performed an additional experiment targeting 16 mutations in 56 replicates from reference samples, 8 replicates each at 1 and 0.03% AFs and wild type, and 32 replicates at 0.003% AF. DNA input per reaction was 5.0 to 13.6 ng for 1%, 0.03%, and wild-type samples, and 20.0 to 27.2 ng for 0.003% AF. Using these input amounts, we expected an average of 38.1, 1.6, and 0.28 mutant molecules per mutation and detected 89.1, 14.1, and 5.9% in 1, 0.03, and 0.003% AFs, respectively (Fig. 3A and data file S2). Only 1 of 128 candidate mutations was erroneously called as false positive in wild-type samples (99.2% specificity). Aggregating 16 mutations together, we expected 553, 19.8, and 2.9 to 4.0 total mutant molecules per sample. At the sample level, we detected tumor DNA in 8 of 8 (100%), 7 of 8 (87.5%), and 17 of 32 (53.1%) replicates in 1, 0.03, and 0.003% AFs, respectively (Fig. 3B). Tumor DNA was not confidently detected in any of eight wild-type samples using the same criteria (see Materials and Methods). To mimic analysis of cfDNA equivalent to two 10-ml blood tubes (a clinically accessible and relevant volume), we tested all combinations of any two replicates at 0.003% and achieved a sample-level sensitivity of 85.9% (237 of 276 combinations). Using a similar approach with wild-type samples, we detected tumor DNA in 1 of 28 combinations (96.4% specificity). As in the previous eight mutation experiments, we observed excellent agreement between sample-level AFs measured using TARDIS and digital PCR (Fig. 3C). Overall, these results confirm quantitative accuracy of TARDIS and demonstrate that simultaneously assaying multiple patient-specific mutations improves sensitivity as well as quantitative precision.

Fig. 3 Evaluation of analytical performance in reference samples at 3 in 105 tumor fraction.

(A) Variant-level sensitivity and specificity across 56 reference samples and 16 mutations, requiring each mutation to be supported by ≥2 RFs and an AF consistent with ≥0.5 mutant molecules. Twenty-two mutations were analyzed in this experiment. However, six mutations were inferred to contribute biological background because these were recurrently observed in a wild-type DNA sample sourced from immortalized cell lines. These included known hotspot variants in TP53 (n = 4 of 4 targets), APC (n = 1 of 2 targets), and GNAS (n = 1 of 1 targets). These mutations were dropped from further analysis. Each row corresponds to a targeted mutation, and each column corresponds to a single sample analyzed at the identified AF. (B) Sample-level sensitivity and specificity, requiring ≥2 RFs contributed by one mutation with multiple sizes or >1 mutations, each with an AF consistent with ≥0.5 mutant molecules. Although a mutation with two RFs was observed in one wild-type sample, this mutation was supported by a single size, and at the sample level, ctDNA was determined to be undetectable. (C) Accuracy evaluated by comparison of sample AFs observed using TARDIS (mean for all 16 mutations assayed in each replicate sample) with known sample AFs (mean of known variant AFs measured using digital PCR). Blue line is linear fit to the mean at each expected AF. (D) Precision evaluated using CVs of sample-level AFs, calculated across 8 to 32 replicates.

Because limited blood volumes can be obtained clinically, a key performance metric for ctDNA assays is conversion efficiency, or the fraction of input DNA molecules that are successfully analyzed. TARDIS uses several cycles of linear pre-amplification before ligation with UMIs, and therefore, we expect the number of RFs to be several-fold higher than input haploid genome copies. To estimate effective molecular conversion for TARDIS, we leveraged multiple replicates from reference samples and inferred effective conversion by comparing observed performance (sensitivity and precision) and expected performance (based on the Poisson distribution), given expected mutation AFs, amounts of input, and sequencing coverage. Measuring 16 candidate mutations in aggregate, we found that precision improved as the number of total mutant molecules increased in the reaction (Fig. 3D). Theoretically, a linear fit to log2(CV) as a function of log2 (expected number of mutant molecules) should have a slope of −0.5, with CVs decreasing as the inverse square root of the number of rare mutant molecules (1/√n) increases. In our data, we observed a slope of −0.496, consistent with this expectation. Using the intercept for the linear fit and input amounts used for each replicate, we inferred an effective molecular conversion rate of 26%. In contrast, the sample-level sensitivity observed across 32 replicates of 0.003% suggests an effective molecular conversion rate of ~39%, based on expected AFs.

Detection of residual disease in patients with early and locally advanced breast cancer

To evaluate whether TARDIS enables residual disease detection in patients with early and locally advanced cancer, we analyzed blood samples obtained from 33 patients with stage I to stage III breast cancer, of whom 22 patients were treated with NAT. Distributions of key clinical characteristics of the cohort are presented in Fig. 4A. Most patients presented with stage II disease (24 of 33 patients) and invasive ductal carcinoma (30 of 33 patients). Of the 33 patients, 17 had ER+ HER2 cancer, 7 had HER2+ cancer, and 9 had TNBC. We performed whole-exome sequencing of DNA from diagnostic tumor biopsies and matched germline samples, achieving 177× and 143× mean coverage, respectively (table S3). Mean tumor cellularity inferred from exome sequencing analysis was 41% (range, 14 to 85%). A mean of 700 somatic mutations (range, 147 to 4904) were called across tumor samples, and we identified a mean of 65.8 putative founder mutations per patient (range, 9 to 286). Using an aggressive filtering strategy during primer design and after excluding primers amplifying erroneously in validation experiments with control samples, we analyzed 80 serial plasma samples obtained from 33 patients (1 to 4 samples per patient, data file S3) for 6 to 115 mutations per patient (mean, 30 mutations; median, 18 mutations). Across this cohort, validated TARDIS assays covered a mean of 55% putative founder mutations found in the corresponding tumor samples (range, 5 to 90%; median, 67%). Plasma samples were collected before starting therapy, during NAT, and after completion of NAT before surgery. Input plasma DNA amounts were 4.8 to 34.5 ng per sample (mean, 14.1; median, 12.7), obtained from 1.2 to 4.7 ml of plasma (mean, 3.8; median, 4.0). Median total plasma DNA concentration was 5.3 ng/ml of plasma (range, 0.83 to 165.1; mean, 8.5). The distribution of total plasma DNA concentration was indistinguishable from healthy volunteers and significantly lower than that in patients with metastatic cancer (Wilcoxon rank-sum P = 0.0789 and P = 2.2 × 10−16, respectively, fig. S3).

Fig. 4 ctDNA analysis in patients with early and locally advanced breast cancer before treatment and after completion of NAT.

(A) Clinical characteristics of the cohort. (B) Summary of results, tumor stage, grade, mitotic rate, subtype, ctDNA detection before treatment and after NAT, and residual disease assessment. Pathological staging was performed after surgery and completion of NAT. NA, not available or not applicable; IDC, invasive ductal carcinoma; ILC, invasive lobular carcinoma; ypTis, in situ disease; ypT1-3 and ypN1-3, tumor and nodal stage upon pathological staging. (C) ctDNA fraction at baseline. (D) ctDNA fraction after completion of NAT, grouped by clinical response to treatment (residual disease versus pathological complete response). (E) Changes in pre- and posttreatment ctDNA fraction in patients with residual disease and pathCR.

Before treatment, we detected ctDNA in 32 of 32 patients (Fig. 4B) at tumor fractions of 0.002 to 1.06% (mean, 0.23%; median 0.11%; Fig. 4C), supported by 2 to 53 distinct mutation events (mean, 10.2; median, 7.0) and 3 to 1638 mutant RFs (mean, 217.5; median, 54.5; data file S4). To ensure that analysis of multiple mutations did not increase false positives, we performed multiple-testing correction (Bonferroni correction) and required a corrected P < 0.05 for each sample. Baseline plasma sequencing failed in one patient (E009). Plasma samples after completion of NAT were analyzed in 22 patients. ctDNA was detected in 17 of 22 patients, including 12 of 13 patients with invasive or in situ residual disease and 5 of 9 patients with pathCR (no evidence of tumor cells in the resected tissue). In one patient with residual disease (pathological stage T3N1 after NAT, patient T065), ctDNA was undetectable in the last blood sample after completion of NAT, likely due to a combination of limited plasma DNA available for analysis (8.7 ng compared to a mean of 16.8 ng for samples obtained at similar time points) and limited number of targets analyzed (11 compared to a mean of 30 across the entire cohort). We calculated the theoretical maximum number of molecules that could be analyzed for each sample (the product of input haploid genome copies and number of mutations targeted). For patient T065, the maximum number of analyzed molecules in the plasma DNA sample after completion of therapy was 26,273, the lowest among the post-NAT samples from all patients where the mean was 92,484 molecules. We excluded T065 from further analysis of samples after completion of NAT. In patients with detectable ctDNA after NAT, tumor fraction was 0.003 to 0.045% (mean, 0.018%; median, 0.016%), supported by one to seven distinct mutation events (mean, 3.6; median, 4.0) and 2 to 82 mutant RFs (mean, 18.8; median, 13). Median ctDNA concentrations after completion of NAT were 5.7-fold lower in patients who achieved pathCR compared to patients with residual disease (median AFs, 0.003% versus 0.018%, respectively, Wilcoxon rank-sum one-sided P = 0.0057, Fig. 4D). We observed a decrease in ctDNA after NAT compared to pretreatment concentrations in all but two patients (median decrease of 84%, Wilcoxon signed-rank paired one-sided P = 9.5 × 10−6, Fig. 4E). In patients who achieved pathCR, median decrease in ctDNA was 96%, whereas in patients with residual disease observed at surgery, median decrease was 77% (Wilcoxon rank-sum one-sided P = 0.055). Temporal changes in variant AFs for multiple mutations within each patient agreed with each other, unless affected by sampling variation as ctDNA decreased during treatment (fig. S4). Using ctDNA concentrations after NAT to differentiate the two groups, we achieved an area under the curve (AUC) of 0.83 (n = 21 patients, Fig. 5). The AUC improved to 1 and 0.89 when patients were analyzed as TNBC and ER+ breast cancer subgroups, respectively (n = 9 and 11 patients, fig. S5).

Fig. 5 Receiver operating characteristic curve for predicting residual disease using ctDNA fraction after completion of NAT.

DISCUSSION

Patients with early and locally advanced cancers are increasingly treated with neoadjuvant systemic therapy to downstage their tumors and to improve outcomes of subsequent localized treatment. Across some cancer subtypes such as breast, rectal, and esophageal cancers, 20 to 30% of the patients achieve pathCR after NAT, which means that no evidence of tumor cells is found in surgically resected tissue (2, 17, 18). Achieving pathCR is associated with good prognosis, but histopathological evaluation of surgically resected tissue remains the only reliable method to establish pathCR. Imaging and clinical assessment of response have been unable to predict pathCR with high accuracy, and no circulating biomarkers have been informative in this setting (4, 5). Our results reveal that ctDNA concentrations after completion of NAT for breast cancer are higher in patients with residual disease at the time of surgery compared to patients with pathCR.

Several earlier studies have evaluated whether ctDNA analysis can be informative of response to NAT in breast cancer. However, these studies were limited in sensitivity and precision because ctDNA concentrations in patients with nonmetastatic cancer are extremely low. In our study, ctDNA was detected in 100% of patients with early and locally advanced breast cancer before treatment (95% confidence interval, 89 to 100%), improving on earlier reports of 50 to 75% ctDNA detection at baseline (7, 19). High sensitivity for ctDNA detection before treatment is a prerequisite for any approach used for residual disease testing because tumor burden is generally higher at presentation. Median pretreatment ctDNA concentration in our study was 0.11%, about 25 to 100 times lower than ctDNA concentrations reported in patients with metastatic breast cancer (13, 20). After completion of NAT, earlier reports suggest that ctDNA concentrations fall below the limit of detection in >90% of patients, regardless of residual disease status (1012, 21, 22). In our study, median ctDNA concentrations after NAT were 0.017 and 0.003% in patients with residual disease and pathCR, respectively.

To achieve sensitivity and quantitative precision required for ctDNA analysis in patients with nonmetastatic cancer, we have developed a method for tumor-guided ctDNA analysis that leverages multiple mutations together with improvements in sequencing library preparation and informatics analysis. Earlier studies investigating ctDNA quantification for longitudinal treatment monitoring have targeted single recurrent or patient-specific mutations using digital PCR or digital sequencing (8, 20). Although these approaches are informative of large changes in ctDNA concentration during treatment, ctDNA typically becomes undetectable in mid-treatment samples even in patients with metastatic cancer who have measurable disease on imaging (20). To improve sensitivity and overcome limited input, plasma DNA can be sampled at multiple genomic loci simultaneously. One approach is to analyze a panel of recurrent cancer genes with high coverage targeted sequencing and to integrate results from multiple mutations in each patient (6, 23, 24). However, this approach typically does not yield more than two to four mutations per patient, limiting the maximum sensitivity achieved regardless of depth of sequencing (Fig. 1A). More recently, analysis of multiple patient-specific mutations pre-identified in the tumor tissue has emerged as an alternative approach, including amplicon sequencing of dozens of mutations, hybrid capture enrichment of sequencing libraries for dozens to thousands of mutations, and whole-genome sequencing (WGS). At various stages of development, these approaches generally improve on the current limit of detection for ctDNA analysis (~0.1% mutation AF), but each approach has some limitations. Conventional multiplexed PCR-based approaches have been limited by the high background error rates observed (9, 22). An alternative is to incorporate UMIs during the first few cycles of PCR to overcome background errors, but this limits molecular conversion because template DNA molecules not incorporated within the first two to three cycles are excluded from further analysis (16, 25, 26). In addition, it limits multiplexing capacity and requires optimization for patient-specific assays. In contrast, ligation-based sequencing library preparation enables a wider analysis of the genome but has limited molecular conversion and loses up to 90% of template DNA molecules due to inefficient ligation (27, 28). Personalized hybrid capture enrichment can overcome this limitation by incorporating thousands of mutations, but such a high number of mutations are only found in a few tumor types, and identifying them requires WGS of high-cellularity tumor samples and corresponding normal tissue. Together with synthesis of customized hybrid capture biotinylated baits for each patient, this approach is currently very expensive. An even wider analysis can be performed by WGS of plasma DNA, either by direct counting of mutated DNA molecules across the genome or by integration of genome-wide patient-specific mutational signatures. However, both approaches require WGS of high-cellularity tumor tissue upfront, and WGS of plasma DNA at the required depth of coverage remains prohibitively expensive for a ctDNA test that may be repeated multiple times during clinical follow-up.

In contrast to efforts highlighted above, TARDIS combines the strengths of PCR-based methods (minimizing losses of template DNA molecules) and ligation-based methods (incorporation of UMIs, preservation of fragment sizes, and hundred-fold multiplexing). This combination achieves a balance between depth and breadth of tumor genome analyzed, investigating dozens to hundreds of patient-specific mutations with deep coverage. TARDIS assays require design, synthesis, and empirical validation of patient-specific primer panels. We have streamlined and automated the design process to successfully target 55% of putative founder mutations per patient on average. We rely on routine primer synthesis with standard purification and need a limited sequencing footprint, making our approach cost-effective and enabling frequent and longitudinal analysis of plasma samples. To identify target mutations, TARDIS requires exome sequencing of tumor DNA from diagnostic tumor biopsies. Compared to WGS, exome sequencing is clinically more feasible in the foreseeable future, generates greater depth of coverage, and enables confident identification of putative founder mutations even in lower-cellularity tumor samples. At our institution, exome sequencing is routinely performed within 2 weeks of receiving a tumor specimen. Using automated informatics pipelines, a TARDIS assay can be designed, synthesized, and empirically validated for each patient within 1 to 2 weeks thereafter. Hence, the total turnaround time for development of a patient-specific assay is 3 to 4 weeks after a diagnostic biopsy, well within the time frame required for clinical decision-making for patients with cancer treated with NAT.

To aggregate multiple patient-specific mutations and improve detection sensitivity and quantitative precision for ctDNA analysis, we target founder mutations that are shared by all tumor cells and are equally informative. Subclonal mutations are more likely to be lost due to population bottlenecks during treatment and become uninformative for residual disease detection (9, 14). Using a combination of founder and subclonal mutations may lower the real-world sensitivity of the assay, although tumor specificity will remain unaffected. Similarly, an aggregate ctDNA fraction calculated using a mix of founder and subclonal mutations may not reflect true tumor burden. Varying contributions of founder and subclonal mutations can complicate both assessment of longitudinal changes in ctDNA within a patient’s clinical course and comparison of ctDNA concentrations across a cohort of patients. Definitive identification of founder mutations requires multisite sequencing, but obtaining multiple biopsies remains clinically challenging. In the current study, we have combined two informatics approaches to maximize the fraction of targeted mutations likely to be founder.

We also report extensive evaluation of analytical performance using commercially available reference samples. Sequencing library preparation typically loses the large majority of input DNA material during early steps such as ligation of adapters. This is particularly challenging for ctDNA analysis because limited blood volumes can be accessed clinically and plasma DNA concentrations are low. We tried to overcome this challenge, while keeping any polymerase-induced errors in check, by using linear pre-amplification of input DNA. To measure our efficiency of molecular conversion, we used an approach based on sensitivity and reproducibility across dozens of replicates of known reference samples. We compared observed sensitivity and precision with expected values based on Poisson distribution and inferred conversion efficiency of 39 and 26%, respectively. This approach measures effective conversion using real-world performance metrics, instead of relying on molecular coverage in the targeted region, a metric that is susceptible to molecular and informatics artifacts due to sequencing and polymerase-induced errors within UMIs and tag switching (polymerase-induced recombination of UMIs). We propose benchmarking of current and future methods for ctDNA analysis using a similar approach, which can also be applied to non–UMI-based methods such as conventional amplicon sequencing. The discrepancy between the two conversion estimates for our data likely results from higher estimates of CVs due to variable conversion efficiencies between targeted mutations. These could result from differences in efficiency of linear and exponential amplification, fragment sizes, local secondary structures, and depth of sequencing coverage.

Our results demonstrate potential applications of ctDNA analysis for monitoring response in neoadjuvantly treated patients with cancer. We have shown that ctDNA concentrations after completion of NAT are associated with pathCR in early and locally advanced breast cancer. Together with imaging and clinical assessment, ctDNA concentrations may guide treatment strategy in individual patients, such as the choice and extent of local treatment (surgical resection or radiation). The threshold for ctDNA concentrations predictive of residual disease will likely vary between clinical subtypes of breast cancer and between cancer types. Larger clinical studies will be needed to validate our findings and to refine clinically relevant diagnostic thresholds. We have also observed a decrease in ctDNA during NAT, which was greater in magnitude when patients achieve pathCR. This highlights the utility of improved quantitative precision achieved using a multimutation assay. Future studies could evaluate whether the magnitude of early decrease in ctDNA concentration during neoadjuvant treatment is informative of therapeutic benefit, enabling adaptive treatment designs to rapidly identify systemic treatment options that work for individual patients. Overall, ctDNA analysis using sensitive and accurate approaches such as TARDIS can enable development of clinical strategies for individualized management of patients treated with curative intent.

A limitation of this study is that we were unable to detect ctDNA in one patient with residual disease after completion of NAT despite high-volume residual disease. This was most likely due to a combination of low plasma DNA concentration and a limited number of mutations assayed for this patient. ctDNA was detected in this patient in three other plasma samples collected before and at 6 and 12 weeks on treatment. Potential approaches to overcome this limitation in future clinical studies include targeting a greater number of putative founder mutations and analyzing larger blood volumes. Although, in the current study, we analyzed up to 4 ml of plasma obtained from 10 ml of blood samples, it is conceivable to collect up to 30 ml of blood at a single time point. It is also feasible in future studies to collect and analyze plasma samples over multiple days after completion of therapy to increase sensitivity for residual disease.

Overtreatment of patients with early stage cancer remains a challenge in cancer medicine, likely to become more relevant as newer blood- and imaging-based early detection approaches gain credence (29). Most efforts to optimize treatments have focused on tissue-based predictive biomarkers to assess risk of tumor recurrence (30). Our results suggest that blood-based residual disease testing during treatment can further help individualize the choice and extent of each treatment modality. Establishing clinical validity and utility for ctDNA monitoring and residual disease detection will require larger and prospective studies with long-term clinical follow-up. Once validated, using residual disease detection to individualize cancer management could substantially reduce treatment-related morbidity while preserving clinical outcomes.

MATERIALS AND METHODS

Study design

The aim of this study was to develop molecular methods to improve limit of detection and quantitative precision for tumor-guided ctDNA analysis. TARDIS was developed and optimized using commercially available reference samples for cfDNA analysis. The analytical performance of TARDIS was demonstrated in replicate reactions with known mutation AFs. This was followed by a retrospective proof-of-principle clinical study with prospectively enrolled patients with stage I to stage III breast cancer. Prior power analysis, randomization, or blinding was not performed for the clinical study.

Patients and samples

This study includes patients prospectively enrolled at Mayo Clinic, Phoenix, AZ, USA, under an approved institutional review board (IRB) protocol number 14-006021 (Mayo cohort), at Addenbrooke’s Hospital, Cambridge, UK, under an approved Research Ethics Committee protocol number 12/EE/0484 (Cambridge cohort), and at City of Hope, Duarte, CA, USA, under an approved IRB protocol number 96144 (COH cohort). Informed consent was obtained from all patients. Tumor samples obtained at the time of diagnosis were exome-sequenced. Blood samples were collected before starting treatment and in a subset of patients, after completion of NAT before surgical resection. In the Cambridge cohort, additional blood samples were collected at 6 and 12 weeks during neoadjuvant treatment.

DNA extraction from tumor and germline samples

For the Mayo cohort, tumor DNA was extracted from four 10-μm sections obtained from archived formalin-fixed paraffin-embedded tissue using the MagMAX FFPE DNA/RNA Ultra Kit (Thermo Fisher Scientific), after macrodissection to enrich for tumor cells guided by a hematoxylin and eosin–stained tumor section. For the Cambridge cohort, tumor DNA was extracted from ten 30-μm sections obtained from fresh frozen tumor tissue using the DNeasy Blood and Tissue Kit (Qiagen). Germline DNA was extracted from peripheral blood cells using the DNeasy Blood and Tissue Kit (Qiagen). For the COH cohort, tumor DNA was extracted from five 10-μm sections obtained from archived formalin-fixed paraffin-embedded tissue using the GeneRead DNA FFPE Kit (Qiagen). Germline DNA was extracted from peripheral blood cells using the FlexiGene DNA Kit (Qiagen).

Plasma processing, DNA extraction, and quality assessment

For the Mayo and Cambridge cohorts, blood was collected in 10-ml K2 EDTA tubes and centrifuged at 820g for 10 min within 3 hours of venipuncture to separate plasma. One-milliliter aliquots of plasma were centrifuged a second time at 16,000g for 10 min to pellet any remaining leukocytes, and the supernatant plasma was stored at −80°C. For the COH cohort, blood was collected in Streck Cell-Free BCT Tubes (Streck) and centrifuged twice to separate plasma. The first spin was at 1600g for 15 min at 25°C. The plasma was then aliquoted and centrifuged again for 10 min at 2500g at 25°C. cfDNA was extracted using either the QIAsymphony DSP Circulating DNA Kit (Qiagen) or the MagMAX Cell-Free DNA Isolation Kit (Thermo Fisher Scientific). All cfDNA samples were evaluated for yield and quality using ddPCR, as described previously (31).

Tumor/germline exome sequencing

For the Mayo and COH cohorts, tumor/germline exome sequencing libraries were prepared using the KAPA Hyper Prep Kit following the manufacturer’s instructions. Exome enrichment through hybridization was performed using a customized version of Agilent SureSelect V6 exome. For the Cambridge cohort, tumor and germline exome libraries were generated using the Illumina Nextera Rapid Capture Exome Library Preparation Kit. We pooled exome libraries and sequenced on Illumina HiSeq or NovaSeq.

Variant calling in tumor exomes and identification of target mutations

Reads were aligned to human genome version hg19 using BWA-MEM (32), followed by base recalibration using Genome Analysis Toolkit (GATK) (33), duplicate identification using Picard Tools MarkDuplicates, and indel realignment using GATK. Germline mutations were inferred using GATK HaplotypeCaller and Freebayes (34). Somatic tumor mutations were called using MuTect (35), Seurat (36), and Strelka (37). Somatic mutations with an allele frequency <5% were removed.

Identification of putative founder mutations

Potential target mutations found on autosomes were assessed for copy number, purity, and variant allele frequency (VAF). We used Sequenza to infer both the proportion of tumor cells in the sequenced tumor DNA sample and copy number alterations in the tumor (38). For each mutation, the mean VAF from the variant callers, sample purity, and local copy number were used to infer its cancer cell fraction (CCF) via two different methods: an implementation of the algorithm from McGranahan et al. (39) and PyClone (40). For each sample, the VAF, minor and major copy number, and purity were used as input for PyClone analysis with 25,000 iterations, including 10,000 iterations of burn in.

Founder mutations were identified using a set of criteria for mutation confidence and maximum CCF. To qualify as a target for ctDNA analysis, a mutation must have been identified by at least two somatic mutation callers, have a mean germline coverage of >20 reads and tumor coverage of >50 reads passing each mutation caller’s filters, and have a germline VAF <0.01%. In addition, the upper range of the CCF distribution calculated using the McGranahan et al. approach must be equal to 1.0, and the mutation must be found in the highest CCF PyClone mutation cluster.

Primer design for TARDIS

Mutations that passed the filtering steps above were used as targets for TARDIS primer design. The primer design process was focused on maximizing TARDIS performance and minimizing spurious amplification, particularly in the linear pre-amplification stage. We first generated primers on the forward or reverse strands up to 350 base pair (bp) from the target mutation position for both linear and exponential amplification reactions (primers 1 and 2 for each targeted locus) (41). Primer 1 melting temperature (Tm) range was set to 68° to 74°C, and primer 2 Tm range was 56° to 60°C, with primer 1 upstream and a maximum of 3-bp overlap allowed between primers 1 and 2. During primer selection, we minimized the distance between the 3′ end of primer 2 and the target mutation position to ensure that short mutant molecules are captured efficiently. To avoid erroneous variants caused by primer synthesis overhangs, we also required a minimum 3-bp distance to the target mutation. To avoid unintended amplification in multiplexed PCR reactions, we used a combination of in silico PCR, sequence comparison to the genome using LAST (42), and 3′ primer kmer matching to identify problematic primers for multiplexing. Primer 1s with more than two LAST matches outside the target region were excluded, along with primer 2s with any LAST off-target matches. All combinations of potential primer 1s were analyzed using in silico PCR. Next, we built a graph in which nodes represented primers and edges linked pairs of primers with predicted PCR products. The nodes were sorted by number of edges, and we iteratively removed the node with the most edges if it was not the last primer 1 for a given target. This process continued until there were no remaining edges or until all targets only had a single primer 1 remaining. If there were multiple remaining primer 1s for a given target, the one with the fewest kmer matches to other target regions was selected. This process was repeated for primer 2s, except the best primer after graph analysis was selected on the basis of minimizing distance to the target mutation rather than kmer matches. A test run of TARDIS using each primer panel was conducted with eight replicates of sheared genomic DNA before analyzing plasma samples to identify any remaining problematic primers. A target was removed from the panel before analysis of plasma samples if median proportion >0.5 or maximum proportion >0.75 of the reads for that target were masked or if the target captured a median proportion >0.5 or maximum proportion >0.75 of all reads in any control run. For finalized TARDIS assays, mean distance between the 3′ end of primers 1 and 2 and the target locus was 56.0 and 31.0 bp, respectively (median, 55.0 and 27.0 bp; SD, 17.5 and 13.2 bp, respectively).

Preparation of TARDIS sequencing libraries

TARDIS sequencing libraries were prepared using target-specific linear pre-amplification, ligation, one to two rounds of target-specific exponential amplification, and barcoding PCR. TARDIS reactions were set up using up to 20 ng of template plasma DNA in 10-μl volume for linear pre-amplification. For some plasma samples, input DNA was split into two TARDIS reactions, and the results were combined informatically. For each TARDIS run, patient-specific primers were pooled equimolarly. For pre-amplification, each primer 1 pool was used at a final concentration of 0.5 to 1.0 μM. Linear pre-amplification was performed using Kapa HiFi HotStart ReadyMix (Kapa Biosystems) at the following thermocycling conditions: 95°C for 5 min followed by 50 cycles at 98°C for 20 s, 70°C for 15 s, 72°C for 15 s, and 72°C for 1 min. This reaction was followed by a magnetic bead cleanup (SPRIselect, Beckman Coulter) at 1.8× ratio after addition of 10% ethanol. Pre-amplified DNA was eluted in 10 μl of water. After dephosphorylation using FastAP (Thermo Fisher Scientific), 0.8 μl of 100 μM ligation adapter was added to each sample. The sequence of the hairpin oligonucleotide used for single-stranded DNA ligation is provided in table S4 and was adapted from Kwok et al. (43). Samples were denatured at 95°C for 5 min and immediately transferred to an ice bath for at least 2 min. We set up ligation reactions using 2.5 μl of 10× T4 DNA Ligase buffer (New England Biolabs), 2.5 μl of 5 M betaine, 2000 U of T4 DNA ligase (New England Biolabs), and 5.8 μl of 40 to 60% polyethylene glycol (PEG) 8000. Ligation was performed at 16°C for 16 to 24 hours. A magnetic bead cleanup (SPRIselect) was performed at 1× buffer ratio after initially diluting the sample by adding 20 to 40 μl of water (to reduce effective PEG concentration during cleanup). An additional dephosphorylation was performed using FastAP.

Exponential PCR was performed in two rounds. In both rounds, a universal reverse primer was used, complementary to the ligated adapter and upstream of the UMI (see table S4 for primer sequences). On the target-specific end, primer 1 pools were used for the first round, and primer 2 pools were used for the second round. When the total number of targeted mutations exceeds 30, 2 μl of amplified DNA from round 1 was split across multiple round 2 reactions of ~30 targets each. In a subset of samples, only the second round of exponential amplification was performed using total ligated DNA. Primers were pooled equimolarly and used at a final pool concentration of 0.5 μM. Round 1 amplification was performed using Kapa HiFi HotStart ReadyMix with the following thermocycling conditions: 95°C for 5 min followed by 5 cycles at 98°C for 20 s and 65°C for 2 min, and 15 cycles at 98°C for 20 s, 65°C for 15 s, and 72°C for 15 s, followed by a 1-min incubation at 72°C. Round 2 amplification was performed using NEBNext Q5 Hot Start HiFi PCR Mastermix (New England Biolabs) with the following thermocycling conditions: 98°C for 1 min followed by 5 cycles at 98°C for 10 s and 61.5°C for 4 min, and 15 cycles at 98°C for 10 s, 61.5°C for 30 s, and 72°C for 20 s, followed by a 2-min incubation at 72°C. Intervening and final magnetic bead cleanups were performed at 1.7× volume ratio (SPRIselect), and products were eluted in 20 to 40 μl of water.

Barcoding PCR was performed using universal primers to introduce sample-specific barcodes and complete sequencing adaptors, as described previously (14). We used 1 U per reaction of Platinum Taq DNA Polymerase High Fidelity (Invitrogen) in the following buffer: 1.3× Platinum buffer, 0.4 M betaine, 2.5 μl per reaction of dimethyl sulfoxide, 0.45 mM deoxynucleotide triphosphates, 1.75 mM MgSO4, and primers at 0.5 μM. Ten microliters of the product from exponential amplification was used as template, at the following thermocycling conditions: 94°C for 2 min followed by 15 cycles at 94°C for 30 s, 56°C for 30 s, and 68°C for 1 min, and a final incubation at 68°C for 10 min. A final magnetic bead cleanup (SPRIselect) was performed at 1.2× volume ratio. TARDIS libraries were eluted in 20 μl of DNA suspension buffer, quantified using fluorometric and electrophoretic assays, and pooled for sequencing. Sequencing was performed on Illumina HiSeq or Illumina NextSeq.

Analysis of TARDIS sequencing data

Paired-end sequencing reads were aligned to human genome hg19 using BWA-MEM. Read pairs whose R1 read mapped to the start position of a target primer were considered on-target reads, and the position of the R2 read was used to determine the length of the template molecule. The UMI sequence and molecule size were used to identify all of the reads that came from the same template molecule. To minimize incorrect assignment of reads to RFs, we implemented a directed adjacency graph approach inspired by Smith et al. (44). Briefly, a graph was constructed in which each UMI was a node. An edge from node A to node B was created if their UMIs differ by one base, their DNA molecule size was the same, and node A had at least twice as many reads as node B. All of the reads from UMIs in each component of the resulting graph constituted an RF and were considered to have come from the same original molecule. UMI variation within an RF was assumed to arise due to PCR or sequencing error. We found that a small number of UMI nodes with very few reads had incoming edges from multiple otherwise separate components. These nodes could not be assigned to a single component unambiguously and reduced the number of independent components in the graph. To resolve this issue, any UMI that had two or more incoming edges and no outgoing edges was removed. We then inferred the allele at the target position by consensus of all R1 reads in a given component, requiring that at least 90% of the R1 reads carried a particular allele at the position of interest. In practice, the vast majority of RFs contained fewer than 10 reads, and therefore required perfect agreement at the target position. Inferred molecules with less than 90% read support for any allele were considered inconclusive (mixed RFs).

To ascertain ctDNA detection in a sample, we required support of at least two RFs across all mutations covered by at least 100 total RFs. For any mutations supporting ctDNA detection, we required that its AF (mutant RFs/total RFs) represents at least 0.5 mutant molecules in the reaction. In addition, the ratio between the number of RFs supporting a mutation and mixed RFs observed at that locus had to be <10. If only one mutation supported ctDNA detection, we required at least two independent RF sizes (to ensure at least two distinct ligated molecules). This requirement was waived if >1 mutation supported ctDNA detection. For each mutation observed, the probability of encountering the number and fragment sizes of mutant RFs was calculated using a distribution of background errors (see below). For each sample, the combined probability of mutations detected was calculated and corrected for multiple testing using the Bonferroni approach to account for the number of mutations analyzed in each TARDIS panel. Sample-level ctDNA detection was confirmed if Bonferroni-corrected P value was <0.05. To quantify ctDNA in a sample, we calculated mean AFs over all targeted mutations. Because not all sequenced molecules may receive enough reads to form RFs, AF for each mutation was calculated as the proportion of all reads that contained the target variant. For mutations not confidently detected (<1 mutant RF, a ratio of mutant RFs with mixed RFs of ≥10 or <0.5 mutant molecules), AFs were set to zero before calculating the sample-level mean ctDNA concentration.

TARDIS analysis pipelines

Target selection and primer design pipelines were developed in Python 3 using NumPy, SciPy, networkX, pandas, and Matplotlib, and in Julia 0.6.2 using BioJulia, DataFrames, Gadfly, and LightGraphs. Data analysis and plotting were conducted in Python 3, Julia 1.1, and R v3 using ggplot2.

Calculation of background error rates

To measure overall background error rates, we evaluated the first 10 bp from a set of amplicons across multiple representative plasma samples for highest non-reference alleles (starting 3 bp downstream of target-specific primers), excluding the targeted locus. The full dataset included 200 loci from each of 39 samples, for a total of 7800 independent positions. To calculate TARDIS-corrected error rates, we required consensus of all members of an RF, a minimum of two RFs with a ratio between variant RFs and mixed RFs <10, similar to criteria for detection of individual variants described above. Unlike variant calling for genotyping, TARDIS relies on detection and quantification of pre-identified non-reference alleles at preselected genomic loci. We did not call mutations at every sequenced locus, and this limited the number of false positives detected. Moreover, given limited template DNA input, individual mutations were not expected at AFs below or close to the background error rate using RF consensus. However, analysis of dozens or hundreds of mutations could result in false-positive ctDNA detection due to multiple testing. To ensure confidence in sample-level ctDNA detection, we built a background distribution for each sample of the number of non-reference RFs and the number of fragment sizes supporting randomly chosen non-reference alleles at nontarget loci (3 to 30 bp from the end of the sequenced primer). We used this distribution to calculate the probability of observing each mutation in a sample. We calculated a combined probability for each sample as the product of probabilities for observed mutations and applied multiple testing correction using the Bonferroni approach, requiring the corrected P value to be <0.05.

Statistical analysis

Differences in background error rates, AFs, and total cfDNA concentration between groups of patients were tested using Wilcoxon rank-sum test. Changes in tumor AFs before and after NAT were tested using paired Wilcoxon signed-rank test. Correlation between observed and expected AFs in reference samples and that between CVs and mutant DNA molecules were evaluated using Pearson correlation coefficient. Two-sided testing was used unless otherwise specified.

SUPPLEMENTARY MATERIALS

stm.sciencemag.org/cgi/content/full/11/504/eaax7392/DC1

Fig. S1. Schematic overview of TARDIS.

Fig. S2. Comparison of raw and TARDIS-corrected background errors.

Fig. S3. Comparison of total cfDNA concentration between plasma samples from patients and healthy volunteers.

Fig. S4. Variant and tumor fractions in individual patients.

Fig. S5. Receiver operating characteristic curve for predicting residual disease using ctDNA fraction after completion of NAT in subgroups.

Table S1. Mutations targeted in reference samples.

Table S2. Expected mutation fractions in reference samples analyzed.

Table S3. Tumor and germline sequencing statistics.

Table S4. Oligonucleotide sequences used for sequencing library preparation.

Data file S1. Mutations detected in reference samples in Fig. 2.

Data file S2. Mutations detected in reference samples in Fig. 3.

Data file S3. Details of patient plasma samples and ctDNA tumor fraction.

Data file S4. Mutations detected in patient plasma samples.

REFERENCES AND NOTES

Acknowledgments: We thank M. Pacheco and L. Dixon at Mayo Clinic for support in collection and processing of patient samples. We thank J. Herzog at City of Hope for support in collection and processing of patient samples and clinical data. We thank all the patients for participating in this study. Funding: This work was supported by funding from the Ben and Catherine Ivy Foundation, V2015-017 from the V Foundation for Cancer Research, BSP-0542-13 from Science Foundation Arizona, charitable donations from SmartPractice, and support from the National Cancer Institute (NCI) of the National Institutes of Health (NIH) under award number 1R01CA223481-01 to M.M. The research reported in this publication was also supported by NCI P30CA33572 (Molecular Pathology Core) to T.P.S., TGen-City of Hope Precision Medicine Pilot Award to M.M. and T.P.S., City of Hope Cancer Control and Populations Sciences Pilot award to T.P.S. and J.N.W., K08CA234394 to T.P.S., Mayo Development Funds to B.A.P., Mayo Clinic Center for Individualized Medicine to M.M. and B.A.P., and Cancer Research UK to C.C. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Author contributions: B.A.P. and M.M. conceptualized and designed the study. B.R.M., T.C.-C., N.P., and M.M. developed methods. S.-J.S., B.E., P.A.C., K.S.A., H.E.K., D.W.N., A.E.M., B.K.P., J.N.W., T.P.S., C.C., and B.A.P. designed and conducted the prospective clinical studies. T.C.-C., S.-J.S., S.-F.C., A.O.-B., M.F., and R.M. generated data. B.R.M. and M.M. analyzed sequencing data. B.R.M., S.-J.S., T.P.S., C.C., B.A.P., and M.M. interpreted data. B.R.M. and M.M. wrote the paper with assistance from S.-J.S., T.P.S., A.O.-B., C.C., B.A.P., and other authors. All authors approved the final manuscript. Competing interests: M.M., T.C.-C., B.R.M., A.O.-B., and N.P. are inventors or coinventors on patent applications covering technologies described here including patent application numbers WO2017205540A1 and US201662343802P, both titled “Molecular tagging methods and sequencing libraries,” and 62/866,543, titled “Detection and treatment of residual disease using circulating tumor DNA analysis.” M.M. serves as an expert witness in intellectual property litigation related to cfDNA analysis methods. C.C. is a member of AstraZeneca’s External Science Panel and is a recipient of research grants (administered by the University of Cambridge) from AstraZeneca, Genentech, Roche, and Servier. All other authors declare that they have no competing interests. Data and materials availability: Analytical performance data from reference samples have been deposited with unrestricted access in Sequence Read Archive with accession number PRJNA551456 and will become available upon publication. Targeted sequencing data and tumor/germline exome sequencing data from patient samples will be made available upon reasonable request. All other data associated with this study are present in the paper or the Supplementary Materials.
View Abstract

Navigate This Article