Single-cell genotyping demonstrates complex clonal diversity in acute myeloid leukemia

See allHide authors and affiliations

Science Translational Medicine  01 Apr 2015:
Vol. 7, Issue 281, pp. 281re2
DOI: 10.1126/scitranslmed.aaa0763

Cancer evolution: No simple answers

Traditionally, the evolution of cancer has been explained in simple terms: A cell acquires mutations and becomes malignant and then gives rise to progeny that become more malignant as they acquire additional mutations. A key assumption of this model is that the entire cancer is derived from descendants of a single original cell. However, a new study by Paguirigan et al. challenges this paradigm by providing evidence of convergent evolution in acute myeloid leukemia. The authors analyzed individual cells from multiple patients with leukemia and demonstrated that the mutation patterns seen in each patient could not have arisen from a single ancestral cell, suggesting a need for more sophisticated models of cancer evolution to inform the development of new treatment strategies.


Clonal evolution in cancer—the selection for and emergence of increasingly malignant clones during progression and therapy, resulting in cancer metastasis and relapse—has been highlighted as an important phenomenon in the biology of leukemia and other cancers. Tracking mutant alleles to determine clonality from diagnosis to relapse or from primary site to metastases in a sensitive and quantitative manner is most often performed using next-generation sequencing. Such methods determine clonal frequencies by extrapolation of allele frequencies in sequencing data of DNA from the metagenome of bulk tumor samples using a set of assumptions. The computational framework that is usually used assumes specific patterns in the order of acquisition of unique mutational events and heterozygosity of mutations in single cells. However, these assumptions are not accurate for all mutant loci in acute myeloid leukemia (AML) samples. To assess whether current models of clonal diversity within individual AML samples are appropriate for common mutations, we developed protocols to directly genotype AML single cells. Single-cell analysis demonstrates that mutations of FLT3 and NPM1 occur in both homozygous and heterozygous states, distributed among at least nine distinct clonal populations in all samples analyzed. There appears to be convergent evolution and differential evolutionary trajectories for cells containing mutations at different loci. This work suggests an underlying tumor heterogeneity beyond what is currently understood in AML, which may be important in the development of therapeutic approaches to eliminate leukemic cell burden and control clonal evolution-induced relapse.


Relapse is the most frequent cause of therapeutic failure in cancer, and recent work has demonstrated that it can be driven by selection for resistant subclones among the clonal diversity of a neoplasm (1, 2). Clonal genetic diversity has been shown to predict progression to malignancy in Barrett’s esophagus (3), and it has been demonstrated in breast cancer (4) and acute myeloid leukemia (AML) (57). Changes in mutant allele frequencies can be observed over the course of therapy (810), in xenograft models (11), and between primary tumor sites and metastases (1214). These studies have demonstrated relatively “linear” models of clonal diversity, with new clones clearly arising from a previous clone, as well as complicated “branching” trees, leading to convergent evolution (1416).

Current strategies for estimating and tracking clonal diversity generally use next-generation sequencing (NGS) of the bulk tumor sample to determine the frequencies of mutant alleles in the resulting metagenome. Despite the power of NGS in identifying mutations across the genome, it ultimately requires bulk DNA samples as starting material to obtain sufficient amounts of DNA. A potential limitation of NGS in describing clonal composition is that it requires a model of the cancer, specifically regarding the heterozygosity of mutations in single cells, the order of acquisition of mutations, and the unique mutational events. After clustering the mutations based on their allele frequencies in the bulk NGS data, mutations with similar allele frequencies in the bulk sample are assumed to occur simultaneously in the same clone of cells. The sample is often assumed to constitute 100% malignant cells, allowing for calculation of the population frequency of each clone as twice the allele frequency, assuming that all mutations occur heterozygously in single cells (Fig. 1A). This allows for tracking of “clones” from diagnosis to relapse through changes in the mutant allele frequencies. Additionally, mutations that occur at a lower frequency are assumed to have occurred later in the evolution of the disease and to constitute a subclone in which all previous mutations (those of higher frequency) are carried along with the lower-frequency mutations. This aspect of the model requires that all mutational events are unique in an individual sample and represent one-time events that occur solely in a clone with preexisting mutation(s). The basic framework into which these measurements and assumptions fit is the idea of a sequence of clonal expansions in which each mutation occurs only once, driving a new clonal expansion in which all further mutational events occur.

Fig. 1. Underlying diversity obscured by describing clonal evolution in bulk AML samples.

(A) Large NGS efforts identify the allele frequency (AF) in bulk samples for multiple genes simultaneously (genes A to H). These allele frequencies are then used to calculate the population frequency of the clones that contain each mutation by assuming all mutations are heterozygous and are added sequentially (a clone containing the cluster of mutations including A/B would be the “founder” clone, and the subclones must contain A/B/C/D and would then be assumed to be subsequent to the founder clone). Mutant alleles below the threshold of detection, which depends on the type of NGS used, the sample quality, and the coverage/filtering, cannot be attributed to a cancer clone mathematically. (B) Examples of clonal distributions for gene A that would produce the same allele frequency when analyzed in the bulk sample. If the mutation is not constrained to be heterozygous, then the same allele frequency observed in a bulk sample can be obtained in many different distributions of zygosity. When combining multiple mutations, this allows for an exponentially increasing number of possible genotype combinations that would each produce the observed overall allele frequencies for each mutant allele. (C) Many combinations of genotypes exist even when multiple mutations occur in the same sample. In the case of two mutations, there are nine possible mutation/zygosity combinations that could occur in individual cells, which, depending on their relative proportions, could reconstitute the bulk allele frequencies.

However, it is known from bulk data that some mutations [such as FLT3 internal tandem duplication (FLT3-ITD) mutations in AML] are often not heterozygous. Given a known bulk allelic frequency, a wide range of clonal distributions for each mutation are possible in a sample if we consider that cells may be wild type, heterozygous, or homozygous for the mutation (Fig. 1B). When combined with multiple mutations in the same sample (Fig. 1C), this leads to exponentially increasing possible genotype distributions within the sample that would be indistinguishable by bulk analyses, particularly if mutational events are not unique. Addressing this particular limitation to bulk NGS strategies is of prime importance to better understand true clonal structures and the extent of clonal diversity present in tumors and to apply all of the information gained with these techniques accurately.

To assess clonal diversity within an AML sample without those assumptions or biases, we developed protocols to directly genotype AML single cells to test whether the current model’s assumptions are valid. A single-cell approach via whole-genome amplification demonstrated the challenge of separating technical artifact from biological variability and limited statistical power associated with these types of methods (15). Some of these limitations can be addressed by increasing the input material via single-cell colony assays; however, this approach introduces bias due to selection for cells that can be grown in culture (typically a very low efficiency in AML), and the possibility that genetic alterations can arise during colony growth. We applied a targeted approach focused on the common FLT3 (16, 17) and NPM1 mutations (18, 19) to generate large cell sample sizes per patient and allow us to address both technical artifact and statistical analyses. Both of these mutations are typically insertions, but FLT3 insertions are widely variable in length [from 3 to 400 base pair (bp)], allele frequency, and exact insertion sites, whereas the sequence and length of NPM1 insertions are identical in 80% of patients (the remaining 20% of NPM1-positive patients have insertions of the same length but of different sequence). Thus, we reasoned that FLT3 and NPM1 mutations would represent different genetic patterns of mutation, allowing us to best test the parameters of the model. We found that targeted genotyping of single cells is technically feasible and reproducible, and demonstrates a surprising amount of clonal heterogeneity, which could be underrepresented by bulk sequencing.


Approach for identifying individual genotypes of single cells

To accurately study clonal heterogeneity focused on specific genes, we optimized and validated a straightforward single-cell, multiplexed polymerase chain reaction (PCR) method adapted from assays used clinically on bulk material to identify concurrent mutations in single cells. The assay reliably identifies FLT3-ITD and NPM1 insertion mutants in single cells with minimal sample handling. The assay was optimized using cell lines and extracted DNA material. Next, a patient sample positive for two different FLT3-ITDs of different lengths was analyzed by multiplex fragment analysis to determine whether mutant allele frequencies identified by bulk analyses were comparable to those calculated from single-cell data (Fig. 2, A to C). For this sample, 593 single sorted cells were analyzed for each insertion, and the results were compared to analysis of bulk leukemia cells. Calculated allele frequencies for both FLT3-ITDs were remarkably similar to those obtained by bulk analyses, suggesting that the quality control methods used during the sample handling, cell sorting, and subsequent PCR and analysis did not significantly impact the resulting genotyping data. Strikingly, the analysis of the two variants of mutations in this single gene (Fig. 2B) demonstrates five subpopulations of cells: wild type for FLT3, two populations of cells heterozygous for either insertion, and those homozygous for either insertion. No single cells contained both mutations. This strongly suggests that clonal complexity in AML is greater than that implied by bulk DNA analysis.

Fig. 2. Validation of single-cell genotyping via a comparison to the bulk data.

(A to D) A reference patient sample containing two different FLT3-ITDs of different lengths was analyzed by single-cell genotyping for FLT3 during technique validation. The approach used for bulk analyses clinically is to determine the allele ratio of mutant (ITD-1 or ITD-2) alleles to that of the wild type (WT) as demonstrated in (A). Adaptation of this technique to single cells was applied to the reference sample, with a total sample size of 593 cells. Error bars in (B) are the 99% confidence intervals for the population frequencies of each FLT3-ITD zygosity. Het, heterozygous; Mut, homozygous mutant. Bulk allelic ratio was calculated via bulk fragment analysis with the same primers used for single-cell genotyping, converted to an allele frequency [AF = AR/(1 − AR)], and compared to the calculated numbers obtained by single-cell genotyping to ensure accuracy in the overall data set (C). These values were then tracked for the seven patient samples analyzed for FLT3 and NPM1 (D).

Bulk data recapitulated by allelic frequencies derived from single-cell genotyping

We next applied the technique to seven additional cryopreserved AML samples known via bulk analyses to be positive for at least one FLT3-ITD and the NPM1 mutation. To obtain a snapshot of the clonal distribution in these samples with respect to FLT3 and NPM1, we thawed samples, flow-sorted live myeloid cells into PCR plates, and then performed high-throughput multiplex fragment analysis. The presence and zygosity of both mutations simultaneously in each individual cell were tallied for multiple independent plates of cells after all quality control measures were taken. As a reference, extracted DNA from bulk patient samples was genotyped to determine the length of the FLT3-ITDs present and the bulk allele frequencies of mutations in FLT3 and NPM1. All samples showed excellent correlation between bulk allele frequencies and allele frequencies calculated using single-cell data. For example, Fig. 2 (A to C) shows a patient with two different FLT3-ITD mutations identified by their different base pair insertion. The allelic frequency of both mutations based on summing the single-cell data is very similar to that obtained in the bulk sample (Fig. 2C). In addition, these experiments were performed in patients harboring the frequent combination of FLT3 and NPM1 mutations (Fig. 2D). Again, single-cell summations compared favorably to bulk analysis, giving overall confidence of the approach. Because only cells for which both FLT3 and NPM1 successfully amplified were used in the final data set, testing for bias due to uneven rates of PCR failure, among other quality control parameters, was tracked and compared between runs for consistency as well as for the overall data set (table S1).

Many different clonal populations shown by single-cell genotyping

As a proof of principle, we concentrated on patients known to have both FLT3 and NPM1 mutations. The frequency of cells containing each combination of FLT3 and NPM1 zygosity for seven cytogenetically normal AML samples (six diagnosis and one paired relapse) is shown in Fig. 3. In each sample, all possible FLT3 and NPM1 genotype combinations were detected. Moreover, the complexity could not be estimated by looking at only the bulk allelic frequency ratios (Fig. 2D). Two samples (Pt5 and Pt6) contained two different FLT3-ITDs each, identified by their unique lengths, and no cells in either patient were positive for two different ITD alleles. These data suggest that clonal diversity is likely higher than anticipated in AML samples and that locus-specific models of allele burden and mutational events that allow for multiple, independent occurrences of the same mutation (convergent evolution) may be an important refinement to current models of clonal diversity. Moreover, the data suggest a wide variation in clonal diversity from patient to patient (such as Pt1 versus Pt5). Further studies will be needed to test if the amount of clonal diversity present at diagnosis correlates with treatment response.

Fig. 3. Genetically distinct clonal frequencies in AML with respect to FLT3-ITD and NPM1.

Bubble plot of the clonal distribution of six AML patient samples at diagnosis (and one analyzed at relapse, Pt4-R) showed the presence of all possible combined genotypes for the mutant genes present in each sample, although at a variable frequency. Pt1 to Pt4, and Pt4-R contained one each of FLT3-ITD and NPM1 mutations, and Pt5 and Pt6 had two FLT3-ITDs and an NPM1 mutation. Note that none of the samples with two ITD lengths had clones containing both ITDs simultaneously.

Changes from diagnosis to relapse and new mutations

To address the possibility of changes in clonal diversity from diagnosis to relapse, two pairs of diagnosis and relapse samples were available that corresponded to Pt3 and Pt4. The bulk genotyping data for Pt3 at relapse maintained a mutant allele frequency of 1.0 for NPM1, and the FLT3-ITD allele frequency increased from 25 to 98%. Such a high allele frequency would require a near-complete shift to homozygous FLT3-ITD cells in the sample, and further analysis was not performed on the relapsed sample from Pt3. For Pt4, we analyzed diagnostic and relapse samples because the FLT3-ITD allelic ratio did increase (although less markedly) from diagnosis to relapse, but the NPM1 allele frequency did not (Fig. 2D). Notably, additional mutations detected in the bulk sample had arisen since diagnosis (three unique WT1 exon 7 insertions/deletions, CEBPa, and DNMT3a R882), so our targeted assessment is an underestimate of the actual clonal diversity present. There is a substantial shift in the distribution of clonal frequencies from diagnosis to relapse as expected, although the distribution of NPM1 zygosities remained unchanged (Fig. 3). In particular, there is an increase in clones expressing homozygous FLT3 mutations; however, all nine clones remain present in the relapse sample at some level. Thus, the single-cell analysis showed both examples of decreasing and increasing clonal complexity with relapse, but selection for FLT3-ITD in both cases.

Description of potential sources of bias or artifact

Addressing the potential for technical error and/or bias in single-cell PCR analysis is a challenging issue because of the destructive nature of the assays (the same cell cannot be measured multiple times). Thus, we performed a variety of validation experiments to determine if our description of clonal diversity could be attributed solely to technical artifact. In particular, the consistent distribution of NPM1 allelic ratios and genotype distributions in single patient cells could signify potential artifact, and thus, we revalidated this assay in an NPM1-positive cell line and additionally with plasmids that contained a single copy of both wild-type and mutated genes. The OCI-AML3 cell line, at the bulk level, is heterozygous for NPM1 with an allele frequency of nearly 0.5 and when analyzed via single-cell analysis proved to have a similar distribution of zygosities as seen previously in our patient populations. We presume that this cell line can be considered a single clone because of its continual passage and our flow cytometric analysis of the cells for common myeloid cell surface markers that did not identify any visible discrete populations that would suggest heterogeneity within the cultures. However, despite its single clone origin, we still see an approximately even distribution of NPM1 zygosities in single cells similar to that seen in patient samples.

In order for these NPM1 data to be due to technical artifact alone, we hypothesized that there was frequent allele dropout within the reactions, which would obscure the true clonal structure. If allele dropout of either the wild-type or the mutant alleles was occurring at the exact same, relatively large rate (~33% of the time), this would have no effect on the overall allelic ratios but give the apparent distributions we saw in patient samples instead of purely heterozygous cells. In each patient sample, the FLT3-ITD length(s) and allele frequency are distinct, and thus, the single-cell data regarding the FLT3-ITD are much less likely to be due to technical artifact alone. However, we aimed to estimate the apparent allele dropout rates for both genes in our assay. To do this, we used plasmid constructs containing both a wild-type and a mutant allele copy of NPM1 or FLT3. We performed multiple independent limiting dilutions of each plasmid and used a Poisson distribution to identify the maximum percentage of single-copy reactions that were known to have a heterozygous plasmid input but were scored as either wild type or homozygous mutant. With this validation for both NPM1 and FLT3-ITD, we discovered that ~5% of the reactions containing a heterozygous plasmid were scored as wild type, and another ~5% were scored as homozygous mutant. We used this estimate of maximum technical uncertainty to develop additional statistical tests for our patient data sets as described below.

Comparisons of single-cell data to null distributions

Quantitatively, we see evidence of all possible clones in each patient sample, even when considering only whether the lower bound of the 99% confidence interval is above zero, because of our large sample sizes per patient. However, it is unclear what impact specific clonal frequencies have on clonal evolution, and thus, we took a more conservative approach to interpreting our data in the context of current approaches. To identify what aspects of clonal diversity bulk analyses can be successfully described with single-cell analyses that could not be explained by technical artifact or chance alone, we developed a null hypothesis distribution for each individual sample given the bulk data we generated regarding overall allele frequencies and current assumptions regarding zygosity (Fig. 1). Specific assumptions included the following: (i) mutations occur heterozygously if the allele frequency is less than 0.5; (ii) if mutant allele frequencies are greater than 0.5, this indicates that all cells in the population must be heterozygous and a subset has undergone loss of heterozygosity, becoming homozygous mutant; (iii) all lower-frequency but differently sized FLT3-ITDs occur in separate clones distinct from that of the larger-frequency FLT3-ITD. We did not assume, as is typically done for newly discovered mutations, that mutations occurring at the same frequency occurred in the same clones, as for FLT3 and NPM1. In fact, rarely does the FLT3 allele ratio (wide-ranging from <~0.05 up to ~1) match that of NPM1 (nearly always 0.5), which is represented in our sample set.

We combined this hypothetical clonal distribution with our maximum estimate of technical variability identified via plasmid validations due to stochastic allele dropout to generate a distribution that would represent the expected clonal diversity, assuming that all non-expected clones are detected due solely to technical error and chance. We then compared the expected cell numbers of each clone in the single-cell data we obtained to the null distributions via a χ2 test to determine if the two distributions were significantly different for each patient sample (Fig. 4). These comparisons suggest that aside from the quantification of specific clonal frequencies, the wide range of clonal identities observed via single-cell analysis reflect actual populations and are likely not a product of artifact.

Fig. 4. Comparison of observed clonal distributions to those calculated with bulk data.

Observed cell numbers scored with each genetic identity in the single cells of each patient sample are shown in shaded bars. Null distributions generated by predicting clonal distributions based on technical uncertainty and existing assumptions of heterozygosity and concurrence of mutations to link mutant allele frequency and clonal frequency are shown as wireframe. This distribution is based on the allelic frequencies in the bulk samples (Fig. 2D).


We have established a protocol allowing for the assessment of targeted single-cell genotyping using conventional equipment and without relying on potential biases inherent in colony formation or genome amplification strategies. Our data suggest that clonal structure based on concurrent mutations can be more complex than is perceived by bulk allelic frequencies and that the assumption that mutations occur only in the heterozygous state may lead to underestimation of the clonal diversity. Experiments to define reproducibility and allelic dropout rate suggest that the clonal distributions we found are unlikely to be mere manifestations of technical effects.

A possible interpretation of the observed clonal distributions with respect to FLT3-ITD and NPM1 is that these mutations may not solely be established in a sequential manner during leukemia evolution, but may arise through independent clonal events, leading to convergent evolution. Notably, the data highlight a difficult problem with any of the current approaches of determining clonal structure. Given previous data (20) showing that multiple FLT3-ITDs are relatively common in AML, the occurrence of multiple FLT3 mutations likely represents independent events, because the same insertion location/length is unlikely to occur more than once. We found that no clone ever had both FLT3-ITDs. In contrast, independent mutation events are challenging to demonstrate for NPM1, where 80% of mutants in AML are the same 4-bp insertion (TCTG) (21), so we cannot evaluate whether independent mutations arise within a patient in a form of convergent evolution or are derived from a common ancestral clone even if sequencing of individual cells was performed. Thus, even our single-cell data may be actually underestimating the clonal complexity of the sample. When one considers adding more genotyping of other relevant genes in AML (such as DNMT3a, IDH1/2, RAS, TET2, etc.), the potential genetic complexity is staggering.

Although an admittedly small number of patients were studied, the results offer a few insights that need further investigation. First, there is a broad range of clonal complexity, and further studies will determine if the amount of clonal diversity at diagnosis correlates with treatment response. Further, in both cases of paired diagnostic/relapse samples, there was a clonal shift toward homozygous mutant FLT3-ITD mutations (in Pt3, it was a large shift, and in Pt4, it was more subtle), which may be in line with clinical and laboratory data, suggesting that high FLT3-ITD allele burden provides a selective advantage to the cancer during therapy, contributing to resistance (19, 22, 23) and a worse prognosis than when the mutation occurs at a lower allele burden (22). Last, although the usual assumption is that clonal selection and relapse may favor a reduction in clonal composition, the Pt4 relapse sample is more genetically diverse than at diagnosis. This suggests that either (i) therapy caused an increase in the rate of generation or fixation of new mutations (recently suggested in NGS data comparing diagnostic and relapse material) (5), (ii) one of the existing resistant clones had a mutator phenotype, which contributed to relapse, or (iii) these clones were actually present at diagnosis, but at a frequency undetected by the single-cell methodology. Future work should track the clonal dynamics in patients from diagnosis through minimal residual disease and relapse, especially in patients receiving FLT3 inhibitors, to illuminate the nature of resistance to those drugs.

Although bulk sample analysis of mutations and allele frequencies can give valuable information about mutation burden in cancer, understanding how these mutations interact intracellularly and how genetically different clones and their progeny compete requires more detailed analyses, specifically at the single-cell level. There is considerable debate as to which mutations function as “drivers” (increase clonal evolutionary fitness) versus “passengers” (evolutionarily neutral); reconstruction of cell lineages using single-cell genotyping can help reveal dependencies and epistasis between mutations through the reconstruction of cell lineages and by determining the order and concurrence of mutations along those lineages. Assessing true clonal distribution—the frequencies of cells containing specific zygosities of mutant alleles of multiple genes—is particularly informative as to (i) whether specific combinations of mutations confer a selective advantage at a single-cell level (allowing cells to overproliferate or resist therapeutics), (ii) whether the mutant proteins have the potential to interact within a cell (providing a therapeutically targetable tumor-specific mechanism), and (iii) whether minor clones with a specific genotype are responsible for relapse (allowing the presence of high levels of heterogeneity or identification of the clones at diagnosis to inform therapeutic decisions).

Relapse is the major obstacle to cure, and with our current therapeutic approaches, natural selection favors cancer. In the context of a complicated clonal structure, where each clone may have varying susceptibilities to chemotherapy, the current method of applying multiple courses of the same therapy would naturally select for the resistant clones. Understanding the change in clonal structure due to the chemotherapy’s selective pressure might allow us to develop diagnostics to quickly quantify clones at diagnosis and during therapy, and adapt therapy as the clonal population evolves to target the surviving clones. Moreover, a better understanding of clonal composition and competition may alter our way of thinking about “cure.” Can we shift therapy early, based on changing clonal compositions assessed in residual disease? Or going further, the natural selection battle is often formulated as the cancer versus the normal cell; however, different cancer clones are also at competition. Can we develop strategies that will give the advantage to slower neoplastic clones, which may never dominate the normal? The combination of bulk NGS data on larger numbers of AML samples and the insight gained from assessing these mutations at the single-cell level may provide a valuable basis for better understanding how changes in multiple mutation allele burdens at the bulk level correlate with the underlying evolution occurring in the leukemia during therapy. The study of tumor clonal structure and evolution, with a focus on the discrete nature of resistant and responsive individual cells, will elucidate the specific genetics that determine therapeutic response from the background of both normal and nonmalignant genetic alterations. This will provide physicians the opportunity to adapt therapy to the ever-changing population of the tumor and to successfully use selective pressures to manage and eliminate cancer cells rather than allowing cancer to use natural selection to foster resistance.


Study design

This study focused on a small cohort of cytogenetically normal AML samples for which previous genetic information was known via clinical molecular testing. Samples were selected for the presence of the FLT3-ITD and NPM1 mutations because of their importance in determining prognosis and relatively high frequency in the adult AML population. Also, high blast count samples were chosen to minimize the impact of residual normal cells present in the population. Data were obtained from the single cells sorted from AML patient samples from a minimum of two 384-well plates, and amplified independently (on different days with different PCR mixes). Additional plates were analyzed for samples with low-frequency mutations until sufficient data points were obtained for statistical significance of all of the clonal population frequencies to be reached (via t test).

Cell sorting and selection

AML samples were collected after consent for Fred Hutchinson and University of Pennsylvania Institutional Review Board–approved protocols. Samples 1 to 3 and 5 were peripheral blood (blast percentage >70%) and were stained with direct conjugate antibodies for CD123, CD117, CD45, and HLA-DR (all antibodies from Becton Dickinson) and calcein violet (Life Technologies) to enrich for live leukemia blasts identified via multicolor flow cytometry performed upon sample banking. Sample 4 was bone marrow with 57% blasts, and because of the lower blast count, it was stained with direct conjugate antibodies/DAPI (4′,6-diamidino-2-phenylindole) and selected for live myeloid cells that were CD33+/CD45+/CD19/CD3 (CD19+/3+ population was analyzed to verify that all mutations occurred only in the myeloid cells), and only myeloid cells were incorporated into the final genotyping data set. Our validation sample along with samples 6 and 4-R were leukapheresis samples, with >90% blast counts, and thus were stained using calcein violet and propidium iodide (Life Technologies) and sorted for live cells. Single cells were sorted into 384-well plates on a BD Aria III, using stringent forward and side scatter gates along with phase and yield sorting masks to ensure that droplets contained single cells (or were empty) and not doublets. The cells were lysed by drying and subsequent hypotonic stress/heating during endpoint multiplexed PCR.

PCR and fragment analysis

For each gene, the primer pairs span the most common insertion/deletion locations for each gene and thus amplify both mutant and wild-type alleles, allowing the genotype for each single cell to be analyzed via the presence of either the expected wild-type peak, the peak that results from the mutant allele, or both (primer sequences in table S2). Primers and protocol were tested on bulk extracted DNA from each sample to confirm the presence of the mutations in the sample and the ability of the primers to amplify both alleles and determine the bulk allelic ratios and FLT3-ITD lengths.

Single-cell genotyping was performed either with a multiplexed one-round PCR or with the addition of a nested PCR using fluorescently labeled forward primers in the second round. After endpoint PCR, fragment analysis was performed via capillary electrophoresis (ABI 3730xl DNA Analyzer) and analyzed with the GeneMapper software (Applied Biosystems). First-round primers were designed for multiplexing using the Web version of MPrimer (24), and FAM (carboxyfluorescein) or VIC labels were used on the forward primers only when one-round multiplexed PCR was performed. For a portion of the cells from sample 4-R, FAM- or VIC-labeled inner nested primers were used, and the fragment analysis outputs were analyzed manually for insertions and deletions in all target genes (outputs from one-step and nested PCRs were compared to identify potential roles of contamination and PCR failure rates for single-cell analysis, as discussed below).

Multiple 384-well plates were assayed separately for each patient sample, and the genotype/zygosity for each of the genes for each well was assembled. To eliminate bias due to experimental failure, only cells for which both genes were represented successfully were scored, and those with failures for one gene were analyzed to confirm that they were randomly occurring, and not skewed for or against one particular gene. Data for each well of the 384-well PCR plates that contained a single cell were analyzed for each gene, and then combined and analyzed for the frequencies of PCR failure and genotype distribution in the population, either separately for each gene or when combined with the genotype of the other gene.

Single-cell genotyping efficiency, false-positive rates, and comparison with bulk analysis

To assess the false-positive rate in our assay, cells were sorted into every other well of a 384-well plate, and all wells of the plate were analyzed via fragment analysis for both genes. The frequency of false-positive readings from wells that were known not to contain cells was 0.68% (sample size of 310 wells, confirmed with additional independent sort of 84 wells). Notably, all false-positive wells only were positive for one of the genes of interest, and thus, because we only included wells for which there was amplification for NPM1 and FLT3, they would have been excluded from further analysis. Additionally, we performed both one-step PCR and nested PCR on Pt4-R to identify whether the failure rates of individual genes or the overall PCR failure rate changed if a different primer set was used or if the fragment analysis was analyzed in singleplex rather than in multiplex. There were no significant differences between the clonal distribution for this patient sample when analyzed via one- or two-step PCR (F test, P = 0.45) nor between PCR failure rates for individual genes, suggesting that even in the case of a nested step or individual second-round PCRs in which additional sources of PCR contamination/crossover could exist, no significant influence on clonal distribution was observed.

The possibility of false negatives or more specifically the failure of amplification of one allele versus the other resulting in an erroneous zygosity being assigned to a cell was addressed by comparing the allelic ratios of bulk samples to those calculated by single-cell analysis. If frequent loss of one allele during PCR amplification was occurring, then the calculated allelic ratio from single-cell genotyping data would be significantly skewed from the overall bulk allelic ratios. One sample containing two FLT3-ITDs was analyzed for FLT3-ITD allelic ratios, and the calculated allelic ratios from single-cell data were compared to those obtained by fragment analysis of the bulk extracted DNA. The similarity of these two measurements suggests that any allele dropout in the single-cell data is occurring at a rate that does not impact overall allele ratio in a significant way. Patient 4 relapse sample, the largest sample used for validation because of the large size of the leukapheresis sample, was also analyzed in this way for both genes to confirm with a large single-cell sample size that the expected allele frequencies for each of the genes were comparable to those obtained via single-cell analysis.

The failure rates of the cell sorting and the PCR for all the genes and each gene individually were assessed for all patients, for each plate sorted and amplified independently. The ability of the flow cytometer to successfully partition single cells into droplets given the stringent selection criteria used to ensure that only single cells rather than doublets are sorted, erring on the side of empty droplets, is estimated to be ~95% efficient by the manufacturer. The percentage of wells that were indicated by the sort layout to contain a cell, but for which none of the genes amplified, corresponds to the observed sort efficiency (average 8.1%), which is on par with the theoretical maximum cited by the manufacturer. The general failure rate of the PCR was calculated for wells where one of the targets did not amplify but the other target did (suggesting that a cell was present, but PCR amplification did not occur for both genes). The frequency of failure of each individual gene was assessed in the entire data set as well and determined by t test to not be significantly different within each patient sample data set between all genes (suggesting that no bias was imparted in the overall genotype distribution because of one target failing significantly more often than the other). When the frequency of failure of each individual gene was analyzed in a large data set (Pt4-R, 1494 sorted wells total), the frequency of failure of any of the genes was not significantly different via t test between two- or one-step PCR protocols, nor between different plates of the same patient’s sample (four plates tested for this patient).

Allelic dropout assessment for NPM1 in cell line cells

The OCI-AML3 cell line is heterozygous for the NPM1 insertion mutation, with a bulk allele frequency of 0.5 assessed by fragment analysis. When analyzed by single-cell analysis for NPM1 to identify if allele dropout of either allele at the same rate was influencing the apparent single-cell zygosities, 33.2% of cells were homozygous wild type, 36.3% of cells were heterozygous, and 30.6% of cells were homozygous mutant. For this sample size of 340 cells, the allele frequency calculated from the single-cell data is 0.49. The data were confirmed with an additional 72 cells sorted and analyzed independently, demonstrating that rampant allele dropout in the NPM1 assay was not resulting in all heterozygous cells being scored as distributions of wild type, heterozygous, and homozygous mutant due to PCR failure of one allele or the other.

To further confirm the presence of homozygous wild-type and mutant cells in this cell line, single-cell originating colony assays were performed using OCI-AML3 cells from the same sort as those used for single-cell genotyping. The single cells were sorted into a 96-well plate with alpha minimum essential medium (α-MEM) (supplemented with 20% fetal bovine serum, penicillin, and streptomycin), and cultured for 10 days to produce colonies in excess of 500 cells each. Plates were centrifuged, medium was removed via individual aspiration, and remaining cells were verified by microscopy before lysis. Molecular biology grade water was added to the wells, and the plate sealed and frozen at −80°C for 1 hour, then defrosted, vortexed, and centrifuged. The lysate produced was transferred directly to the NPM1 PCR using fluorescently labeled primers and analyzed by fragment analysis with the same protocol used for single cells. The resulting fragment analysis data were scored to identify the overall genotype of a random selection of 15 colonies, 10 of which were heterozygous (although allelic frequencies for NPM1 in heterozygous clones were far more variable than in bulk assays and were not all ~0.5), 2 were homozygous mutant, and 3 were homozygous wild type.

This assay confirmed that when multiple alleles are present in the PCR, in quantities such that complete allele dropout of either the wild-type or mutant alleles is unrealistic, homozygous wild-type, heterozygous, and homozygous mutant colonies are all detected. Given that in these colony assays, more than 500 cells were incorporated into the reactions, providing presumably ~1000 total NPM1 alleles per reaction, the likelihood of a 100% heterozygous sample having all 500 wild-type or mutant alleles fail to amplify is negligible.

Plasmid validation of allele dropout in FLT3 and NPM1 PCR assays

Plasmids containing amplicons for both a wild-type and a mutant copy of either NPM1 or FLT3-ITD were fabricated by Life Technologies (sequences of inserts in table S3). The plasmid inserts were validated by the Life Technology quality assurance protocol, and sequencing data were provided with the intact plasmid to confirm heterozygosity and purity. The plasmids were serially diluted to a concentration that when added to the PCRs would provide an average of 0.2 copies per reaction. We then applied this dilution to the NPM1 and FLT3 PCR assay and determined the number of wells that failed, those that were heterozygous, and any wells that appeared to show amplification but with only one allele present. A Poisson distribution was used to determine how many of those wells may have had two or more plasmid copies, and these wells were presumed to be scored as heterozygous because they would have multiple copies of both alleles in the reactions, making it unlikely that allele dropout would occur. After removing the maximum possible multicopy wells from the data sets, the resulting one-copy data were scored for genotype and allele dropout. The average percentage of wells in which one heterozygous NPM1 plasmid was present after Poisson distribution adjustment but the genotype appeared to be wild type was 4%, and those appearing to be mutant were 3% (sample size was 121 single-copy wells after 35 heterozygous wells were removed because of the potential for being multicopy). The estimated rate of allele dropout for FLT3 was 6% for mutant and 4% for wild type, with a sample size of 139 single-copy wells after Poisson adjustment.

Thus, when heterozygous plasmid DNA with one copy of each allele is applied to the NPM1 or FLT3 PCR, ~90% of the wells will be genotyped as heterozygous, with ~5% of the wells miscategorized as homozygous mutant or wild type. The rate of plasmid allele dropout for FLT3 was slightly higher than for NPM1, but when patient samples were analyzed, the calculated FLT3-ITD allele ratio from single patient cells was within 10% of the bulk values for a range of bulk allele ratios. These estimates of the maximum technical allele dropout rate in our assay were used to generate null hypothesis clonal distributions for comparison to single-cell data.


Table S1. Efficiency data calculated for multigene single-cell genotyping.

Table S2. Primer sequences used for single-cell genotyping.

Table S3. Plasmid insert sequences for FLT3-ITD and NPM1.


Funding: This work was supported in part by Research Scholar Grant #117209-RSG-09-163-01-CNE and Postdoctoral Fellowship #117749-PF-09-288-01-CCE from the American Cancer Society and NIH grants P01 CA91955, R01 CA149566, R01 CA114563, and R01 CA140657. Author contributions: A.L.P. was the primary investigator involved in the design, performance, and analysis of the experiments and was the primary writer of the paper; S.M. did initial experiments and contributed to the writing of the paper; J.S. performed the experiments and contributed to the writing of the paper; M.C. and C.M. helped design the experiments and contributed to the writing and analysis; and J.P.R. was the leader of this team and contributed to and oversaw the design, experiments, and writing. Competing interests: M.C. served as a paid member of the scientific advisory board on myeloid diseases for Janssen Pharmaceuticals. All other authors declare that they have no competing interests.

Stay Connected to Science Translational Medicine

Navigate This Article