Research ArticlePregnancy

Rare autosomal trisomies, revealed by maternal plasma DNA sequencing, suggest increased risk of feto-placental disease

See allHide authors and affiliations

Science Translational Medicine  30 Aug 2017:
Vol. 9, Issue 405, eaan1240
DOI: 10.1126/scitranslmed.aan1240

A complete look at fetal chromosomes

Genetic analysis of fetal DNA in maternal blood is becoming increasingly common, but the standard clinical tests typically consider only the chromosomes that are most frequently found to be aneuploid: 13, 18, 21, X, and Y. Pertile et al. analyzed patient data from two clinical laboratories and discovered that this approach may be ignoring valuable information. In both cohorts, the authors found a number of rare autosomal trisomies that are not reported on routine testing and showed that these are associated with an increased risk of pregnancy complications, indicating their potential relevance for clinical care.

Abstract

Whole-genome sequencing (WGS) of maternal plasma cell-free DNA (cfDNA) can potentially evaluate all 24 chromosomes to identify abnormalities of the placenta, fetus, or pregnant woman. Current bioinformatics algorithms typically only report on chromosomes 21, 18, 13, X, and Y; sequencing results from other chromosomes may be masked. We hypothesized that by systematically analyzing WGS data from all chromosomes, we could identify rare autosomal trisomies (RATs) to improve understanding of feto-placental biology. We analyzed two independent cohorts from clinical laboratories, both of which used a similar quality control parameter, normalized chromosome denominator quality. The entire data set included 89,817 samples. Samples flagged for analysis and classified as abnormal were 328 of 72,932 (0.45%) and 71 of 16,885 (0.42%) in cohorts 1 and 2, respectively. Clinical outcome data were available for 57 of 71 (80%) of abnormal cases in cohort 2. Visual analysis of WGS data demonstrated RATs, copy number variants, and extensive genome-wide imbalances. Trisomies 7, 15, 16, and 22 were the most frequently observed RATs in both cohorts. Cytogenetic or pregnancy outcome data were available in 52 of 60 (87%) of cases with RATs in cohort 2. Cases with RATs detected were associated with miscarriage, true fetal mosaicism, and confirmed or suspected uniparental disomy. Comparing the trisomic fraction with the fetal fraction allowed estimation of possible mosaicism. Analysis and reporting of aneuploidies in all chromosomes can clarify cases in which cfDNA findings on selected “target” chromosomes (21, 18, and 13) are discordant with the fetal karyotype and may identify pregnancies at risk of miscarriage and other complications.

INTRODUCTION

Analysis of circulating cell-free DNA (cfDNA) in maternal plasma has profoundly transformed prenatal screening for trisomies 21, 18, and 13. In a recent meta-analysis of 41 independent studies, cfDNA analysis had positive predictive values for trisomy 21 of 91 and 82% for women at high and low risk, respectively, of having an affected fetus (1). Furthermore, in two studies of general obstetric populations that compared detection of trisomies 21 and 18 by cfDNA analysis versus serum biochemical testing and nuchal translucency sonographic measurement, cfDNA analysis performed better (2, 3). Despite the improved screening accuracy compared to nongenetic assays, a small percentage of cfDNA tests are complicated by atypical findings (4), results that are discordant with the fetal or neonatal karyotype (510), or test failures (11).

Several different laboratory techniques are currently used to analyze the circulating cfDNA in maternal plasma. These include whole-genome sequencing (WGS), targeted sequencing, single-nucleotide polymorphism (SNP) comparison, and microarrays (1214). In the WGS technique, the bioinformatics algorithms are designed and validated to detect aneuploidies in the test chromosomes 13, 18, 21, X, and Y by comparing and normalizing sequence counts in the test chromosomes to nontest reference chromosomes. A potential disadvantage of this approach is that a true aneuploidy of a reference chromosome may be falsely called as an aneuploidy of a test chromosome. Some laboratories have developed and incorporated quality control parameters to help detect abnormalities in reference and other nontest chromosomes so that cases with likely anomalous results can be either flagged for review or cancelled.

Here, we systematically analyzed WGS data from all chromosomes in two independent clinical laboratories that use a common quality control parameter to flag anomalies in nontest chromosomes. We hypothesized that by visualizing and quantifying the frequency of various rare chromosomal abnormalities, we could clarify cases in which cfDNA test results are discordant with the fetal karyotype, thus identifying pregnancies that are at risk of miscarriage and other feto-placental and maternal complications.

RESULTS

Combined cohort demographics

WGS data were generated from 89,817 unique pregnancies: 72,932 from cohort 1 (Illumina, Redwood City, CA) and 16,885 from cohort 2 [Victorian Clinical Genetics Services (VCGS), Melbourne, Australia]. Samples from these sets were flagged as cases of interest if the normalized chromosome denominator quality (NCDQ) value was less than 50. The overall study flow is shown in Fig. 1. A summary of the participant demographics for both cohorts is given in Table 1.

Fig. 1. Overall study flow showing numbers and distribution of participants.

n, number; BI, bioinformatics; TN, technical noise; WNL, within normal limits.

Table 1. Participant demographics.

MA, maternal age; GA, gestational age; MAD, median absolute deviation; na, not available.

View this table:

The median maternal ages in the flagged and nonflagged samples within each cohort were significantly different (Table 1). In both cohorts, median maternal ages were higher in the flagged samples. There were no differences between the cohorts in the median maternal ages of the flagged samples.

Within each cohort, there was no difference in the median gestational ages between the flagged and nonflagged subsets (P = 0.830 and 0.847 for cohorts 1 and 2, respectively). When comparing the median gestational ages between cohorts, significant differences were found for both the flagged and nonflagged samples (P < 0.0005 and P < 0.0005, respectively). On average, cohort 1 samples were collected at a later mean gestational age (13.8 weeks) when compared with samples from cohort 2 (mean, 11.0 weeks). This finding is also reflected in the lower proportion of samples collected in the first trimester of pregnancy for cohort 1 (65.6%), when compared with cohort 2 (92.8%). Both cohorts had fewer than 2% of tests performed in the third trimester.

A summary of the indications for cfDNA screening could not be ascertained in cohort 1 because the data set was de-identified and scrubbed of all clinical data except for maternal and gestational ages. For cohort 2, the indications for testing were as follows: primary screen (~80%), advanced maternal age (15%), and follow-up of combined first trimester screening results (4%). Only 1% of women in cohort 2 had fetal sonographic abnormalities; they were counseled to have a prenatal diagnostic procedure.

NCDQ values and chromosomal findings

For analysis, the cases were binned into five NCDQ groups, as described in Materials and Methods. Although the NCDQ results were independently obtained in each clinical laboratory, the distribution of cases across these five groups was essentially identical between cohorts 1 and 2. No significant differences were found (Table 2).

Table 2. Stratification of clinical samples by NCDQ values.
View this table:

In cohort 1, 518 samples were flagged for having borderline, warning, or abnormal NCDQ values (0.71%) (Table 2). Visual review of the sequencing data showed that 328 of these cases (63.4%) had chromosomal abnormalities (Fig. 1). Single rare autosomal trisomies (RATs) were observed in 246 of 518 flagged samples (47.5%; Fig. 2). Copy number variants (CNVs), defined as single, double, or triple segmental duplications or deletions, were found in 35 samples (6.8%). In 31 samples (6.0%), combinations of two or three abnormalities (trisomies, CNVs, and, more rarely, monosomies) were observed. A dysploid pattern, defined as more than three distinct chromosomal regions with major areas of genomic imbalance, was observed in 16 samples (3.1%). There were 53 samples (10.2%) in which no chromosomal abnormalities could be visualized on any of the nontarget chromosomes; 96% of these were found in group A. There were 137 cases (26.4%) in which the interpretation was uncertain. The latter were frequently associated with technical noise and high guanine-cytosine (GC) bias.

Fig. 2. Results of interpretation of visual sequence data for both cohorts, indicating absolute numbers of each abnormality.

The “Uncertain” category reflected cases in which the technical noise precluded interpretation of the data. The “Other” category indicated two or three different abnormalities in the same sample (for example, trisomy plus sex chromosome aneuploidy).

In cohort 2, 109 samples (0.65%) had borderline, warning, or abnormal NCDQ values (Table 2). The largest subgroup (71 cases, 65.1%) showed clear evidence of whole-chromosome trisomy or segmental duplication or deletion (Fig. 2). Single RATs were observed in 60 samples (55.0%). Single and double segmental duplications or deletions were found in seven samples (6.4%). Three cases (2.8%) were categorized as “Other” because the CNVs were observed in combination with trisomy. In one subject (0.9%), there was evidence of dysploidy. Six cases (5.5%) were classified as within normal limits because no abnormal sequencing patterns could be visualized on any of the nontarget chromosomes. There were 32 samples (29.4%) with technical noise that were classified as uncertain, as described in Materials and Methods.

Single RATs

For both cohorts, trisomy of a whole nontarget chromosome (RAT) was the single most common finding (Fig. 2), observed in 306 of 627 of flagged samples (48.9%). In order of decreasing frequencies, the most commonly detected single RATs in both cohorts were 7, 15, 16, and 22 (Fig. 3). The RATs were distributed across all NCDQ groups: 122 in group A, 85 in group B, 80 in group C, and 19 in group D.

Fig. 3. Bar graph indicating the absolute numbers of single RATs observed in both cohorts.

The chromosome number is shown on the x axis. No trisomies were observed for chromosomes 1, 17, and 19, so they are not included in the figure.

The data were further analyzed by gestational age at time of blood draw (Table 3). When stratified by trimester, differences in the frequency of RATs were observed within cohort 2 (P = 0.023) but not within cohort 1 (P = 0.253). These differences disappeared when the cohorts were combined (P = 0.126). The aggregate frequencies of RATs in cohort 1 were 0.34% (n = 163 of 47,829), 0.32% (n = 78 of 24,133), and 0.52% (n = 5 of 970) for the first, second, and third trimesters, respectively (Table 3). For cohort 2, the frequencies were 0.34% (n = 53 of 15,643), 0.49% (n = 6 of 1230), and 8.3% (n = 1 of 12) for the three trimesters, respectively. When comparing the frequency of RATs, there were no statistical differences between cohorts 1 and 2 when stratified by trimester or in aggregate (P = 0.122).

Table 3. Distribution of total samples across trimesters compared to the number and frequency of RATs.
View this table:

Effects of RATs on clinical test results

Of the 246 observed trisomies of nontarget chromosomes in cohort 1, 172 involved reference chromosomes. In 21 of these cases, one or more putative (false-positive) monosomy calls were reported. In all cases in which the clinical test result was affected, the amplitude of the trisomic fraction was greater than or equal to 6% (average, 15%), with 20 of 21 cases being in the NCDQ group C or D. The algorithm used the higher counts of the trisomic reference (denominator) chromosome(s) to normalize results for one or more of the test chromosomes, which resulted in lower values for the normalized test chromosome. These lower values were algorithmically interpreted and reported as monosomies of the test chromosome (13, 18, or 21).

In cohort 2, the algorithm was designed to automatically cancel (fail) any result with NCDQ values less than or equal to −100, so false-positive monosomies were not generated. These cases were analyzed, and results were reported as described in Materials and Methods. There were 26 cases in cohort 2 in which a low NCDQ value resulted in test failure; 24 of these involved a single RAT (table S1). The remaining two cases of NCDQ failure involved a segmental deletion on chromosome 10q and a single case of dysploidy.

Clinical outcome data

Cytogenetic or pregnancy outcome data were available for 57 of 71 samples (80%) classified as abnormal in cohort 2, and of these, single RATs comprised the largest group with recorded outcomes (52 of 60; 87%). RATs were associated with early miscarriage, in utero fetal demise (IUFD), intrauterine growth restriction (IUGR), true fetal mosaicism (TFM), confirmed or suspected uniparental disomy (UPD), and normal pregnancy outcomes (Table 4 and table S1). Samples with outcome data not involving single RATs were four pregnancies with single CNVs and one case of dysploidy (multiple CNVs and aneuploidy). The single CNV was not confirmed in any of these cases after chorionic villus sampling (CVS) or amniocentesis using SNP microarrays. In one pregnancy with a single CNV, a normal live birth was recorded without pre- or postnatal cytogenetic investigations. The single case of dysploidy, observed in maternal blood at both 10 and 13 weeks, had normal prenatal (by CVS) and postnatal (by umbilical cord blood) SNP chromosome microarray (CMA) results, with a normal live birth outcome. Repeat cfDNA testing of maternal blood at 22 weeks of gestation failed to show any abnormalities. Additional details are given in Discussion.

Table 4. Clinical outcome data for cohort 2 cases with RATs.

Misc., miscarriage; MCA, multiple congenital abnormalities; NLB, normal live birth; NA, follow-up not available; Mat, maternal; Av., average; TF, trisomic fraction; FF, fetal fraction.

View this table:

Of 52 single RATs with outcome data, 22 samples (42%) were associated with an early or missed miscarriage (<11 to 12 weeks of gestation). Miscarriage was reported in 13 of 14 samples (93%) with trisomy 15 and in 3 of 5 samples (60%) with trisomy 22. Single cases of trisomies 9, 10, 14, and 20 and two cases of trisomy 16 were also recorded as miscarriages. Another case of trisomy 9 was associated with a co-twin demise at 9 weeks of gestation. Cytogenetic investigation on products of conception (POC) was carried out in five miscarriage samples. In each case, the RAT was confirmed by using SNP microarrays: three cases of trisomy 15 (placental villi), one case of TFM for trisomy 22 (fetal skin), and one case of nonmosaic trisomy 9 (fetal skin) in a pregnancy that was terminated after multiple fetal anomalies were observed on ultrasound examination.

There were 17 pregnancies involving single RATs that proceeded to amniocentesis. TFM was recorded in five of these samples: three cases of trisomy 2 mosaicism and single cases of trisomy 7 and 16 mosaicism. UPD was suspected or confirmed in seven pregnancies (13%): two cases with putative UPD2 (both seen in association with TFM, as described above), two cases with putative UPD4, and two cases with putative UPD16 (fig. S1). In the seventh case, maternal uniparental heterodisomy for chromosome 15 (matUPD15), causing Prader-Willi syndrome, was confirmed by a comparative analysis of fetal and parental SNP microarray genotyping data.

Three pregnancies were associated with second-trimester IUFD (6%): one case of trisomy 2 and two cases of trisomy 16 with putative UPD16 (as described above). Confined placental mosaicism (CPM) was confirmed in both cases (100% trisomy 2; 100% trisomy 16), where placental biopsies were available for analysis. Both pregnancies were complicated by severe IUGR. The trisomy 16 CPM was determined to have a meiotic origin, whereas the trisomy 2 case resulted from an apparent mitotic (postzygotic) gain of chromosome 2. One small for dates pregnancy with trisomy 4 was associated with IUFD in the third trimester (2%). Postmortem examination at 40 weeks of gestation demonstrated asymmetrical growth restriction and fetal death due to placental insufficiency. Widespread fetal vascular malperfusion was evident throughout the placenta. No follow-up cytogenetic data are available.

Normal amniocentesis results were obtained in seven pregnancies (13%) for single samples associated with trisomies 2, 7, 9, 16, and 22 and for two cases of trisomy 10. These pregnancies proceeded to phenotypically normal live births, except for the case with trisomy 9, which was associated with IUGR and cleft palate at birth. Placental biopsies were available after delivery in three of these cases (trisomies 2, 7, and 10), and CPM was confirmed in each. The trisomies for chromosomes 2 and 7 had a likely mitotic (postzygotic gain) origin based on SNP microarray genotyping data, whereas the trisomy 10 case had a meiotic origin with trisomy correction. Another four cases of trisomy 7 and single cases of trisomies 3, 8, and 16 were described as having normal birth outcomes (13%), without pre- or postnatal cytogenetic investigations. One case of trisomy 20 was delivered at 35 weeks and 2 days of gestation with IUGR. No further information was available. Finally, one case of trisomy 8 was the result of a low-grade (10%) maternal mosaicism for trisomy 8, as identified by SNP microarray analysis of maternal blood.

In summary, relevant abnormal feto-maternal findings were recorded in 39 of 52 (75%) single RAT samples with outcome data available, defined as early miscarriage, IUFD, IUGR, TFM, UPD, and maternal constitutional mosaicism (Table 4). Fourteen cases were associated with a normal live birth. Full details of cytogenetic and clinical outcomes for the cohort 2 RAT cases are recorded in table S1.

Analysis of trisomic and fetal fractions to determine potential biologic mechanisms

In both cohorts, differences between the fetal fraction, as calculated by chromosomal coverage counts, and the trisomic fraction, as calculated by visual inspection of the increases in counts above the expected euploid average, suggested the presence of mosaicism (fetal, placental, or maternal) when the trisomic fraction was much lower than the fetal fraction or a maternal contribution alone when the trisomic fraction was much higher than the fetal fraction (Fig. 4).

Fig. 4. Dot plots of trisomic versus fetal fractions for cohorts 1 and 2.

The x axis indicates the trisomic fraction, and the y axis shows the fetal fraction. Each dot represents a distinct pregnancy, and the colors represent different chromosomes. Results showed that, in many cases, the trisomic fraction was much less than the estimated fetal fraction (dots within yellow triangles). This disparity suggests that, in cases falling above the identity line, placental mosaicism is likely. (A) The circled data point in cohort 1 shows a trisomic fraction that was much greater than the fetal fraction (red circle). Maternal mosaicism for trisomy 8 was suspected. (B) The circled data point in cohort 2 is a confirmed case of trisomy 8 maternal mosaicism.

In cohort 1, a preponderance of cases in which the trisomic fraction was lower than the fetal fraction suggested that many cases could be due to CPM (Fig. 4A). In cohort 2, these differences were less pronounced such that the trisomic fraction approximated the fetal fraction in a higher proportion of samples (Fig. 4B). In cohort 2, when the trisomic fraction and fetal fraction were similar, an increased risk for miscarriage, IUFD, IUGR, TFM, and UPD was observed, as evidenced by the cytogenetic and pregnancy outcome data (Table 4 and table S1). Furthermore, when the samples were grouped according to these pregnancy outcomes, the ratio of the average trisomic to the average fetal fractions was about 1.0, which is consistent with very high frequencies of trisomic cells in the placenta, as might be expected for a trisomic conception (Table 4). In cohort 2, pregnancies without these complications had much lower average ratios of trisomic to fetal fractions (0.62), which is more suggestive of CPM (Table 4). Results for trisomies 3 and 7 suggested CPM in most samples (trisomic fraction lower than fetal fraction), whereas results for trisomies 2, 4, 9, 15, 16, and 22 suggested a trisomic conception in most samples (trisomic fraction approximating fetal fraction). These trisomies also had worse clinical outcomes.

Rare maternal autosomal mosaicism may also be detected in apparently phenotypically normal individuals. In both cohorts, there was one case each of maternal trisomy 8 mosaicism. In cohort 1, maternal trisomy 8 mosaicism was suspected because the trisomic fraction (64%) was much greater than the fetal fraction (4%) (Fig. 4A). In the cohort 2 case (Fig. 4B), the estimated trisomic fraction was only marginally greater than the fetal fraction (14.1% versus 11.1%), but trisomy 8 mosaicism was confirmed in a maternal blood sample by using a SNP microarray and was not observed in placental villi examined later by SNP microarray after an elective termination of pregnancy.

Comparison of the frequencies of RATs found by cfDNA sequencing versus by CVS

We next determined the frequency and type of single RATs identified by cfDNA analysis compared with the frequency of RATs reported previously in the CVS literature (fig. S2 and table S2) (15). The frequency of 0.34% for RATs in the cfDNA study samples (306 of 89,817) was similar to the 0.30% frequency of RATs observed after short-term culture (STC) karyotype analysis of cytotrophoblast cells from CVS (P = 0.249) (15). In addition, the frequency of trisomy 7, which is the most commonly reported RAT in both the noninvasive prenatal testing (NIPT) and CVS data sets, was comparable at 0.0746 and 0.0795%, respectively (P = 0.776) (table S3 and fig. S2). The observed frequency of other RATs was also in broad agreement with the CVS literature (1517), although some RATs (such as trisomies 9, 15, 16, and 22) were more frequently represented in the sequencing data set, whereas others were more frequent in the CVS data set (most notably trisomy 3) (table S3 and fig. S2).

DISCUSSION

Here, we systematically analyzed maternal plasma WGS data from all 24 chromosomes using a quality control parameter, NCDQ, to identify potentially abnormal samples from two independent cohorts representing nearly 90,000 distinct pregnancies undergoing NIPT by cfDNA sequence analysis. The major strengths of this study are its large size, the fact that its key findings were replicated in two completely independent cohorts, and its outcome data, which provide information on both the clinical relevance of the findings and the potential underlying biological explanations for the abnormal test results. Furthermore, our data suggest that a comparison of the trisomic and fetal fractions may be helpful in estimating the tissue distribution of aneuploidy, which may aid in determining the clinical prognosis.

Single RATs were the most frequently observed chromosome abnormalities in both study cohorts, accounting for 0.34% of total samples and 48.8% (306 of 627) of cases identified by low NCDQ values. Although the data are limited, several smaller NIPT series incorporating WGS analysis have reported RAT frequencies of 0.35, 0.28, and 0.78% (table S2) (1820). The frequencies of RATs identified within cohorts 1 and 2 (0.34 and 0.36%, respectively) and the types of RATs independently observed in our cohorts and the published literature were also very similar. The samples in which RATs were identified were associated with an increased risk for feto-placental disease, including miscarriage, IUFD, IUGR, TFM, and UPD. The cytogenetic and clinical data for RAT outcomes in cohort 2 are sufficiently complete (52 of 60; 87%) to demonstrate that a biological cause is responsible for most of the observed abnormalities. With the exception of rare cases of maternal mosaicism, all trisomies identified by WGS were predicted to be present in the cytotrophoblast cell lineage of the chorionic villi, the primary source of fetal cfDNA. The trisomic cells may or may not extend into the extraembryonic mesoderm. In some instances, fetal tissues harbored the trisomy, in either mosaic or nonmosaic form. The outcome data from cohort 2 support a high frequency of nonmosaic RATs, with 42.0% of pregnancies with known outcomes associated with an early or missed miscarriage. This was most common for trisomy 15, in which 13 of 14 pregnancies miscarried. All trisomy 15 pregnancies, except one, miscarried between the time a viable fetus was detected by sonography and the time blood was collected at 10 to 11 weeks of gestation. In each case of early miscarriage, regardless of the chromosome involved, the trisomic fraction approximated the fetal fraction. Thus, this finding indicates an increased risk for fetal involvement, at least in the first trimester of pregnancy. It should also be noted that 13 of 60 pregnancies with RATs in cohort 2 had normal birth outcomes, despite the presence of very high frequencies of trisomic cells in several cases. These trisomies involved multiple different chromosomes (Table 4 and table S1).

By comparison, single RATs were reported in 0.60% of about 60,000 CVS karyotypes analyzed by both STC of cytotrophoblast and long-term culture (LTC) of chorionic mesenchyme (15). When only STC analysis was considered, the frequency of single RATs was 0.30% (1 of 333 STC). This frequency is similar to the 0.34% (1 of 294) observed in our combined cohorts (P = 1.0). Although the CVS and cfDNA study populations are not entirely uniform (fetal viability is always confirmed before CVS, there are differences in the timing of sample collection, and the sensitivity of mosaicism detection with cfDNA analysis is not well defined), the overall frequency of RATs in our cfDNA cohorts and those of others appear to be in broad agreement with the CVS literature (1517).

Trisomy 7 was observed most frequently in both study cohorts and is the most commonly observed RAT in CVS samples that incorporated STC of cytotrophoblast in their analysis. Other RATs commonly observed in our series, in descending order, were trisomies 15, 16, and 22; trisomies 3, 8, 9, 20, 10, and 2 were observed at intermediate frequencies. Trisomies for chromosomes 14, 4, 11, 6, 5, and 12 were relatively uncommon, and trisomies 1, 17, and 19 were not observed at all.

A meiotic origin of trisomy associated with placental mosaicism is correlated with an increased risk for adverse pregnancy outcomes (16, 21, 22). A meiotic origin for the trisomy increases the risk for TFM through persistence of the trisomic cell line in fetal tissues. Large numbers of trisomic cells in the placenta may adversely affect fetal growth. This complication is well documented for trisomy 16, which primarily has a maternal meiotic origin (2325). In addition, trisomic cell lines can revert to disomy by unequal division at metaphase. Each of the three trisomic chromosomes has an equal chance of being lost. If two of the remaining chromosomes are from the same parent, UPD results. This can have clinical consequences if the chromosome contains imprinted regions. Thus, pregnancy outcomes associated with RATs may include TFM, UPD, IUFD, and severe IUGR associated with placental insufficiency. These outcomes were well documented in cohort 2. Because chromosomes 2, 4, and 16 are not associated with known imprinting disorders (26), specific testing to confirm cases of suspected UPD for these chromosomes was not attempted. However, large regions of homozygosity present on these chromosomes, together with evidence for trisomy by cfDNA analysis, provided strong circumstantial evidence for UPD after trisomy rescue (27).

From the cohort 2 outcome data, the frequency of UPD is estimated at 1 in 2412 pregnancies (7 of 16,885; 0.04%). This frequency is a minimum estimate, because uniparental heterodisomy cannot be differentiated from biparental inheritance without further analysis of the SNP microarray data (27). This was evidenced by the sole case of maternal UPD15 reported here, which demonstrated a normal SNP CMA genotyping profile on analysis. The trisomic fraction was increased in each case associated with UPD, indicating an increased risk for a trisomic conception. Thus, when the RAT involves an imprinted chromosome and the trisomic fraction is high, the risk for a pathogenic UPD outcome is increased in pregnancies that do not spontaneously miscarry.

In addition, our data suggest that when the calculated trisomic fraction is much lower than the fetal fraction, CPM is a more likely outcome. Similar results have been recently suggested by another group (28), although there was no clinical confirmation, as was done here in cohort 2. These pregnancies most likely represent normal conceptions that have gained the trisomic chromosome after fertilization. Two pregnancies in cohort 2 with normal birth outcomes had CPM of mitotic origin confirmed. In both cases, the trisomic fraction was decreased relative to the fetal fraction, and in one case, considerably so. Postfertilization errors resulting in trisomy are most frequently seen for chromosomes 3 and 7 when restricted to the cytotrophoblast cells of the placenta (CPM type I) or chromosome 2 when restricted to the extraembryonic mesenchyme (CPM type II) (15, 16). In this setting, CPM for trisomies 3 and 7 might be identified by cfDNA screening, possibly in association with lower trisomic fractions, whereas CPM for trisomy 2 would not be identified because it is absent from the cytotrophoblast. Mosaicism for these trisomies is usually clinically benign when it is compartmentalized within these placental cell lineages (CPM types I and II) (16, 17, 22).

Although most of our maternal plasma cfDNA data agree with previous CVS literature, some of the pregnancy outcome data from cfDNA screening presented here appear to be in conflict. For example, Malvestiti et al. (15) recorded 0 of 65 cases of TFM or UPD involving trisomy 2 among mosaic cases subsequently investigated by amniocentesis. Similarly, in the CVS study, 0 of 74 cases of trisomy 7 mosaicism were associated with TFM or UPD, and only 1 of 26 cases of trisomy 15 mosaicism were confirmed with UPD. This contrasts with three of five cases of TFM for trisomy 2 and two of five cases with putative UPD2 within cohort 2. For trisomy 7, one of two cases at amniocentesis was confirmed with TFM, and the sole ongoing pregnancy with trisomy 15 was associated with UPD15. These findings suggest that cfDNA screening may identify those pregnancies at highest risk for TFM and UPD, particularly when the trisomic fraction is high and both the cytotrophoblast cell lineage and the extraembryonic mesenchyme are involved (CPM type III and TFM type VI), although mesenchyme involvement will not be reflected in the trisomic fraction. The CVS literature indicates that these pregnancies are at greatest risk for adverse outcomes (16, 21, 22).

More rarely, some WGS evaluations implied maternal abnormalities, such as apparent maternal mosaic trisomy 8 (seen in both cohorts). The maternal trisomy 8 mosaicism in cohort 2 had a mitotic origin, consistent with previous reports of constitutional trisomy 8 mosaicism (29). There were also 17 cases of dysploidy, which suggested possible maternal neoplasms. Five of the 16 cases in cohort 1 were known to have maternal malignancies (30). The remaining 11 had unknown clinical outcomes. The single case of dysploidy in cohort 2 was thoroughly investigated with maternal whole-body imaging and tumor markers, and all test results were normal. The abnormal cfDNA findings seen at 10 and 13 weeks may have been associated with a uterine leiomyoma seen on ultrasound examination. The dysploidy resolved on a repeat cfDNA sample taken at about 22 weeks of gestation. The presence of a uterine leiomyoma is a previously documented reason for abnormal cfDNA results (31), although the cause of the resolution in this case is not known. Hemorrhagic infarction of the leiomyoma without clinical symptoms is a presumptive explanation for the change in the cfDNA results.

The NCDQ value was a useful parameter for prompting further visual review of the cfDNA sequencing data. Currently, each of the clinical laboratories uses the NCDQ value differently based on internal bioethics policies. The NCDQ value was used in cohort 1 as an alert for the presence of technical noise and/or possible abnormalities in nontest chromosomes. In some cases with low NCDQ values, a review of all chromosome sequencing data was performed upon patient or health care provider request, with findings reported in a letter to the health care provider. In cohort 2, it resulted in a systematic review of all sequencing data in lieu of an automatic assay failure due to presumptive poor quality. The bioinformatics algorithms used in these studies, if limited to reporting only on chromosomes 13, 18, 21, X, and Y, have the potential disadvantage of generating results that are not expected in a living fetus, such as monosomy 13. When such results are obtained, a full chromosomal analysis may reveal the underlying biologic basis for the unusual result. The clinical utility of a whole-chromosome analysis is still not entirely clear, but this study strongly suggests that a full analysis of all chromosomes may have clinical relevance, particularly with respect to TFM and pathogenic UPD, and in monitoring for an increased risk of IUGR in ongoing pregnancies.

At the present time, most laboratories do not clinically report RATs. These clinical outcome data demonstrate that the presence of a RAT, particularly at a proportion similar to the fetal fraction, was frequently associated with serious pregnancy complications. For this reason, we recommend that patients should be given the option of receiving test results from all chromosomes. Analysis of maternal cfDNA is essentially a liquid biopsy of the placenta. Further research is needed to determine the full clinical utility of reporting autosomal aneuploidy for any chromosome.

MATERIALS AND METHODS

Study design

The objective of this study was to retrospectively and systematically evaluate all chromosome changes in the plasma cfDNA of pregnant women undergoing NIPT within specified time periods in two different clinical laboratories using similar WGS protocols. Sequencing results were selected for visual inspection if an associated quality control parameter fell below a specified threshold. While visually reviewing sequencing results, some, but not all, of the observers were blinded to either the clinical test report findings, the clinical outcomes, or both. Upon visual review of the WGS data, the type and frequency of deviation from the expected two copy numbers of each chromosome were noted and compared to demographic data and maternal and pregnancy health outcomes, if known. The frequencies of observed abnormalities were also compared to published data on CPM in chorionic villus tissue.

Sample selection

Two independent cohorts from clinical laboratories were analyzed. Both laboratories offer WGS of maternal plasma cfDNA and use similar (but not identical) workflows and bioinformatics pipelines to evaluate sequencing tags mapped to chromosomes 21, 18, 13, X, and Y (Fig. 1) (32). Both laboratories also use a similar quality control parameter, NCDQ, to flag deviations from the expected number of counts over nontest chromosomes. The entire data set included 89,817 samples.

Cohort 1 consisted of test data from 72,932 subjects derived from 84,945 continuous WGS tests performed between October 2013 and September 2014 in the Illumina Northern California Clinical Services laboratory (Redwood City, CA). Institutional Review Board (IRB) approval (Copernicus Group, IRB #ILL1-14-526, version 9-OCT-14) was obtained to waive patient informed consent for the reanalysis of WGS files that were de-identified. Study inclusion criteria were as follows: (i) gestational age at time of sampling was greater than or equal to 10 weeks; (ii) a value for the NCDQ parameter was available; (iii) blood samples had been drawn into nonexpired Streck DNA Blood Collection Tubes (BCT) and had arrived at the laboratory within the time frame required for analysis and with sufficient volume for testing; and (iv) if multiple test samples at different gestational ages were received from the same pregnancy, only one blood sample was selected for study. The second (later) sample was selected, unless the data for that sample were incomplete. Study exclusion criteria were as follows: (i) a gestational age of less than 10 weeks, (ii) inadequate blood volume, and (iii) blood collected into tubes other than Streck DNA BCT. In this cohort, clinical outcome information was only available for five subjects in whom a diagnosis of maternal malignancy was known. These women had previously consented to a WGS review of their NIPT data to participate in a different research study on discordant results due to maternal malignancy (30).

Cohort 2 was independently derived from 16,885 continuous WGS tests performed between April 2015 and August 2016 in the VCGS laboratory. Data reported were all generated as part of standard clinical follow-up for abnormal test results. Study inclusion and exclusion criteria were the same as for cohort 1.

In the 89,817 total study samples, any tests with an NCDQ value of less than 50 were flagged as cases of interest. This resulted in 627 [518 (0.71%) samples from cohort 1 and 109 (0.65%) samples from cohort 2] that were independently selected for detailed bioinformatics review and visual inspection of the WGS data.

The NCDQ parameter was part of the bioinformatics analysis pipeline used by the Illumina laboratory and licensed to VCGS. NCDQ was developed to identify cases with technical background noise in the sequencing data by measuring deviations in count densities (either higher or lower than the expected two–copy number values) over all nontest chromosomes. The degree of deviation is represented by a continuous variable, calculated using the following formula:Embedded Image

In this equation, I was defined as all autosomes excluding target chromosomes 13, 18, and 21. A training set of 3000 euploid samples was used to calculate the mean (μI) iI and covariance matrix (Σ) of the normalized coverage for this set of I chromosomes. For any processed sample, xI was defined as the chromosomal normalized coverage. The NCDQ was then calculated as the likelihood that the observed coverage of the set of I chromosomes for that sample represents the coverage of a euploid sample. For numerical convenience, the likelihood scores were shifted by 100 such that 99.9995% of NCDQ scores of normal euploid samples were greater than zero.

Most cases (99.3%) had values between 50 and 100, with the highest values indicating the least deviation from the theoretical two–copy number state. NCDQ values between 0 and 50 suggest an acceptable level of technical “noise,” whereas values below zero are suspected to indicate higher amounts of technical noise or other abnormalities. There is no lower limit to this parameter. Cases with an NCDQ value below −100 are considered likely to have abnormal findings. High deviations in counts can result in high negative NCDQ values (the lowest observed value was −185,000).

In the Illumina laboratory, the NCDQ values were initially used only as an informational tool. In the VCGS laboratory, the bioinformatics software program was adapted to automatically trigger a test failure result if the NCDQ value was below −100. Although the NCDQ parameter had originally been developed to detect technical noise, it subsequently became clear that it could also identify samples likely to have abnormalities in nontest chromosomes.

For the purpose of finding as many cases as possible with potential genomic abnormalities, an NCDQ value above zero was used as the study cutoff. The upper NCDQ value of 50 was chosen on the basis of empirical observations of findings in cases that had gone through visual review in the cohort 1 clinical laboratory. Cases with NCDQ scores between 0 and 50 had not been frequently found to have obvious genomic abnormalities and, because they represented less than 0.4% of all cases, were not overly burdensome to visually curate. A randomly selected set of 30 cases with NCDQ scores above 50 was assessed by visual inspection. No chromosomal abnormalities were noted in this set.

For both cohorts, samples with NCDQ values less than 50 (about 0.7% of all cases) were selected for further analysis (Table 2). Because NCDQ values were continuous variables unique to each test sample, to ensure true de-identification, the cases were further stratified and binned into one of four groups as follows: group A, values between 0 and 50; group B, between −100 and <0; group C, between −1000 and <−100; group D, all <−1000.

Bioinformatics review and data visualization

Each laboratory independently developed a process for evaluating cases in which the NCDQ parameter indicated a need to cancel or review the test. In the VCGS laboratory, quality control analysis included visual inspection of the normalized chromosome coverage data obtained from all chromosomes for each patient. These data were viewed in conjunction with NCDQ values of less than 50 to identify nontarget chromosomes with an obvious deviation in sequence counts. The sequencing data were subsequently analyzed using the WISECONDOR algorithm (33) to determine the count distribution across nontarget chromosomes and to bypass the restrictions of the automatic test failure mode imposed by the bioinformatics pipeline. In these cases, visual review of the WISECONDOR data allowed trained personnel to infer possible chromosomal abnormalities in the cfDNA sequencing results. In the Illumina laboratory, a bioethics policy prohibited further analysis of an individual patient’s sequencing data, even when low NCDQ values were present. IRB approval was therefore sought and granted to allow for visualization of sequencing data over all chromosomes from subjects provided that they were de-identified.

Although the methods for visual evaluation of the sequencing traces were independently developed at each laboratory, they were essentially identical. In both approaches, technical artifacts of sequencing were identified by noting the density of counts (in 100-kb bins) above and below the value of 1.0, the number assigned to the presumed two–copy number state. At the typical sequencing depth used, visualization of normal chromosomes shows a range of count values, with 90% falling in a zone that is within 10% of the assigned two–copy number value of 1 (fig. S3A). That is, count densities between 0.9 and 1.1 times the two–copy number value are considered to be within normal limits. Here, technical background noise was graded on a scale of 0 to 4 based on the distribution of counts that fell substantially above or below the two–copy number boundary values of 0.9 and 1.1 (table S4). Noise levels were evaluated independently of chromosomal abnormalities.

Grade 0 represented the typical range of count deviations (90% between 0.9 and 1.1) from the two–copy number averages that were seen over most euploid chromosomes. Grades 1 to 3 represented increasing count scatter above the theoretical euploid average. In grade 1, about 80% of counts were between the 0.9 and 1.1 bounds (fig. S3B); in grades 2 and 3, 60 and 40% of counts were within these bounds, respectively. Grade 4 represented nonlinear (often sinusoidal) scatter patterns that were found to be related to technical issues resulting from GC bias or suboptimal sample preparation.

Cases were considered to be within normal limits if no chromosomal abnormalities were observed by visual inspection and technical noise levels were less than grade 3. Cases were considered to be uncertain when several abnormal chromosomal regions were identified visually; however, the counts over all chromosomes appeared erratic, and the case was associated with a high degree of GC bias or noise (grades 3 and 4).

Whole-chromosome trisomies were identified when average counts over an entire chromosome were consistently increased above 1.01-fold (implying a trisomic fraction of >2%) (fig. S3C). Monosomies were identified if average counts over an entire chromosome were less than 0.99-fold. CNVs were identified if large segments of a chromosome (>20 Mb) were increased above 1.02-fold or if smaller segments were increased at increased fractions such that visual inspection showed an obvious abnormality. Segments less than 3 Mb in length could not be accurately assessed regardless of fraction size. Data from CNVs are not reported here because of the lack of sufficient outcome data on these cases.

Expected results categories were RAT, CNV, monosomy, or any combination of these as well as within normal limits, uncertain, and dysploidy (defined as more than three distinct chromosomal regions with major areas of genomic imbalance).

Data analysis

Cohort 1. For cohort 1, each flagged sample had WGS data for all 24 chromosomes displayed graphically and visually assessed by at least one of three trained observers (M.H.-M., D.W.B., and W.K.S.). The 518 samples that were evaluated visually were assigned to the three reviewers by random selection from among each of the four NCDQ groups, A to D. After the individual reviewers independently assessed the samples, the data sets were combined. WebEx meetings or in-person conferences were used to review cases needing further evaluation. If a consensus interpretation could not be achieved, one reviewer (M.H.-M.) had final adjudication authority. The consensus conclusions became the semifinal results. One reviewer (M.H.-M.) collaborated with a bioinformatician (S.L.K.) to further refine the assigned categories. Visual evaluation of the entire flagged data set showed that many samples exhibited a consistent pattern of abnormalities that was also associated with technical noise. Bioinformatics review showed that these cases were all associated with either high or low GC bias in the sample data, making the validity of the count deviations suspect. Thus, these 68 cases were all moved to the uncertain category. It was also noted that many cases with high GC bias showed putative trisomies or monosomies of chromosome 19 and putative monosomies of chromosome 16. On the basis of their abnormal GC bias values, all trisomy 19 and monosomy 16 or 19 calls were eliminated. The data set was then locked down and made available for statistical analysis.

Cohort 2. For each study sample in cohort 2, the WGS data were used to assess the normalized chromosome coverage by a trained analyst (M.D.P. or N.F.). Within this cohort, if samples were flagged as having technical noise, there were three subcategories: (i) confirmed laboratory/technical error, which was defined as the same noise pattern not being observed when the test was repeated using plasma isolated from the same sample; (ii) sample-dependent, defined as the noise not being seen in a new blood sample from the same pregnant woman; or (iii) patient-dependent, in which a repeat sample showed the same chromosome coverage pattern as in the initial sample, but there were no obvious whole/subchromosomal changes called by the WISECONDOR algorithm (33).

Estimation of trisomic fraction and fetal fraction

The trisomic fraction (TF) was estimated from the amplitude (A) of the counts above the two–copy number value using the following equation:Embedded Image

Amplitudes of less than 1 generated negative numbers, implying a monosomy or partial deletion. In cohort 1, fetal fraction estimation was performed from sequence read counts according to the elastic net approach described by Kim et al. (34). In cohort 2, the VCGS laboratory measured total fetal fraction using the SeqFF algorithm described in the same publication (34).

Statistical analysis

Summary statistics are presented throughout as means ± SDs. Comparisons of groups used Wilcoxon tests for continuous variables, χ2 test for categorical variables, Fisher’s exact test for categorical variables when assumptions for the χ2 test do not hold (count within a category is less than or equal to five), and the Mantel-Haenszel test for matched categorical variables, all with a two-sided α of 0.05 (35, 36). Statistical analyses were performed with R version 3.3.1 (21 June 2016) (37).

Correlation of NCDQ values with clinical outcomes

Clinical outcome data were available as part of routine clinical care in cohort 2. WGS data from cases of NCDQ test failure were evaluated. The presumptive cause of the test failure was reported to the referring clinician based on the shift in the normalized chromosome coverage and analysis of the sequence data using WISECONDOR. Flagged cases were evaluated by follow-up diagnostic testing using one or more of the following biologic samples: chorionic villi, amniocytes, POC, placental biopsies, or maternal peripheral blood. Analysis was performed using standard karyotyping, interphase fluorescence in situ hybridization, CMAs using SNPs, or a combination of these techniques. Not all flagged samples underwent further testing. Permission to collect pregnancy outcome data was obtained as part of the routine informed written consent obtained at the time of maternal blood collection for cfDNA testing.

SUPPLEMENTARY MATERIALS

www.sciencetranslationalmedicine.org/cgi/content/full/9/405/eaan1240/DC1

Fig. S1. Examples of amniocentesis SNP CMA results associated with UPD.

Fig. S2. Comparison of the frequency of single RATs in current study versus CVS data from a published study.

Fig. S3. Examples of DNA sequence displays used in cohort 1 to visually analyze results.

Table S1. Summary of cytogenetic and clinical outcome data from the 60 cases of single RATs in cohort 2.

Table S2. Frequencies of single RATs in the published literature using cfDNA and cytogenetic analysis.

Table S3. Observed frequencies of single RATs in the combined current study cohorts compared with CVS data from a published study.

Table S4. Definition of parameters used to grade technical noise in visual review of data.

Reference (38)

REFERENCES AND NOTES

  1. Acknowledgments: The VCGS laboratory thanks G. Shi and O. Giouzeppos for cfDNA analytical support and C. Love for bioinformatics support. Expert technical assistance was provided by R. Manser, I. Burns, S. Baeffel, A. Tsegay, T. Harrington, and L. Carver. We thank F. Norris and R. Oertel for cytogenomics expertise. The Illumina laboratory thanks K. Jinnet and K. Curnow for assistance with preparation of graphs and tables, and C. Deciu for bioinformatics support. Funding: The cohort 1 study was supported by Illumina Inc. The cohort 2 study was unfunded. Author contributions: M.D.P., M.H.-M., and D.W.B. supervised work, analyzed sequencing data, and wrote the manuscript. N.F. performed experiments and analyzed data. W.K.S. analyzed data. C.B. and S.L.K. provided bioinformatics support. D.V. and S.L.K. provided statistical support. Competing interests: M.D.P. has received honoraria from Illumina Inc. for seminar presentations. D.W.B. received a sponsored research grant from Illumina that was administered by Tufts Medical Center; this ended on 31 October 2016. M.H.-M., C.B., S.L.K., and D.V. are all current or former employees of Illumina. W.K.S. is a consultant to Illumina. M.H.-M. and C.B. are now employees of GRAIL Inc. N.F. declares that she has no competing interests. Data and materials availability: The bioinformatic pipelines used for processing of cohort 1 and 2 raw sequencing data are proprietary to Illumina. However, an interested party may send blood samples to the Illumina Northern California Clinical Services laboratory (Redwood City, CA) from subjects in an IRB-approved study and have those samples processed (for a fee) by the Verifi Plus test. Investigators may then receive either the FASTQ files for input to the WISECONDOR algorithm (33) or the Illumina-derived normalized bin count files, which may be used to create visual traces, or they may request that Illumina create visual traces (released as PDF files) of each chromosome using its proprietary software. If a researcher wishes to run the method in his or her own laboratory, Illumina can be contacted for a discussion of a potential licensing agreement and the associated terms and conditions. The formula for the NCDQ “selection” parameter is given in the text.
View Abstract

Navigate This Article