Research ArticleCORONAVIRUS

Using viral load and epidemic dynamics to optimize pooled testing in resource-constrained settings

See allHide authors and affiliations

Science Translational Medicine  14 Apr 2021:
Vol. 13, Issue 589, eabf1568
DOI: 10.1126/scitranslmed.abf1568
  • Fig. 1 Group testing designs for sample identification or prevalence estimation.

    In group testing, multiple samples are pooled, and tests are run on one or more pools. The results of these tests can be used for identification of positive samples (A and B) or to estimate prevalence (C). (A) In the simplest design for sample identification, samples are partitioned into nonoverlapping pools. In stage 1 of testing, a negative result (pool 2) indicates that each sample in that pool was negative, whereas a positive result (pool 1) indicates that at least one sample in the pool was positive. These putatively positive samples are subsequently individually tested in stage 2 to identify positive results. (B) In a combinatorial design, samples are included in multiple pools as shown in stage 1. All samples that were included in negative pools are identified as negative, and the remaining putatively positive samples that were not included in any negative test are tested individually in stage 2. (C) In prevalence estimation, samples are partitioned into pools. The pool measurement depends on the number and viral load of positive samples as well as the dilution factor. The (quantitative) results from each pool can be used to estimate the fraction of samples that would have tested positive, had they been tested individually.

  • Fig. 2 Viral kinetics model fits, simulated infection dynamics, and population-wide viral load kinetics.

    (A) Schematic of the viral kinetics and infection model. Individuals begin susceptible with no viral load (susceptible), acquire the virus from another infectious individual to become exposed but not yet infectious (exposed), experience an increase in viral load to become infectious and possibly develop symptoms (infected), and, last, either recover after viral waning or die (are removed). This process is simulated for many individuals. Symbols shown are tg as the time from infection to viral load crossing the limit of detection, tp as the time from first detectable viral load to peak, tinc as the time from infection to symptom onset, and tw as the time from symptom onset to loss of detectable viral load. (B) Model fits to time-varying viral loads in swab samples. The black dots show observed log10 RNA copies per swab, black bars show positive but unquantified swab samples, solid lines show posterior median estimates, dark shaded regions show 95% credible intervals (CIs) on model-predicted latent viral loads, and light shaded regions show 95% CI on simulated viral loads with added observation noise. The blue region shows viral loads before symptom onset, and the red region shows time after symptom onset. The horizontal dashed line shows the limit of detection. (C and E) Twenty-five and 500 simulated viral loads over time, respectively. The heatmap shows the viral load in each individual over time. The distribution of viral loads reflects the increase and subsequent decline of prevalence. We simulated from inferred distributions for the viral load parameters, thereby propagating substantial individual-level variability through the simulations. Marginal distributions of observed viral loads during the different epidemic phases are shown in fig. S4. (D) Simulated infection incidence and prevalence of virologically positive individuals from the SEIR model. Incidence was defined as the number of new infections per day divided by the population size. Prevalence was defined as the number of individuals with a viral load of >100 (log10 viral load >2) in the population divided by the population size on a given day.

  • Fig. 3 Estimating prevalence from a small number of pooled tests.

    In prevalence estimation, a total of N individuals are sampled and partitioned into b pools (with n = N/b samples per pool). The true prevalence in the entire population varies over time with epidemic spread. Population prevalences shown here are during the epidemic growth phase. (A) Estimated prevalence (y axis) and true population prevalence (x axis) using 100 independent trials sampling N individuals at each day of the epidemic. Each facet shows a different pooling design (additional pooling designs shown in fig. S1). Dashed gray lines, one divided by the sample size, N. (B) For a given true prevalence (top label, red points and horizontal dashed red line), estimation error is introduced both through binomial sampling of positive samples (prevalence in sample) and inference on the sampled viral loads (estimated prevalence, blue boxes). Each set of three connected dots shows one simulation, with points slightly jittered on the x axis for visibility. Horizontal lines indicate accurate inference. The orange line shows the median across 100 simulations. Each panel shows the results from a single pooling design at the specified true prevalence. Sampling variation is a bigger contributor to error at low prevalence and low sample sizes. When prevalence is less than one divided by N (gray-shaded panels), inference is less accurate due to the high probability of sampling only negative individuals or inclusion of false positives.

  • Fig. 4 Group testing for sample identification.

    We evaluated a variety of group testing designs for sample identification (table S1) based on sensitivity (A), efficiency (B), total number of positive samples identified (C), and the fold increase in positive samples identified relative to individual testing (D). (A and B) The average sensitivity (A) (y axis, individual points and spline) and average number of tests needed to identify individual positive samples (B) (y axis) using different pooling designs (individual lines) were measured over days 20 to 110 in our simulated population, with results plotted against prevalence (x axis, log scale). Results show the average of 200,000 trials, with individuals selected at random on each day in each trial. Pooling designs are separated by the number of samples tested on a daily basis (individual panels), the number of pools (color), and the number of pools into which each sample is split (dashed versus solid line). Solid teal line indicates results for individual testing. Note that the average number of test kits consumed increases with prevalence due to a greater number of positive pools requiring confirmatory testing. (C) Every design was evaluated under constraints on the maximum number of samples collected (columns) and average number of reactions that can be run on a daily basis (rows) over days 40 to 90. Text in each box indicates the optimal design for a given set of constraints [number of samples per batch (N), number of pools (b), number of pools into which each sample is split (q), and average number of total samples screened per day]. Color indicates the average number of samples screened on a daily basis using the optimal design. Arrows indicate that the same pooling design is optimal at higher sample collection capacities due to testing constraints. (D) Fold increase in the number of positive samples identified relative to individual testing with the same resource constraints. Error bar shows range among optimal designs.

  • Fig. 5 Experimental validation of simple and combinatorial pooling.

    (A) Five pools (columns of matrix), each consisting of 24 nasopharyngeal swab samples (rows of matrix: 23 negative samples per pool and 1 positive, with viral load indicated on right) were tested by viral extraction and RT-qPCR. Pooled results indicated as negative (blue), inconclusive (yellow), or positive (red). (B) Six combinatorial pools (columns) of 48 samples (rows: 47 negative and 1 positive with viral load of 12,300) were tested as above. Pools 1, 2, and 4 tested positive. Arrows indicate two samples that were in pools 1, 2, and 4: sample 32 (negative) and sample 48 (positive). (C) Previously tested de-identified samples were pooled using a combinatorial design with 96 samples, 6 pools, and 2 pools per sample. Thirty positive samples were randomly distributed across 10 batches of the design. Viral RNA extraction and RT-qPCR were performed on each pool, with the results used to identify potentially positive samples. (D) Samples were pooled according to a simple design (48 pools with 48 samples per pool). Twenty-four positive samples were randomly distributed among the pools (establishing 1% prevalence). The pooled test results were used with an MLE procedure to estimate prevalence (0.87%), and bootstrapping was used to estimate a 95% confidence interval (0.52 to 1.37%).

Supplementary Materials

  • stm.sciencemag.org/cgi/content/full/scitranslmed.abf1568/DC1

    Fig. S1. True prevalence against maximum likelihood estimates.

    Fig. S2. Prevalence estimation can depend on training and application period.

    Fig. S3. Sensitivity of sample identification relative to dilution factor and time since peak viral load.

    Fig. S4. Simulated viral loads.

    Fig. S5. Group testing for sample identification during epidemic decline.

    Fig. S6. Effectiveness of optimal testing design under resource constraints at high prevalence.

    Fig. S7. Effectiveness of optimal testing design under resource constraints using sputum data.

    Fig. S8. Evaluation of pooled testing in a sustained, multiwave epidemic.

    Fig. S9. Evaluation of pooled testing for sample identification in the multiwave epidemic shown in fig. S8A.

    Fig. S10. Model fits to swab viral loads.

    Fig. S11. Posterior distributions of estimated parameters fitted to swab and sputum data.

    Fig. S12. Markov chain Monte Carlo trace plots from fitting to swab and sputum data.

    Fig. S13. qPCR calibration curve using standard viral RNA copies.

    Table S1. List of all group test designs for sample identification.

    Table S2. Ct values from qPCR on pooled samples with variable viral load.

    Table S3. Positive sample distribution within validation pools.

    Table S4. Pool design for combinatorial test with 96 samples.

    Table S5. Description of all parameters used in the viral kinetics and transmission models.

    Table S6. RT-qPCR results for pooling validations.

    Data file S1. Ninety-six–sample pooling template (Excel).

  • The PDF file includes:

    • Materials and methods
    • Fig. S1. Population prevalence (left) or prevalence in sample (right) against maximum likelihood prevalence estimates.
    • Fig. S2. Prevalence estimation can depend on training and application period.
    • Fig. S3. Sensitivity of sample identification relative to dilution factor and time since peak viral load.
    • Fig. S4. Simulated viral loads.
    • Fig. S5. Group testing for sample identification during epidemic decline.
    • Fig. S6. Effectiveness of optimal testing design under resource constraints at high prevalence.
    • Fig. S7. Effectiveness of optimal testing design under resource constraints using sputum data.
    • Fig. S8. Evaluation of pooled testing in a sustained, multi-wave epidemic.
    • Fig. S9. Evaluation of pooled testing for sample identification in the multi-wave epidemic shown in fig. S8A.
    • Fig. S10. Model fits to swab viral loads.
    • Fig. S11. Posterior distributions of estimated parameters fitted to swab and sputum data.
    • Fig. S12. Markov chain Monte Carlo trace plots from fitting to swab and sputum data.
    • Fig. S13. qPCR calibration curve using standard viral RNA copies.
    • Table S1. List of all group test designs for sample identification.
    • Table S2. Cycle threshold values from qPCR on pooled samples with variable viral load.
    • Table S3. Positive sample distribution within validation pools.
    • Table S4. Pool design for combinatorial test with 96 samples.
    • Table S5. Description of all parameters used in the viral kinetics and transmission models.
    • Table S6. RT-qPCR results for pooling validations.
    • References (4345)

    [Download PDF]

    Other Supplementary Material for this manuscript includes the following:

Stay Connected to Science Translational Medicine

Navigate This Article