Research ArticleCORONAVIRUS

Genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2

See allHide authors and affiliations

Science Translational Medicine  09 Dec 2020:
Vol. 12, Issue 573, eabe2555
DOI: 10.1126/scitranslmed.abe2555
  • Fig. 1 Phylogenetic-epidemiological reconstruction of SARS-CoV-2 infection clusters in Austria.

    (A) Number of acquired samples per district in Austria (top) and sampling dates of samples that underwent viral genome sequencing in this study (bottom), plotted in the context of all confirmed cases (red line) in Austria. (B) Connection of Austrian strains to global clades of SARS-CoV-2. Points indicate the regional origin of a strain in the time-resolved phylogenetic tree from 7666 randomly subsampled sequences obtained from GISAID including 345 Austrian strains sequenced in this study (left). Lines from global phylogenetic tree (left) to phylogenetic tree of all Austrian strains obtained in this study (right) indicate the phylogenetic relation and Nextstrain clade assignment of Austrian strains. Color schemes of branches represent Nextstrain clade assignment (left) or phylogenetic clusters of Austrian strains (right). (C) Phylogenetic tree of SARS-CoV-2 strains from Austrian patients with COVID-19 sequenced in this study. Phylogenetic clusters were identified on the basis of characteristic mutation profiles in viral genome sequences of SARS-CoV-2–positive cases in Austria. Cluster names indicate the most abundant location of patients based on epidemiological data. The circular color code indicates the epidemiological cluster assigned to patients based on contact tracing. (D) Mutation profiles of phylogenetic clusters identified in this study. Positions with characteristic mutations compared to reference sequence “Wuhan-Hu-1” (GenBank: MN908947.3) are highlighted in red. Details regarding the affected genes or genomic regions and the respective codon and amino acid change are given below the table. (E) Timeline of the emergence of strains matching the mutation profile of the Tyrol-1 cluster in the global phylogenetic analysis by geographical distribution with additional information from European phylogenetic reconstruction.

  • Fig. 2 Mutational analysis of fixed mutations in SARS-CoV-2 sequences.

    (A) Ratio of nonsynonymous to synonymous mutations in unique mutations identified in Austrian SARS-CoV-2 sequences. (B) Frequencies of synonymous and nonsynonymous mutations per gene or genomic region normalized to length of the respective gene, genomic region, or gene product (nsp1-16). (C) Mutational spectra panel. Mutational profile of interhost mutations. Relative probability of each trinucleotide change for mutations across SARS-CoV-2 sequences in 7666 global sequences obtained from GISAID samples plus 345 Austrian samples (top) or 345 SARS-CoV-2 sequences from Austrian patients with COVID-19 (bottom). (D) Mutation rate distribution along the SARS-CoV-2 genome. Top: A 1-kb window comparison of the observed number of synonymous mutations across the global subsample of 8011 SARS-CoV-2 sequences from GISAID compared with the expected distribution (based on 106 randomizations) according to their trinucleotide context. The gray line indicates the mean number of simulated mutations in the window, the colored background represents the distribution of expected mutations (mean ± SD), and red dots indicate a significant difference (G-test goodness of fit P < 0.01). Odds ratio in log2 scale of the observed compared with the expected number of synonymous mutations across the thirty 1-kb windows of the SARS-CoV-2 genome. Bottom: A zoom-in into the mutation rate across the first (left) and last (right) 1-kb windows. The comparisons were performed using ten 100–base pair windows. Gene annotations for SARS-CoV-2 genome are given below the top panel.

  • Fig. 3 Analysis of low-frequency mutations.

    (A) Number of variants detected across different sample types. (B) Number of variants per variant class. (C) Mutational profile (relative probability of each trinucleotide) of 7050 intrahost mutations across Austrian samples (allele frequencies between 0.02 and 0.05) (top). Mutational profile (relative probability of each trinucleotide) of 1,554,566 intrahost mutations across Austrian samples (allele frequencies <0.01) (bottom). (D) Analysis of the mutation rate (analogous to the interhost mutation rate panel) across the SARS-CoV-2 genome using 2527 intrahost nonprotein affecting mutations with allele frequencies between 0.02 and 0.5. (E) RNA secondary structure prediction of the upstream 300 nucleotides of the SARS-CoV-2 reference genome (NC 045512.2), comprising the complete 5′ untranslated region (UTR) and parts of the nsp1 protein nucleotide sequence. The canonical AUG start codon is located in a stacked region of SL5 (highlighted in gray). Mutational hotspots observed in the Austrian SARS-CoV-2 samples are highlighted: Two fixed mutations at positions 187 and 241, respectively, are marked in red, and low-frequency variants with an abundance between 0.02 and 0.5 in individual samples are shown in orange. Insertion and deletion variants are not shown.

  • Fig. 4 Dynamics of low-frequency and fixed mutations in superspreading clusters.

    (A) Percentage of samples sharing detected (≥0.02) mutations across genomic positions. For each of the 9391 positions harboring an alternative allele, the percentage of samples with high (≥0.50) or low [0.02, 0.50] frequency are reported in dark blue and orange, respectively. (B) Allele frequency of nonsynonymous mutation G > U at position 15,380 across samples in the phylogenetic cluster Tyrol-1. This variant has been observed both as low-frequency variant and as fixed mutation, the latter defining a phylogenetic subcluster (dark green). (C) Proportion of European samples with a reference (yellow) or alternative (blue) allele at position 15,380. (D) Allele frequency of synonymous mutation C > U at position 20,457 across samples of the Vienna-1 phylogenetic cluster. This variant is fixed and defines a phylogenetic subcluster (dark orange) as part of the broader Vienna-1 cluster. (E) Schematic representation of the transmission lines between epidemiological cluster A and cluster AL was reconstructed on the basis of results from deep viral sequencing and case interviews. The transmission scheme is overlaid with epidemiological clusters and family-related information.

  • Fig. 5 Impact of transmission bottlenecks and intrahost evolution on SARS-CoV-2 mutational dynamics.

    (A) Schematics of time-related patient interactions across epidemiological clusters A and AL. Each node represents a case, and links between the nodes are epidemiologically confirmed direct transmissions. Samples sequenced from the same individual are reported under the corresponding node. Cases corresponding to the same family are color coded accordingly. Additional families, unrelated to clusters A/AL, and their epidemiological transmission details are also reported. (B) Bottleneck size (number of virions that initiate the infection in an infectee) estimation across infector-infectee pairs based on the transmission network depicted in (A), ordered according to the timeline of cluster A for the respective pairs, and with a cutoff of [0.01, 0.95] for alternative allele frequency. For patients with multiple samples, the earliest sample was considered for bottleneck size inference. Centered dots are maximum likelihood estimates, with 95% confidence intervals. A star (*) for family 4 indicates that the transmission line was inferred as detailed in Materials and Methods. The histogram (yellow bars) of all the bottleneck values is provided on the right side of the graph. (C) Alternative allele frequency (y axis) of mutations across available time points (x axis) for patient 5. Only variants with frequencies ≥0.02 and shared between at least two time points are shown. Two mutations increasing in frequency are color coded. (D) Genetic distance values of mutation frequencies between infector-infectee pairs (A and B) (transmission chains) and intrapatient consecutive time points [(C) and fig. S5D]. Only variants detected in two same-patient samples were considered.

Supplementary Materials

  • stm.sciencemag.org/cgi/content/full/scitranslmed.abe2555/DC1

    Materials and Methods

    Fig. S1. Data overview.

    Fig. S2. Technical pipeline and controls.

    Fig. S3. Phylogenetic analysis of SARS-CoV-2 sequences from Austrian patients with COVID-19 in global context.

    Fig. S4. Bottleneck size estimations.

    Fig. S5. Viral intrahost diversity in individual patients.

    Data file S1. Sample and sequencing information of the 572 samples and controls.

    Data file S2. Acknowledgments for SARS-CoV-2 genome sequences derived from GISAID.

    Data file S3. Epidemiological clusters referred to in this study.

    Data file S4. Transmission chain and sample information for cluster A/cluster AL and family-related cases.

    Data file S5. Clinical information of patients with COVID-19 relating to Fig. 5 and fig S5.

    Reference (57)

  • The PDF file includes:

    • Fig. S1: Data overview.
    • Fig. S2. Technical pipeline and controls.
    • Fig. S3. Phylogenetic analysis of SARS-CoV-2 sequences from Austrian COVID-19 patients in global context.
    • Fig. S4: Bottleneck size estimations.
    • Fig. S5: Viral intra-host diversity in individual patients.

    [Download PDF]

    Other Supplementary Material for this manuscript includes the following:

    • Data File S1. Sample and sequencing information of the 572 samples and the controls.
    • Data File S2. Acknowledgements for SARS-CoV-2 genome sequences derived from GISAID.
    • Data File S3. Epidemiological clusters referred to in this study.
    • Data File S4. Transmission chain and sample information for ClusterA/ClusterAL and family-related cases.
    • Data File S5. Clinical information of patients with COVID-19 relating to Fig 5 and fig S5.

Stay Connected to Science Translational Medicine

Navigate This Article