Supplementary Materials

The PDF file includes:

  • Materials and Methods
  • Fig. S1. Venn diagram showing overlap of observed and expected patient phenotypic features in 95 children diagnosed with 97 genetic diseases.
  • Fig. S2. Precision, recall, and F1 score of phenotypic features identified manually and by CNLP and OMIM.
  • Fig. S3. Flow diagram of the software components of the autonomous system for provisional diagnosis of genetic diseases by rWGS.
  • Table S1. Comparison of the analytic performance of standard and new library preparation and genome sequencing in retrospective samples.
  • Table S2. Comparison of the analytic performance of standard and rapid library preparation and genome sequencing methods in seven matched prospective samples.
  • Table S3. Characteristics of 16 children with genetic diseases used to train CNLP.
  • Table S4. Precision and recall of phenotypic features extracted by CNLP from EHRs in 10 children with genetic diseases.
  • Legends for tables S5 to S17
  • Table S18. Number of SVs shortlisted by MOON and rank of the causal variant in MOON in 11 children with genetic diseases.
  • Table S19. Summary statistics of provisional diagnoses reported for clinical rWGS.
  • Legends for data files S1 to S3

[Download PDF]

Other Supplementary Material for this manuscript includes the following:

  • Table S5 (Microsoft Excel format). Precision and recall of 26 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 201.
  • Table S6 (Microsoft Excel format). Precision and recall of 96 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 205.
  • Table S7 (Microsoft Excel format). Precision and recall of 95 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 213.
  • Table S8 (Microsoft Excel format). Precision and recall of 158 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 233.
  • Table S9 (Microsoft Excel format). Precision of 85 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 243.
  • Table S10 (Microsoft Excel format). Precision and recall of 90 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 6094.
  • Table S11 (Microsoft Excel format). Precision and recall of 96 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 6098.
  • Table S12 (Microsoft Excel format). Precision and recall of 83 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 6108.
  • Table S13 (Microsoft Excel format). Precision and recall of 44 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 7003.
  • Table S14 (Microsoft Excel format). Precision and recall of 71 phenotypic features extracted and proportion of OMIM clinical features detected by CNLP from the EHR of patient 7004.
  • Table S15 (Microsoft Excel format). The test cohort diagnosed manually by rWGS or rWES and interpreted retrospectively with an autonomous system.
  • Table S16 (Microsoft Excel format). Variant characteristics in rWGS or rWES of the 101 children with 105 genetic diseases.
  • Table S17 (Microsoft Excel format). Number of nucleotide variants shortlisted by MOON and rank of the causal variant in MOON in 84 children with 86 genetic diseases.
  • Data file S1 (Microsoft Excel format). Mapping of HPO terms to SNOMED CT expressions.
  • Data file S2 (Microsoft Excel format). Phenotypic features of 101 children with genetic diseases that were manually extracted by experts from the EHR.
  • Data file S3 (Microsoft Excel format). Phenotypic features of 101 children with genetic diseases that were automatically extracted from the EHR by CNLP.