Supplementary Materials

The PDF file includes:

  • Fig. S1. Flowchart of the simulation study.
  • Fig. S2. Test characteristics of different ICD9 cutoffs for identification of RA cases using reviewed medical record data as the gold standard.
  • Fig. S3. Flowchart of patient selection in setting I.
  • Fig. S4. Flowchart of patient selection in setting II.
  • Fig. S5. Flowchart of patient selection in setting III.
  • Fig. S6. Flowchart of the medical record review procedure.
  • Fig. S7. Density plots of G-probabilities per disease.
  • Fig. S8. Precision recall curves.
  • Fig. S9. Sensitivity analysis of the performance of G-PROB per disease.
  • Fig. S10. Sensitivity analysis of the influence of individual diseases on G-PROB’s performance.
  • Fig. S11 Sensitivity analysis comparing different shrinkage factors.
  • Fig. S12. Test characteristics for the probabilities at different cutoffs.
  • Table S1. ICD9 and ICD10 codes used to identify patients in setting I (eMERGE).
  • Table S2. Patient characteristics in setting I.
  • Table S3. Patient characteristics in setting II.
  • Table S4. Patient characteristics in setting III.
  • Table S5. Area under the receiver operating curve per disease.
  • Table S6. McFadden’s R2 from multinomial logistic regression testing how much of the variance in the final disease diagnosis was explained by clinical, genetic, or serologic information.
  • Legends for data files S1 and S2

[Download PDF]

Other Supplementary Material for this manuscript includes the following:

  • Data file S1 (Microsoft Excel format). ORs of curated risk variants for RA, RAneg, SLE, PsA, SpA, and gout.
  • Data file S2 (Microsoft Excel format). Disease prevalence used in G-PROB per setting.