Fig. 1. Experimental flow. A cohort of patients encompassing bacterial ARI, viral ARI, or noninfectious illness was used to develop classifiers of each condition. This combined ARI classifier was validated using leave-one-out cross-validation and compared to three published classifiers of bacterial versus viral infection. The combined ARI classifier was also externally validated in six publicly available data sets. In one experiment, healthy volunteers were included in the training set to determine their suitability as “no-infection” controls. All subsequent experiments were performed without the use of this healthy subject cohort.
Fig. 2. Evaluation of healthy adults as a no-infection control. Classifiers of bacterial ARI, viral ARI, and no infection as represented by healthy controls were generated and applied using leave-one-out cross-validation. Each patient, represented along the horizontal axis, is assigned three distinct probabilities: bacterial ARI (black triangle), viral ARI (blue circle), and no infection (green square). The group on the right represents subjects with noninfectious illness.
Fig. 3. Leave-one-out cross-validation. Classifiers of bacterial ARI, viral ARI, and no infection as represented by noninfectious illness were generated and applied using leave-one-out cross-validation. Each patient, represented along the horizontal axis, is assigned probabilities of having bacterial ARI (black triangle), viral ARI (blue circle), and noninfectious illness (red square). Patients clinically adjudicated as having bacterial ARI, viral ARI, or noninfectious illness are presented in the top, center, and bottom panels, respectively.
Fig. 4. Classifier performance in patients with coinfection defined by the identification of bacterial and viral pathogens. Bacterial and viral ARI classifiers were trained on subjects with bacterial (n = 22) or viral (n = 71) infection (GSE60244). This same data set also included 25 subjects with bacterial/viral coinfection. Bacterial and viral classifier predictions were normalized to the same scale. Each subject receives two probabilities: that of a bacterial ARI host response and a viral ARI host response. A probability score of 0.5 or greater was considered positive. Subjects 1 to 6 have a bacterial host response. Subjects 7 to 9 have both bacterial and viral host responses. Subjects 10 to 23 have a viral host response. Subjects 24 to 25 do not have bacterial or viral host responses.
- Table 1. Demographic information for the enrolled cohort as well as independent data sets used for external validation.
M, male; F, female; B, black; W, white; O, other/unknown. GSE numbers refer to National Center for Biotechnology Information (NCBI) Gene Expression Omnibus data sets. N/A, not available on the basis of published data.
Cohort No. of
subjects*Gender
(M/F)Mean age,
years (range)†Ethnicity
(B/W/O)Admitted No. of samples
(viral/bacterial/noninfectious illness)Enrolled derivation cohort 317 122/151 45 (6–88) 135/116/22 61% 115/70/88 Viral 115 44/71 45 (6–88) 40/59/16 21% Bacterial 70 35/35 49 (14–88) 46/22/2 94% Noninfectious illness‡ 88 43/45 49 (14–88) 49/35/4 88% Healthy 44 23/21 30 (20–59) 8/27/6§ 0% Validation cohorts Ramilo et al. (GSE6269) 113¶ 62/51 4 (0.04–36) 22/37/54 100% 28/85/0 Hu et al. (GSE40396) 43 25/18 14 (2–32) 16/25/2 N/A 35/8/0 Herberg et al. (GSE42026) 59 N/A Pediatric N/A 100% 18/41/0 Parnell et al. (GSE20346)|| 10 4/6 Adult N/A 100% 19/26/0 Bloom et al. (GSE42834) 103 N/A Adult N/A N/A 0/19/84 *Only subjects with viral, bacterial, or noninfectious illness were included (when available) from each validation cohort.
†When mean age was unavailable or could not be calculated, data are presented as either adult or pediatric.
‡Noninfectious illness was defined by the presence of systemic inflammatory response syndrome (SIRS) criteria, which include at least two of the following four features: temperature <36° or >38°C; heart rate >90 beats per minute; respiratory rate >20 breaths per minute or arterial partial pressure of CO2 <32 mmHg; and white blood cell count <4000 or >12,000 cells/mm3 or >10% band form neutrophils.
§Three subjects did not report ethnicity.
¶In the case of GSE6269, subjects with Illumina Sentrix Hu6 expression data were excluded because array results could not be readily compared. Eight viral and 15 bacterial infections comprised the 24 excluded cases.
||Subjects in the GSE20346 data set include serial sampling. The number of samples exceeds the number of subjects because serial samples were treated as independent tests in the validation experiments.
- Table 2. Performance characteristics of the derived ARI classifier.
A combination of the bacterial ARI, viral ARI, and noninfectious illness classifiers were validated using leave-one-out cross-validation in a population of bacterial ARI (n = 70), viral ARI (n = 115), or noninfectious illness (n = 88). Three published bacterial versus viral classifiers were identified and applied to this same population as comparators. Data are presented as number (%).
Clinical assignment Bacterial Viral Noninfectious illness Ramilo et al. Bacterial 54 (77.1) 4 (3.5) 12 (13.6) Classifier-predicted assignment Viral 6 (8.6) 101 (87.8) 12 (13.6) Noninfectious illness 12 (14.3) 12 (8.7) 64 (72.7) Hu et al. Bacterial 53 (75.7) 4 (3.5) 9 (10.2) Viral 9 (12.9) 104 (90.4) 9 (10.2) Noninfectious illness 8 (11.4) 7 (6.1) 70 (79.5) Parnell et al. Bacterial 51 (72.8) 8 (7.0) 11 (12.5) Viral 13 (18.6) 94 (81.7) 10 (11.4) Noninfectious illness 6 (8.6) 13 (11.3) 67 (76.1) Derived ARI classifier Bacterial 58 (82.8) 4 (3.4) 8 (9.0) Viral 9 (12.8) 104 (90.4) 4 (4.5) Noninfectious illness 3 (4.2) 7 (6.0) 76 (86.3) - Table 3. External validation of the ARI classifier (combined bacterial ARI, viral ARI, and noninfectious classifiers).
Five Gene Expression Omnibus data sets were identified on the basis of the inclusion of at least two of the relevant clinical groups: viral ARI, bacterial ARI, and noninfectious illness (SIRS).
Clinical assignment Bacterial Viral SIRS AUC GSE6269: Hospitalized children with
influenza A or bacterial infectionClassifier-predicted assignment Bacterial 84 1 0.95 Viral 2 26 GSE42026: Hospitalized children with
influenza H1N1/09, RSV (respiratory
syncytial virus), or bacterial infectionBacterial 15 3 0.90 Viral 6 35 GSE40396: Children with adenovirus,
HHV-6 (human herpesvirus 6),
enterovirus, or bacterial infectionBacterial 7 1 0.93 Viral 3 32 GSE20346: Hospitalized adults with
bacterial pneumonia or influenza ABacterial 26 0 0.99 Viral 1 18 GSE42834: Adults with bacterial
pneumonia, lung cancer, or sarcoidosisBacterial 18 3 0.99 SIRS 1 81
Supplementary Materials
www.sciencetranslationalmedicine.org/cgi/content/full/8/322/322ra11/DC1
Fig. S1. Positive and negative predictive values for (A) bacterial and (B) viral ARI classification as a function of prevalence.
Fig. S2. Validation of bacterial and viral ARI classifiers in GSE6269.
Fig. S3. Validation of bacterial and viral ARI classifiers in GSE42026.
Fig. S4. Validation of bacterial and viral ARI classifiers in GSE40396.
Fig. S5. Validation of bacterial and viral ARI classifiers in GSE20346.
Fig. S6. Validation of bacterial ARI and noninfectious illness classifiers in GSE42834.
Fig. S7. Treatment effect on bacterial ARI classification.
Fig. S8. Venn diagram representing overlap in the bacterial ARI, viral ARI, and noninfectious illness classifiers.
Table S1. Etiological causes of illness for subjects with viral ARI, bacterial ARI, and noninfectious illness.
Table S2. Summary of clinical features for the derivation cohort.
Table S3. Probes selected for the bacterial ARI, viral ARI, and noninfectious illness classifiers.
Table S4. Subjects with discordant predictions compared to clinical assignments.
Table S5. Genes in the bacterial ARI, viral ARI, and noninfectious illness classifiers grouped by biologic process.
Additional Files
- Supplementary Material for:
Host gene expression classifiers diagnose acute respiratory illness etiology
Ephraim L. Tsalik, Ricardo Henao, Marshall Nichols, Thomas Burke, Emily R. Ko, Micah T. McClain, Lori L. Hudson, Anna Mazur, Debra H. Freeman, Tim Veldman, Raymond J. Langley, Eugenia B. Quackenbush, Seth W. Glickman, Charles B. Cairns, Anja K. Jaehne, Emanuel P. Rivers, Ronny M. Otero, Aimee K. Zaas, Stephen F. Kingsmore, Joseph Lucas, Vance G. Fowler Jr., Lawrence Carin, Geoffrey S. Ginsburg,* Christopher W. Woods*
*Corresponding author. E-mail: geoffrey.ginsburg{at}duke.edu (G.S.G.); chris.woods{at}duke.edu (C.W.W.)
Published 20 January 2016, Sci. Transl. Med. 8, 322ra11 (2016)
DOI: 10.1126/scitranslmed.aad6873This PDF file includes:
- Fig. S1. Positive and negative predictive values for (A) bacterial and (B) viral ARI classification as a function of prevalence.
- Fig. S2. Validation of bacterial and viral ARI classifiers in GSE6269.
- Fig. S3. Validation of bacterial and viral ARI classifiers in GSE42026.
- Fig. S4. Validation of bacterial and viral ARI classifiers in GSE40396.
- Fig. S5. Validation of bacterial and viral ARI classifiers in GSE20346.
- Fig. S6. Validation of bacterial ARI and noninfectious illness classifiers in GSE42834.
- Fig. S7. Treatment effect on bacterial ARI classification.
- Fig. S8. Venn diagram representing overlap in the bacterial ARI, viral ARI, and noninfectious illness classifiers.
- Table S1. Etiological causes of illness for subjects with viral ARI, bacterial ARI, and noninfectious illness.
- Table S2. Summary of clinical features for the derivation cohort.
- Table S3. Probes selected for the bacterial ARI, viral ARI, and noninfectious illness classifiers.
- Table S4. Subjects with discordant predictions compared to clinical assignments.
- Table S5. Genes in the bacterial ARI, viral ARI, and noninfectious illness classifiers grouped by biologic process.