Research ArticleInfectious Disease

Host gene expression classifiers diagnose acute respiratory illness etiology

See allHide authors and affiliations

Science Translational Medicine  20 Jan 2016:
Vol. 8, Issue 322, pp. 322ra11
DOI: 10.1126/scitranslmed.aad6873
  • Fig. 1. Experimental flow.

    A cohort of patients encompassing bacterial ARI, viral ARI, or noninfectious illness was used to develop classifiers of each condition. This combined ARI classifier was validated using leave-one-out cross-validation and compared to three published classifiers of bacterial versus viral infection. The combined ARI classifier was also externally validated in six publicly available data sets. In one experiment, healthy volunteers were included in the training set to determine their suitability as “no-infection” controls. All subsequent experiments were performed without the use of this healthy subject cohort.

  • Fig. 2. Evaluation of healthy adults as a no-infection control.

    Classifiers of bacterial ARI, viral ARI, and no infection as represented by healthy controls were generated and applied using leave-one-out cross-validation. Each patient, represented along the horizontal axis, is assigned three distinct probabilities: bacterial ARI (black triangle), viral ARI (blue circle), and no infection (green square). The group on the right represents subjects with noninfectious illness.

  • Fig. 3. Leave-one-out cross-validation.

    Classifiers of bacterial ARI, viral ARI, and no infection as represented by noninfectious illness were generated and applied using leave-one-out cross-validation. Each patient, represented along the horizontal axis, is assigned probabilities of having bacterial ARI (black triangle), viral ARI (blue circle), and noninfectious illness (red square). Patients clinically adjudicated as having bacterial ARI, viral ARI, or noninfectious illness are presented in the top, center, and bottom panels, respectively.

  • Fig. 4. Classifier performance in patients with coinfection defined by the identification of bacterial and viral pathogens.

    Bacterial and viral ARI classifiers were trained on subjects with bacterial (n = 22) or viral (n = 71) infection (GSE60244). This same data set also included 25 subjects with bacterial/viral coinfection. Bacterial and viral classifier predictions were normalized to the same scale. Each subject receives two probabilities: that of a bacterial ARI host response and a viral ARI host response. A probability score of 0.5 or greater was considered positive. Subjects 1 to 6 have a bacterial host response. Subjects 7 to 9 have both bacterial and viral host responses. Subjects 10 to 23 have a viral host response. Subjects 24 to 25 do not have bacterial or viral host responses.

  • Table 1. Demographic information for the enrolled cohort as well as independent data sets used for external validation.

    M, male; F, female; B, black; W, white; O, other/unknown. GSE numbers refer to National Center for Biotechnology Information (NCBI) Gene Expression Omnibus data sets. N/A, not available on the basis of published data.

    CohortNo. of
    subjects*
    Gender
    (M/F)
    Mean age,
    years (range)
    Ethnicity
    (B/W/O)
    AdmittedNo. of samples
    (viral/bacterial/noninfectious illness)
    Enrolled derivation cohort317122/15145 (6–88)135/116/2261%115/70/88
      Viral11544/7145 (6–88)40/59/1621%
      Bacterial7035/3549 (14–88)46/22/294%
      Noninfectious illness8843/4549 (14–88)49/35/488%
      Healthy4423/2130 (20–59)8/27/6§0%
    Validation cohorts
      Ramilo et al. (GSE6269)11362/514 (0.04–36)22/37/54100%28/85/0
      Hu et al. (GSE40396)4325/1814 (2–32)16/25/2N/A35/8/0
      Herberg et al. (GSE42026)59N/APediatricN/A100%18/41/0
      Parnell et al. (GSE20346)||104/6AdultN/A100%19/26/0
      Bloom et al. (GSE42834)103N/AAdultN/AN/A0/19/84

    *Only subjects with viral, bacterial, or noninfectious illness were included (when available) from each validation cohort.

    †When mean age was unavailable or could not be calculated, data are presented as either adult or pediatric.

    ‡Noninfectious illness was defined by the presence of systemic inflammatory response syndrome (SIRS) criteria, which include at least two of the following four features: temperature <36° or >38°C; heart rate >90 beats per minute; respiratory rate >20 breaths per minute or arterial partial pressure of CO2 <32 mmHg; and white blood cell count <4000 or >12,000 cells/mm3 or >10% band form neutrophils.

    §Three subjects did not report ethnicity.

    ¶In the case of GSE6269, subjects with Illumina Sentrix Hu6 expression data were excluded because array results could not be readily compared. Eight viral and 15 bacterial infections comprised the 24 excluded cases.

    ||Subjects in the GSE20346 data set include serial sampling. The number of samples exceeds the number of subjects because serial samples were treated as independent tests in the validation experiments.

    • Table 2. Performance characteristics of the derived ARI classifier.

      A combination of the bacterial ARI, viral ARI, and noninfectious illness classifiers were validated using leave-one-out cross-validation in a population of bacterial ARI (n = 70), viral ARI (n = 115), or noninfectious illness (n = 88). Three published bacterial versus viral classifiers were identified and applied to this same population as comparators. Data are presented as number (%).

      Clinical assignment
      BacterialViralNoninfectious illness
      Ramilo et al.Bacterial54 (77.1)4 (3.5)12 (13.6)Classifier-predicted assignment
      Viral6 (8.6)101 (87.8)12 (13.6)
      Noninfectious illness12 (14.3)12 (8.7)64 (72.7)
      Hu et al.Bacterial53 (75.7)4 (3.5)9 (10.2)
      Viral9 (12.9)104 (90.4)9 (10.2)
      Noninfectious illness8 (11.4)7 (6.1)70 (79.5)
      Parnell et al.Bacterial51 (72.8)8 (7.0)11 (12.5)
      Viral13 (18.6)94 (81.7)10 (11.4)
      Noninfectious illness6 (8.6)13 (11.3)67 (76.1)
      Derived ARI classifierBacterial58 (82.8)4 (3.4)8 (9.0)
      Viral9 (12.8)104 (90.4)4 (4.5)
      Noninfectious illness3 (4.2)7 (6.0)76 (86.3)
    • Table 3. External validation of the ARI classifier (combined bacterial ARI, viral ARI, and noninfectious classifiers).

      Five Gene Expression Omnibus data sets were identified on the basis of the inclusion of at least two of the relevant clinical groups: viral ARI, bacterial ARI, and noninfectious illness (SIRS).

      Clinical assignment
      BacterialViralSIRSAUC
      GSE6269: Hospitalized children with
      influenza A or bacterial infection
      Classifier-predicted assignmentBacterial8410.95
      Viral226
      GSE42026: Hospitalized children with
      influenza H1N1/09, RSV (respiratory
      syncytial virus), or bacterial infection
      Bacterial1530.90
      Viral635
      GSE40396: Children with adenovirus,
      HHV-6 (human herpesvirus 6),
      enterovirus, or bacterial infection
      Bacterial710.93
      Viral332
      GSE20346: Hospitalized adults with
      bacterial pneumonia or influenza A
      Bacterial2600.99
      Viral118
      GSE42834: Adults with bacterial
      pneumonia, lung cancer, or sarcoidosis
      Bacterial1830.99
      SIRS181

    Supplementary Materials

    • www.sciencetranslationalmedicine.org/cgi/content/full/8/322/322ra11/DC1

      Fig. S1. Positive and negative predictive values for (A) bacterial and (B) viral ARI classification as a function of prevalence.

      Fig. S2. Validation of bacterial and viral ARI classifiers in GSE6269.

      Fig. S3. Validation of bacterial and viral ARI classifiers in GSE42026.

      Fig. S4. Validation of bacterial and viral ARI classifiers in GSE40396.

      Fig. S5. Validation of bacterial and viral ARI classifiers in GSE20346.

      Fig. S6. Validation of bacterial ARI and noninfectious illness classifiers in GSE42834.

      Fig. S7. Treatment effect on bacterial ARI classification.

      Fig. S8. Venn diagram representing overlap in the bacterial ARI, viral ARI, and noninfectious illness classifiers.

      Table S1. Etiological causes of illness for subjects with viral ARI, bacterial ARI, and noninfectious illness.

      Table S2. Summary of clinical features for the derivation cohort.

      Table S3. Probes selected for the bacterial ARI, viral ARI, and noninfectious illness classifiers.

      Table S4. Subjects with discordant predictions compared to clinical assignments.

      Table S5. Genes in the bacterial ARI, viral ARI, and noninfectious illness classifiers grouped by biologic process.

    • Supplementary Material for:

      Host gene expression classifiers diagnose acute respiratory illness etiology

      Ephraim L. Tsalik, Ricardo Henao, Marshall Nichols, Thomas Burke, Emily R. Ko, Micah T. McClain, Lori L. Hudson, Anna Mazur, Debra H. Freeman, Tim Veldman, Raymond J. Langley, Eugenia B. Quackenbush, Seth W. Glickman, Charles B. Cairns, Anja K. Jaehne, Emanuel P. Rivers, Ronny M. Otero, Aimee K. Zaas, Stephen F. Kingsmore, Joseph Lucas, Vance G. Fowler Jr., Lawrence Carin, Geoffrey S. Ginsburg,* Christopher W. Woods*

      *Corresponding author. E-mail: geoffrey.ginsburg{at}duke.edu (G.S.G.); chris.woods{at}duke.edu (C.W.W.)

      Published 20 January 2016, Sci. Transl. Med. 8, 322ra11 (2016)
      DOI: 10.1126/scitranslmed.aad6873

      This PDF file includes:

      • Fig. S1. Positive and negative predictive values for (A) bacterial and (B) viral ARI classification as a function of prevalence.
      • Fig. S2. Validation of bacterial and viral ARI classifiers in GSE6269.
      • Fig. S3. Validation of bacterial and viral ARI classifiers in GSE42026.
      • Fig. S4. Validation of bacterial and viral ARI classifiers in GSE40396.
      • Fig. S5. Validation of bacterial and viral ARI classifiers in GSE20346.
      • Fig. S6. Validation of bacterial ARI and noninfectious illness classifiers in GSE42834.
      • Fig. S7. Treatment effect on bacterial ARI classification.
      • Fig. S8. Venn diagram representing overlap in the bacterial ARI, viral ARI, and noninfectious illness classifiers.
      • Table S1. Etiological causes of illness for subjects with viral ARI, bacterial ARI, and noninfectious illness.
      • Table S2. Summary of clinical features for the derivation cohort.
      • Table S3. Probes selected for the bacterial ARI, viral ARI, and noninfectious illness classifiers.
      • Table S4. Subjects with discordant predictions compared to clinical assignments.
      • Table S5. Genes in the bacterial ARI, viral ARI, and noninfectious illness classifiers grouped by biologic process.

      [Download PDF]

    Stay Connected to Science Translational Medicine

    Navigate This Article