Research ArticleMalaria

Integrated pathogen load and dual transcriptome analysis of systemic host-pathogen interactions in severe malaria

See allHide authors and affiliations

Science Translational Medicine  27 Jun 2018:
Vol. 10, Issue 447, eaar3619
DOI: 10.1126/scitranslmed.aar3619

Uncomplicating malaria

Plasmodium falciparum infections can progress to severe malaria, resulting in organ failure and life-threatening hematological or metabolic abnormalities. This process is sometimes thought to result from an aberrant host immune response. Lee et al. sequenced patient and parasite transcriptomes in 46 P. falciparum–infected Gambian children to better understand malarial host-pathogen interactions. They found evidence that the immune response in severe compared to uncomplicated malaria was not necessarily dysregulated but instead an expected response to high pathogen load and also implicated neutrophils in this response. This transcriptomic resource may prove useful in future studies of severe malaria and the interaction between malaria host and parasite.


The pathogenesis of infectious diseases depends on the interaction of host and pathogen. In Plasmodium falciparum malaria, host and parasite processes can be assessed by dual RNA sequencing of blood from infected patients. We performed dual transcriptome analyses on samples from 46 malaria-infected Gambian children to reveal mechanisms driving the systemic pathophysiology of severe malaria. Integrating these transcriptomic data with estimates of parasite load and detailed clinical information allowed consideration of potentially confounding effects due to differing leukocyte proportions in blood, parasite developmental stage, and whole-body pathogen load. We report hundreds of human and parasite genes differentially expressed between severe and uncomplicated malaria, with distinct profiles associated with coma, hyperlactatemia, and thrombocytopenia. High expression of neutrophil granule–related genes was consistently associated with all severe malaria phenotypes. We observed severity-associated variation in the expression of parasite genes, which determine cytoadhesion to vascular endothelium, rigidity of infected erythrocytes, and parasite growth rate. Up to 99% of human differential gene expression in severe malaria was driven by differences in parasite load, whereas parasite gene expression showed little association with parasite load. Coexpression analyses revealed interactions between human and P. falciparum, with prominent co-regulation of translation genes in severe malaria between host and parasite. Multivariate analyses suggested that increased expression of granulopoiesis and interferon-γ–related genes, together with inadequate suppression of type 1 interferon signaling, best explained severity of infection. These findings provide a framework for understanding the contributions of host and parasite to the pathogenesis of severe malaria and identifying new treatments.


Most studies of infectious disease pathogenesis focus on either host or pathogen, despite the fact that outcome is determined by their interaction. Dual RNA sequencing has been developed as a method for transcriptomic assessment of such interactions (1, 2), although it has not been widely applied to study systemic infection in humans. In malaria, the pathogenic stage of the parasite is restricted to blood where important interactions between parasites and host leukocytes can be assessed by dual RNA sequencing (3). The blood is also the conduit for systemic responses to infection, and gene expression in blood will reflect the inflammatory and metabolic milieu to which leukocytes and parasites are exposed.

Plasmodium falciparum malaria is one of the most important infectious diseases affecting humans (3). Most malaria deaths occur in children, in whom three major syndromes are associated with increased risk of death and distinguish severe malaria from uncomplicated malaria: (i) cerebral malaria (manifesting as coma), (ii) hyperlactatemia/acidosis (often manifesting as deep breathing), and (iii) severe anemia (36). These are usually accompanied by the laboratory finding of low platelet count (thrombocytopenia) (7). Severe malaria syndromes can occur in isolation or in overlapping combinations (4), and mortality is highest when cerebral malaria and hyperlactatemia/acidosis coexist (4). Severe malaria is most likely when there is a high parasite load (811) and numerous accompanying pathophysiological derangements have been described (4, 5, 12), broadly arising from inflammation, vascular endothelial dysfunction, and parasite sequestration (accumulation of parasitized erythrocytes in small blood vessels that obstruct blood flow) (4, 13). These mechanisms interact with one another, so defining their individual contributions to specific features of severe malaria is challenging (13, 14).

The host immune response is a major determinant of outcome in rodent models of severe malaria (15, 16), and it has long been supposed that an excessive host response may also contribute to some forms of human severe malaria (17, 18). A similar concept exists to explain severity in other infections such as bacterial sepsis, Ebola, and respiratory syncytial virus (1921); however, direct evidence is often lacking, and the confounding effect of pathogen load on the magnitude of the host response is rarely quantified. Controlled human infection models have provided insights into the relationship between pathogen load and early immune responses to infection (22) but cannot be extended to investigate severe disease. To better understand the pathogenesis of severe infection, a systemic, integrated view of host-pathogen interaction accounting for variation in pathogen load is required.

The feasibility of host and parasite dual RNA sequencing in malaria has been demonstrated in individuals with uncomplicated malaria (23). Here, we extend the application of dual RNA sequencing to infectious disease pathogenesis by integrating gene expression analysis with detailed clinical and laboratory data that characterize the systemic pathophysiology of severe malaria. We further refine the analysis by accounting for three major confounders, which may vary within and between severity groups. This allows us to characterize human and parasite differential gene expression between severe and uncomplicated malaria and the role of parasite load in determining the host response. These data provide a unique insight into host-pathogen interactions associated with severity of infection in humans and reveal new perspectives on the likely pathogenic mechanisms of human severe malaria.


Dual RNA sequencing and adjustment for cellular heterogeneity

We performed dual RNA sequencing on whole blood, collected before treatment, from 46 Gambian children with P. falciparum uncomplicated malaria (n = 21) and severe malaria (n = 25; Table 1). These children were recruited from a region with relatively low malaria transmission and consistent with this epidemiology (4); the severe malaria group contained children with cerebral malaria, hyperlactatemia, or a combination thereof but no cases of severe anemia. After exclusion of parasite var, rifin, stevor (14), and other highly polymorphic regions for which reference genome–based mapping is not possible, we obtained medians of 26.6 million human uniquely mapped reads from each subject (26.6 million severe malaria, 26.7 million uncomplicated malaria; Mann-Whitney test, P = 0.913) and 9.61 million parasite uniquely mapped reads (10.3 million severe malaria, 5.03 million uncomplicated malaria; Mann-Whitney test, P = 0.346) (Fig. 1A). We detected expression of 12,253 human and 3880 parasite genes. Commensurate with the high parasitemias seen in these children (Table 1), parasite read depth was considerably greater than in a previous study of Indonesians with uncomplicated malaria (23).

Table 1 Characteristics of study subjects (n = 46).

CM, cerebral malaria; CH, cerebral malaria plus hyperlactatemia; HL, hyperlactatemia (CM, CH, and HL are all subgroups of severe malaria); UM, uncomplicated malaria; PfHRP2, P. falciparum histidine-rich protein 2; Hb, hemoglobin concentration; WBC, white blood cell count. Data are median [interquartile range (IQR)], and superscripts indicate the number of subjects with data for each variable if less than the total. P for Kruskal-Wallis test comparing all groups (df = 3) except for sex where P is for Fisher’s exact test. Leukocyte population numbers and proportions were measured by a clinical hematology analyzer.

View this table:
Fig. 1 Whole-blood dual RNA sequencing and deconvolution.

(A) Uniquely mapped reads from human (red) and P. falciparum (blue) from subjects with severe malaria (SM; n = 25) and uncomplicated malaria (UM; n = 21). (B and C) Heatmaps showing signature gene expression in counts per million (CPM) for different leukocyte (B) and parasite developmental stage (C) populations and their relative intensity in individual subjects with SM, including different SM phenotypes (CH, cerebral malaria plus hyperlactatemia; CM, cerebral malaria; HL, hyperlactatemia), and UM. (D) Surrogate proportion variables (SPVs) for parasite developmental stages compared between severe malaria and uncomplicated malaria using the Mann-Whitney test (bold line, box, and whiskers indicate median, IQR, and up to 1.5 times the IQR from the lower and upper ends of the box, respectively). (E and F) Principal component (PC) plots showing the effect of deconvolution on the segregation of subjects with UM and SM, adjusting human (E) and parasite (F) gene expression for differences in proportions of leukocytes or parasite developmental stages, respectively. Analyses of human gene expression (B and E): SM, n = 25; UM, n = 21. Analyses of parasite gene expression (C, D, and F): SM, n = 23; UM, n = 20.

Systemic infection provokes changes in blood leukocyte subpopulations, which can dominate changes in gene expression and confound their interpretation (24). Among our study subjects, there were significant differences between clinical groups in the proportions of neutrophils (Kruskal-Wallis test, P = 0.01) and neutrophil counts (Kruskal-Wallis test, P = 0.05) in blood (Table 1). We performed gene signature–based deconvolution (25) to estimate heterogeneity in the contribution of the major leukocyte subpopulations to the RNA in our samples (Fig. 1B and fig. S1). Parasite gene expression in vivo is also dominated by the mixture of parasite developmental stages at the time of sampling because there is phasic variation in gene expression (26), and total RNA content increases during the intraerythrocytic developmental cycle (27). Therefore, we also applied the deconvolution approach with reference gene signatures derived from highly synchronous parasite cultures (26, 28) to identify the contribution of parasites at different developmental stages (Fig. 1C). Because the method was developed and validated for distinct human cell types, we confirmed the effectiveness of estimation of parasite developmental stage mixture by comparison with previously proposed stage-specific marker genes (29) and by assessment of performance in synthetic data sets of known composition (fig. S2). We compared relative contributions of parasite developmental stages between severe and uncomplicated malaria samples and observed a trend toward greater contributions from late-stage asexual parasites and gametocytes in children with severe malaria (Fig. 1D), consistent with previous reports (30).

To remove the confounding effect of heterogeneity in leukocyte and parasite mixtures, we adjusted gene expression values for the proportions of detected cell types, essentially allowing us to compare gene expression as if all subjects had the same leukocyte and parasite population compositions. Adjustment for heterogeneity in the mixture of leukocytes and parasite developmental stages improved segregation of severe and uncomplicated malaria cases [Fig. 1, E and F; multivariate analysis of variance (MANOVA), P = 0.0013 and P = 0.00012 (for human gene expression) and P = 0.0049 and P = 0.00019 (for parasite gene expression), before and after adjustment, respectively]. Therefore, we used adjusted gene expression values for all subsequent analyses.

Gene expression associated with severe malaria and parasite load

To identify differentially expressed genes between clinical groups and to identify gene expression associated with continuous variable severity features, we used a generalized linear model approach incorporating leukocyte populations or parasite stage as covariates, and we considered a false discovery rate (FDR)–adjusted P < 0.05 as significant. Considering all subjects, there were 770 human significantly differentially expressed genes between severe and uncomplicated malaria (Fig. 2A and table S1). Genes more highly expressed in severe malaria versus uncomplicated malaria most notably included MMP8 (matrix metallopeptidase 8), OLFM4 (olfactomedin 4), DEFA3 (defensin A3), and ELANE (neutrophil elastase), all encoding neutrophil granule proteins (31). Given our adjustment for cellular heterogeneity, these results likely reflect a true increase in transcription of these genes rather than a greater proportion of neutrophils in the blood, and differences remained when absolute neutrophil counts were multiplied by expression (Fig. 2B). We performed gene ontology (GO) analyses to better understand the biological functions of the differentially expressed genes (Fig. 2C) and identified enrichment of cotranslational protein targeting, cell motility, and immune response functions (table S2). We used Ingenuity Pathway Analysis (QIAGEN Bioinformatics) to predict upstream regulators of the differentially expressed genes, and colony-stimulating factor 3 [CSF3; also known as granulocyte colony-stimulating factor (GCSF)], Fas cell surface death receptor, and prostaglandin E receptor 2 signaling were among the most overrepresented (table S3). We repeated similar analyses to identify and interpret genes differentially expressed between uncomplicated malaria and different clinical phenotypes of severe malaria (fig. S3 and tables S1 and S2), which revealed many consistent associations but also some notable differences. For example, the number of differentially expressed genes substantially increased when comparing the subgroup of subjects with cerebral malaria plus hyperlactatemia (the most severe phenotype; n = 12) versus uncomplicated malaria (n = 21), possibly reflecting their greater severity and homogeneity of disease (table S1 and fig. S3). The most highly expressed genes in these patients relative to uncomplicated malaria included genes for neutrophil granules and heat shock proteins.

Fig. 2 Association of gene expression with features of severe malaria and parasite load.

(A) Volcano plot showing extent and significance of up- or down-regulation of human gene expression in severe malaria (SM) compared with uncomplicated malaria (UM) (red and blue, P < 0.05 after Benjamini-Hochberg adjustment for FDR; orange and blue, absolute log2 fold change (log2FC) in expression > 1; SM, n = 25; UM, n = 21). (B) Comparison of selected neutrophil-related gene expression multiplied by absolute neutrophil count in blood between SM (n = 21) and UM (n = 20) [bold line, box, and whiskers indicate median, IQR, and up to 1.5 times IQR from the lower and upper ends of the box, respectively; units are fragments per kilobase of transcript per million mapped reads/liter (FPKM/L); P for Mann-Whitney test]. (C) Heatmap comparing enrichment of GO terms for human genes significantly differentially expressed between severe malaria and uncomplicated malaria or significantly associated with blood lactate, platelet count, or BCS. TAP, transporter associated with antigen processing. (D) P. falciparum differential gene expression in severe malaria compared to uncomplicated malaria [color coding as in (A); SM, n = 23; UM, n = 20]. (E) Heatmap comparing enrichment of GO terms for parasite genes significantly differentially expressed between severe malaria and uncomplicated malaria and or significantly associated with blood lactate or BCS. (F) Heatmap comparing GO terms for human genes significantly associated with log parasite density and log PfHRP2.

To gain a greater insight into specific pathophysiological processes, we examined the quantitative association of gene expression with clinical and laboratory parameters (Table 1) that characterize specific aspects of severe malaria pathophysiology: consciousness level [using the Blantyre coma scale (BCS)], blood lactate concentration, platelet count, and hemoglobin concentration (table S4). Seven hundred thirty-eight genes were significantly associated with BCS level (using ordinal regression with FDR P < 0.05; table S4), and decreasing consciousness level (lower BCS) was associated with both higher expression of genes involved in the cell cycle and lower expression of genes involved in major histocompatibility complex (MHC) class I antigen presentation and interferon-γ (IFN-γ) signaling (Fig. 2C and table S2). Predicted upstream regulators included estrogen receptor 1 and transglutaminase 2 (table S3). One thousand twelve human genes were associated with lactate concentration (table S4), among which immune response pathways were again prominent, but a negative association between lactate and type 1 IFN signaling was particularly notable (Fig. 2C and table S2), and the most strongly predicted upstream regulators were IFN-γ, IFN-α, and tumor necrosis factor (TNF; table S3). One hundred seventy-eight genes were associated with platelet count (table S4), and the most enriched pathways differed considerably from those in the preceding analyses (Fig. 2C and table S2), with nucleosome assembly (predominantly histone genes), coagulation, and response to wounding genes all negatively correlated, and the most strongly predicted upstream regulators being interleukin 13, retinoblastoma transcriptional corepressor 1, and interleukin 1 receptor antagonist (table S3). No human genes correlated with hemoglobin concentration. Together, these results identify common transcriptional features of severe malaria but also implicate distinct mechanisms underlying the different pathophysiological processes that can occur in severe malaria.

There were 236 parasite genes differentially expressed between severe and uncomplicated malaria (Fig. 2D and table S5). The parasite gene with highest expression in severe relative to uncomplicated malaria was PF3D7_1016300, a gene that encodes a glycophorin-binding protein (GBP) expressed in the cytoplasm of infected erythrocytes and influences adhesion and rigidity of the red cell (32, 33). The most down-regulated parasite gene (with known function) in severe relative to uncomplicated malaria was PF3D7_1222600, which encodes the AP2 (Apetala2) domain transcription factor AP2-G. This protein controls the balance between gametocytogenesis and asexual replication, and knockout of the ortholog in Plasmodium berghei ANKA enhances the in vivo asexual parasite growth rate (34, 35). As with human gene expression, there were more differentially expressed genes in the comparison of uncomplicated malaria versus the cerebral malaria plus hyperlactatemia group (fig. S3 and table S5). Here, the most differentially expressed parasite genes included PF3D7_0202000 (knob-associated histidine-rich protein), PF3D7_1016300 (GBP), PF3D7_0201900 (erythrocyte membrane protein 3), and PF3D7_0424600 (Plasmodium helical interspersed subtelomeric-b, protein), all of which encode proteins that interact with the erythrocyte cytoskeleton to influence cytoadhesion and deformability of the infected erythrocyte, making them plausible determinants of severity (33). Parasite genes differentially expressed in severe compared to uncomplicated malaria were enriched in specific biological functions including RNA processing, protein transport, and hemoglobin catabolism (Fig. 2E and table S6).

Four hundred forty-five parasite genes were significantly associated with BCS level (using ordinal regression with FDR P < 0.05), and those most significantly associated with lower BCS were PF3D7_0919800 (TLD domain–containing protein), PF3D7_1133700 (forkhead-associated domain–containing protein), and PF3D7_1408200 (AP2 domain transcription factor AP2-G2), the latter two being important determinants of asexual parasite growth rate (table S7) (35, 36). The most enriched functions associated with BCS included transport, hemoglobin catabolism, and prenylation (Fig. 2D). One hundred parasite genes associated with lactate (table S7). The most significant (FDR P = 2.4 × 10−6) was PF3D7_0201900 (encoding erythrocyte membrane protein 3), consistent with infected erythrocyte rigidity and cytoadhesion (32, 37, 38) being important determinants of microvascular obstruction and hyperlactatemia (39). Pathway enrichments among these genes differed from those most associated with BCS and included membrane docking and ribosomal RNA (rRNA) processing (Fig. 2D and table S6). Few parasite genes were associated with hemoglobin concentration or platelet count (table S7). Together, these findings indicate that different patterns of parasite gene expression are associated with, and may therefore contribute to, specific aspects of host pathophysiology.

To establish whether changes in parasite gene expression might be the cause or consequence of severe malaria, we tested the effect of hyperlactatemia on parasite gene expression in vitro. Sixty-one genes were differentially expressed between lactate-supplemented (n = 4) and control (n = 5) early ring-stage parasite cultures, particularly enriched in genes associated with transcription and RNA processing (tables S5 and S6). Two of the genes most highly induced by lactate supplementation were PF3D7_1016300 (GBP) and PF3D7_0202000 (knob-associated histidine-rich protein), genes that were also highly expressed in the cerebral malaria plus hyperlactatemia phenotype. This suggests that lactate may influence the virulence phenotype of parasites, consistent with a recent report that Plasmodium can sense and respond to the host metabolic environment (40).

Previous studies have shown a correlation between the expression levels of host genes and circulating parasitemia (41, 42). Parasite load differed between our subjects with severe and uncomplicated malaria (Table 1), and we were interested to determine the extent to which this explained the differences in whole-blood gene expression. Peripheral blood parasite quantification (parasite density) underestimates the total number of parasites in the body because of sequestration of parasites in the microvasculature (13, 14). The soluble parasite protein PfHRP2 has been used as a plasma biomarker of total parasite load (circulating plus sequestered parasites) and is more strongly associated with severity (8, 9, 11) and death (8, 9). We examined the association of host and parasite gene expression with both circulating parasite density and PfHRP2 (restricting comparisons to subjects with data for both). We found 1886 human genes significantly (FDR P < 0.05) correlated with log parasite density and 616 significantly correlated with log PfHRP2 (102 common to both), whereas only 2 and 10 parasite genes were significant in the corresponding analyses (none common to both; tables S4 and S7). Human genes correlated with log parasite density were particularly enriched in pathways related to translation (especially exported proteins), oxidative phosphorylation, and antigen presentation (Fig. 2F and table S2), with predicted upstream regulation by RICTOR [Regulatory-associated protein of mammalian target of rapamycin (MTOR)]–independent companion of MTOR complex 2), hepatocyte nuclear factor 4 α (HNF4A), and X-box binding protein 1 (table S3). Genes correlated with log PfHRP2 were enriched in innate immune response functions (Fig. 2F and table S2), and the most strongly predicted upstream regulators were IFN-γ, transglutaminase 2, and IFN-α2 (table S3). These findings suggest that the nature of the systemic host response is associated with the localization of parasites.

We next asked to what extent the differences in gene expression between severe and uncomplicated malaria phenotypes were dependent on parasite load. Restricting analyses to subjects with both parasite density and PfHRP2 measurements, the number of human genes differentially expressed in severe versus uncomplicated malaria remained almost unchanged after adjustment for parasite density but was reduced by 98.6% after adjustment for PfHRP2, whereas parasite genes differentially expressed changed much less after either of the same adjustments (Table 2 and tables S1 and S5). Findings were similar when adjusting for parasite load in comparisons of each of the severe malaria subtypes versus uncomplicated malaria (tables S1 and S5). Repeating this analysis on an independent data set of human microarray gene expression in Malawian children with cerebral malaria (43) revealed 994 differentially expressed genes (FDR P < 0.05) between children with (n = 55) and without (n = 17) malaria-associated retinopathy (table S1), with 608 (61%) genes found to be differentially expressed after adjustment for parasite density, and none differentially expressed after adjustment for PfHRP2.

Table 2 Numbers of differentially expressed genes before and after adjustment for parasite load.

Number of genes associated with severity category or laboratory marker of severity before and after adjustment for parasite load (% of number in unadjusted analysis where applicable). SM, severe malaria; UM, uncomplicated malaria.

View this table:

Together, these findings suggest that total body parasite load, as represented by PfHRP2, is a dominant determinant of host gene expression in malaria, particularly of inflammatory and immune response genes, and differences in total body parasite load drive most of the human gene expression differences between severe and uncomplicated malaria. However, if genes remain associated with severity after adjustment for parasite load, then this may indicate intrinsic variation in the host response, which determines susceptibility to severe disease. In our data set, only 13 genes remained significant (FDR P < 0.05) after adjustment for PfHRP2 (Table 2 and table S1). Of particular interest among these, MMP8 encodes the metallopeptidase MMP8 (also known as collagenase 1), which causes endothelial barrier damage in several infection models (44, 45), and AZI2 encodes 5-azacytidine induced 2 (also known as nuclear factor κB–activating kinase-associated protein 1) (46), a regulator of the type 1 IFN response, a pathway that is known to control severity of disease in rodent malaria models (47), whereas CX3CR1 encodes the receptor for fractalkine [a biomarker of cerebral malaria in humans (48)], expressed on a subset of monocytes that are particularly efficient at killing malaria parasites (49), and controls the trafficking of monocytes during inflammation (50).

Parasite load was also a major driver of the associations between human gene expression and BCS level, lactate, and platelet count, although platelet count–associated genes were relatively less dependent on parasite load (Table 2). The few remaining genes significantly associated with lactate after adjustment for PfHRP2 included PKM (encoding the glycolytic enzyme pyruvate kinase M) and GYS1 (encoding the glycogenic enzyme glycogen synthase 1; table S4), suggesting that hyperlactatemia is partly associated with parasite load–independent variation in control of host glucose metabolism.

Coexpression networks of host and parasite genes

The expression of genes with common functional roles is often correlated and can be identified through coexpression network analysis (51). We applied this methodology to paired host and parasite gene expression data from each individual (without previous adjustment for parasite load) to identify coexpressed groups of genes from either or both species. We term these groups of genes “modules,” and we named each module according to the “hub gene,” which has the greatest connectivity to other genes within the module. First, we analyzed all subjects together, generating a network with 26 modules (Fig. 3 and table S8): 10 modules contained exclusively human genes, 5 contained exclusively parasite genes, and 11 contained both human and parasite genes (most of these highly skewed to a single species). Only the HSPH1 [heat shock protein family H (Hsp110) member 1] module contained more than 10 genes from both human and parasite and was strongly enriched in human heat shock response and parasite RNA metabolism genes. All modules showed significant (P < 0.05) GO enrichments, regardless of host or parasite origin. The composite expression of genes within a module can be described by a module eigengene value (51, 52), and there were associations between module eigengene values and malaria severity, parasite load, consciousness level, and other laboratory parameters (Fig. 3). Some host-dominated and parasite-dominated modules were also highly correlated with each other, most notably the RPL24 (ribosomal protein L24) module (highly enriched in translation pathways) strongly correlated with the functionally similar PF3D7_0721600 (putative 40S ribosomal protein S5) parasite module. We excluded multimapping reads as an explanation for this and suggest that this indicates co-regulation of conserved host and parasite translation machinery. Furthermore, most of these genes were also differentially expressed between severe and uncomplicated malaria.

Fig. 3 Interspecies gene expression modules and their associations with severity.

Circos plot showing gene expression modules obtained from whole-genome correlation network analysis using expression of all human and parasite genes from each subject (severe malaria, n = 22; uncomplicated malaria, n = 19) as the input. From outside to inside: labels, hub gene, and most enriched GO term (with P value) for each module; track 1, module eigengene value for each subject; track 2, clinical phenotype (red, CH; orange, CM; green, HL; yellow, uncomplicated malaria); track 3, hub gene expression (log CPM) for each subject; track 4, heatmap for correlation with laboratory measurements (clockwise, blocks: log parasite density, log PfHRP2, lactate, platelets, hemoglobin, BCS; color intensity represents correlation coefficient as shown in color key); track 5, module size and composition (length proportional to number of genes in module; red, human genes; blue, parasite genes); polygons connect modules with significant (FDR P < 0.01) Pearson correlation between eigengene values (width proportional to −log10 FDR P value; red, positive correlation; blue, negative correlation). ER, endoplasmic reticulum.

Association of coexpression modules with severity

Coexpression network modules can be used as units of analysis, affording considerable dimension reduction for whole-genome expression data. We used module eigengene values and parasite load (with which many modules were correlated; Fig. 3) in linear regression models to determine the best within-sample predictors of severity, starting with all significant (P < 0.01) univariate associations and proceeding by backward selection (table S9). The resulting multivariate model combined MMP8, OAS1 (2′-5′-oligoadenylate synthetase 1), and LYSMD3 (LysM, putative peptidoglycan-binding, domain containing 3) module eigengenes but not parasite load. These modules represent distinct aspects of the immune response (table S8): The MMP8 module, highly enriched in defense response genes with predicted upstream regulators CEBPA (CCAAT/enhancer binding protein α; a myeloid transcription factor) and CSF3, likely reflects granulopoiesis (31); the OAS1 module is highly enriched for type 1 IFN response genes; and the small LYSMD3 module, with limited GO enrichment, contains a functional network around IFN-γ (fig. S4). The direction of association of the OAS1 module with severity changed from negative in univariate analysis to positive in the multivariate analysis, suggesting that inadequate suppression of the type 1 IFN response in conjunction with up-regulation of granulopoiesis and IFN-γ signaling may contribute to pathogenesis.

Differential coexpression in severe malaria

Considering all subjects together in the generation of coexpression networks maximizes power to detect consistently co-regulated genes but may not identify sets of genes where co-regulation is altered by severity. For this reason, we also created separate coexpression networks for uncomplicated and severe malaria and compared modules to identify differential coexpression (Fig. 4 and table S10). Eight modules showed substantial preservation between networks, seven were partially preserved, and two were unique to severe malaria (Fig. 4A). Partial preservation was common among modules comprised predominantly from human or parasite genes (Fig. 4, A and B), and module preservation was not dependent on the proportion of module genes differentially expressed between severe and uncomplicated malaria (Fig. 4, A and C). An MMP8 module (exclusively human genes, many encoding neutrophil granule and phagosome components) was uniquely present in severe malaria subjects, and 38% of its member genes were differentially expressed in the comparison between severe malaria and uncomplicated malaria (table S10). The module was enriched in host defense functions and predicted to be regulated by CEBPA, CSF3, and TNF (table S10). These findings strongly suggest that the MMP8 module represents emergency granulopoiesis (31) and mark this as a specific feature of severe malaria. The TIPRL (target of rapamycin signaling pathway regulator) module (99.2% human genes) was also unique to severe malaria but contained very few (1.3%) differentially expressed genes and had limited GO enrichment, and the most strongly predicted upstream regulator was the transcription factor HNF4A (table S10). Both TIPRL and HNF4A have regulatory roles in metabolic, inflammatory, and apoptosis signal pathways, so the minimal change in expression of this module may represent an aberrant response in severe malaria (53, 54). Among the partially preserved modules, we observed that host and parasite translation pathways were more tightly co-regulated in severe than uncomplicated malaria, with genes being distributed across fewer modules in severe malaria (Fig. 4A and table S10). This once again suggests that there is an interaction between these processes that is associated with severity.

Fig. 4 Severity-associated differential coexpression within the interspecies gene expression network.

(A to C) Cytoscape visualization of merged coexpression networks derived separately from severe malaria (n = 22) and uncomplicated malaria (n = 19). Networks were merged such that genes found in both subnetworks (represented as arrow-shaped, larger-sized nodes) are connected to genes found in only one subnetwork (represented as circular-shaped and smaller-sized nodes). (A) Genes and gene clusters are colored and annotated by module, species, most enriched GO terms, and conservation between subnetworks. Preserved, module pairs from severe malaria and uncomplicated malaria subnetworks overlap with each other but not with other modules; partially preserved, module clusters in one subnetwork overlap with two or more modules in the other subnetwork; unique, gene clustering only found in one subnetwork. Genes in black do not belong to any characterized module. TCA, tricarboxylic acid. (B) Identical network layout with genes colored by species (red, human; blue, P. falciparum). (C) Identical network layout with genes colored by whether they are significantly differentially expressed in severe malaria versus uncomplicated malaria (red, human; blue, P. falciparum; black, not differentially expressed).

Interaction of parasite and host gene expression accounting for parasite load

We sought to integrate pathogen load into analysis of interaction between host and parasite gene expression. To reduce dimensionality, we generated human-only gene expression modules from all subjects (table S11), identified those modules significantly (P < 0.05) associated with severity, and then identified parasite genes with significant (FDR P < 0.05) pairwise associations with these modules in a linear model accounting for log PfHRP2. The human modules associated with severity were similar to those identified in preceding analyses (Table 3). The most significantly associated parasite genes and the most enriched parasite GO terms were those involved in RNA processing and translation (Table 3 and table S11), suggesting that these processes in the parasite drive multiple aspects of the host transcriptional response independent of their effect on parasite load.

Table 3 Parasite genes correlated with human gene coexpression modules after adjustment for parasite load.

“+/−” indicates the number of parasite genes positively/negatively correlated with each human module eigengene value. All “top genes” in the table are positively correlated with the module eigengene value. RTP4, receptor transporter protein 4; TNRC6B, trinucleotide repeat containing 6B; ATPase, adenosine triphosphatase; PPIB, peptidylprolyl isomerase B.

View this table:

Neutrophil-related proteins in plasma

The most differentially expressed genes in comparisons between severe and uncomplicated malaria encode neutrophil granule proteins (Fig. 2A). Relationships between transcription, translation, storage, and release of granule proteins are expected to be complex, but we sought evidence of correlation between gene expression and circulating protein concentrations. In subjects with residual stored plasma, we found significant correlations between gene expression and plasma concentrations of defensin A3 (P = 0.0049, rho = 0.47, n = 34) and elastase (P = 0.045, rho = 0.35, n = 34; fig. S5). MMP8 expression was significantly correlated with plasma concentrations in subjects with severe malaria (P = 0.02, rho = 0.59, n = 15) but not in uncomplicated malaria (P = 0.37, rho = −0.21, n = 20) or all subjects combined (P = 0.88, rho = −0.026, n = 35). Upstream regulator analyses described earlier suggested that GCSF (CSF3) was a major regulator of genes in the MMP8 module. We found that plasma GCSF concentrations significantly correlated with the eigengene values (table S8) for this module (P = 0.0030, rho = 0.64, n = 19). We tested whether neutrophil degranulation occurred in response to parasite material by stimulating healthy donor blood cells with P. falciparum schizont lysate and detected increases in MMP8 release, reaching similar concentrations to those observed in plasma during malaria (fig. S5).


We used dual RNA sequencing to identify simultaneous host and parasite gene expression and their systemic interactions associated with severity of P. falciparum malaria in humans. Although gene expression is only one of the many biological processes involved, our findings add to the argument for an integrated understanding of infectious diseases and make a strong case that neither host nor pathogen should be studied in isolation when possible.

We have identified many associations between gene expression and features of severity, providing plausible insight into the pathogenesis of severe malaria (fig. S6). One of our most marked findings was the overriding effect of parasite load on differences in human gene expression between severe and uncomplicated malaria. Previous studies (23, 41, 42) have examined the association between human gene expression and circulating parasitemia, but we found that estimation of total body (both circulating and sequestered) parasite load was necessary to appreciate the full effect on host response. Our findings imply that the host response in severe malaria is not excessive per se but rather that it is an appropriate host response to an excessive pathogen load. This has important implications for malaria research and likely for other infectious disease, immunology, and pathogenesis research in humans. Without accounting for pathogen load, associations between host factors (such as genetic variants or comorbidities) and severity of infection may be misinterpreted. Unfortunately, total body pathogen load is much harder to measure in other infections in humans where pathogens are not restricted to the blood (55). We found that specific sets of host and parasite genes were associated with different pathophysiological consequences of malaria, although our power to detect associations was limited in the smallest subgroup analyses. Distinct sets of host genes were associated with BCS, lactate concentration, and platelet count. Hyperlactatemia in malaria is often ascribed to anaerobic metabolism arising from microvascular obstruction by adherent and rigid parasitized erythrocytes (4, 39). These properties are partly determined by the expression of particular members of the var, rif, and stevor gene families (14), which we did not include in our analysis because their extreme degree of polymorphism prevents reference genome–based quantification. Despite this limitation, we still found associations between lactate concentration and severity and in vivo variation in the expression of other parasite genes known to modify the biophysical properties of the infected erythrocyte. Some variation in parasite gene expression may have a genetic basis (56), but our in vitro data suggest that it may also occur as a response to the host environment.

The human genes most correlated with lactate were immune response–related, suggesting that inflammation, and perhaps its effect on glycolysis, may be involved (57, 58). If lactate production is associated with the strength of the host response, then the changes in expression of parasite genes in response to lactate might favor sequestration and evasion of innate immune cells as a parasite survival strategy. Cerebral malaria in humans is usually ascribed to parasite sequestration in the cerebral microvasculature, but the association between BCS and antigen presentation via MHC class I, IFN-γ, and type 1 IFN signaling would be consistent with the localization of these immunopathological mechanisms away from the blood and into the brain microvasculature as seen in rodent experimental cerebral malaria (15, 16, 59). Thromobocytopenia is almost invariable in malaria, and its mechanism is poorly understood (7). Our findings not only implicate the well-recognized activation of endothelial surfaces and coagulation pathways in malaria (7) as a cause but also lead us to suspect a role for histone-induced thrombocytopenia (60, 61).

Many of our results converge on a putative role for neutrophils in severe malaria. Different analytical approaches repeatedly identified the association of genes encoding neutrophil granules (such as MMP8, OLFM4, DEFA3, and ELANE) and upstream regulators of granulopoiesis (CSF3 and CEBPA) with severe outcomes. Granule proteins are enriched in immature neutrophils (31, 62), which are mobilized from the bone marrow to the circulation in malaria (63, 64), and there is plentiful evidence that neutrophil degranulation occurs in severe malaria (43, 65). Release of neutrophil granule proteins can be highly damaging to host tissues (62), and increased production and release of these proteins could contribute to many of the pathological features of severe malaria. Both neutrophil granule proteins and histones are released into the circulation during production of neutrophil extracellular traps (NETs) (62, 66), a phenomenon that has been described in malaria (64). Note that similar neutrophil-related signatures are not reported in the whole-blood transcriptomes of rodent models that have been examined to date (42, 67), creating a challenge for experimental validation. However, neutrophil depletion has been shown to prevent experimental cerebral malaria (68), and although this is not a viable therapeutic option in humans, pharmacological inhibitors of specific neutrophil functions such as NETosis are being evaluated in other diseases (66).

We observed a relationship between type 1 IFN responses and severity of malaria, which may help to tie together data from previous observations in humans and animal models. In a small study, expression of type 1 IFN response genes in blood from uncomplicated malaria was higher than in severe malaria, leading to the suggestion that this may be protective against developing severe malaria (69). However, we found that type 1 IFN response genes were negatively correlated with parasite load, suggesting that down-regulation with increasing parasite load (and severity) is a more likely explanation. Our multivariate analyses using gene expression modules to explain severity suggested that insufficient suppression of type 1 IFN signaling was associated with severity. This would be more consistent with results in mouse malaria models where genetic or antibody-mediated ablation of type 1 IFN signaling improves outcome (7073).

Very few parasite genes correlated with parasite load at the time of clinical presentation. This may be a consequence of the dynamic nature of parasite load, which is determined by parasite growth rate, the number of replication cycles in the host (duration of infection), and a reciprocal interaction with the constraining host response. Thus, lower expression of the gene encoding ApiAP2-G in severe malaria may increase the asexual parasite growth rate (34) and make severe malaria more likely, without this gene exhibiting any correlation with parasite load. It may also seem paradoxical that this gene is down-regulated in severe malaria given that gametocytes are usually more common in severe malaria (30). Development of mature gametocytes takes 10 to 12 days, for much of which they are not in the systemic circulation (74). We speculate that the mature gametocytes detected at the time of clinical presentation may reflect preferential gametocytogenesis in early infection and perhaps a subsequent reduction in ApiAP2-G that promotes enhanced asexual replication and severe disease.

We noted that both human and parasite translation pathways were associated with severe malaria, and these pathways showed the strongest evidence of interaction between species, with co-regulation appearing tighter in more severe disease. Increased translation is important for production of host defense effector proteins (75) and parasite proteins that enable survival (76), and it is feasible that these processes drive each other. This raises the intriguing question of whether addition of a translation-inhibiting antimalarial such as mefloquine (77) to standard artesunate treatment may have added benefit in severe malaria.

The differences in human and parasite gene expression between severe and uncomplicated malaria were much clearer after adjusting for heterogeneity of leukocyte population and parasite developmental stage. Although the importance of accounting for such variation is well recognized (24), it is rarely done in infectious disease transcriptomic studies. Several studies have used alternative methods to account for parasite developmental stage distribution and have shown that this has a major impact on observed associations between parasite gene expression and clinical phenotype (78, 79).

Although we cannot establish causation from an observational study such as this, our findings should launch points for future work assessing the implicated mechanisms and their potential as targets for adjunctive therapies. The identification of multiple and sometimes distinct host and parasite mechanisms associated with differing aspects of pathophysiology potentially bodes ill for adjunctive therapies, which might need to have multiple targets and perhaps be personalized to differing severe malaria manifestations. However, the common association of neutrophil granule protein genes with all severe malaria manifestations suggests that targeting neutrophil function may be a therapeutic strategy in the future.


Detailed Supplementary Materials and Methods are available online.

Study design

The primary aim of the study was to analyze differential human and parasite gene expression between children with severe malaria and uncomplicated malaria and to determine association of gene expression with parasite load. Secondary aims were (i) to analyze differences in gene expression associated with different severe malaria syndromes and with continuous variable markers of pathophysiology, (ii) to evaluate coexpression of host and parasite genes, and (iii) to evaluate differential coexpression associated with severe malaria. Sample size was determined pragmatically on the basis of the availability of suitable samples with necessary clinical and laboratory data. We aimed to achieve close to 30 million mapped human reads and 5 million mapped parasite reads per sample and calculated the likely number of reads required to achieve this based on the percentage parasitemia in each subject and the likely amount of RNA per ring-stage parasite (27). From available RNA samples, we aimed to have roughly equal numbers of severe and uncomplicated malaria cases, and within the severe malaria cases, we aimed to have eight subjects with each phenotype. After assessment of RNA quantity and quality, some samples were unsuitable for RNA sequencing, which resulted in the final composition of groups being slightly unbalanced.

Subjects and samples

Gambian children (under 16 years old) with P. falciparum malaria were recruited as previously described (11, 80, 81). Informed consent was obtained, and the study was approved by the Gambian Government/Medical Research Council (MRC) Laboratories Joint Ethics Committee (scientific coordinating committee references 670, 1077, 1143, 1178, 1179, 1180, 1207, and L2013.07V2). Malaria was defined by fever and >5000 asexual parasites per microliter of blood. Cerebral malaria was defined as a BCS of 1 or 2, or a BCS of 3 if the motor response was 1, not due to other causes (11, 80). Hyperlactatemia/acidosis was defined as blood lactate concentration >5 mM (11). Subjects meeting both criteria were described as having cerebral malaria plus hyperlactatemia/acidosis. Blood was collected at the time of presentation to the clinic, before any treatment (80).

For RNA sequencing, severe and uncomplicated malaria groups were matched as closely as possible by age and gender (data set S1). For uncomplicated malaria samples, we aimed to include an equal number with parasitemia above and below 5%.

RNA sequencing

RNA was collected and extracted as described previously (80). One microgram of total RNA was used with a ScriptSeq v2 RNA-sequencing library preparation kit (Illumina), and rRNA and globin mRNA were depleted using a Globin-Zero Gold kit (Epicentre). Strand-specific libraries were sequenced using the 2 × 100– base pair protocol with an Illumina HiSeq 2500.

Genomes and RNA annotations

Reference genomes were obtained for human (hg38) (82) and P. falciparum (release 24) (83), and gene annotations were obtained from GENCODE (release 22) (84) and PlasmoDB (release 24) (83).

Statistical analysis

Characteristics were compared between subject groups using the Kruskal-Wallis or Mann-Whitney tests for continuous data and Fisher’s exact test for categorical data. The Wilcoxon matched pairs test was used to compare paired data. Correlations were performed using Spearman correlation or Pearson correlation (when data were normally distributed). To determine genome-wide gene expression associated with variables of interest, we calculated P values for individual genes as described in Supplementary Materials and Methods, and a FDR-adjusted P < 0.05 was considered significant. Fisher’s exact test was used to identify gene set enrichments. Logistic regression was used to determine associations between severity and module eigengene values.


Materials and Methods

Fig. S1. Estimates of the relative proportions of leukocyte subpopulations in subjects with severe and uncomplicated malaria.

Fig. S2. Validation of the gene-signature approach to estimate parasite developmental stage proportions.

Fig. S3. Differential gene expression between severe malaria and uncomplicated malaria phenotypes.

Fig. S4. Top functional network for the small LYSMD3 module.

Fig. S5. Association between gene expression and plasma protein concentrations.

Fig. S6. Host-pathogen interactions in severe malaria revealed through dual RNA sequencing.

Table S1. Human genes differentially expressed between malaria disease phenotypes in unadjusted and parasite load–adjusted analyses.

Table S2. GO terms associated with human differentially expressed or significantly correlated genes in unadjusted and parasite load–adjusted analyses.

Table S3. Predicted upstream regulators associated with human differentially expressed or significantly correlated genes in unadjusted and parasite load–adjusted analyses.

Table S4. Human genes significantly correlated with parasite load and pathophysiological variables in unadjusted and parasite load–adjusted analyses.

Table S5. P. falciparum genes differentially expressed in unadjusted and parasite load–adjusted analyses.

Table S6. GO terms associated with parasite differentially expressed or significantly correlated genes in unadjusted and parasite load–adjusted analyses.

Table S7. P. falciparum genes significantly correlated with parasite load and pathophysiological variables in unadjusted and parasite load–adjusted analyses.

Table S8. Summary of modules obtained from combined whole-genome correlation network.

Table S9. Univariate and multivariate associations of module eigengene values and parasite load with severity.

Table S10. Summary and overlap of whole-genome correlation subnetworks for severe and uncomplicated malaria.

Table S11. Summary of modules obtained from human-only whole-genome correlation network.

Data set S1. Subject-level clinical and laboratory data.

References (8593)


Acknowledgments: We are grateful to the study subjects, staff at the Medical Research Council Unit The Gambia at London School of Hygiene and Tropical Medicine, the Jammeh Foundation for Peace Hospital, and the Brikama Health Centre; K. Paszkiewicz and staff at the Exeter Sequencing Service at the University of Exeter; and J. Daily for providing subject-level data for GSE72058. Funding: This work was funded by the UK MRC and the UK Department for International Development (DFID) under the MRC/DFID Concordat agreement and is also part of the EDCTP2 program supported by the European Union (MR/L006529/1 to A.J.C.), the European Research Council (AdG-2011-294428 to D.J.C. and L.B.S.), MRC core funding of the MRCG, and the Wellcome Trust (098051 to A.J.C.). Exeter Sequencing Service is supported by an MRC Clinical Infrastructure award (MR/M008924/1), the Wellcome Trust Institutional Strategic Support Fund (WT097835MF), a Wellcome Trust Multi-User Equipment Award (WT101650MA), and a Biotechnology and Biological Sciences Research Council Longer and Larger award (BB/K003240/1). Author contributions: A.J.C., M.L., and D.J.C. conceived the study. H.J.L., T.D.O., and A.G. performed formal analysis. A.G., M.W., and A.J.C. performed laboratory analyses. M.W., D.J.C., D.N., and L.B.S. provided the samples. D.J.C., T.D.O., and L.B.S. provided the methodology. A.J.C. and H.J.L. wrote the original draft. All authors contributed to the review and editing of the manuscript. L.J.C., D.J.C., M.L., T.D.O., and A.J.C. provided supervision. A.J.C. obtained funding. Competing interests: The authors declare that they have no competing financial interests. Data and materials availability: RNA-sequencing data from human samples have been deposited in the ArrayExpress database at European Molecular Biology Laboratory European Bioinformatics Institute ( under accession number E-MTAB-6413. The accession number for in vitro parasite RNA sequencing is E-MTAB-6573. Source data for Table 1 are provided with the paper as data set S1.

Stay Connected to Science Translational Medicine

Navigate This Article