Research ArticleHIV

Single-cell transcriptional landscapes reveal HIV-1–driven aberrant host gene transcription as a potential therapeutic target

See allHide authors and affiliations

Science Translational Medicine  13 May 2020:
Vol. 12, Issue 543, eaaz0802
DOI: 10.1126/scitranslmed.aaz0802

Sorting out HIV

The latent reservoir of HIV-1–infected cells that persist in, otherwise, virally suppressed individuals constitutes the major barrier to cure. Liu et al. developed a method called HIV-1 SortSeq to identify rare HIV-infected CD4+ T cells from individuals on antiretroviral therapies upon latency reversal ex vivo. Analysis of the isolated single cells showed that the 5′ long terminal repeat of HIV-1 was capable of driving the transcription of host genes downstream of the integration site, which may contribute to HIV persistence.

Abstract

Understanding HIV-1–host interactions can identify the cellular environment supporting HIV-1 reactivation and mechanisms of clonal expansion. We developed HIV-1 SortSeq to isolate rare HIV-1–infected cells from virally suppressed, HIV-1–infected individuals upon early latency reversal. Single-cell transcriptome analysis of HIV-1 SortSeq+ cells revealed enrichment of nonsense-mediated RNA decay and viral transcription pathways. HIV-1 SortSeq+ cells up-regulated cellular factors that can support HIV-1 transcription (IMPDH1 and JAK1) or promote cellular survival (IL2 and IKBKB). HIV-1–host RNA landscape analysis at the integration site revealed that HIV-1 drives high aberrant host gene transcription downstream, but not upstream, of the integration site through HIV-1–to–host aberrant splicing, in which HIV-1 RNA splices into the host RNA and aberrantly drives host RNA transcription. HIV-1–induced aberrant transcription was driven by the HIV-1 promoter as shown by CRISPR-dCas9–mediated HIV-1–specific activation and could be suppressed by CRISPR-dCas9–mediated inhibition of HIV-1 5′ long terminal repeat. Overall, we identified cellular factors supporting HIV-1 reactivation and HIV-1–driven aberrant host gene transcription as potential therapeutic targets to disrupt HIV-1 persistence.

INTRODUCTION

Despite effective antiretroviral therapy (ART), HIV-1 persists in latently infected CD4+ T cells as a major barrier to cure (13). During treatment interruptions, CD4+ T cells from the latent reservoir contribute to viral rebound (4). To cure HIV-1 infection, all cells harboring infectious HIV-1 proviruses need to be recognized and eliminated by the immune system. However, latent HIV-1 proviruses are transcriptionally inactive and do not present antigens to immune effectors. Transcriptional interference by the host gene may induce transcriptional read-through or viral promoter occlusion and prevent HIV-1 transcription (5, 6). Transcriptional blocks (7), such as the lack of active forms of cellular transcription factors (813) and the HIV-1 Tat feedback loop (1417), prevent HIV-1 transcriptional initiation and elongation. Therefore, despite maximum T cell activation, HIV-1 reactivation is stochastic, as some HIV-1 proviruses remain transcriptionally inactive (18). It remains unclear which cellular factors contribute to maximum HIV-1 expression upon latency reversal, particularly in CD4+ T cells from HIV-1–infected individuals. Understanding the cellular environment supporting HIV-1 expression upon latency reversal is required to effectively target HIV-1–infected cells.

The rarity of HIV-1–infected cells in vivo and the lack of cellular surface markers that can distinguish HIV-1–infected cells harboring intact proviruses make it challenging to study HIV-1–host interactions in vivo. Only 1 to 100 CD4+ T cells harbor intact HIV-1 proviruses in ART-treated, virally suppressed, HIV-1–infected individuals (1820). Using HIV-1 Env expression as a surrogate, broadly neutralizing antibodies can capture HIV-1–infected cells for transcriptome analysis (21). However, induction of readily detectable amount of HIV-1 Env expression requires >30 hours of mitogen stimulation (21). In contrast, using HIV-1 RNA as a surrogate, fluorescence in situ hybridization (FISH)–based methods can identify HIV-1–infected cells within 24 hours of stimulation (22). However, the branched DNA amplification methods used to amplify the low amount of HIV-1 RNA in FISH-based methods degrade RNA and prevent downstream transcriptome analysis. Therefore, the cellular environment supporting HIV-1 transcription during early latency reversal remains unknown.

HIV-1–infected cells undergo clonal expansion, and these clonally expanded cells increase over time (2325). More than 50% of HIV-1–infected cells harboring replication-competent HIV-1 proviruses are maintained through clonal expansion (2628), making this proliferating latent reservoir a major concern. Clonal expansion is presumably driven by antigen stimulation (29, 30) and homeostatic proliferation (3133). Recent evidence suggests that proliferation of some HIV-1–infected cells in vivo is associated with HIV-1 integration sites, particularly in individuals treated with years of suppressive ART (2325): First, although HIV-1 integration sites during in vitro infection span the transcription unit of the host genome, HIV-1 integration sites found in vivo are enriched in a small region of certain cancer-related genes (such as within 90 of the 370-kb BACH2 transcription unit) as expanded clones (23, 24). Second, in vivo enrichment of HIV-1 integration into these cancer-related genes is exclusively in the same orientation as the host transcription unit, whereas HIV-1 integration into these cancer-related genes in vitro can be either the same or opposite orientation (23, 24). This indicates that both the location and orientation of HIV-1 at the integration site may be associated with preferential proliferation. Third, HIV-1 integration in the same orientation as the host transcription unit, such as BACH2 and STAT5B, induces HIV-1 RNA splicing into the host RNA, creating HIV-1–host chimeric RNA that can contribute to HIV-1 persistence (34). These lines of evidence suggest that an unknown mechanism, which cannot be explained by antigen-driven proliferation and homeostatic proliferation, favors the proliferation of HIV-1–infected cells in a manner dependent on integration location and orientation.

We propose that HIV-1 reactivation changes the host cellular transcriptional landscape both globally in infected CD4+ T cells and locally at the integration site. Here, we developed HIV-1 SortSeq to identify rare HIV-1–infected cells from ART-treated, virally suppressed, HIV-1–infected individuals and examined the transcriptional landscape of HIV-1–infected cells within 24 hours of latency reversal. We further developed a CRISPR-dCas9–based HIV-1–specific activation and inhibition system to examine the impact of HIV-1 promoter activity on the host gene at the integration site. Our approach examines the cellular environment supporting full HIV-1 reactivation and mechanisms of HIV-1 integration site-related proliferation.

RESULTS

HIV-1 SortSeq identifies HIV-1–infected cells from individuals

To examine HIV-1–host genetic interactions upon early latency reversal, we developed HIV-1 SortSeq to identify rare (1 to 100/106) (18, 20, 35, 36) CD4+ T cells containing inducible HIV-1 from ART-treated, virally suppressed, HIV-1–infected individuals (Fig. 1; fig. S1, A and B; and table S1). Briefly, we treated CD4+ T cells with phorbol 12-myristate 13-acetate (PMA) and ionomycin ex vivo for 16 hours in the presence of enfuvirtide to induce HIV-1 latency reversal without causing cellular proliferation or new rounds of infection (Fig. 1A). Although this approach cannot answer why latent HIV-1 remains latent, we proposed to examine why inducible HIV-1 can be induced through examination of the cellular environment supporting HIV-1 transcription during early (<24 hours) latency reversal. Cells were fixed, permeabilized, and hybridized with HIV-1 RNA-specific probes targeting 5′ and 3′ HIV-1 (data files S2 and S3) under RNA-preserving conditions (37) and sorted by flow cytometry. The use of one set of 96 fluorescently labeled probes targeting 5′ HIV-1 RNA and another set of 96 fluorescently labeled probes targeting 3′ HIV-1 overcomes the HIV-1 sequence diversity in clinical samples and maximizes HIV-1 RNA signal without RNA degradation caused by branched DNA amplification (22). HIV-1 SortSeq detected primary CD4+ T cells infected with NL4-3 reference strain or clinical isolates (Fig. 1B) and CD4+ T cells from HIV-1–infected individuals (Fig. 1C) with high specificity (Fig. 1D). HIV-1 SortSeq captured one HIV-1–infected cell per million uninfected cells (Fig. 1E). The frequency of HIV-1 SortSeq+ cells was correlated with the size of the latent reservoir measured by the quantitative viral outgrowth assay (Fig. 1F). The RNA in HIV-1 SortSeq cells was sufficient for single-cell complementary DNA library construction (fig. S1C).

Fig. 1 HIV-1 SortSeq identifies CD4+ T cells harboring inducible HIV-1 from ART-treated, virally suppressed, HIV-1–infected individuals.

(A) Experimental scheme of HIV-1 SortSeq. CD4+ T cells from ART-treated, virally suppressed, HIV-1–infected individuals were sorted two-way into HIV-1 SortSeq+ and SortSeq single cells. A stringent gating strategy away from the HIV-1 SortSeq cells was used. (B) HIV-1 SortSeq gating strategy for primary cells infected with clinical isolates. The gating strategy was used to demonstrate the four quadrants of HIV-1 5′ and 3′ positive staining, not for sorting. (C) HIV-1 SortSeq gating strategy for CD4+ T cells isolated from HIV-1–infected, ART-treated, and virally suppressed individuals upon latency reversal. The gating strategy was used to demonstrate the four quadrants of HIV-1 5′ and 3′ staining, not for sorting. (D) Fluorescent microscopic imaging of primary CD4+ T cells infected with the NL4-3 reference strain and reconstructed clinical isolates at ~10% of infectivity. DAPI, 4,6-diamidino-2-phenylindole nuclear staining. (E) Regression analysis of the frequency of HIV-1–infected cells detected by HIV-1 SortSeq versus the frequency of input HIV-1–infected cells. (F) Correlation between the frequency of HIV-1 SortSeq+ cells and the size of the latent reservoir as measured by viral outgrowth assays.

We captured 86 HIV-1 SortSeq+ cells in bulk from seven participants and 48 HIV-1 SortSeq+ single cells from 14 ART-treated, virally suppressed, HIV-1–infected individuals (data file S4). Instead of choosing participants with a known, comparatively large latent reservoir (21), we did not preselect participants to have a better representation of the diversity of the latent reservoir. The low number of cells we identified reflects the rarity of cells containing inducible HIV-1. We identified spliced HIV-1 RNA, no large internal deletions, and no APOBEC3G-mediated hypermutations in HIV-1 SortSeq+ cells, indicating a likely intact HIV-1 genome (fig. S2). Of note, HIV-1 harboring small internal deletions or missense mutations can still produce readily detectable amounts of HIV-1 RNA (38). Using identical HIV-1 env V3-V4 sequences as an indicator of clonally expanded HIV-1–infected cells (39), we identified the same expanded clone by HIV-1 SortSeq and from viral outgrowth culture positive wells, indicating detection of clonally expanded replication-competent HIV-1 (fig. S3). We then identified HIV-1 SortSeq+ and SortSeq cells from the same HIV-1–infected individuals through two-way flow cytometric single-cell sorting for single-cell RNA sequencing (RNA-seq). HIV-1 SortSeq+ and SortSeq cells from HIV-1–infected individuals were sorted directly into tubes containing RNA-preserving buffer to maximize RNA capture. Although this method captures HIV-1 SortSeq+ and SortSeq single cells, it does not allow flow cytometry confirmation of sorting purity. Therefore, we used the presence of HIV-1 RNA reads, as shown on Integrative Genomic Browser and HIV BLAST (Los Alamos National Laboratory), to ensure that HIV-1 SortSeq+ cells were authentic HIV-1–infected cells.

HIV-1 SortSeq+ cells are polarized in TH1 cells

We first examined the T cell activation status in HIV-1 SortSeq+ and SortSeq cells (Fig. 2). To avoid batch effects in transcriptome analysis, all HIV-1 SortSeq cells processed at Yale University, but not at Johns Hopkins University, were included for transcriptome analysis. From 28 HIV-1 SortSeq+ cells and 43 HIV-1 SortSeq cells from 10 ART-treated, virally suppressed, HIV-1–infected individuals, we found that both HIV-1 SortSeq+ and SortSeq cells expressed RNA encoding early activation markers (CD69 and CD25), but not late activation markers [CD38 and human lymphocyte antigen DR (HLA-DR)], suggesting that HIV-1 SortSeq captured early activation events. We found that the extent of T cell activation (as measured by RNA expression levels of T cell activation markers CD69, CD25, CD38, and HLA-DR) was comparable between HIV-1 SortSeq+ and SortSeq cells (Fig. 2, A to D; P = not significant).

Fig. 2 HIV-1 SortSeq+ cells are polarized to TH1 phenotype.

(A to D) RNA expression of T cell activation markers CD69 (A), IL2RA (CD25) (B), CD38 (C), and HLA-DRA (D) from HIV-1 SortSeq+ and SortSeq cells from ART-treated, virally suppressed, HIV-1–infected individuals. (E to N) RNA expression of representative T cell polarization signatures of TH1 (E to G), TH2 (H to J), TH17 (K to M), and Treg (N) from HIV-1 SortSeq+ and SortSeq cells. Each dot represents a single cell from 28 SortSeq+ and 43 SortSeq cells. Red lines denote median expression. Dashed red lines denote 75th expression percentile.

We next examined the T cell polarization phenotypes of HIV-1 SortSeq cells (Fig. 2). Using signature cytokine profiles, we found that only HIV-1 SortSeq+ cells were enriched in T helper 1 (TH1) effector cytokines IFNG (P = 0.018; Fig. 2F) and IL2 (P = 0.029; Fig. 2G), suggesting an enrichment of HIV-1–infected cells in TH1 (40), but not in TH2, TH17, or regulatory T cells (Treg). This is consistent with the finding that HIV-1–infected cells are mainly memory CD4+ T cells and TH1 cells (31, 40), whereas HIV-1–uninfected cells can be either naïve or memory CD4+ T cells.

Single-cell transcriptional landscape identifies up-regulation of cellular factors involving HIV-1 transcription, cellular survival, and immune response

We compared the single-cell transcriptional landscape between HIV-1 SortSeq+ and SortSeq cells. We found that both groups of cells demonstrated a heterogeneous transcriptional profile (Fig. 3, A and B), reflecting the unique and diverse cellular environment in each cell during early latency reversal. This is consistent with the diverse and heterogenous nature of CD4+ T cells (41, 42). In comparison, the distribution of transcriptional profiles of housekeeping genes such as B2M and UBC was relatively homogeneous across samples (fig. S4, A and B), suggesting that the heterogeneity in transcription of these genes was not because of different amounts of cellular transcription in individual cells. We identified 447 differentially expressed genes in HIV-1 SortSeq+ compared to SortSeq cells (data file S5). Gene ontology analysis from the 395 genes up-regulated in HIV-1 SortSeq+ cells showed molecular function enrichment for RNA binding proteins (fig. S5A) and biological function enrichment in nonsense-mediated RNA decay (NMD), RNA processing, and viral transcription (fig. S5B), suggesting the importance of cellular and viral RNA transcription and processing during early latency reversal.

Fig. 3 The single-cell transcriptional landscape of HIV-1–infected cells upon ex vivo activation.

(A) The principal component analysis (PCA) demonstrates the distribution of HIV-1 SortSeq+ (closed circles) and SortSeq cells (open circles) from ART-treated, virally suppressed, HIV-1–infected individuals. (B) Heterogeneous transcriptional profile of HIV-1–infected cells within 24 hours of latency reversal. The heat map shows significantly differentially expressed genes between HIV-1 SortSeq+ and SortSeq cells. Values indicate expression measured as log2(TPM + 1). The participant ID is color-coded as shown in (A). (C to J) Expression of selected significantly differentially expressed genes in HIV-1 SortSeq single cells. Each dot represents a single cell from 28 SortSeq+ and 43 SortSeq cells. Red lines denote median expression. PC, principal component.

We identified up-regulation of cellular factors IMPDH1 (Fig. 3C) and JAK1 (Fig. 3D) that may support HIV-1 expression, up-regulation of NMD pathways involving UPF2 (Fig. 3E and fig. S5B), and up-regulation of IKBKB and IL2 in HIV-1 SortSeq+ cells (Fig. 2G and Fig. 3F). We found up-regulation of immune regulatory cytokine LTA (lymphotoxin-α) and chemokines CCL3 [macrophage inflammatory protein–1α (MIP-1α)], CCL4 (MIP-1β), and XCL1 (lymphoactin; Fig. 3, G to J), all of which may inhibit HIV-1 replication (43). We note that Tat can induce LTA, CCL3, CCL4, and XCL1 expression (44, 45) as a host immune response against HIV-1 infection (46). The expression of these genes enriched in HIV-1 SortSeq+ cells {~5 to 10 log2[transcripts per million (TPM) + 1]; Fig. 3, G to J} was comparable to the expression of housekeeping genes B2M and UBC [~4 to 8 log2(TPM + 1); fig. S4, C and D]. Using constitutively expressed gene EF1A as a reference (fig. S6A), we confirmed up-regulation of IMPDH1, JAK1, UPF2, and IKBKB by quantitative polymerase chain reaction (qPCR; fig. S6, B to E) normalized to EF1A expression (fig. S6, F to I).

HIV-1 proviruses that are integrated into cancer-related genes can be inducible

We next examined HIV-1–host interactions locally at the integration site. To identify HIV-1 integration sites in HIV-1 SortSeq+ cells, we examined the HIV-1–host chimeric RNA that indicates the integration sites of HIV-1 proviruses. To exclude sequencing artifacts (47), only HIV-1–host chimeric RNA reads that captured the definite HIV-1–host RNA junction of a known HIV-1 splice site and a known host splice site (in HIV-1–host RNA splicing) or the exact end of the 5′ and 3′ HIV-1 long terminal repeat (LTR; in read-through transcription) were considered authentic. The sequencing reads used for analysis are shown in fig. S7 (see also tables S2 and S3). After stringent filtering for sequencing artifacts, manual examination using the University of California, Santa Cruz (UCSC) Genome Browser BLAT and identification of canonical splice sites, we found 19 HIV-1–host chimeric RNA species, three of which share the same integration site (table S3). The low number of HIV-1–host chimeric RNA reflects not only the low frequency of HIV-1–host chimeric RNA but also the stringency of our RNA sequence examination.

Whether HIV-1 proviruses that are integrated into cancer-related genes are intact or defective remains unclear. Because most of the HIV-1 proviruses are defective (18, 20), methods that examine HIV-1 integration site and HIV-1 genome integrity at the same time capture mainly defective proviruses (19, 48). On the other hand, methods that can capture intact HIV-1 from viral outgrowth positive cultures require multiple rounds of in vitro infection, making integration site analysis unfeasible (2628). Using HIV-1 SortSeq, we examined whether HIV-1 copies that have integrated into cancer-related genes are inducible. Using a previously reported list of 2983 cancer-related genes (49) and 3804 housekeeping genes (50), we compared the proportion of cancer-related genes in the human genome and in integration sites identified in HIV-1 DNA and HIV-1–host chimeric RNA analysis during in vitro infection (47) and in cells from HIV-1–infected individuals (23) (Fig. 4). HIV-1 integration sites captured in HIV-1 SortSeq were enriched in both cancer-related genes (29.4%) and housekeeping genes (29.4%; Fig. 4), reflecting HIV-1 preferential integration into active transcription units (18, 51, 52). Overall, we found that HIV-1 proviruses that are integrated into cancer-related genes can be inducible and are therefore putatively intact.

Fig. 4 HIV-1 proviruses integrated into cancer-related genes can be inducible.

Using a previously reported list of 2983 cancer-related genes (49) and 3804 housekeeping genes (50), we compared the proportion of cancer-related genes in the human genome, integration sites identified in HIV-1 DNA during in vitro infection (47), integration sites identified in HIV-1 DNA from HIV-1–infected, virally suppressed individuals (23), integration sites in HIV-1–host chimeric RNA during in vitro infection (47), and integration sites identified in HIV-1–host chimeric RNA in this study. P values, Fisher’s exact test.

HIV-1 integration in the same orientation in cancer-related genes is associated with HIV-1–host RNA aberrant splicing

We next examined whether HIV-1 integration orientation affected HIV-1–host interactions (fig. S8). To examine whether the host promoter or the HIV-1 promoter drives host gene transcription, we performed strand-specific RNA-seq and examined the sense-strand RNA sequences. We found that HIV-1 3′ LTR dominated over the host promoter during read-through transcription upon latency reversal (fig. S7C), compatible with previous findings of 3′ LTR dominance (5, 47). Although HIV-1 proviral integration orientation was roughly equal for the same and convergent orientations (fig. S8), we found that HIV-1 integration into cancer-related genes more frequently lead to HIV-1–host splicing: Five (62.5%) HIV-1–host splicing resulted in eight integrations into a cancer-related gene, whereas none of the nine (0%) HIV-1 integrations into noncancer-related genes lead to HIV-1–host splicing (fig. S8). HIV-1 integration into cancer-related genes induced chimeric RNA spliced between canonical splice donors and canonical splice acceptors in HIV-1 and cancer-related genes such as SMARCC1, PYHIN1, MIR155HG, BACH2, and NFATC3 (Fig. 5A and fig. S7D), some of which (SMARCC1, BACH2, and NFATC3) were previously reported integration sites in clonally expanded HIV-1–infected cells from HIV-1–infected individuals (23, 24). Our finding suggests that although HIV-1 can integrate into both cancer-related genes and noncancer-related genes, HIV-1 integration in the same orientation as the cancer-related genes is associated with HIV-1–host RNA aberrant splicing.

Fig. 5 HIV-1 dominates over the host promoter and drives NFATC3 expression downstream of HIV-1 integration site through aberrant splicing.

(A) HIV-1 LTR drives the transcription of NFATC3 through aberrant splicing from HIV-1 major splice donor (MSD) into host canonical splice acceptor site identified in HIV-1 SortSeq cells from an ART-treated, virally suppressed, HIV-1–infected individual. The HIV-1–host chimeric RNA junction is indicated as a dashed red arrow. (B) Transcription in the single cell (154_21) in which HIV-1 integrated into NFATC3. The expected transcription of upstream exons is indicated by magenta arrowheads, and the transcription of the downstream exons is indicated by red asterisks.

The HIV-1 promoter dominates over the host promoter and drives aberrant host gene transcription at the integration site

We identified HIV-1 splicing from the HIV-1 major splice donor into the canonical acceptor of exon 8 of NFATC3 (Fig. 5A). NFATC3 is important for T cell activation (53) and HIV-1 reactivation (54). HIV-1 integration into NFATC3 upstream of this splice junction was previously reported in clonally expanded cells in HIV-1–infected individuals (23). To understand whether HIV-1 integration affects NFATC3 transcription, we compared NFATC3 transcripts in different single cells (Fig. 5B). In cells which did not have HIV-1 integrated into NFATC3, all exons of NFATC3 remained transcribed (Fig. 5B). However, in the single cell that contained HIV-1 integrated into NFATC3 (154_21), all exons downstream of HIV-1 integration site were transcribed; however, we did not detect transcription upstream of the HIV-1 integration site (Fig. 5B). These results indicate that HIV-1 LTR dominates over the host at the integration site and drives aberrant NFATC3 transcription.

The HIV-1 promoter drives aberrant host protein expression at the integration site

Although the host gene may suppress HIV-1 transcription through transcriptional interference, as demonstrated in cell line clones in which an HIV-1 reporter is integrated into noncancer-related genes PP5, UBA2 (5), and HPRT (6), the presence of HIV-1 RNA splicing into BACH2 and STAT5B RNA (34) suggests that HIV-1 can escape host gene transcriptional interference. We examined the impact of HIV-1 on host gene expression at the integration site to understand whether HIV-1 insertional activation of the host gene is driven by HIV-1 LTR. Although HIV-1–infected cell line models may not reflect the quiescent state of resting memory CD4+ T cells, the use of cell line clones allows examination of HIV-1–host interactions at both transcription and translation levels. Using an HIV-1 reporter provirus [NL4-3-d6-dEnv-drGFP; (55)] that contained all splice elements to allow examination of HIV-1–host splicing, we infected Jurkat T cells at a low multiplicity of infection, which we then sorted into single cells and grew into individual clones. Instead of using targeted integration of HIV-1 reporter into specific genes such as BACH2 and STAT5B, our approach recapitulated HIV-1 integration into the host intron in vivo without additional transcriptional interference caused by the transcriptional terminator cassettes in lentiviral vectors used in targeted integration methods. We established three Jurkat T cell clones harboring HIV-1 proviral reporter proviruses integrated into the introns of three cancer-related genes, such as RAP1B, VAV1, and SPECC1, in the same orientation as the host gene transcription unit, which resembled the integration sites observed in vivo (Fig. 6). Of note, HIV-1 integration into VAV1 was reported in clonally expanded HIV-1–infected cells from HIV-1–infected individuals (23).

Fig. 6 HIV-1 drives aberrant cancer-related gene transcription and induces aberrant protein expression at the integration site.

(A to C) HIV-1 integration effect on the transcription of cancer-related genes VAV1 in HIV-1–Jurkat clone 8B10 (A), RAP1B in HIV-1–Jurkat clone 1G2 (B), and SPECC1 in HIV-1–Jurkat clone 1D7 (C) at the integration sites. Normalized RNA transcription landscapes with enlarged 20-kb windows across integration sites. HIV-1 proviruses are integrated downstream of the translation start site of VAV1 and SPECC1 in HIV-1–Jurkat clones 8B10 and 1D7, respectively. HIV-1 integration is upstream of the translation start site of RAP1B in HIV-1–Jurkat clone 1G2. HIV-1–host chimeric RNA captured in RNA-seq is depicted as HIV-1 RNA (red boxes), splice junction (gray lines), canonically spliced host exons (blue boxes), and cryptic host exons (yellow bars). (D) Western blot of VAV1 and RAP1B expression in HIV-1–Jurkat clones. The blue arrowhead indicates HIV-1–driven RAP1B protein expression, and the yellow arrowhead indicates HIV-1–driven truncated VAV1 protein expression. αGAPDH, anti–glyceraldehyde-3-phosphate dehydrogenase. (E) HIV-1 genomic RNA landscape in HIV-1–Jurkat T cell clones.

We examined the host gene transcriptional landscape at the integration site in these three HIV-1–Jurkat cell clones using RNA-seq of polyadenylated RNA (Fig. 6, A to C). First, we found that HIV-1 integration did not affect host gene transcription upstream of the integration site (Fig. 6), suggesting an orientation-dependent effect of HIV-1 on the host gene expression. Second, we found that host gene expression was highly increased (>5-fold) downstream of the HIV-1 integration site in all three HIV-1–Jurkat clones (Fig. 6). Third, zooming into the intron in which HIV-1 is integrated, we found that HIV-1 induced intron retention at the integration site (Fig. 6). Fourth, we captured HIV-1–host chimeric RNA splicing events from canonical HIV-1 splice donor to canonical host splice acceptor. HIV-1 aberrant splicing activated cryptic host exons (Fig. 6). The aberrant splicing events followed the GT|AG mRNA processing rule and are therefore unlikely to be sequencing artifacts. Last, HIV-1 transcription and intra–HIV-1 splicing remained intact (Fig. 6E), suggesting that the HIV-1 RNA can splice into human RNA without affecting HIV-1 RNA splicing. Together, these findings suggest that HIV-1–induced aberrant host transcription is not simply an HIV-1–host splicing event: HIV-1 drives downstream host gene expression, induces host intron retention, and activates cryptic exons.

To understand whether HIV-1–induced aberrant host gene transcription leads to aberrant host protein translation, we performed a Western blot on HIV-1–Jurkat clones 1G2 (in which HIV-1 is integrated upstream of the translation start site of RAP1B) and 8B10 (in which HIV-1 is integrated downstream of the translation start site of VAV1). We found that HIV-1 integration upstream of the RAP1B translation start site induced increased Ras-related protein Rap-1b (RAP1B) expression (Fig. 6D), whereas HIV-1 integration downstream of the VAV1 translation start site induced truncated Proto-oncogene vav (VAV1) protein expression (Fig. 6D). In the 8B10-VAV1 clone, HIV-1 integrated downstream of the translation start codon and cut into the middle of the protein-coding region, thereby causing truncation of the VAV1 protein (Fig. 6A). In the 1G2-RAP1B clone, HIV-1 integrated upstream of the translation start codon (Fig. 6B), and thus, the protein-coding region remained intact. Therefore, there was no RAP1B truncation in the 1G2-RAP1B clone. Of note, an N-terminal truncation is known to increase the oncogenic potency of VAV1 (56). Overall, we showed that HIV-1–induced aberrant host gene transcription at the integration site leads to aberrant host protein expression.

HIV-1–driven aberrant transcription can be suppressed by CRISPR-mediated inhibition of the HIV-1 LTR

We next examined whether it is the HIV-1 promoter that drives the aberrant host protein expression. We hypothesized that HIV-1–specific activation would increase host gene expression at the integration site, whereas HIV-1–specific inhibition would suppress host gene expression at the integration site. We constructed dCas9-VP64–mediated HIV-1–specific activation (CRISPRa) and dCas9-Krab–mediated HIV-1–specific inhibition (CRISPRi) systems (57) using guide RNAs (gRNAs) targeting HIV-1 LTR (58) (Fig. 7A). The CRISPRa and CRISPRi system has been previously established to examine how targeted activation and suppression of cellular genes can change host-pathogen interactions (57). We found that activation of HIV-1 LTR drives aberrant VAV1 protein expression. Thus, HIV-1–induced aberrant host protein expression is driven by HIV-1 LTR. Inhibition of HIV-1–LTR reduces aberrant VAV1 protein expression (Fig. 7, B and C). Using RNA landscape mapping at the HIV-1 integration site in all three clones, we found that HIV-1–specific CRISPRi inhibition suppressed HIV-1–driven aberrant host gene transcription at the integration site in all three cell line clones (Fig. 8), restoring the expression of the host gene to that of the uninfected Jurkat cells (Fig. 8). Overall, our results suggest that HIV-1–induced aberrant host gene transcription can be targeted by CRISPR-based or potentially small molecule–based disruption of HIV-1 LTR function.

Fig. 7 HIV-1–driven aberrant host gene transcription at the integration site can be suppressed by CRISPR-dCas9–mediated HIV-1 LTR inhibition.

(A) CRISPR-dCas9–based HIV-1 LTR-specific activation and inhibition system. Uninfected Jurkat T cells and HIV-1–Jurkat T cell clone 8B10 (in which HIV-1 is integrated into VAV1) were transduced with dCas9-VP64-mCherry or dCas9-Krab-mCherry and isolated by flow cytometric sorting. These CRISPR-ready cells were then transduced with lentiviruses carrying HIV-1–specific guide RNA (gRNA) targeting HIV-1 LTR or nontargeting (NT) gRNA as the negative control. (B) dCas9-VP64–mediated HIV-1 activation and dCas9-Krab–mediated HIV-1 inhibition as measured by flow cytometry. (C) The effect of dCas9-mediated HIV-1 activation and repression on aberrant VAV1 protein expression. Yellow arrowhead denotes HIV-1–driven aberrant and truncated VAV1 protein.

Fig. 8 CRISPRi-mediated HIV-1–specific inhibition restores HIV-1–driven aberrant host gene transcription to that of uninfected cells.

From CRISPR-ready, gRNA transduced uninfected and HIV-1–infected Jurkat T cell clones 8B10 (in which HIV-1 is integrated into VAV1), 1G2 (in which HIV-1 is integrated into RAP1B), and 1D7 (in which HIV-1 is integrated into SPECC1); HIV-1–green fluorescent protein–positive (GFP+) cells in CRISPRa/HIV-1 gRNA and CRISPRa/nontargeting gRNA systems were sorted for RNA landscape analysis at the integration site. HIV-1–GFP cells in CRISPRi/HIV-1 gRNA and CRISPRi/nontargeting gRNA systems were sorted similarly. Peaks show normalized RNA transcription in the corresponding genes VAV1 (A), RAP1B (B), and SPECC1 (C).

DISCUSSION

Our comprehensive study—encompassing global single-cell transcriptional landscape to HIV-1–host interactions at the integration site, clinical samples to cell line validation, and moving from mechanisms to potential therapeutic targets—provides insights on HIV-1 persistence and HIV-1 eradication strategies. The stochastic nature of HIV-1 reactivation makes it challenging to expose all HIV-1–infected cells for immune clearance (18). Nonetheless, the identification of cellular factors required for HIV-1 transcription may help to understand transcriptional blocks on HIV-1 reactivation (7).

We found enrichment of HIV-1 SortSeq+ cells in cells with a TH1 phenotype. CD4+ T cells in peripheral blood are polarized toward TH1 about 10-fold more often than other polarizations (59). HIV-1 infects TH1 cells at a higher frequency, followed by TH0, then by TH2 cells (60). This is further evidenced by the fact that HIV-1–infected cells are enriched in the TH1 population (40), memory T cells (31), and HIV-1–specific CD4+ T cells (29). On the basis of the integration sites that we captured, there is no evidence suggesting that the enrichment in TH1 phenotype is driven by the integration site.

Inosine-5’-monophosphate dehydrogenase (IMPDH) is the rate-limiting enzyme required for guanine nucleotide de novo synthesis in CD4+ T cells. Inhibition of IMPDH by a U.S. Food and Drug Administration (FDA)–approved drug mycophenolic acid [the active form of mycophenolate mofetil (MMF)] suppresses HIV-1 replication (61). JAK1 is required for Tat-dependent HIV-1 gene expression (62) and is involved in T cell activation. Janus kinase 1 (JAK1) inhibition by FDA-approved drug ruxolitinib suppresses HIV-1 replication (63). Although the role of IMPDH1 and JAK1 in HIV-1 latency remains unknown, targeting IMPDH by MMF (NCT03262441) and JAK1 by ruxolitinib (NCT02475655) is currently being examined in ongoing clinical trials. Although numerous cellular factors and small-molecule compounds had been proposed to alter HIV-1 expression in cell line models, HIV-1 SortSeq identifies cellular factors that are enriched in HIV-1–infected cells from virally suppressed individuals as high-priority therapeutic targets. Our result suggests that transcriptome analysis by HIV-1 SortSeq identified not only a transcriptome signature of HIV-1–infected cells upon latency reversal but also cellular factors that can serve as therapeutic targets in HIV-1 eradication strategies.

NMD is a cellular surveillance mechanism that identifies and degrades aberrant mRNA containing premature stop codons and long aberrant introns. Regulator of nonsense transcripts 2 (UPF2), a key player in NMD pathways, can block HIV-1 RNA nuclear export (64, 65). Although UPF2 (along with other NMD proteins such as UPF1) restricts human T cell leukemia virus–1 (HTLV-1) expression and that the UPF2 restriction can be counteracted by HTLV-1 Tax (66), whether UPF2 can function as a restriction factor inhibiting HIV-1 expression upon latency reversal remains to be examined.

Whether HIV-1 drives clonal expansion of HIV-1–infected cells remains unclear. HTLV-1 Tax associates with inhibitor of nuclear factor κB (IκB) kinase and degrades nuclear factor κB inhibitor IκB. Together with Tax-induced interleukin-2 (IL-2) and IL-2Rα (CD25) transcription (67) and other oncogenic mechanisms, HTLV-1–infected cells undergo clonal expansion (68). HIV-1 Tat is the major driver of stochastic activation of the infected cells (17). Tat is also known to induce IL-2 expression (69) and IKBKB activity (44). Up-regulation of IL2 and IKBKB, potentially in part by Tat, could be a mechanism protecting activated and proliferating CD4+ T cells from apoptosis (70).

HIV-1 insertional activation of cancer-related genes is a potential mechanism for HIV-1 integration site-related proliferation (34). We found that HIV-1–driven aberrant host gene transcription affected processes beyond splicing between HIV-1 and the host gene. First, we found that HIV-1 proviruses integrated into cancer-related genes can be inducible and are putatively intact. Second, location- and integration orientation-dependent HIV-1–driven aberrant host gene transcription drove high (>5-fold increase) downstream host gene expression at the integration site. This suggests that HIV-1–driven aberrant host gene transcription may potentially contribute to integration site-dependent HIV-1 proliferation. Third, HIV-1–driven aberrant host gene transcription induced aberrant splicing, intron retention, and cryptic exon activation at the integration site, which can potentially induce NMD (71). Last, we found that the HIV-1 LTR drives aberrant host gene transcription, and this HIV-1 LTR-driven aberrant transcription can be targeted by CRISPR-based inhibition of the HIV-1 LTR.

The HIV-1 promoter LTR is relatively conserved in all infected cells, regardless of the integration site. We propose that inhibiting HIV-1 LTR-driven transcription disrupts HIV-1–driven aberrant host gene transcription. HIV-1 LTR-driven transcription can be inhibited ex vivo by IMPDH1 inhibitor MMF (61, 72) or JAK inhibitor ruxolitinib (63). Furthermore, gene therapy approaches may also disrupt HIV-1–driven aberrant host gene transcription, which targets HIV-1 LTR, and gag may remove elements required for HIV-1–driven host gene transcription (73). Therefore, despite the heterogeneity of HIV-1 integration sites, targeting the relatively conserved HIV-1 LTR was sufficient to disrupt HIV-1–driven aberrant transcription.

Our study is limited to examination of HIV-1–infected CD4+ T cells harboring induced HIV-1 RNA upon latency reversal, not in the quiescent latent state. Because latent HIV-1 does not express HIV-1 RNA or HIV-1 proteins for detection, we had to induce maximal HIV-1 expression by stimulating CD4+ T cells with PMA and ionomycin ex vivo. Analysis of HIV-1–infected CD4+ T cells activated by other reactivation reagents, such as anti-CD3/CD28 costimulation and latency-reversing agents, may reveal different transcriptional signatures and HIV-1–host transcriptional interaction. Another limitation is that HIV-1 SortSeq only captures a subset of HIV-1–infected cells in which HIV-1 proviruses are reactivated by a single round of PMA/ionomycin stimulation. The replication-competent, noninduced HIV-1 may not be reactivated despite multiple rounds of T cells activation (18). Thus, HIV-1 SortSeq, using HIV-1 RNA expression as surrogates, would not capture the latent reservoir that is not reactivated after one round of activation. Last, HIV-1 SortSeq involves sample fixation and permeabilization to deliver HIV-1 RNA probes into cells and multiple washing steps to remove nonspecific binding. The fixation and permeabilization cause partial RNA degradation, and cells may be lost during the washing steps. To overcome this barrier, we optimized an RNA-preserving method to preserve RNA integrity and to reduce cell loss during washing.

Overall, our single-cell transcriptome analysis of HIV-1–infected cells from HIV-1–infected individuals examined the transcriptional landscape both globally in cells and locally at the integration site. Although many cellular factors and small-molecule compounds have been implicated in HIV-1 persistence in cell line models, our study identifies key cellular factors and HIV-1 persistence mechanism in cells from HIV-1–infected individuals and suggests priority targets in HIV-1 cure strategies.

MATERIALS AND METHODS

Study design

We conducted a prospective, cross-sectional study recruiting adult HIV-1–infected individuals. The primary study objectives were to map the single-cell atlas of HIV-1–infected cells upon ex vivo latency reversal. The inclusion criteria were adult HIV-1–infected individuals under suppressive ART for >1 year with undetectable plasma viral load for >6 months. Peripheral blood and leukapheresis samples were obtained from a total of 25 HIV-1 study participants (table S1). This study was approved by the Johns Hopkins University and Yale University Institutional Review Boards. All participants were provided written informed consent. Additional methods details are available in the Supplementary Materials.

Statistical analysis

Analyses were performed using MedCalc and Prism software (GraphPad). Statistical tests are indicated in the figure legends. P values of <0.05 in two-tailed testing were considered statistically significant. In the regression analysis of the frequency of HIV-1–infected cells detected by HIV-1 SortSeq versus the frequency of input HIV-1–infected cells, R2 and P values were calculated on log-transformed data using MedCalc. In the correlation analysis between the frequency of HIV-1 SortSeq+ cells and the size of the latent reservoir as measured by viral outgrowth assays, log-transformed data were tested for a normal distribution using D’Agostino-Pearson test. The Pearson correlation coefficient and P value were calculated on log-transformed data using MedCalc. In single-cell analysis, DEsingle was used to compare the transcriptome of SortSeq+ and SortSeq cells using default parameters (74), and P values of <0.05 after Benjamini-Hochberg correction were considered significant (75). Red lines denote median, and dashed lines denote 75th and 25th percentiles.

SUPPLEMENTARY MATERIALS

stm.sciencemag.org/cgi/content/full/12/543/eaaz0802/DC1

Materials and methods

Fig. S1. Flow cytometry gating strategies and quality control of HIV-1 SortSeq.

Fig. S2. HIV-1 SortSeq+ cells from HIV-1–infected individuals harbor spliced HIV-1 RNA.

Fig. S3. HIV-1 SortSeq can identify clonally expanded HIV-1–infected cells harboring replication competent HIV-1.

Fig. S4. Transcriptional profile of housekeeping genes B2M and UBC in HIV-1 SortSeq+ and SortSeq single cells.

Fig. S5. Gene ontology analysis of differentially expressed genes in HIV-1 SortSeq+ versus SortSeq cells from ART-treated, virally suppressed, HIV-1–infected individuals.

Fig. S6. Expression levels of IMPDH1, JAK1, UPF2, and IKBKB in HIV-1 SortSeq+ and SortSeq single cells measured by qPCR.

Fig. S7. HIV-1–host chimeric RNA landscape.

Fig. S8. Orientation and integration sites of induced HIV-1 proviruses in HIV-1 SortSeq+ cells.

Table S1. Characteristics of study participants.

Table S2. Genes in which HIV-1 is integrated.

Table S3. HIV-1–host RNA junctions.

Data file S1. Primary data.

Data file S2. HIV-1 SortSeq probe sequences.

Data file S3. Location of HIV-1 SortSeq probes.

Data file S4. List of HIV-1 SortSeq samples.

Data file S5. Differentially expressed genes between HIV-1 Sortseq+ and Sortseq cells.

References (7687)

REFERENCES AND NOTES

Acknowledgments: We thank all study participants. We thank the NIH AIDS Reagents Program. We thank S. Klemm for technical suggestions, R. F. Siliciano for providing the NL4-3-d6-dsGFP plasmid, and S. Laskey for constructing the NL4-3-dEnv-BFP plasmid. Funding: This work is supported by Yale Top Scholar, Rudolf J. Anderson Fellowship, NIH R01 AI141009 (to Y.-C.H.), R61 DA047037 (to Y.-C.H.), R21AI118402 (to Y.-C.H.), R01 AI147868 (to Y.-C.H.), T32AI055403 (to J.A.C. and K.A.), W. W. Smith AIDS Research Grant (to Y.-C.H.), Johns Hopkins Center for AIDS Research Award P30AI094189 (to Y.-C.H.), Gilead AIDS Research Grant (to Y.-C.H.), Gilead HIV Research Scholar Grant (to Y.-C.H.), NIH BEAT-HIV Delaney Collaboratory UM1AI126620 (to Y.-C.H.), NIH CHEETAH P50 AI150464 (to Y.-C.H.), and NIH R37 AI147868 (to Y.-C.H.) and P30CA006973 (to R.F.A.). The SCOPE cohort was supported by the UCSF/Gladstone Institute of Virology and Immunology CFAR (P30AI027763) and the CFAR Network of Integrated Systems (R24 AI067039). Additional support was provided by the Delaney AIDS Research Enterprise (DARE; AI096109 and A127966). This work was funded in part by the intramural program of the NIH (D.C.D. and J.H.). This sequencing conducted at Yale Stem Cell Center Genomics Core facility was supported by the Connecticut Regenerative Medicine Research Fund and the Li Ka Shing Foundation. Author contributions: Y.-C.H. conceptualized the study. Y.-C.H. and R.L. performed HIV-1 SortSeq. Y.-H.J.Y. interrogated HIV-1–driven transcription landscape. A.V., M.P., and Y.-H.J.Y. mapped HIV-1–host chimeric RNA. J.A.C. performed single-cell bioinformatic analysis. K.A. designed HIV-1–specific gRNA. R.A.P. optimized the FISH protocol. Y.-H.J.Y., R.L., A.V., J.A.C., M.P., S.S.-M., Y.-C.H., C.C.T., F.D.B., S.M., H.H., J.H., and D.C.D. participated in bioinformatic investigation. S.A.B., R.M.C., C.M.D., R.F.A., R.H., S.G.D., J.C., and S.S. recruited study participants. H.Z. performed flow cytometric sorting. Y.-C.H., R.L., Y-H.J.Y., and A.V. wrote the manuscript. Competing interests: Y.-C.H. receives research grants from Gilead Sciences. C.M.D. is a consultant for Gilead Sciences and received research grants from Abbvie and GlaxoSmithKline, but the work is unrelated to the present research. Data and materials availability: All data associated with this study are in the main paper or the Supplementary Materials. RNA-seq results were deposited to Gene Expression Omnibus as GSE126230. Methods for identifying and filtering integration sites and occompanying computational tools are available on GitHub (https://github.com/alevar/chimFinder) and Zenodo (https://doi.org/10.5281/zenodo.3740882).

Stay Connected to Science Translational Medicine

Navigate This Article