Research ArticleClinical data analysis

Data-Driven Prediction of Drug Effects and Interactions

See allHide authors and affiliations

Science Translational Medicine  14 Mar 2012:
Vol. 4, Issue 125, pp. 125ra31
DOI: 10.1126/scitranslmed.3003377


Adverse drug events remain a leading cause of morbidity and mortality around the world. Many adverse events are not detected during clinical trials before a drug receives approval for use in the clinic. Fortunately, as part of postmarketing surveillance, regulatory agencies and other institutions maintain large collections of adverse event reports, and these databases present an opportunity to study drug effects from patient population data. However, confounding factors such as concomitant medications, patient demographics, patient medical histories, and reasons for prescribing a drug often are uncharacterized in spontaneous reporting systems, and these omissions can limit the use of quantitative signal detection methods used in the analysis of such data. Here, we present an adaptive data-driven approach for correcting these factors in cases for which the covariates are unknown or unmeasured and combine this approach with existing methods to improve analyses of drug effects using three test data sets. We also present a comprehensive database of drug effects (Offsides) and a database of drug-drug interaction side effects (Twosides). To demonstrate the biological use of these new resources, we used them to identify drug targets, predict drug indications, and discover drug class interactions. We then corroborated 47 (P < 0.0001) of the drug class interactions using an independent analysis of electronic medical records. Our analysis suggests that combined treatment with selective serotonin reuptake inhibitors and thiazides is associated with significantly increased incidence of prolonged QT intervals. We conclude that confounding effects from covariates in observational clinical data can be controlled in data analyses and thus improve the detection and prediction of adverse drug effects and interactions.


Adverse drug events (ADEs) remain a significant source of mortality and morbidity around the world with costs estimated at several billion dollars each year (1, 2). Many ADEs are rare or occur only in a subset of the human population and not observed in relatively small clinical trials. To address this issue, the U.S. Food and Drug Administration (FDA), World Health Organization, and Health Canada (3) have created large adverse event reporting systems (AERSs) that collect data from clinicians, patients, and pharmaceutical companies. These resources present an opportunity to monitor drug safety in a large and diverse population of patients. Quantitative signal detection algorithms use these data to flag and prioritize drug-event signals for follow-up analysis via formal pharmacoepidemiological studies and to discover complex relationships that are difficult to identify manually [such as drug-drug interactions (DDIs)] (4, 5). Despite their power, these methods suffer from well-recognized limitations that result from sampling variance and reporting biases (4, 6).

Signal detection algorithms quantify the “unexpectedness” of an adverse event being reported for a drug through disproportionality analysis; the goal is to identify drugs that have a greater proportion of a particular event among their reported events compared to the proportion seen for other drugs. Signals are detected by comparing the observed reporting rates between a drug-event pair to an expected reporting rate derived from other drug-event pairs. Under the null hypothesis that the event occurred by chance, the observed and expected rates will be equivalent and their ratio equal to one. When this ratio is much larger than one, the null hypothesis is rejected.

Unfortunately, there are a number of extraneous causes of differential reporting that fall into two distinct classes: (i) sampling variance and (ii) selection biases. Sampling variance refers to reporting rates that vary widely across drugs and time and depends on many factors. One source of sampling variance comes from the underreporting of events by physicians, who may report only ADEs that they deem to be important or that result from a new or untrusted drug. On the other hand, some ADEs can be oversampled. For example, in 2006, more than 18,000 reports were submitted to the FDA that associated rofecoxib (Vioxx) and myocardial infarction—likely a result of the intense media attention that occurred during that time. Sampling variance has been effectively addressed in modern signal detection algorithms, such as the gamma Poisson shrinker (GPS) or Information Component (IC) (6, 7). These methods estimate confidence intervals (CIs) for the disproportionality statistics and then dampen drug-event signals that have little evidence to support them. However, these methods do not address the issue of reporting biases (6).

Selection biases result from the nonrandom selection of subjects exposed to the drug and experiencing adverse events. This selection may be driven by causative covariates other than the drug under analysis (for example, a patient’s disease state or other medications). This faulty selection may cause the disproportionality analysis to associate the drug and the event when a causative covariate is not accounted for; we refer to this as a synthetic association. Indication bias is one of the most common examples of this and occurs when a drug is synthetically associated with an event that is more appropriately attributed to the underlying disease (4). For example, it is common for diabetes drugs to be reported with hyperglycemia, a symptom of diabetes and usually not an effect of treatment. Similarly, concomitant medications can also confound drug-effect associations. Drugs commonly co-prescribed with rofecoxib (Vioxx) were more likely to be associated with heart attack simply because these drugs were commonly taken together. These issues extend to other covariates as well, both common and uncommon. Patients reported to be taking a cholesterol-lowering agent are more likely to be older, and this may cause these drugs to be synthetically associated with age-related effects, such as hypertension or myocardial infarction (age bias). Patients who have recently had a renal transplant are often prescribed moxifloxacin, which has resulted in synthetic associations of the drug with renal impairment (prescribing bias).

The sources of bias are myriad and have not been directly addressed in modern signal detection algorithms. Indeed, stratification of the data on predefined covariates is the primary technique for removing these biases. However, stratification requires enumeration of all important covariates, which is computationally intractable to use for safety surveillance. Identifying a fixed set of likely covariates for all drugs can help the analysis, but reduces power when dividing the reports across strata that are not correlated with the outcome (4, 8). In addition, stratification is impossible when the reporting systems do not contain reliable measures of the common covariates. These factors limit the benefit of applying stratification routinely (8).

Our new method (i) accomplishes the goals of stratification, dampening or removing the effect of covariates, without the need to divide drug-exposed reports into strata; (ii) is both adaptive (it removes different covariates for different drugs) and appropriate for systematic application and routine analysis; and (iii) is designed to complement modern signal detection approaches and thus extends the applicability and power of existing methods. Our model is inspired by the case-control approach to cohort selection in observational clinical studies. Each drug-exposed patient is matched to one (or more) nonexposed patients (controls). The nonexposed patients are selected on the basis of how well they match an exposed patient on a set of predefined covariates. Propensity score matching (PSM)—a statistical method designed to yield an unbiased estimate of treatment effects—has emerged as the preferred method of matching exposed and nonexposed patients in observational cohort studies and has yielded similar estimates of effects when compared to the results of randomized control trials (911). However, like other confounder controlling methods, PSM requires the covariates to be both known and measured; neither parameter is guaranteed to be present in spontaneous reporting systems. Instead, to match patients, we adapted PSM to use only the co-reported drugs and co-reported indications. We hypothesize that many confounders correlate with these key variables and do not need to be modeled.

When we applied this data-driven approach to flag potentially significant drug-event associations in the AERS, we successfully removed many synthetic associations from indications, co-prescriptions, and hidden covariates. We call this new method the statistical correction of uncharacterized bias (SCRUB) and use it to construct comprehensive databases of off-label and DDI side effects (Offsides and Twosides, respectively). These databases contain information that is independent from the Side Effect Resource (SIDER), a database of drug effects mined from the package inserts (12). We demonstrate the biological use of these databases by showing improved performance (compared to SIDER) at predicting drug targets and drug indications (13, 14). Furthermore, we use the new methods to identify adverse drug class interactions and then corroborate 47 of the 395 predicted interactions with electronic medical records (EMRs) from Stanford University Hospital. Finally, we conclude our analysis with an association between adverse cardiovascular events and co-prescription of thiazides and selective serotonin reuptake inhibitors (SSRIs). Patients who take these drugs in combination are significantly more likely to have prolonged QT intervals than those who take thiazides or SSRIs alone.


Sources of synthetic associations: Disease indication, concomitant drug use, and characteristic biases

Disease indication. We manually constructed a set of 543 adverse events strongly associated with indications for which the indication and the adverse event have a known causative relationship. We call a drug-event association synthetic if it has a tight reporting correlation with the indication (ρ ≥ 0.1) and a high relative reporting (RR) association score (RR ≥ 2). Drugs reported frequently with these indications were 80.0 (95% CI, 14.2 to 3132.8; P < 0.0001, Fisher’s exact test) times as likely to have synthetic associations with indication events. We found a strong linear relationship between the correlation of reporting between a drug and indication and the likelihood of a synthetic association (ρ = 0.63, P < 0.0001, Fig. 1A).

Fig. 1

Synthetic associations in adverse event reports. (A) Disease indications are a significant source of synthetic associations. The more disproportionately a drug is reported with an indication (x axis), the more likely that drug will be synthetically associated with the indication’s effects (y axis) (for example, it is common for hypoglycemic agents to be synthetically associated with hyperglycemia). (B) Concomitantly taken drugs are another significant source of synthetic associations. The more disproportionately two drugs are reported together (x axis), the more likely they will be associated with the other drug’s effects (y axis). (C) Drugs that are preferentially reported with males are more likely to be synthetically associated with sex-related effects. (D) Similarly, drugs that are preferentially reported with relatively young or relatively old patients are more likely to be synthetically associated with age-related effects. (E to H) Application of SCRUB removes synthetic associations that result from disproportionate reporting with (E) disease indications, (F) concomitant drug use, (G) sex biases, and (H) age biases.

Concomitant drugs. We identified 1559 adverse events strongly associated with drugs and listed on the drug’s package insert. These drug-event pairs represent a set of known strong positive associations. We call a drug-event association synthetic if it has a tight reporting correlation with the causative drug (ρ ≥ 0.1) and a high association score (RR ≥ 2). Drugs co-reported with these drugs were 55.8 (95% CI, 29.3 to 122.1; P < 0.0001, Fisher’s exact test) times as likely to have synthetic associations with the drug events. We found a strong linear relationship between the correlation of co-reporting between drugs and the likelihood of synthetic associations (ρ = 0.93, P < 0.0001, Fig. 1B).

Characteristic biases (sex and age). We identified 33 adverse events that, for physiological reasons, predominantly occur in males (for example, penile swelling and azoospermia; full list shown in table S1). We found that drugs that are disproportionately reported as causing adverse events in males were more likely to be synthetically associated with these events (ρ = 0.57, P < 0.0001, Fig. 1C). Similarly, we identified 48 adverse events that predominantly occur in either relatively young or relatively old patients (table S2). We found that drugs that are disproportionately reported to cause adverse events in relatively younger or relatively older patients (compared to the database average) were more likely to be synthetically associated with these events (ρ = 0.60, P < 0.0001, Fig. 1D).

Correction of selection bias in adverse event report data

By identifying matched nonexposed reports to the exposed reports for each drug, SCRUB reduced the rate of synthetic associations that resulted from indications and concomitant drugs. We found that applying SCRUB removed 57% of the synthetic associations from indications and 49% of those that resulted from concomitant drugs. Further, we found that SCRUB preferentially dampened the signal of synthetic associations: odds ratio (OR) = 1.8 (95% CI, 1.4 to 2.4; P < 0.0001, Fisher’s exact test) for associations from concomitant drugs and OR = 2.0 (95% CI, 1.8 to 2.3; P < 0.0001, Fisher’s exact test) for those resulting from indications (Fig. 1, E and F). We evaluated the method against three independent silver standards of ADEs: (i) side effects mined from the drug package inserts, (ii) adverse events reported to the FDA after our data extraction, and (iii) adverse event reports from Canada. We found that using SCRUB in combination with the GPS, a commonly used method for correcting sampling variance, significantly increased the predictive power in all three cases. We found that the area under the receiver operating characteristic curve (AUROC), a common method for evaluating the performance of predictive algorithms, increased from 0.53 to 0.79 (χ2 = 19,130, P < 0.0001), 0.58 to 0.71 (χ2 = 13,963, P < 0.0001), and 0.59 to 0.77 (χ2 = 8598, P < 0.0001) for each of the silver standards, respectively (Fig. 2). This finding is corroborated by several case studies reported in the Supplementary Materials (figs. S1 to S7).

Fig. 2

Systematic evaluation against three independent silver standards of drug-effect associations. (A) Side effects mined from the package inserts. (B) Drug-effect pairs reported to the AERS after the original download date. (C) Drug-effect pairs reported to the Canadian system MedEffect. ROC curves for the empirical Bayes geometric mean (EBGM) (black) and a model combining EBGM and the correction factor derived from the SCRUB algorithm (aqua). In each case, including the correction term substantially improves the predictive power of the algorithm.

Correction of uncharacterized bias in adverse event report data

We hypothesized that the SCRUB algorithm would correct for synthetic associations caused by hidden, or unmeasured, covariates as well as those from indication and concomitant drug use. To test this notion, we hid age and sex data from the model and tested the ability of SCRUB to remove synthetic associations that resulted from these covariates. We found that the SCRUB-identified cohort matched more closely in age for 478 of 629 (76%) of drugs. We highlight the 20 most age-biased drugs and their cohort age differences in Fig. 3A. Similarly, we tested the ability of the algorithm to implicitly correct for sex differences. SCRUB-identified cohorts matched more closely in sex differences for 467 of 629 (74%) of drugs. Figure 3B highlights the top 20 drugs most biased in terms of sex of the patients and our corrections.

Fig. 3

Implicit matching of covariates. (A) Biases from age. Average age differences between cohorts of reports for those patients exposed to the drug and those who were not exposed (controls). The average difference for the uncorrected (solid squares) and corrected (open circles) nonexposed control reports is shown. Ideally, the difference between the two cohorts of reports is zero. (B) Biases from sex. Difference in the proportion of males reported to be exposed to the query drug versus those who were not exposed (controls). The difference for the uncorrected (solid squares) and corrected (open circles) nonexposed reports is shown. Ideally, this difference is zero.

Off-label and polypharmacy side-effect databases

We use the term “off-label” to refer to any drug effect not already listed on the drug’s package insert. We constructed a database of 438,801 off-label side effects for 1332 drugs and 10,097 adverse events. The average drug label lists 69 “on-label” adverse events. We list an average of 329 high-confidence off-label adverse events for each drug. For comparison, the SIDER database, extracted from drug package inserts, lists 48,577 drug-event associations for 620 drugs and 1092 adverse events that are also covered by the data mining. Offsides recovers 38.8% (18,842 drug-event associations) of SIDER associations from the adverse event reports. Thus, Offsides finds different associations from those reported during clinical trials before drug approval. We found that the drug-event associations reported in Offsides were predictive of known class-wide drug effects, such as the adverse events of the nervous systems associated with antiparasitics and insecticides (fig. S8).

In addition, we constructed a database of polypharmacy side effects for pairs of drugs (Twosides). This database contains 868,221 significant associations between 59,220 pairs of drugs and 1301 adverse events. These associations are limited to only those that cannot be clearly attributed to either drug alone (that is, those associations covered in Offsides). The database contains an additional 3,782,910 significant associations for which the drug pair has a higher side-effect association score, determined using the proportional reporting ratio (PRR), than those of the individual drugs alone. We found that the Twosides database is enriched for pairs of drugs with known interactions (t = 6.6, P = 4.9 × 10−11).

Use of drug-effect associations to predict protein targets and drug indications

Using a simplified version of the analysis by Campillos et al. (13), we computed pairwise similarity metrics between all drugs in the Offsides and SIDER databases. We found that side-effect similarities derived from Offsides were predictive of the proportion of shared targets between drugs (Fig. 4A, ρ = 0.92, P < 0.0001). For example, diazepam (Valium) and zolpidem (Ambien) share seven of the same protein targets, and although they have different chemical structures and are used for different indications, the two drugs have similar side-effect profiles. This similarity in side-effect profiles can be used to identify pairs of drugs likely to share targets and, when those associations are not yet known, to find new targets for existing drugs. Furthermore, similarities derived from Offsides provided information that was independent of similarities derived from SIDER (Fig. 4B) with respect to predicting shared targets (χ2 = 177.2, P < 0.0001), as determined by an analysis of variance (ANOVA). The model that was most predictive of shared targets was one that included information from both databases (AUROC = 0.75), followed by SIDER alone (AUROC = 0.71), and then Offsides alone (AUROC = 0.70) (Fig. 4C).

Fig. 4

Predicting shared protein targets using drug-effect similarities. (A) The side-effect similarity score between two drugs is linearly related to the number of targets that those drugs share. (B) A scatter plot showing the relationship between the side-effect similarity score and the number of shared targets for side effects derived from Offsides (blue), SIDER (red), and both combined (black). (C) ROC curve representing the ability of the side-effect similarity scores to predict which pairs of drugs share targets. The best performance is reached by combining both data sets.

Similarly, we found a linear relationship between the proportion of shared indications between a pair of drugs and the similarity of their side-effect profiles in Offsides (Fig. 5A). This opens up the possibility of using side-effect profiles to suggest new uses for old drugs. Again, we found that Offsides provided information independent of that provided by SIDER (Fig. 5B) with respect to predicting shared indications (χ2 = 874.5, P < 0.0001), as determined by an ANOVA. We found that a combination of the two databases performed best in terms of predicting existing therapeutic indications of known drugs (AUROC = 0.83), followed by SIDER alone (AUROC = 0.78), and then Offsides (AUROC = 0.75) (Fig. 5C).

Fig. 5

Drug repurposing using drug-effect similarities in Offsides. (A) The side-effect similarity score between two drugs is linearly related to the number of indications those drugs share. (B) Scatter plot showing the relationship between the side-effect similarity score and the number of shared targets for side effects derived from Offsides (blue), SIDER (red), and both combined (black). (C) ROC curve representing the ability of the side-effect similarity scores to predict which pairs of drugs share indications. The best performance is reached by combining both data sets.

Corroboration of class-wide interaction effects with EMRs

We used the Twosides database to identify DDIs shared by an entire drug class (see Materials and Methods). Our class-class interaction analysis produced 1732 putative drug class interactions. We then identified laboratory reports commonly recorded in EMRs—for patients who visited Stanford University Hospital—that may be used as markers of these class-specific DDIs. We tested 596 of these interactions and found significant changes in the long-term (≥1 year) laboratory markers for 395 (66%; P < 0.0001, Fisher’s exact test) of the interactions using a Cox proportional hazards model. Further, we found additional evidence of drug effects for 47 of 395 interactions when looking for short-term (≤36 days) changes in laboratory markers after the start of treatment (Table 1).

Table 1

Class-class interactions identified in Twosides and corroborated by EMRs. Forty-seven class-class interactions were corroborated by two retrospective analyses of EMRs: (i) short-term combination effects observed in the 36 days after combination treatment and (ii) long-term incidence of abnormal laboratory values associated with combination treatment. The posttreatment lab changes and long-term risk values shown are all statistically significant (P < 0.05) as determined by a paired t test and Cox proportional hazards model, respectively (see the Supplementary Methods). CA, calcium; MCH, mean corpuscular hemoglobin; MCHC, MCH concentration; PLT, platelets; RDW, red cell distribution width; K, potassium; Mg, magnesium; QTC, corrected QT interval; Na, sodium; PCO2A, arterial CO2; PTT, partial thromboplastin time; ALB, albumin; CR, creatine; GLOB, globulin; RBC, red blood cells; TP, total protein; AST, aspartate aminotransferase; TBIL, total bilirubin; BUN, blood urea nitrogen; ALKP, alkaline phosphatase.

View this table:

To test whether these discoveries were spurious fluctuations of the laboratory markers, we constructed a random set of interaction predictions for comparison. We found that our class-class interaction predictions were significantly enriched for interactions for which there was evidence in the EMRs (OR, 36.8; 95% CI, 6.2 to 1481.9; P < 0.0001, Fisher’s exact test). Figure 6 summarizes our drug effect and interaction findings for cardiovascular adverse events.

Fig. 6

Interaction diagram depicting single-drug effects, drug-class effects, DDIs, and class-class interactions for cardiovascular adverse events. Drugs are sorted clockwise around the ring by the physiological system they treat. Drugs labeled by name are members of data-mined DDIs. Within each physiological system, drugs are grouped into lower-order drug classes according to structural similarity or treatment indication. These lower-order classes are colored by their class-wide association with adverse cardiovascular effects (red for most severe to blue for least severe). Each arc across the center represents one DDI according to the data mining. The arc is colored red if the drug interaction is corroborated with evidence from the EMRs and brown if the drugs are members of class-class interactions. The heat map around the interior of the ring indicates the individual drug effects with the top 10 cardiovascular adverse events (arteriosclerosis, decreased arteriole pressure, chest pain, difficulty breathing, heart attack, apoplexy, high blood pressure, coronary heart disease, edema in extremities, cardiac decompression) (dark red for strong associations to white for weak or no association).

Association of co-prescription of thiazides and SSRIs with a prolonged QT interval

The DDI with the largest effect size was an association between co-prescriptions of thiazides and serotonin reuptake inhibitors (SSRIs) with prolonged QT intervals (QTc > 440 ms) (Supplementary Materials). Prolonged QT intervals on an electrocardiogram have been associated with increased risk of spontaneous arrhythmias and sudden death. For EMR analysis, we removed patients who had a previous history of prolonged QT. Of 932 patients who were co-prescribed a thiazide and an SSRI, 87 (9.3%) displayed prolonged QT, whereas 588 of 9008 (6.5%) patients who were prescribed thiazides alone had prolonged QT, and 684 of 14,218 (4.8%) patients who were prescribed SSRIs alone had prolonged QT. Using Cox proportional hazards regression with covariates, we performed a time-to-event analysis on the association between co-prescription of thiazides and SSRIs and the incidence of prolonged QT. This analysis showed that patients who were co-prescribed a thiazide and an SSRI were 1.5 (95% CI, 1.2 to 1.9) times as likely (P = 4.46 × 10−4) to record a prolonged QT interval when compared to patients prescribed a thiazide alone and 1.4 (95% CI, 1.2 to 1.8) times as likely (P = 0.0013) as those prescribed an SSRI alone (Fig. 7). We ruled out the possibility that other co-prescribed medications may have been associated with QT prolongation by testing 38 commonly co-prescribed drugs in independent regression models and, in each case, found a significant effect from the combined use of thiazides and SSRIs (table S3). We corrected for the effects of co-prescriptions, age, race, and sex (table S4). Together, these results suggest that further study of the potential interaction between thiazides and SSRIs may be warranted.

Fig. 7

Kaplan-Meier curves showing the proportion of patients that had prolonged QT corrected values after the start of drug therapy. The solid line represents patients who received both thiazides and SSRIs, the dashed line represents patients who received only thiazides, and the dotted line represents patients who received only SSRIs.


The methods we present here build upon the foundation of signal detection algorithms developed for drug safety surveillance. The use of spontaneous reporting systems for identifying ADEs faces challenges as a result of sampling variance and reporting biases (4, 6). Modern signal detection algorithms address the issue of sampling variance by using shrinkage to down-weight drug-event associations with little evidence to support them (6, 7). Stratification is designed to address reporting biases by dividing the data across covariate-defined strata. However, systematic application of stratification using a fixed set of covariates reduces power by dividing up the available data across unimportant strata (4, 15). Our approach does not divide data across strata and can correct for the effects of confounders even if those variables are unknown or unmeasured. The key insight is that, at least for drugs, the indications of use and other drugs used capture most of many important covariates. Although our approach is inspired by those used in observational cohort analysis, it does not enable causative inference. Like other signal detection techniques, the goals are to generate quality hypotheses for follow-up analysis. Our method has a comparable running time to current techniques, making it suitable for systematic drug surveillance.

The successful prediction of side effects before a drug enters clinical trials remains a tantalizing goal. Chemical informatics techniques can predict drug side effects by comparing the structural similarity of drugs (16, 17). In an analogous manner, protein structural similarity can explain and predict drug side effects (18). More recently, network and chemical properties have been combined together into predictive models of drug effects (19); these approaches all rely on a comprehensive database of known drug effects. Package inserts list drug side effects and could serve as a primary source of known side effects, but these data are limited. First, because clinical trials are conducted on relatively small patient populations, only common effects can be detected with sufficient confidence to be listed on a drug’s package insert. Second, effects observed during the clinical trials may be incidental and not actually caused by the drug. Nonetheless, recent work in chemical biology has used the SIDER (a text-mined database of drug package inserts) to good effect (12, 13, 20). Our Offsides database contains information complementary to that found in SIDER and improves the prediction of protein targets and drug indications. As a complement to Offsides, our Twosides database of mined putative DDIs also lists predicted adverse events. These databases will serve as valuable resources for chemical biology, drug discovery, and pharmacoepidemiology studies. These databases are made available in the Supplementary Materials and at the Web site.

Identification and prediction of DDIs is a critical activity for improved patient care (2126). Clinical trials do not routinely investigate DDIs because they are focused on establishing safety and efficacy of single-agent therapeutics. A wide range of methods, from text mining (27, 28) to network modeling (29, 30), can detect, explain, and predict DDIs. Recently, a systems pharmacology approach was presented to identify genes associated with adverse cardiovascular drug effects (31). Integration of these methods with Twosides may lead to further understanding of the molecular etiology of these effects (figs. S9 and S10). We highlight one potentially clinically significant association between co-prescription of thiazides and SSRIs and QT interval prolongation. Prolonged QT is not a known interaction effect of thiazides and SSRIs. However, each drug class is individually implicated in causing hyponatremia (3234), and the mechanisms that cause this side effect may interact synergistically. The EMR analyses we report are not full epidemiological studies. EMR records are incomplete and may be missing data on medical history and prescription orders. In addition, patients who take multiple drugs may have a higher rate of adverse events than less-medicated patients. Further analysis is needed to evaluate these potentially important drug interactions.

Evaluation of signal detection algorithms and side-effect prediction algorithms, in general, is not straightforward; no gold standard of known ADEs exists. In lieu of a standard, we evaluated our proposed methodology against three “silver” standards: (i) effects listed on the drug’s package inserts, (ii) ADEs reported after the original download date of September 2009, and (iii) ADEs reported to the Canadian spontaneous reporting system. We found that when used in combination with modern signal detection algorithms, our method significantly improved performance. These standards, however, are biased toward more common effects, and so the performance of our method with respect to detecting rare events may be less reliable. A publicly available resource of drug effects would enhance the evaluation of this and other predictive algorithms.

In summary, we present a new methodology for correcting for the effects of confounding variables in large clinical observational databases when those variables are unknown, unmeasured, or sparsely collected. The goals of this work parallel those of patient stratification; however, our presented methodology adapts to specific drug-event pairs, does not require data to be split across strata, and can implicitly correct for unmeasured covariates. The key assumption of the method is that many patient covariates will be represented by the concomitant drugs the patient is taking and indications for which the patient is being treated. The method improves the performance of modern signal detection techniques and is suitable for systematic and routine drug safety surveillance. Finally, we present two new resources of adverse drug effects and drug interactions for use in drug discovery, repositioning, chemical biology, and pharmacoepidemiology studies.

Materials and Methods

Data source

We downloaded the following: (i) 1,851,171 adverse event reports in the AERS from the FDA’s Web site from the first quarter of 2004 to the first quarter of 2009; (ii) the SIDER, a database of the drugs, adverse events, and indications mined from the FDA drug labels (12) and Canada’s MedEffect resource, the sister database to the AERS containing about 300,000 adverse event reports (downloaded September 2009); (iii) the drug target information from the DrugBank, Matador, and Psychoactive Drug Screening Program (PDSP) chemical databases referenced by Campillos and colleagues for use in correlating side-effect similarity to shared drug targets (13); and (iv) an independent database of the adverse event reports in AERS for the third quarter of 2009 to the fourth quarter of 2010 for validation purposes.

Statistical model and assumptions

Not all ADEs that occur are captured by spontaneous reporting systems. The drug effects need to be observed, recognized, attributed to a drug, and then reported. Therefore, differential reporting and covariate biases prohibit a straightforward interpretation of the reports. Disproportionality analysis addresses some of these issues by looking for drugs that are disproportionately reported with a particular event compared to that same proportion for other drugs. They do so by comparing the observed number of reports to an expected number estimated from the proportions of other drugs. In our model, under the null hypothesis, we view the observed (Oxy) and expected (Exy) as biased estimators for the incidence (Ixy), where the incidence is the rate at which the event would be reported absent of any confounding variables. We can then write the observed-to-expected ratio as follows:Ixy+εIxy+βwhere β is the bias of Exy and ε is the bias in Oxy. It is clear that synthetic associations occur when the bias in the observed is greater than the bias in the expected, and in general, any time the biases are not equal, errors may occur. Because we do not have complete knowledge of the patients who are prescribed any given drug and the events that occur, we cannot compute the bias terms directly. Instead, we adapted the tools of cohort selection in observational studies to match each exposed case report to a nonexposed control report. PSM models the probability that a patient (also known as a report in our case) is selected into the exposed group versus the nonexposed group as a function of the available covariates using logistic regression (10). Each exposed patient (that is, report) is matched to a nonexposed patient with a similar probability according to the PSM model, thereby mitigating the effects of confounders. In general, AERSs do not collect data on all of the covariates necessary to implement a PSM model. However, our hypothesis is that many of the important covariates for a patient will be captured by the concomitant medications that patient is taking and the indications they are being treated for. These are data that are collected by spontaneous reporting systems, and these are the variates we use in our PSM model.

Identification of nonexposed control reports to estimate expected values

For each drug, x, we used PSM to model the probability that a given report lists x as a concomitant medication. That is, the dependent variable for the logistic regression model was an indicator variable for the listing of drug x on the adverse event report. The independent variables were other drugs and indications. However, rather than using all drugs and indications (which would lead to a model with thousands of features), we used only drugs and indications found to be preferentially reported with the given drug, x. In addition, we limited the total number of possible features to the top 200 covariates (sorted by their Spearman correlation coefficient, ρ). Finally, we removed any reports that listed none of the chosen features (that is, their feature vector would be all zeroes). In this way, we built a PSM for each of the 632 drugs and used the model to score each report for each drug to compute its PSM score. Now, for each drug, we have generated PSM scores for each of the exposed reports and a subset of the nonexposed reports. We divide the exposed reports into 20 equally spaced bins on the basis of their PSM scores, and for each bin, we sample, with replacement, nonexposed control reports with PSM scores within the bin range until we have 10 times as many nonexposed reports as exposed reports. We discard any bins that have no matching controls available. The result is two sets of reports for each drug (exposed and nonexposed) from which we compute the observed and expected ratios and the resulting disproportionality statistics. To identify DDIs, we used a slightly modified approach that was computationally more efficient (table S5).

Construction of the silver standard sets of drug-event associations

To evaluate the presented method, we tested its predictive performance against three standards of drug effects and compared this performance to the GPS. No unbiased gold standard for ADEs exists. The drug-event associations from the FDA drug labels are the most obvious option for comparison. However, it is important to note that the labels are biased toward the more common adverse events that are observed and reported in premarketing clinical trials. This bias will limit the applicability of the drug labels because the goal of signal detection is to identify rare and unexpected side effects of drugs. An independent adverse event database, such as Canada’s MedEffect database, can also be used for evaluation. However, because such a database will suffer from the same types of errors, it is necessary to take only a subset of high-confidence associations. We extracted only those associations where there was only one drug listed on the report (according to the publicly available structured data) under the assumption that if only one drug is listed, then it is the causative agent. Similarly, to asses reproducibility, we used a subset of AERS (quarter 3 of 2009 to quarter 4 of 2010) that was not used in the original analysis as a third silver standard. Again, to mitigate confounding effects, we use only those reports that list exactly one drug and flag that drug as the primary suspect.

Using drug side-effect similarities to predict drug targets and indications

Previous work has shown that a drug’s side effects can be used to predict protein targets. Specifically, Campillos and colleagues have shown that if two drugs are similar in the side effects they elicit, then they are more likely to share a common drug target (13). As validation of the biological relevance of the methods we present, we replicated this result in our mined associations. We calculated the similarity between two drugs by computing the Tanimoto coefficient between the drug’s adverse event bit vectors (in these adverse event bit vectors, each bit represents one adverse event and is set if the drug has a significant association with the adverse event). Some drugs have higher similarity scores on average using this metric, so we perform a z-score normalization by drug. We calculated these “z similarities” for both the SIDER data set (the side effects extracted from the drug’s package inserts) and the Offsides data set. We tested the similarity score’s ability to predict the number of targets two drugs share using a multivariate linear regression (modeling proportion of shard targets) and logistic regression (modeling shared targets), and tested for independence between SIDER and Offsides using an ANOVA.

Recent studies have also shown that drug indications can be predicted using side-effect similarities (14). We use the same side-effect similarity scores computed for target identification to predict drug indications. We tested the similarity score’s ability to predict the number of targets two drugs share using a multivariate linear regression (modeling proportion of shard targets) and logistic regression (modeling shared targets), and tested for independence between SIDER and Offsides using an analysis of covariance (ANCOVA).

Construction of the off-label and polypharmacy side-effect databases

The side-effect resources we present and made publicly available are a subset of the associations analyzed in the analysis. In the Offsides database, we removed any associations that are not nominally significant (uncorrected P > 0.05); the remaining associations were included. In the Twosides database, we also removed associations that were not significant. In addition, we removed any associations where there is evidence, according to Offsides, that one drug of the pair is likely responsible for the adverse event. This step improves the chances that the reported drug interaction effects are due to synergistic interactions and not recapitulations of known effects.

Associating drug classes to adverse event categories from Offsides

We associated Anatomic Therapeutic Chemical (ATC) drug classes (levels 1 and 4) with adverse event categories by linear modeling. To facilitate the statistical modeling, we constructed a table where the rows are all drug-event pairs determined to have significant associations by SCRUB. The model contained two features: (i) indicator variable of the membership of the drug in the ATC drug class and (ii) indicator variable of the membership of the event in the category. The dependent variable is the reporting frequency for the drug-event pair observed in the AERS. For each class-category pair, we then modeled the reporting frequency between all drugs and adverse events as a function of the two indicator variables and an interaction term between the two indicator variables. We filtered for those class-category associations where the interaction term was significant, after multiple hypothesis correction.

Methodological detail covering the statistical analysis of the drug and indication case studies, computing the drug-effect association statistics, identifying DDI effects and the analytical methods for validating acute, and long-term effects using EMR data can be found in the Supplementary Materials and tables S4, S6, and S7. Institutional review board approval was obtained for the EMR studies.

Supplementary Materials

Materials and Methods

Fig. S1. Case study: abacavir and rash.

Fig. S2. Case study: isoniazid and hepatic failure.

Fig. S3. Case study: pergolide and heart valve damage.

Fig. S4. Case study: rofecoxib and myocardial infarction.

Fig. S5. Case study: arrhythmias.

Fig. S6. Case study: hypercholesterolemia.

Fig. S7. Case study: hyperglycemia.

Fig. S8. Heat map of the interaction coefficients between adverse event categories and drug classes.

Fig. S9. Performance of disproportionality statistics for drug-drug interactions.

Fig. S10. Trends in disproportionality statistics for drug-drug interactions.

Table S1. Adverse events primarily reported with males.

Table S2. Age-related adverse events.

Table S3. Effect of commonly co-prescribed drugs on association with thiazides and SSRIs.

Table S4. Covariates in Cox regression model of thiazides and SSRIs.

Table S5. Drug class–adverse event category associations in Offsides.

Table S6. Labs used as markers for adverse event categories.

Table S7. Normal lab value ranges.


Tab delimited database file: offsides.tsv

Tab delimited database file: twosides.tsv

References and Notes

  1. Acknowledgments: We thank G. Friedman, J. Terdiman, D. Oliver, and D. Ludwig for useful comments and discussion. Funding: N.P.T. is supported by a training grant from the U.S. National Library of Medicine (NIH LM007033) and by an award from the U.S. Department of Energy Office of Science Graduate Fellowship. R.D. is a Howard Hughes Medical Institute Medical Research Fellow. P.P.Y. is supported by Clinical and Translational Science Awards (CTSA) TLI RR025742 and the Bruce E. and Doris A. Nelson Fellowship Fund. R.B.A. is supported by NIH/National Institute of General Medical Sciences PharmGKB resource, R24GM61374, as well as LM05652. Additional support is from the Stanford NIH/National Center for Research Resources CTSA award number UL1 RR025744. Author contributions: N.P.T. and R.B.A. designed the analysis; N.P.T., P.P.Y., and R.D. implemented and performed the analysis; N.P.T. and R.B.A. wrote the paper. Competing interests: The authors declare that they have no competing interests.
View Abstract

Navigate This Article