PerspectiveDrug Discovery

Developing predictive assays: The phenotypic screening “rule of 3”

See allHide authors and affiliations

Science Translational Medicine  24 Jun 2015:
Vol. 7, Issue 293, pp. 293ps15
DOI: 10.1126/scitranslmed.aab1201


Phenotypic drug discovery approaches can positively affect the translation of preclinical findings to patients. However, not all phenotypic assays are created equal. A critical question then follows: What are the characteristics of the optimal assays? We analyze this question and propose three specific criteria related to the disease relevance of the assay—system, stimulus, and end point—to help design the most predictive phenotypic assays.


Over the past decade, several commentaries have referred to a productivity crisis within the pharmaceutical industry (1, 2). A potential culprit is an overreliance on the industrialization of the drug-discovery process coupled with a reductionist approach to disease biology that focuses on single, insufficiently validated targets (3, 4). Recent analyses have indeed confirmed that a majority of high-profile studies reporting potential therapeutic targets contain irreproducible results and are potentially drawing incorrect conclusions on the link between these targets and specific diseases (5, 6). It is during this time of collective industry soul-searching that Swinney and Anthony published a thought-provoking review on the origins of newly approved medicines, revealing the prominence of phenotypic approaches in the discovery of first-in-class, small-molecule drugs (4). Although an updated survey produced a lower percentage of success (7), the difference in numbers appears to be mostly attributable to semantic distinctions in the definition of phenotypic screening and does not overturn the conclusions of the first study, in our opinion.

Phenotypic assays are compound screening systems that focus on the modulation of a disease-linked phenotype in a target-agnostic manner, in contrast to those centering on a specific protein (as is the case in target-based drug discovery). This success of phenotypic approaches, rather unexpected in an era focused on target-based drug discovery, has been ascribed to the unbiased nature of such screens (4, 8). Generally, phenotypic assays are more physiologically relevant than target-based ones because they are minimally cell-based, if not tissue- or whole-organism–based. Furthermore, they offer the possibility of identifying compounds acting through either unknown targets or unprecedented molecular mechanisms of action (MMOA) for known targets. For example, a phenotypic screen using live bacteria led to Linezolid, a member of a novel antibiotic class preventing the initiation of protein translation (9). Similarly, the antiepileptic lacosamide was identified through testing in an in vivo rat model of epilepsy; after its U.S. Food and Drug Administration approval, it was documented to promote slow inactivation of voltage-gated sodium channels rather than directly blocking them (10, 11).

With the pharmaceutical industry embracing phenotypic screening anew, a key question has emerged that has received comparatively scant attention in the scientific literature (12, 13). Given that a research program will likely be entirely based on the results of its initial hit identification screen, the assay itself can be expected to play a critical role in the fate of this endeavor. What, then, are the characteristics of the best phenotypic assays, those most likely to lead to compounds and mechanisms that successfully translate to patients?


A critical interpretation of recent, major discoveries in both basic and disease biology is that they probably reveal the extent of what we have yet to learn in these areas. For example, the existence of mammalian stem cells was an unexpected finding, the ramifications of which are hard to overstate (14). Similarly, the discovery of noncoding RNAs and the explosive growth of the epigenetic field in general, highlight how a basic area of biology has only recently become better understood (15). Even well-established dogmas, such as the presumed lack of neurogenesis in the adult human brain, have been overturned recently (16).

By definition, target-based drug discovery postulates a direct link between the modulation of a target through a given MMOA and the resolution or mitigation of a disease state. However, the high rate of failure of target-based projects due to lack of efficacy indicates that this underlying hypothesis is often incorrect (2). The target selection process usually uses a reductionist approach in which a hypothesized link from the disease to the target is built by using progressively simpler systems, with the assumption of translation to the clinic at each step (Fig. 1). These assumptions—some known and others unknown to the researchers—are validated during the drug-discovery process, as compounds are tested in progressively more complex systems up to, ultimately, patients. Unrecognized assumptions made during target selection, or “unknown unknowns” (17), may then play a substantial role in the documented high rate of project failure.

Fig. 1 Target selection and project progression: Making and validating translation assumptions.

Symbolic representation of the path followed during target selection, in which a relationship is hypothesized between a specific disease state and a MMOA modulating the activity of a single target. The assumptions are made at each step on the left side of the pyramid because we “assume” at each step that the ever simpler systems used will still be representative of the human disease. Project progression involves testing this hypothesis in progressively more disease-relevant models.


By design, phenotypic approaches have the potential to mitigate our incomplete understanding of physiology and disease biology. To fully deliver on that promise, it follows that those phenotypic assays most faithfully replicating the disease state will incorporate fewer incorrect assumptions and will consequently be more likely to deliver clinically relevant leads. In other words, the value of a phenotypic assay lies entirely in its ability to predict the successful translation of a given compound and/or MMOA to patients. Practically, this statement can be used as a guiding philosophy to evaluate the potential value of proposed assays in a resource-constrained environment. From a technical standpoint, three criteria stand out as helpful in this evaluation: assay system, stimulus, and readout.

We present the least- to most-relevant assay characteristics using a simple stoplight analysis in Table 1 and discuss each criterion in this Perspective. For the first criterion, we propose, rather obviously, that the physiological relevance of the assay system used in the phenotypic screen is of critical importance.

Table 1

Disease relevance within the three key features of phenotypic assays.

View this table:

Criterion 1: Disease relevance of the assay system. Although the majority of phenotypic screens were conducted in cell lines (often tumor-derived) as recently as a few years ago, awareness is increasing in the field of the importance of physiological relevance in the assay systems. Intuitively, one would expect native cells, such as primary cells and induced pluripotent stem cell (iPSC)–derived cells, to be more representative of human physiology (18).

Multiple examples support this claim. The karyotype of a cell represents one of its most fundamental and defining characteristics. A large number of tumor-derived cell lines display substantial genetic abnormalities, with some extreme examples bearing in excess of 100 chromosomes as opposed to the expected 46. By that measure, the widely used human monocytic THP-1 cell line would fare well considering its overall diploid character (19). Nonetheless, triploidy is observed for four chromosomes and monoploidy for another, along with the entire deletion of chromosome X and substantial chromosomal rearrangements. A simple question pertains: Is this a monocyte? In other words, can we expect a faithful representation of all of the functions of a primary human monocyte from such a cell? Furthermore, it is increasingly clear that most proteins do not function in isolation inside cells but instead partake in multiprotein complexes for signaling and metabolic purposes. G protein–coupled receptors provide excellent examples of how the pharmacology of a small molecule can be substantially influenced by both the expression level of the target and the presence of relevant protein binding partners (20). For example, the angiotensin II type I (AT1)–receptor/α2C–adrenergic receptor heterodimer activates a specific signaling pathway that is not engaged by either homodimer (21).

Accordingly, phenotypic projects can benefit from using more native cellular systems because there may be substantial differences in the hits and mechanisms identified (Fig. 2). Data from the cystic fibrosis area support this approach. There is poor overlap in the activity of compounds that correct the F508del cystic fibrosis transmembrane regulator (CFTR) trafficking defect between different cell lines overexpressing the same mutated CFTR protein (22); consequently, the field increasingly relies on patient-derived primary bronchial epithelial cells to generate efficacy and potency data with greater clinical relevance than that of cell lines (23, 24). In agreement with these published results, we have documented a poor pharmacology correlation between the cell lines and patient-derived cells (unpublished data). Another report further stresses that because protein expression and active signaling pathways differ between cell types, so will the hits that are identified in a screen. In this example, researchers studying familial dysautonomia observed that compounds identified with patient-sourced, iPSC-derived neural crest cells displayed altered pharmacology when tested against fibroblasts and lymphoblasts bearing the same disease-causing mutation and were inactive in neural crest cells from a healthy donor (25).

Fig. 2 Pitfalls associated with the use of less-relevant cellular systems.

Symbolic representation of a phenotypic assay system comprising a stimulus, a cell, and assay readout (phenotype). Arrows within each cell represent signaling pathways affected by the stimulus that modulate the phenotypic end point. The Venn diagram displays a hypothetical partial overlap of hits and mechanisms identified in parallel phenotypic screens conducted with native cells and a cell line.


Although the use of primary or patient-derived cells invariably raises concerns of donor variability, the examples above support the notion that using a cell line, often derived from cancerous tissue, may have overridingly negative consequences on the outcome of a phenotypic screen. Cells from several donors can be used to validate hits and mechanisms; thus, we have taken the approach that primary and patient-derived primary cells should be used by default whenever possible (Table 1). In general, use of human cells and tissue is preferable because identical genetic mutations in humans and rodents can lead to inconsistent phenotypes in vivo, indicating imperfect replication of some signaling pathways in nonhuman systems (26).

The cellular microenvironment, including the presence of different cell types and a three-dimensional (3D) setting, can also provide critical inputs required for the proper development of relevant cellular phenotypes. A 3D system with iPS-derived neurons carrying mutations linked to Alzheimer’s disease was recently published (27). Although the β-amyloid plaques–Tau neurofibrillary tangles hypothesis has long been the backbone of research in this area, the iPSC model provided, for the first time, a demonstration of the entire pathway in native human cells. Similarly, the local tissue environment influences gene expression in resident macrophages and therefore promotes context-dependent functions contributing to tissue homeostasis (28).

Criterion 2: Disease relevance of the stimulus. Most phenotypic assays require a stimulus to achieve the production of the desired phenotype. We suggest that careful consideration should be given to this parameter in terms of its relevance to the indication of interest (Table 1).

The stimulus applied to the assay system will direct the engagement of specific signaling pathways, which in turn will substantially bias the mechanisms and targets that may be identified in the screen. The ideal stimulation would thus be derived from an accurate and complete understanding of the disorder’s root causes. As discussed previously, we likely lack such knowledge in many cases. Our suggestion would be to sidestep this conundrum through the use of highly disease-relevant biological systems that intrinsically contain the appropriate stimulus, such as patient-derived cells or iPSC-derived cells incorporating specific disease-causing genetic alterations (2325).

In cases where the above strategy cannot be used, a stimulus will need to be chosen, taking into account the fact that the root causes of many disorders involve more than a single causative agent or mutation. The optimal phenotypic assays will attempt to appropriately recapitulate this activation complexity through the use of relevant, synergistic stimuli. Examples of this approach can be found in the suite of primary-cell assays addressing inflammatory and autoimmune disorders used to profile the ToxCast library (29). In contrary, the use of less-relevant stimuli, such as broad cytotoxicants [for example, high levels of hydrogen peroxide (H2O2)] to recreate a cellular injury of interest, runs a major risk of missing relevant mechanisms while capturing some that would not translate to patients (Fig. 3A, left) (30). Similarly, selecting a relevant activation method downstream of the disease-causing inputs would theoretically result in the upstream mechanisms being lost to researchers as potential therapeutic avenues (Fig. 3A, right).

Fig. 3 Pitfalls associated with the use of less–disease-relevant stimuli and assay readouts.

(A) Symbolic representation of the consequences derived from choosing either a non–disease-relevant stimulus or a stimulus engaging only part of the disease-relevant pathways. (B) Symbolic representation of the mechanisms captured by different assay readouts for a hypothetical screen aiming to increase the activity of a given protein. Gene expression assay readouts limit the number and type of mechanisms that can be uncovered in a phenotypic screen. PTM, posttranslational modification.


Criterion 3: Assay readout proximity to the clinical end point. The relationship between the assay readout and the clinical end point constitutes the last evaluation criterion. Specifically, we propose that more mechanisms with the potential to translate to clinical efficacy will be identified as the assay readout moves closer to the clinical end point, from basic gene or protein expression to functional or macrophysical manifestations of the disease (Table 1).

The track record of gene expression readouts such as reporter gene assays is lackluster with respect to phenotypic drug discovery; no recent (>1998), first in class, small-molecule drug has originated from such an assay (7). A potential explanation is that mechanisms influencing gene expression represent only a fraction of all mechanisms affecting a given phenotype (Fig. 3B). A study comparing hit rates for compound libraries using broad gene expression and multiplexed cytological readouts found only partial overlap between the two sets of hits (31). An in-house study aimed at discovering previously unknown mechanisms leading to the up-regulation of apolipoprotein E (ApoE) secretion compared confirmed hits obtained in the same cellular system using reporter gene and enzyme-linked immunosorbent assay readouts. Although the reporter gene assay successfully identified compounds that provide large increases in ApoE secretion, it missed half of the overall hit set—in other words, those compounds providing lower but still substantial effects (unpublished data). Furthermore, transcriptional mechanisms have the potential to affect a broader set of cellular pathways and thus may carry a greater safety attrition risk.

When translational biomarkers, or end points that can be monitored clinically to predict efficacy in patients, can be used as assay readouts, they offer data more proximal to clinical end points (13). Mechanisms discovered to influence a given biomarker in vitro would be expected to display a similar effect in vivo and may provide a therapeutic benefit to patients. However, though helpful in many situations, biomarkers are not a panacea in drug development because they have proven difficult to identify and validate for a range of indications (32). As a result, focusing on functional, potentially macrophysical (such as muscle contraction) readouts that reproduce key in vivo disease phenotypes may be more productive. First, the more downstream the readout is, the more mechanisms modulating this phenotype are captured during the screen (Fig. 3B). Second, with an end point closely related to the desired clinical readout, fewer assumptions are built into the assay, leading to compounds and mechanisms that are more likely to translate to patients. Indeed, phenotypic screening with functional readouts extends back decades to Sir James Black, who exploited ex vivo assays using contraction of heart tissue to drive the development of beta blockers (33).


Admittedly, the above criteria will prove difficult to attain in certain cases. However, such phenotypic assays do exist and have proven their translational value. For example, cystic fibrosis drug discovery has benefited from the use of patient-derived airway epithelial cells, offering a compelling assay system that carries an intrinsic disease stimulus. These cells, coupled with readouts measuring channel activity and the airway surface liquid they release, have provided an outstanding in vitro model of the disease, enabling the development of the first drugs to improve lung function in patients bearing the G551D and F508del mutations (23, 24). Similarly, monitoring Plasmodium cell proliferation in primary human erythrocytes led to the discovery of KAE609, an antimalarial agent that exerts its effects through an unprecedented target, ion channel PfATP4; KAE609 recently demonstrated efficacy in patients (34). Last, an in-house phenotypic screen run by using adenosine 5′-triphosphate–stimulated primary human monocytes with interleukin 1β secretion as readout led to the development of clinical candidate CP-456,773 for inflammatory disorders. Interestingly, this compound was recently characterized as an inhibitor of NLRP3 inflammasome formation, a signaling pathway yet to be identified at the time of the screen (35).

Sourcing of primary human cells and tissues, especially patient-derived, for phenotypic screening is crucial given the benefits derived from their use. Their availability has increased noticeably over the past few years through both commercial and not-for-profit sources such as academia-industry collaborations (36). For the most part, such assays are unlikely to be high throughput. However, although a thorough discussion of compound library size for phenotypic screening is outside the scope of this Perspective, it is noteworthy that lacosamide was discovered in the 1990s by using an in vivo epilepsy model with extremely low throughput (10). This example suggests that disease relevance has the potential to trump compound throughput as the critical assay parameter in phenotypic drug discovery.

To increase the odds of clinical translation of compounds and mechanisms identified by phenotypic screening, assays should strive to replicate the disease of interest in terms of the assay system and stimulus while ideally using a miniaturized version of the clinical end point as the assay readout. This approach would minimize the number of assumptions that are implicitly and explicitly made at project initiation and thus partly mitigate our incomplete understanding of human physiology and disease. Accordingly, phenotypic screening may be considered a more humble way of conducting drug discovery and, with the right assay, one more likely to succeed.


  1. Competing interests: All authors are employed by Pfizer.
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article