Researching Genetic Versus Nongenetic Determinants of Disease: A Comparison and Proposed Unification

See allHide authors and affiliations

Science Translational Medicine  18 Nov 2009:
Vol. 1, Issue 7, pp. 7ps8
DOI: 10.1126/scitranslmed.3000247


Research standards deviate in genetic versus nongenetic epidemiology. Besides some immutable differences, such as the correlation pattern between variables, these divergent research standards can converge considerably. Current research designs that dissociate genetic and nongenetic measurements are reaching their limits. Studies are needed that massively measure genotypes, nongenetic exposures, and outcomes concurrently.


The advent of high-throughput platforms has allowed the massive, accurate measurement of variation across the human genome (1). These approaches, termed “agnostic” in that they do not involve assumptions about which variants influence particular phenotypes, have reshaped our understanding about genetic risk factors for common diseases. Hundreds of new genetic associations with disease risk have been discovered recently (2), and full genome sequencing in population studies is currently under way.

Although massive agnostic testing has become the dominant method for genotype measurements, studies to detect possible links between nongenetic factors (exposures) and phenotypes (outcomes) still follow the paradigm of one-at-a-time or few-at-a-time hypothesis testing. Would a massive testing approach be indicated and feasible for these sorts of studies? Here we discuss whether the hallmarks of genome-wide association studies (GWASs) are desirable for evaluating nongenetic exposures and outcomes (Table 1). This extension has implications for the design of future studies that aim to explain and modify the risk of common diseases.

Table 1. Method comparison.

Successful hallmarks of agnostic genomic measurements, differences in current research on nongenotype measurements, and possibility for amendments to bring nongenotype measurements in line with genotype measurements.

View this table:


Agnostic genotype measurements cover a large proportion of human genotypic variability, all of which theoretically should be covered eventually by full sequencing of numerous genomes. The correlation pattern of human genome variation is largely known through the HapMap project ( and is continuously being refined. This correlation pattern is conducive to the identification of relatively specific risk factors. Because the vast majority of genetic variants are noncorrelated (randomly associated in a given genome), most discovered associations are quite specific, whereas linkage disequilibrium (the nonrandom association of variants) allows for some limited redundancy that can aid discovery. Rigorous quality checks are routinely applied to genome-wide data, which yield genotypic measurements with very low error rates. In addition, genotypes are stable during a person’s life span, so that such measurements need be made only once. Furthermore, GWASs routinely account for the multiplicity of comparisons, by making appropriate adjustments to avoid false positives. Also used are false-discovery–rate control or Bayesian approaches, methods that probe the credibility of associations.

Because the genetic effects of common variants are subtle and require large sample sizes to uncover, massive genome-wide testing is most successful if done in large collaborative studies (3, 4). In theory, with massive testing platforms, all data can be reported together and shared in public, thus reducing reporting biases. Finally, genetic variation can be used to test whether there is a causal relation between an exposure and an outcome using observational rather than experimental data, because polymorphisms are randomly transferred to gametes (Mendelian randomization) (5). This brings genetic epidemiology closer to randomized clinical research.

Despite these attractive features, GWASs have their limitations. Even though numerous common genetic risk variants have been identified, they explain only a small portion of the variability in risk between individuals (6, 7). Furthermore, markers identified through GWASs do not immediately lead to an understanding of the underlying biology (8), and markers with strong statistical support often have substantial variability in the size of their effects across different populations (8). Measurement of genomic features other than genotypes might reveal information about environmental exposures. For example, epigenetic markers might add to our understanding of how genetic and environmental factors interact to cause disease (9). However, not every gene/environment interaction results in or produces easily accessible epigenetic modifications, and these changes may be reversible. Given these limitations, massive genome measurements alone will probably not be sufficient to fully explain disease risk.


Current practices for measuring nongenetic exposures and outcomes differ considerably from the successful GWAS hallmarks (Table 1).

Measurement platforms. Most traditional epidemiological case-control studies measure one or a few exposures and outcomes. Cohort studies allow for a wider variety of measurements. For example, 19 cohorts participating in the Public Population Project in Genomics (P3G) consortium (10) have typically yielded information about hundreds (range, 53 to 1266) of variables. In contrast, recently designed biobanks (repositories of biological samples) record only limited details of life style, nutritional, and behavioral exposures. They may partly compensate for this limited information-gathering with large-scale laboratory measurements from stored blood samples, but even these typically pertain to a single baseline visit. Moreover, although there are only a few available platforms for GWASs, and data can be standardized to be comparable across platforms (for example, by imputation), for nongenetic measurements there are often many different instruments, assays, and questionnaires available. Sometimes this poses a challenge in harmonizing already-collected data to make them comparable across different studies. Collaboration among many research teams also requires addressing intrainstitutional and international ethical and legal issues (such as data sharing and intellectual property).

Correlation pattern. Non‐genomic variables (including both exposures and outcomes) are densely correlated among themselves (11). The correlation web is so dense and the direction of effect (what “causes” what) so unclear, that often it is impossible to say whether a variable is (only) an exposure, (only) an outcome, or just a correlate.

We examined the correlation pattern between different types of nongenetic variables in the Singapore Prospective Cohort 2 (SP2) (12) (for details, see the supporting online material). Overall, 78% of the correlations between 10 ordinal (ordered, with discrete values) variables, 63% of the correlations between 24 continuous nonnutrient variables, and 94% of the correlations between 19 continuous nutrient variables were nominally statistically significant (P < 0.05). Furthermore, nominally statistically significant Pearson correlation coefficients were seen in 37, 62, and 62% of the 14,196 evaluable pairs of variables generated by the 169 food-frequency variables in three population groups. Thus, in five of these six data sets, an analysis of two randomly selected variables would be more likely to give statistically significant rather than nonsignificant results. Usually, the significant coefficients were of small magnitude. However, for variables regarding the total daily intake of various nutrients, 77% of the correlations had absolute correlation coefficient values exceeding 0.2, suggesting that most pairs of these nutritional variables inherently have nonnegligible correlations among themselves. In the dense web of correlations shown schematically in Fig. 1A (a “correlation globe”), a few variables (alcohol, monounsaturated fat, and whole-grain consumption) are not strongly correlated with others, but most variables have modest-to-strong correlations among themselves (Fig. 1A). This dense correlation pattern creates extensive redundancy and difficulty in pinpointing any one independent association. Given this pattern, it is suboptimal to model variables in linear models that assume independent effects (done by the vast majority of epidemiological and clinical research studies).

Fig. 1. Correlation globes.

A web shows the absolute magnitude of correlation coefficients between different variables. Each variable is shown by a node, and the thickness of the links is proportional to the absolute correlation coefficient. Coefficients with an absolute value <0.025 have no link at all. (A) Nineteen continuous variables regarding daily nutrient uptake from the SP2 cohort: calories, protein, saturated fat, monounsaturated fat, polyunsaturated fat, total fat, cholesterol, carbohydrates, iron, calcium, vitamin A, vitamin C, fiber, servings of fruit, servings of vegetables, servings of meat and alternatives, servings of rice and alternatives, servings of whole grains, and standard drinks of alcohol (n = 6825 participants). (B) Seven metabolic outcome variables (insulin, fasting glucose, total cholesterol, triglycerides, HDL cholesterol, LDL cholesterol, and creatinine), diabetes mellitus (DM), and three of the daily nutritional intake variables of Fig. 1A (subset of n = 4717 participants with metabolic measurements). (C) Eight randomly selected single-nucleotide polymorphisms from HapMap. Variables that are classically considered as exposures are in red, variables that are surrogate metabolic outcomes are in yellow, and diabetes (a clinical outcome) is in blue.

Even when outcomes can be separated from exposures, there are several layers of outcomes. Figure 1B shows the correlation globe for seven metabolic outcome variables, a clinically important outcome (diabetes), and three nutritional variables. Clearly, claims of specificity seem inopportune here. The same dense correlation extends to hard clinical outcomes, where several investigators have already shown the complexity and dense connections within the “phenome” (13, 14) or “diseasome” (15, 16). Comparatively, no dense correlations are seen when eight single-nucleotide polymorphisms are randomly selected from the HapMap (Fig. 1C).

Measurement error. Measurement error can range from practically 0% (for example, for death in a study with links to death certificates) to extremely high. For some variables, fully objective measurements are impossible. For example, three variables from the SP2 cohort study (the amount of togetherness, degree of support and understanding, and the amount that one talks things over) are subjective appraisals. Not surprisingly, research efforts on subjective experiences have poor replication records, even when accurate genome measurements are involved (17). When research requires two steps, one with a 0.00001% error rate and one with a 60% error rate, the latter determines the fate of the research effort. Other variables are subject to recall errors (questions on drinking patterns) or purposeful evasion (questions on household earnings).

Quality checks. Ambiguous “gray” measurements are common in nongenetic studies. Missing information also affects several types of epidemiological research. For survey questionnaires, response rates are almost never in the range considered acceptable for genotype measurements (>90%), and values of <50% are deemed satisfactory for some surveys (18). Of the 243 SP2 variables that could have been measured in theory for all relevant participants, 13 (5.3%) had missing values for >30% of participants, and 31 (12.8%) had missing values for >1% of participants.

Stability of information. Unlike genotypes, nongenotypic characteristics change over time, as do gene-related attributes such as RNA, protein, and metabolite concentrations and epigenetic modifications. Figure 2 shows the mean absolute change in the values of selected variables for participants in the SP2 cohort from 2004 to 2007, as compared with the values for the same people when they were examined for the 1998 National Health Survey. Some of the variables changed on average more than 100% in less than a decade; average absolute changes exceeding 30% were ubiquitous, except for body-mass index and blood pressure.

Fig. 2. National Health Survey.

Mean absolute percentage change in variables measured in the SP2 cohort in 2004–2007 as compared with the National Health Survey measurements on the same individuals in 1998 (35). Abbreviations of the variables are explained in table S5 of the supporting online material.


Although case-control designs have worked well for GWASs, they face limitations for nongenetic exposures for which the risk is related to the pattern of lifetime exposure rather than to a single exposure. Study design should also take into account latency and age of susceptibility. For example, the latency period between the onset of smoking and cancer is up to 30 years, and intense exposure to ionizing radiation in the atomic bombing of Hiroshima and Nagasaki had no effect on breast cancer risk in women over 40 years of age. Many exposure biomarkers have been proposed in various fields, such as nutritional epidemiology (19). Biomarkers of chronic exposures are theoretically very useful. Unfortunately, such biomarkers are rare and often restricted to exposures to toxic chemicals that must be measured with complex equipment. Thus, it is necessary to collect samples repeatedly for exposure biomarkers in a cohort study.

Multiplicity considerations. Multiplicity considerations remain unpopular in nongenomic research, despite long debates about their necessity. There is resistance to abandoning emphasis on uncorrected P values. The number of times an analysis was performed usually is unknown, as there is no standard protocol and log of conducted analyses. Some journals have replaced P values with confidence intervals, but this does not help, because most people mistakenly view a 95% confidence interval that is entirely on one side of the null in the same way that they view a P value <0.05: as proof of association. Empirical data show that most nominally statistically significant associations in traditional epidemiology have modest Bayes factors at best, which do not improve much the credibility of the proposed association (20). Researchers commonly generate serial hypotheses while chipping away at a database. One could use a typical cohort to publish a few thousand nominally significant confounded correlations in several hundreds of publications. Each link in a correlation globe (Fig. 1) might have fueled thousands of papers and hundreds of researchers’ careers.

Large-scale and collaborative efforts. There are many large, well-designed, highly productive cohort studies that focus mostly on nongenetic exposures, some of which are also collaborative and involve many partners (21). However, even these collaborative efforts usually function as a single study, and the discovery and replication phases are not systematically linked. Collaboration between large cohort studies is probably less common than collaboration between different teams in genome-wide investigations, for which many consortia have efficiently standardized genotypes and harmonized phenotypes of interest (for example, the Genomewide Investigation of Anthropometric Measures Consortium and the Wellcome Trust Case Control Consortium). There are some efforts to unite all nongenetic cohort studies working on the same topic, typically retrospectively (2224). Prospective efforts such as the Asian Cohort Consortium and P3G hold more promise than the retrospective ones.

Comprehensive reporting. Usually, traditional cohort studies report on one or a few associations at a time in separate papers. Data sharing is difficult or even discouraged. The extent of measurable information for most cohorts’ databases remains nontransparent, and selective reporting is problematic (25). Even the Strengthening the Reporting of Observational Epidemiology guidelines (26) have not asked for a listing of all measurements performed in an epidemiological study.

Randomization. Although Mendelian randomization for genetic risk factors occurs by nature, for nongenetic risk factors any attempt to randomize harmful exposures is unethical. For more contentious exposures with unknown effects, some epidemiologists have argued that randomized trials are suboptimal and that observational studies are the gold standard, because they capture real-life situations (27). Others have even resisted adopting the findings of large, well-designed randomized trials of life style or nutritional interventions when these contradicted previous observational results (28).


Among the hallmarks discussed above (Table 1), the dense correlation pattern of nongenetic variables is inherent and impossible to change. This is not necessarily a drawback. It may allow easier pinpointing of an association through some correlate of a truly associated variable. Correlates act as mirrors of the true associations. However, claims about causal relations are notoriously precarious. Not surprisingly, debates on what constitutes causality are endless and fruitless. Correlated effects may be the best we can detect. However, instead of examining correlations pairwise, clinical and epidemiological researchers might need to examine entire correlation globes between variables in a given field. Such correlation patterns may help to reveal the specificity of detected associations. Moreover, some of the observed correlations might be present and strong in diverse situations (what we will call “meta-stable associations”), whereas others might be situation-specific or change rapidly over time, setting, or population (“meta-unstable associations”). Instead of trying to reach an immediate absolute truth about an association, we might need to assess it over time. For example, being overweight has modestly meta-unstable associations with disease outcomes. But decreasing body weight in overweight people might have different effects (beneficial, neutral, or even harmful) at different periods (for example, in the 1970s, now, and in the mid-21st century) and settings (29, 30); conversely, the risk of tobacco causing lung cancer is apparently a meta-stable association.

Even genotype effects on disease risk might turn out to be meta-unstable, regardless of how robust their current statistical support appears. For example, if the translation of a genetic effect into clinical risk also requires some kind of environmental exposure, the genetic effect may no longer be seen if the necessary environmental exposure is aborted.


For most of the differences between current analyses of genome-wide and nongenetic variables, there is some room for alignment, but full convergence is difficult. Exposure and outcome data can be enriched with meticulous comprehensive capture of information and repeated collection of information from study participants. Linkage to existing registries can also improve the amount of outcome data. Electronic medical records, personally maintained medical records, and electronic epidemiology [e-epidemiology (31), the collection of information through mobile phones, e-mails, and other electronic means] might increase data mass, although quality should also be safeguarded.

Unfortunately, most biobanks that aim to collect biological material for genomic measurements collect limited nongenetic exposure and outcome information. A new generation of large cohorts and biobanks could be constructed with explicit detailed collection of massive exposure and outcome data. The quantity and quality of the information might be improved with repeated measurements via multiple sources, and efforts should be made to improve quality checks and reduce the amount of missing information and measurement error. Moreover, despite dissent, randomized trials for life style, nutritional, and behavioral exposures have been conducted successfully.


Multiplicity correction, Bayesian, and false-discovery–rate approaches can be adopted in analyses involving nongenomic measurements. These approaches might improve the inferential ability of this research, resulting in fewer refuted “significant” findings. The paradigm of large-scale and collaborative efforts can also be extended beyond genome-wide investigations. Comprehensive reporting of all analyses and results, both positive and negative, is also essential and should be feasible if the will exists and incentives are given for transparency. The online database tools are readily available and data sharing can be improved, maintaining data integrity, confidentiality, and the attribution of proper credit to primary investigators (3).


Table 2 categorizes study designs based on their number of participants, the extent of coverage they afford for exposures (“exposurome”), genome data, and outcomes (“phenome”) and whether they entail randomization. Traditional experimental designs, for both GWASs and nongenetic studies, will continue to generate useful information, but are probably reaching their limits; indeed, the accumulated incremental discoveries to date fail to explain a large proportion of the risk for many diseases.

Table 2. A new taxonomy for current and future clinical and epidemiological research.
View this table:

To extend these limits, new designs should generate comprehensive and transparently accumulated and reported evidence on all three components (exposurome, genome, and phenome) and also encompass the benefits of randomization, whenever possible. Study designs with more comprehensive data collection than is currently the standard should permit scientists to address more and increasingly sophisticated questions than is possible with studies with a more limited spectrum. The availability of outcome-wide and exposure-wide information does not preclude also conducting focused, specific hypothesis–testing analyses within such agnostic designs.

One must consider the huge bioinformatics task entailed, the computing strategies and hardware needs, and the costs of data storage, analysis, and dissemination in order to ensure feasibility and efficiency in such huge endeavors. A new generation of biobanks may merge the sophistication of genomic measurements with the detailed capture of accurate information on nongenetic exposures and outcomes in the range of 10,000 or more variables. This task is not trivial, either in expense or feasibility. Nesting randomized trials—that is, performing such trials on consenting subsets of the populations enrolled in large cohorts—might further increase the information yield (32). Such trials would benefit from the availability of accurate outcome information. Settings such as Scandinavia or Singapore, where there are preexisting, comprehensive, and accurate registries of outcomes (33, 34), would have an advantage in this regard. The maintenance of high-quality information over time is an extra challenge. Thus, researchers leading large projects should be prepared to change the way exposure data are collected over time and keep up with advances in genome measurement technology. Massive biobanking will need to consider a growing list of emerging types of measurements, including (but not limited to) transcriptomics, metabolomics, proteomics, and diverse imaging assessments. If these challenges are successfully met, the stability of risk associations for modifiable exposures could also be monitored over time. Meeting these challenges should also aid in understanding how the knowledge of disease risk can be translated to health benefits.


J.P.A.I. had the original idea and wrote the first draft of the article, which was commented on by the other authors. E.Y.L. and J.P.A.I. performed statistical analyses and generated graphics.

Supplementary Material

Supplementary Text

Tables S1 to S5


  • Citation: J. P. A. Ioannidis, E. Y. Loy, R. Poulton, K. S. Chia, Researching genetic versus nongenetic determinants of disease: A comparison and proposed unification. Sci. Transl. Med. 1, 7ps8 (2009).


View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article