Editors' ChoiceComputational Biology

CoINcIDE: All together now

See allHide authors and affiliations

Science Translational Medicine  13 Apr 2016:
Vol. 8, Issue 334, pp. 334ec61
DOI: 10.1126/scitranslmed.aaf6940

Central to the U.S. precision medicine initiative is the notion that Big Data, properly collected and analyzed, will reveal subtypes of diseases we currently view as single disorders. But what about small data sets? Is there translational value to meta-analyses of existing or newly generated data sets that aren’t part of large studies? Now, Planey and Gevaert bring together small data sets and show that the answer is yes.

The investigators aim was to identify subtypes of disease, but many of the studies they used were too small to reveal robust subtypes on their own. So they developed a computational meta-analysis procedure called CoINcIDE—which used clustering to identify disease subtypes within each individual data set—and found cross–data set clusters of similar subtypes. Performing clustering within studies followed by cross-study comparison sidesteps obstacles such as platform or study effects that could hinder subtype discovery. Cross–data set analysis with CoINcIDE identified robust disease subtypes in both simulated and patient biopsy data from studies of breast and ovarian cancer. The authors performed the analysis of ovarian cancers both with and without the largest data set, TCGA. With 578 ovarian cancers, TCGA was more than twice as large as any other data set. Results were consistent whether or not TCGA was included, highlighting the power of smaller data sets paired with the CoINcIDE method for subtype discovery.

The CoINcIDE analyses add to the literature on ovarian-cancer subtypes associated with survival; however, work to confirm the predictive nature of these subtypes or to identify effective subtype-specific treatments remains to be done. As a next step, the biological basis of the gene expression signals that give rise to the CoINcIDE subtypes needs to be identified, and the underlying mechanisms should be determined.

Vincent van Gogh wrote, “Great things are not done by impulse, but by a series of small things brought together.” CoINcIDE might empower researchers to perform subtype-discovery for rare diseases, which otherwise would be impossible for a single lab or initiative because of the small numbers of samples. By developing a computational method that brings together small data sets, the authors transform the vast array of disease-relevant, publicly available data into opportunities for subtype discovery.

C. R. Planey, O. Gevaert, CoINcIDE: A framework for discovery of patient subtypes across multiple data sets. Genome Med. 10.1186/s13073-016-0281-4 (2016). [Full Text]

Navigate This Article