Identification of type 2 diabetes subgroups through topological analysis of patient similarity

See allHide authors and affiliations

Science Translational Medicine  28 Oct 2015:
Vol. 7, Issue 311, pp. 311ra174
DOI: 10.1126/scitranslmed.aaa9364

Networks work for diabetes

Big problems require big solutions, and for complex diseases such as cancer or diabetes, the big solution is big data. One long-term goal of U.S. President Barack Obama’s Precision Medicine Initiative is to assemble medical and genetic data from at least one million volunteers. But how might researchers use all those data? Li et al. provide one answer by using patient electronic medical records (EMRs) and genotype data from Mount Sinai Medical Center in New York to characterize new subtypes of type 2 diabetes (T2D).

The group first clustered EMR data to identify T2D patients within the larger group. Topological analysis of the T2D group identified three new T2D subtypes on the basis of distinct patterns of clinical characteristics and disease comorbidities. Genetic association analysis identified more than 300 single nucleotide polymorphisms (SNPs) specific to each subtype. The authors found that classical T2D features such as obesity, high blood sugar, kidney disease, and eye disease, were limited to subtype 1, whereas other comorbidities such as cancer and neurological diseases were specific to subtypes 2 and 3, respectively. These distinctions might call for tailored treatment regimens rather than a one-size-fits-all approach for T2D. Although a larger sample size is needed to determine causal relationships, this study demonstrates the potential of precision medicine.


Type 2 diabetes (T2D) is a heterogeneous complex disease affecting more than 29 million Americans alone with a rising prevalence trending toward steady increases in the coming decades. Thus, there is a pressing clinical need to improve early prevention and clinical management of T2D and its complications. Clinicians have understood that patients who carry the T2D diagnosis have a variety of phenotypes and susceptibilities to diabetes-related complications. We used a precision medicine approach to characterize the complexity of T2D patient populations based on high-dimensional electronic medical records (EMRs) and genotype data from 11,210 individuals. We successfully identified three distinct subgroups of T2D from topology-based patient-patient networks. Subtype 1 was characterized by T2D complications diabetic nephropathy and diabetic retinopathy; subtype 2 was enriched for cancer malignancy and cardiovascular diseases; and subtype 3 was associated most strongly with cardiovascular diseases, neurological diseases, allergies, and HIV infections. We performed a genetic association analysis of the emergent T2D subtypes to identify subtype-specific genetic markers and identified 1279, 1227, and 1338 single-nucleotide polymorphisms (SNPs) that mapped to 425, 322, and 437 unique genes specific to subtypes 1, 2, and 3, respectively. By assessing the human disease–SNP association for each subtype, the enriched phenotypes and biological functions at the gene level for each subtype matched with the disease comorbidities and clinical differences that we identified through EMRs. Our approach demonstrates the utility of applying the precision medicine paradigm in T2D and the promise of extending the approach to the study of other complex, multifactorial diseases.

View Full Text

Stay Connected to Science Translational Medicine