Research ArticleMETABOLOMICS

Urinary metabolic signatures of human adiposity

See allHide authors and affiliations

Science Translational Medicine  29 Apr 2015:
Vol. 7, Issue 285, pp. 285ra62
DOI: 10.1126/scitranslmed.aaa5680


Obesity is a major public health problem worldwide. We used 24-hour urinary metabolic profiling by proton (1H) nuclear magnetic resonance (NMR) spectroscopy and ion exchange chromatography to characterize the metabolic signatures of adiposity in the U.S. (n = 1880) and UK (n = 444) cohorts of the INTERMAP (International Study of Macro- and Micronutrients and Blood Pressure) epidemiologic study. Metabolic profiling of urine samples collected over two 24-hour time periods 3 weeks apart showed reproducible patterns of metabolite excretion associated with adiposity. Exploratory analysis of the urinary metabolome using 1H NMR spectroscopy of the U.S. samples identified 29 molecular species, clustered in interconnecting metabolic pathways, that were significantly associated (P = 1.5 × 10−5 to 2.0 × 10−36) with body mass index (BMI); 25 of these species were also found in the UK validation cohort. We found multiple associations between urinary metabolites and BMI including urinary glycoproteins and N-acetyl neuraminate (related to renal function), trimethylamine, dimethylamine, 4-cresyl sulfate, phenylacetylglutamine and 2-hydroxyisobutyrate (gut microbial co-metabolites), succinate and citrate (tricarboxylic acid cycle intermediates), ketoleucine and the ketoleucine/leucine ratio (linked to skeletal muscle mitochondria and branched-chain amino acid metabolism), ethanolamine (skeletal muscle turnover), and 3-methylhistidine (skeletal muscle turnover and meat intake). We mapped the multiple BMI-metabolite relationships as part of an integrated systems network that describes the connectivities between the complex pathway and compartmental signatures of human adiposity.


The prevalence of obesity and being overweight is rising worldwide and now affects a large proportion of the adult population in the United States, UK, and many other countries (1, 2). Adiposity is associated with increased mortality from heart disease, stroke, diabetes, and cancer (3), contributing to an estimated 3.4 million deaths per year worldwide (4). However, the mechanisms by which adiposity may contribute to increased risk of these diseases are not fully understood. Better knowledge about the effects of adiposity on human metabolism and the complex metabolic pathway connections underlying them is needed to provide new insights into perturbations that link adiposity to human disease risks. This in turn may help efforts to control the obesity epidemic and its sequelae through better targeting of preventive efforts to those most at risk.

Exploratory metabolic phenotyping offers a powerful means of capturing systems-level information that reflects both genetic (host, gut bacterial metagenome) and environmental influences, hence helping to elucidate the metabolic disturbances and pathways associated with chronic disease (5, 6). For example, we have demonstrated use of a metabolic phenotyping approach for discovering urinary molecular markers associated with interindividual blood pressure variation and cardiovascular disease risk (7). Here, we apply this approach to elucidate the urinary metabolic signatures associated with body mass index (BMI) and adiposity in the U.S. and UK cohorts of the International Study of Macro- and Micronutrients and Blood Pressure (INTERMAP) epidemiologic study.


We have used a unique collection of urinary aliquots sampled from pooled timed 24-hour urine collections obtained over two separate 24-hour periods from men and women in the population-based INTERMAP epidemiologic study (79) in the United States and UK. We used these urine samples to delineate a metabolic signature of adiposity that was reproducible in individuals over time and in different populations (Fig. 1). We applied a combination of untargeted proton (1H) nuclear magnetic resonance (NMR) spectroscopy to aliquots from the two 24-hour urinary collection periods and also performed targeted analyses of urinary amino acids and related compounds using ion exchange chromatography (IEC). We analyzed urinary amino acids because of the reported importance of amino acids, especially branched-chain amino acids (BCAAs), in adiposity (1016). We identified metabolites significantly and reproducibly associated with BMI after correction for multiple testing and located these in an interconnecting systems framework. The few previous metabolic phenotyping studies of human adiposity have analyzed serum or plasma metabolite profiles (1116) or urine from small numbers of insulin-resistant, morbidly obese, and lean individuals (17, 18). Here, we take a population-wide approach using BMI as a continuous measure across the full range of BMI from lean to obese individuals. Use of 24-hour urine samples as reported here has the advantage over single blood or spot urine samples collected at one point in time, because it captures the end products of metabolism integrated over a 24-hour time period and therefore is not influenced by sampling time and diurnal variation in metabolite concentrations (19).

Fig. 1. Schema of study design.

Individuals were recruited into the study from general and occupational population samples in the United States and UK. Each individual made four clinic visits, the first two on consecutive days and the second two on consecutive days 3 weeks later. At each visit, blood pressure was measured twice, and a complete 24-hour dietary recall was obtained by trained interviewers. Data on height, weight, and extensive questionnaire information were also obtained. A timed 24-hour urine collection was commenced at the first (and third) visit and completed at the second (and fourth) visit the following day. Each 24-hour urine collection was mixed together, and urine aliquots were obtained from the pooled urine sample. Urinary aliquots from individuals with complete data were measured with 1H NMR spectroscopy.

We have included data on 1880 individuals (962 men and 918 women) ages 40 to 59 years from eight American population samples in INTERMAP with a replication data set of 444 individuals (245 men and 199 women) ages 40 to 59 years in INTERMAP UK (8) (Fig. 1). We obtained height and weight measurements for the calculation of BMI, extensive covariate information (for example, demographic variables and medical and life-style factors), as well as data on dietary variables and energy intake from four interviewer-administered multipass 24-hour dietary recalls per person (8, 9) (table S1).

We present our results as partial correlations between BMI and urinary metabolites along with P values to indicate the strength of associations and Q values to indicate the false discovery rate (see below). This approach made no assumptions as to whether the observed cross-sectional associations were causally related or were the consequence of adiposity. Rather, our results captured the metabolic signatures of the multitude of dietary, environmental, and other life-style characteristics associated with the obesity epidemic. We used the eight U.S. population samples as the discovery set and calculated the partial correlation of individual BMI against (i) each of 7100 1H NMR detected spectral variables and (ii) the targeted set of urinary amino acids and related compounds measured by IEC. In multiple regression analyses, we adjusted initially for age, gender, and population sample (defined as model 1, see footnotes of tables and Materials and Methods) because of the known associations of these variables with both BMI and metabolite concentrations (7, 15, 20, 21). For these discovery analyses, we used a Q value (22) threshold of 1% based on the Storey-Tibshirani false discovery rate (23) (ST-FDR) to identify spectral variables significantly associated with BMI after correction for multiple testing (also see Materials and Methods). For a spectral variable to be retained, both the variable itself and the two adjacent spectral variables needed to pass the Q value threshold of 1% (to avoid spurious findings based on a single data point) in addition to the correlation with adiposity of all three variables being in the same direction (that is, the correlation is either less than 0 or greater than 0 for all three variables). We then further adjusted for multiple potential confounders including history of heart disease or stroke, physical activity, medication and dietary supplement use, special diet, smoking, education (as an indicator of socioeconomic status), and total energy intake (defined as model 2, see footnotes of tables and Materials and Methods). Finally, we additionally adjusted for urinary creatinine (model 3) as a marker of meat intake (24, 25) and muscle turnover (26), because BMI correlates with muscle mass as well as fat mass (27). BMI-metabolite associations that decreased or became apparent after correction could reflect a link between (skeletal) muscle mass and metabolites.

We used a range of statistical spectroscopic tools (28, 29) and experimental analytical techniques (30) to identify structurally the metabolites significantly associated with BMI (table S2). We assessed the reproducibility of our findings using the NMR spectral data from the second urine collection obtained from each individual in the U.S. population samples and then validated our findings using independent data from the INTERMAP UK cohort. For these confirmatory analyses, a threshold for the ST-FDR of 5% was used to correct for multiple testing. Furthermore, for both the U.S. and UK data sets, we determined the intraclass correlations (ICCs) (table S3) between the 24-hour metabolite excretion measured in the two urine collections to assess the reliability of urinary excretion patterns over the, on average, 3-week period between the two sampling occasions.

We found significant metabolic associations with BMI involving an extensive interconnected set of biochemical pathways and physiological processes, as well as evidence for involvement of symbiotic gut microbial–human co-metabolism. The average 1H NMR spectrum from the discovery analysis of BMI, adjusted for age, gender, and population sample (model 1, see footnotes of Table 1), is shown in Fig. 2A. The significantly associated peaks, representing multiple metabolites, are labeled in the figure with the −log10 Q values and direction of association shown in the accompanying Manhattan plot (Fig. 2B). We structurally identified 29 metabolites significantly associated with BMI, with P values (first 24-hour urine collection) ranging from 1.46 × 10−5 to 2.04 × 10−36 (model 1, Table 1). After adjustment for multiple potential confounders (model 2, Table 1), associations remained significant for all identified metabolites. Table 2 shows results of the targeted analyses of 22 amino acids and related compounds by IEC. The majority of these metabolites were found to be significantly associated with BMI after Bonferroni correction for multiple testing (P ≤ 4.55 × 10−4, models 1 and 2); exceptions are taurine, serine, asparagine, glycine, methionine, and arginine.

Table 1. Structurally identified 1H NMR–derived urinary metabolites associated significantly with BMI in 1880 U.S.

INTERMAP participants using first urine collection specimens. Excluding metabolic outliers based on Hotelling’s T2 test (n = 132) and participants with doctor-diagnosed diabetes mellitus (n = 152). Partial correlation (r) and corresponding P values are listed for each metabolite. Statistical significance is based on the ST-FDR, Q ≤ 0.01 for each spectral variable and the two adjacent spectral variables.

View this table:
Fig. 2. Associations of BMI with urinary metabolites in the U.S. INTERMAP cohort (n = 1880).

Urine was collected for 24 hours from individuals on two separate occasions 3 weeks apart. Each individual completed four 24-hour dietary recalls, and eight blood pressure measurements and questionnaire information were obtained. The partial correlation between each 1H NMR variable and BMI was adjusted for age, gender, and population sample (model 1). (A) Average 600 MHz 1H NMR spectrum of the first urine collection. (B) Manhattan plot showing −log10(Q) × sign of partial correlation for each of the 7100 spectral variables. Significance of 1H NMR was determined based on a Q value threshold of 1%; in addition to this, both adjacent variables must also pass Q ≤ 0.01 and must have the same sign. Statistically significant peaks are colored red if directly associated with BMI and blue if inversely associated. Metabolites significantly associated with BMI (numbered according to spectral position) were as follows: 1: ketoleucine, 2: leucine, 3: valine, 4: 2-hydroxyisobutyrate, 5: alanine, 6: lysine, 7: N-acetyl signals from urinary glycoproteins, 8: N-acetyl neuraminate, 9: phenylacetylglutamine, 10: glutamine, 11: proline betaine, 12: 4-cresyl sulfate, 13: succinate, 14: citrate, 15: dimethylamine, 16: TMA, 17: dimethylglycine, 18: creatinine, 19: ethanolamine, 20: O-acetyl carnitine, 21: glucose, 22: 3-methylhistidine, 23: glycine, 24: hippurate, 25: pseudouridine, 26: NMNA, 27: 3-hydroxymandelate, 28: tyrosine, 29: 4-hydroxymandelate, 30: formate, U1 to U26 unidentified metabolites. These data are tabulated in Table 1 and table S4. Some signals overlap; however, they were unequivocally identified using various statistical and experimental methods (table S2).

Table 2. Association with BMI of a set of metabolites measured by targeted IEC (log10 values) in 1880 U.S. INTERMAP participants.

Excluding metabolic outliers based on Hotelling's T2 test (n = 132) and participants with doctor-diagnosed diabetes mellitus (n = 152). Partial correlation (r) and corresponding P values are listed for each metabolite. Statistical significance based on a Bonferroni threshold of P ≤ 4.55 × 10−4 (P ≤ 0.01/22). n.s., not significant.

View this table:

Gut microbial co-metabolic signatures

We found associations of nine gut microbial co-metabolites with BMI reflecting five different host-gut transformation microbial pathways. First, we find associations with BMI of metabolites related to the gut microbial metabolism of choline. We found previously unreported associations with BMI (direct) of trimethylamine (TMA) (P = 2.04 × 10−9, Table 1 and Fig. 2, A and B) and dimethylamine (P = 1.74 × 10−8). Also, we found another product of choline degradation and glycine-betaine oxidation, dimethylglycine (P = 1.43 × 10−7) (31), to be directly associated with BMI. These metabolites are closely related to formate (P = 4.50 × 10−12), the major hub metabolite in one-carbon metabolism and a major product of gut microbial origin. The second pathway class relates to distal colonic microbial protein putrefaction and the metabolism of tyrosine and related compounds. We found a relationship (inverse) with BMI of the gut microbial–host co-metabolite 4-cresyl sulfate (P = 8.09 × 10−11), a hepatic phase 2 α conjugation (detoxification) product of 4-cresol, which is a distal colonic bacterial degradation product of tyrosine. We also found direct associations of BMI with urinary tyrosine (P = 2.02 × 10−45, Table 2), confirming previous findings in blood (1114, 16), and with the 1H NMR signal representing tyrosine and the tyrosine metabolite 4-hydroxymandelic acid (P = 1.35 × 10−19, Table 1). The association of BMI with 4-cresyl sulfate in particular is largely unaffected by the multiple covariate adjustments, including creatinine, indicating that this relationship is independent of confounders and muscle turnover (Table 1) and thus can be considered to be directly linked to gut microbial activity differences. In addition, we show a previously unreported association (inverse) of BMI with urinary 3-hydroxymandelate (P = 2.81 × 10−8, Table 1), a metabolite of the α-adrenergic agonist p-synephrine (found in bitter orange extract) (32), and closely related to tyrosine metabolism. The third host-gut transformation microbial pathway, involving the distal colonic microbial metabolism of phenylalanine, is closely related to that of tyrosine. We found a relationship (inverse) of BMI with the gut microbial–host co-metabolite phenylacetylglutamine (P = 1.39 × 10−8), a bacterial degradation product of phenylalanine. A direct association with BMI of phenylalanine (P = 8.35 × 10−19, Table 2), based on the IEC amino acid analysis data, confirmed previous findings linking obesity with the blood concentration of phenylalanine (1113, 16). The fourth pathway concerns the gut microbial production of benzoic acid from dietary polyphenolics, flavonoids, and related compounds in the proximal colon; benzoic acid is phase 2 conjugated with glycine in the mitochondria (mainly in the liver) to form hippurate, an abundant urine excretory product (33). We found a strong inverse association of urinary hippurate across the range of BMI (P = 1.52 × 10−14, Table 1); a weak inverse association with adiposity has been reported previously only in morbidly obese humans (17) as well as in dogs and mice (34, 35). Last, we found associations of BMI with the symbiotic gut microbiota–associated metabolite 2-hydroxyisobutyrate (P = 7.24 × 10−12), which is related to N-butyrate produced by the bacterium Faecalibacterium prausnitzii (36).

BCAA, skeletal muscle, and energy metabolism

We observed direct associations of BMI with urinary excretion of the BCAAs leucine, valine, and isoleucine, consistent with previous reports of the importance of the BCAAs in adiposity based on measurements in plasma (1016). Here, we report associations (inverse) with BMI (P = 7.19 × 10−22) for ketoleucine, the first step metabolic product of leucine metabolism and ketogenesis in skeletal muscle mitochondria (37), and the ketoleucine/leucine ratio (P = 8.85 × 10−26). These findings reflect the importance of the enzymatic conversion of leucine to ketoleucine as a rate-limiting step in BCAA metabolism and a critical energy process in skeletal muscle (37, 38). We found a strong direct association of urinary 24-hour creatinine excretion with BMI (P = 8.88 × 10−14, Table 1); adjustment for creatinine (model 3, Table 2) greatly attenuated the association of BMI with 3-methylhistidine, the most strongly associated (P = 8.28 × 10−80, Table 2) of the variables measured by IEC. Like creatinine, 3-methylhistidine is a marker of muscle turnover (39) and meat intake (24, 25, 40). The significant direct association between BMI and ethanolamine (P = 2.00 × 10−29, Table 2) is similarly attenuated by creatinine adjustment, as are the direct associations with BMI of carnosine, 1-methylhistidine (both markers of meat intake) (40), threonine, alanine, valine, isoleucine, and histidine (Table 2). Some metabolites remain relatively unaffected by creatinine adjustment, including glycoproteins, tyrosine, ketoleucine, succinate, and 4-cresyl sulfate, indicating that these are largely independent of muscle turnover. Others such as taurine, serine, and asparagine only became significant after creatinine adjustment, indicating their interdependence with creatinine in association with BMI (Table 2). We also show associations (inverse) of BMI with the tricarboxylic acid (TCA) cycle intermediates succinate (P = 3.22 × 10−18) and citrate (P = 8.61 × 10–15) (Table 1) in urine.

Other metabolite markers of adiposity

A composite N-acetyl signal from mixed urinary glycoproteins from the 1H NMR–detected metabolite panel has a strong and significant association with BMI (P = 2.04 × 10–36) (Fig. 2, A and B, and Table 1). The association of BMI with urinary glycoproteins is thought to reflect the known higher glomerular filtration rate associated with obesity and may also provide a useful noninvasive indicator of the glomerular hyperfiltration process that is known to precede diabetes (41). Additionally, we found an association of BMI with N-acetyl neuraminate (P = 2.02 × 10–16), a hydrolytic breakdown product of urinary glycoproteins with sialic acid subunits; the assignment was confirmed by N-acetyl neuraminidase incubation experiments (42) (table S2). This finding may also reflect the fact that α-1 acid glycoprotein and other low molecular weight glycoproteins are part of the acute phase reactive proteins associated with inflammatory conditions including obesity (43). Inverse association of BMI with urinary excretion of N-methyl nicotinate (NMNA) (r = –0.16, P = 9.04 × 10–13) is consistent with previous observations in morbidly obese individuals (17). NMNA is a niacin-related (vitamin B3) metabolite reported to be a biomarker of coffee drinking (44). We found a strong partial correlation between urinary NMNA excretion and dietary caffeine intake (r = 0.54, P = 1.20 × 10–140, adjusted for model 1), although when caffeine intake was added to the model 1 covariates, it did not materially alter the BMI-NMNA association (r = −0.16, P = 5.01 × 10–12), suggesting that the BMI-NMNA association was independent of coffee consumption. The direct association of BMI with urinary pseudouridine excretion (P = 2.10 × 10–12) reflected increased whole-body nucleic acid turnover with higher BMI, consistent with enhanced adipose tissue turnover associated with obesity (45).

We found a direct association of urinary glucose with BMI (r = 0.14, P = 3.27 × 10–9), which is materially unchanged when dietary glucose intake, total sugar, or total carbohydrate intake are added to the regression model, indicating possible latent insulin resistance and undiagnosed diabetes related to adiposity (11, 13, 14, 16). We also found associations between urinary markers of dietary intake and BMI. Thus, we found an inverse association with BMI of proline betaine (P = 7.29 × 10–9), a biomarker of citrus fruit consumption (46, 47) associated with a healthy eating pattern (48), and a direct association of BMI with urinary excretion of O-acetyl carnitine (r = 0.10, P = 1.46 × 10–5), a marker of red meat intake (24, 49), which in turn has been associated with long-term weight gain (50). As expected, the association of O-acetyl carnitine with BMI was reduced when adjusted for total dietary animal protein intake, although it remained significant (r = 0.07, P = 2.93 × 10–3).

We found several differences in metabolic associations with BMI between men and women based on significant gender-interaction term (tables S5 and S6, see Materials and Methods). For example, there was a greater association of BMI with cystine (dicysteine) in women than men in both the U.S. and UK samples; the important role of cystine (and cysteine) in insulin signaling and glucose and lipid metabolism (51) might be manifested as different patterns of fat deposition between men and women (52). We also found a number of 1H NMR signals from as yet unidentified metabolites that were associated with BMI (tables S4 and S7).

Replication and validation of BMI-metabolite associations in independent samples

We show that the significant associations with BMI of all 29 1H NMR–identified metabolites from analysis of the first set of urine samples could be replicated in the second set of urine samples from the U.S. cohort, obtained on average 3 weeks later (table S8). We then analyzed data from INTERMAP UK to validate our findings from the U.S. samples in an independent data set. The same methods and protocol (8) were used for the U.S. and UK INTERMAP sample data collection, and we have reported that urinary metabolite excretion patterns measured by 1H NMR spectroscopy were similar (7). We show here that the metabolic signatures of adiposity are consistent across the U.S. and UK population samples (Table 3 and table S9). Specifically, the UK population samples replicated 25 of the 29 structurally identified significant BMI-metabolite associations at a 5% ST-FDR; the exceptions were citrate, NMNA, proline betaine, and 3-hydroxymandelate, possibly reflecting reduced statistical power for these analyses because of the smaller sample size in the UK cohort.

Table 3. Association with BMI of (A) metabolites identified by 1H NMR and (B) metabolites measured by targeted IEC (log10 values) in 444 UK INTERMAP participants for first urine collection specimens.

Excluding metabolic outliers based on Hotelling’s T2 test (n = 47) and participants with doctor-diagnosed diabetes mellitus (n = 5). Partial correlation (r) and corresponding P values are listed for each metabolite. Statistical significance is based on the ST-FDR, Q ≤ 0.05 for 1H NMR metabolites identified and analyzed in the U.S. data. Statistical significance for IEC metabolites is based on a Bonferroni threshold of P ≤ 2.50 × 10−3 (P ≤ 0.05/20).

View this table:

Reliability of urinary metabolite excretion patterns

We previously demonstrated that the analytical (technical) reproducibility of the 1H NMR urinary metabolite data in INTERMAP was >98% by comparing data from identical samples that were split in the field and analyzed blindly using different identification numbers (53). To check the stability of the metabolic network (below), we calculated the reliability of the urinary metabolite excretion data by comparing results from 1H NMR analysis of the first and second urine collections for each individual. Specifically, we calculated the ICCs (table S3) of the BMI-associated urinary metabolites detected by 1H NMR spectroscopy. The median ICC coefficient across all 7100 1H NMR spectral variables was 0.37 for the U.S. cohort [interquartile range (IQR), 0.14 to 0.51] and 0.30 for the UK cohort (IQR, 0.11 to 0.49) (fig. S1). For the 29 identified metabolites significantly associated with BMI from 1H NMR data, the ICC coefficients ranged from 0.34 (pseudouridine) to 0.72 (NMNA) in the U.S. cohort and up to 0.78 (tyrosine + 4-hydroxymandelate) in the UK cohort (table S3) (median ICC coefficient of identified metabolites is 0.53). These ICC coefficients are similar to those described previously for a range of metabolites in serum (54, 55) and urine (56) based on mass spectrometry (MS) and gas chromatography. They indicate a modest degree of attenuation of observed (simple) correlations of metabolites with, for example, BMI in the U.S. samples of between 20 and 70% for 1H NMR measurements from a single 24-hour urine collection per person.

Metabolic network map of adiposity

Metabolic pathways, as generally depicted, describe the sequential chemical reactions and conversions of substrates and products involved in multiple anabolic and catabolic processes. However, these representations do not easily capture the differential compartmental activities of these pathways or indeed the symbiotic co-metabolic interactions between, say, the gut microbiome and the host or their complex stochastic interdependencies (5759). We present an integrated metabolic reaction network of human adiposity based on the results from the U.S. population samples, using the publicly available MetaboNetworks program (60), to show the shortest metabolic paths linking the individual BMI-associated metabolites. The software matches the metabolite biomarker excretion patterns to a preexisting model of all possible scalar distances between metabolites in the Kyoto Encyclopedia of Genes and Genomes (KEGG) with the possibility to include gut microbial species in the model (60). The metabolic systems map includes the significant metabolites from NMR discovery and IEC after adjustment for confounders in model 2 and those that became significant only after the adjustment for creatinine (model 3). The resultant network effectively summarizes the human systemic urinary metabolic signatures of adiposity (Fig. 3). The network map shows (i) the metabolites associated with BMI and the direction of association and (ii) major metabolic functions classified via a colored overlay showing the class or compartment associations of the metabolites (Fig. 3 and tables S10 and S11). We show here that BCAA energy metabolism is linked to various energy-related pathways including both the TCA cycle and a series of TCA anaplerotic replenishment reactions, that is, chemical reactions that form intermediates that are then used in the TCA cycle for energy production. Also, the metabolic map indicates links between muscle and BCAA energy metabolism with a large proportion of the network being related to lipid metabolism. The amino acid excretion data shown here suggest a much broader involvement in multiple pathways disturbed in adiposity than for the BCAAs alone. Notably, the gut microbial–related markers of BMI map onto several pathways embedded in the host multicompartmental network. For instance, 4-cresyl sulfate, hippurate, phenylacetylglutamine, and 3- and 4-hydroxymandelate map to the aromatic amino acid pathway cluster, and dimethylamine and TMA map to choline metabolism, as does formate. Also, hippurate (benzoylglycine) production is mediated via coenzyme A adduct formation, which is mitochondrial (as is the case for all coenzyme A–mediated amino acid conjugations), and therefore is linked indirectly to mitochondrial energy metabolism that powers the reaction (61).

Fig. 3. Multicompartmental metabolic reaction network illustrating human adiposity–related urinary metabolic signatures in the U.S. population (n = 1880).

Metabolites are included that passed the ST-FDR threshold of Q ≤ 0.01 for both models 1 and 2 (1H NMR metabolites, see Table 1) and the Bonferroni threshold of P ≤ 4.55 × 10–4 for both models 1 and 2 (metabolites detected by IEC, see Table 2). The background shading illustrates different types of metabolism based on closest affinity classification. Table S10 lists the full names and abbreviations for the metabolites. Table S11 lists the closest affinity classifications for metabolites shown in this figure. Dotted lines indicate the closest related metabolite in the network for metabolites that are not listed in the KEGG database, based on available literature.

Joint covariate and metabolite models

We investigated which of the urinary metabolites appeared to contribute most to the metabolic signature of adiposity in two ways. First, we constructed a saturated regression model that includes model 1 and 2 covariates and all of the significantly associated (with BMI) metabolites together, and examined each metabolite in turn for its “residual” association with BMI (table S12). Then, we used the elastic net (EN) approach (62) to fit a more parsimonious joint model of the covariate data and the metabolic variables (table S13) to give a sense of the important variables correlated with BMI. The optimal EN parameters (λ and α) were found using 10-fold cross-validation to avoid overfitting the model. Some metabolites found by 1H NMR spectroscopy have signals in heavily overlapping spectral regions containing resonances from compounds such as ethanolamine, 3-methylhistidine, and lysine, which were also measured independently by IEC (Table 2). When there was a choice between a metabolite measured with NMR and IEC, the most significant was retained.

In the fully saturated model with model 2 covariates, based on a conservative Bonferroni threshold of P ≤ 2.50 × 10–4 for the U.S. samples (table S12A), urinary glycoproteins, cystine, tyrosine, ethanolamine, and 3-methylhistidine remained significantly and directly associated with BMI (P = 4.44 × 10–6 to P = 1.99 × 10–35), whereas glutamine, 4-cresyl sulfate, O-acetyl carnitine, the ketoleucine/leucine ratio, taurine, and asparagine were inversely associated (P = 1.60 × 10–4 to P = 6.17 × 10–9). This confirmed the importance in relation to BMI of muscle metabolism, urinary amino acids, gut microbial co-metabolism, and dietary meat–related markers; urinary glycoproteins, tyrosine, and 3-methylhistidine predominated in the smaller UK sample data (table S12B).

A more comprehensive set of discriminatory variables emerges with use of the EN to select significant BMI-associated metabolites in a joint modeling approach (table S13). In addition to the metabolites significantly associated with BMI in the fully saturated model for the U.S. samples (table S12A), N-acetyl neuraminic acid was directly associated with BMI, whereas succinate, citrate, hippurate, NMNA, proline betaine, 3-hydroxymandelate, and serine were inversely associated, further supporting the importance of gut microbial co-metabolites, urinary amino acids, and diet-related variables, as well as TCA cycle intermediates in determining the metabolic signature of adiposity (tables S13 and S14). Finally, as further validation of significant BMI-metabolite associations, we applied the U.S. EN model (excluding terms for population sample) to the UK sample data and found that the U.S. model provides a good fit to the UK data (table S14 and fig. S2) (F test: F52,391 = 4.04, P = 9.12 × 10–16).


We have found multiple strong associations between adiposity and urinary metabolites. These include metabolites associated with renal function, gut microbial co-metabolites, TCA intermediates, skeletal muscle turnover and mitochondrial metabolism, BCAA metabolism, and dietary intake. We have linked the urinary metabolites together in an integrated biochemical network that delineates the urinary metabolic signature of human adiposity, reproducible both across populations and among individuals over time. The network map shows dependencies and connectivities between metabolic pathways associated with adiposity and provides a systems and compartmental overview of these metabolic disturbances. Many of the metabolites remain independently associated with BMI when accounting for all variables simultaneously, and point to the importance of metabolism, environment, diet, and life-style in the ongoing obesity epidemic.

Notably, in the Results section, we observed the products of five distinct gut microbial–host co-metabolic pathways linked to BMI. The first concerns the action of the gut microbiota on dietary choline to produce TMA via choline TMA-lyases (63), with subsequent conversion of TMA to TMA-N-oxide via the FMO3 enzyme (64). This pathway has been linked to the development of atherosclerosis and cardiovascular disease (65, 66). Increased microbial co-metabolism of choline has also been associated with steatosis, fatty liver, and insulin resistance in animals fed high-fat diets (67). In humans, an increase in energy intake induces rapid changes in the gut microbiota and concomitant changes in energy absorption and energy loss in the stool (68); this may contribute to the associations observed for BMI with total energy intake as well as with energy from animal protein and total and saturated fat (69).

The second and third major pathways of gut microbial activity concern the distal colonic putrefaction of proteins by the microbiota, which leads to the production of 4-cresol from tyrosine and phenylacetylglutamine from phenylalanine; 4-Cresol is O-sulfated in the liver and excreted in the urine. 4-cresyl sulfate and phenylacetylglutamine are two of the three major human gut microbial urinary co-metabolites (the other is hippurate) (21). The production of 4-cresol from 4-hydroxyphenylacetic acid is a purely microbial enzymatic conversion performed by clostridial species (Firmicutes) (70) and reflects variation in microbial genomic activity and composition. Our findings of an association between 4-cresyl sulfate and adiposity is consistent with previous findings regarding differences in the Firmicutes/Bacteroidetes ratio in obese versus lean individuals (71), although there remains debate regarding the mechanistic connection between this microbial ratio and adiposity (72).

Another major gut microbial–host co-metabolic pathway connected to adiposity concerns the production and excretion of hippurate. The proximal colonic metabolism of polyphenolic compounds (including catechins and chlorogenic acid) results in production of benzoic acid, which is absorbed and then phase 2 glycine–conjugated to form hippurate in the host hepatic mitochondria (33). Hippurate excretion is inversely correlated with BMI (Table 1) and is reproducibly sensitive to dietary manipulation in individuals (73). A close connection of hippurate modulation with TCA cycle intermediates has been observed toxicologically where mitochondrial activity has been compromised. These metabolic connectivities may arise because of the common compartmental location (61) as the glycine conjugation of benzoic acid is coenzyme A–mediated and occurs in the hepatic mitochondria (33). This co-compartmentalization of the metabolic processes rather than simple enzymatic activity explains why the close relationship between hippurate and TCA cycle intermediate excretion is not directly apparent from Fig. 3, which maps shortest path connectivities between metabolites based on the KEGG pathways. The final gut microbial pathway associated with BMI concerns the urinary excretion of 2-hydroxyisobutyrate, which has been reported previously only among morbidly obese individuals (17) including those undergoing bariatric surgery (18).

The close connections between metabolites in multiple human body compartments and the products of gut microbial activities have not previously been demonstrated in an embedded symbiotic metabolic network (Fig. 3). Gut microbial disorders and microbe-host metabolic activities and signaling abnormalities have been implicated in the etiopathogenesis of several common chronic diseases (74) as well as premorbid conditions including obesity (71, 75) and insulin resistance (76). Specifically, there is increasing evidence for a key role of the microbiota in the modulation and control of immune responses and of inflammatory pathways (74, 77, 78). The possible connection between microbiome composition and obesity in humans (71) may also reflect the fact that calorific availability can be affected by microbial activity, for example, short-chain fatty acid production from dietary fiber by colonic microorganisms (78). Transplant of fecal material from third-trimester pregnant women or obese individuals into germ-free mice results in an obese phenotype (79, 80). We show here independent evidence of the multiple functional consequences of microbiome activities in human adiposity.

The TCA cycle intermediates citrate and succinate are key energy metabolites in the Krebs cycle and are inversely correlated with BMI (Table 1). Clearly, the greater loss of these metabolites to the urine in individuals with lower BMI represents a small calorific loss to the body. However, the main driver for excretion of TCA cycle intermediates likely relates to renal physiological function, such as renal tubular acid-base balance. In renal tubular acidosis, for example, there is reduced excretion of TCA cycle intermediates because of their increased utilization in proximal tubular mitochondria, a well-known effect of drugs that cause renal tubular acidosis such as acetazolamide and certain heavy metal toxins (81).

Given the high proportion of skeletal muscle mass in the body (~30% in women and 38% in men) (82), and the high level of resting energy consumption and even higher level of exercise energy consumption (90% of whole-body O2 uptake) in skeletal muscle (83), it is unsurprising that markers of muscle turnover and mass correlate with BMI. Our finding that the ketoleucine/leucine ratio correlates strongly with BMI is a new observation related to the utilization of BCAAs in muscle metabolism. The BCAAs have been extensively studied with respect to obesity; typically, blood concentrations of BCAAs are markedly elevated in obese individuals (1016). Data in humans show increased utilization of leucine and conversion to ketoleucine in muscle with increased exercise; the hyperaminoacidemia associated with lack of exercise may contribute to insulin resistance in obesity (10, 11). The leucine aminotransferase enzyme that controls the key leucine to ketoleucine step in the muscle mitochondria is induced by exercise to respond to increased energy demand by enhancing BCAA metabolism in skeletal muscle (37); the inverse association of the ketoleucine/leucine ratio with BMI observed here may reflect the important role of exercise in controlling adiposity. Most enzymes connected with leucine metabolism and other metabolites detected here have not been found in large-scale genome-wide association studies (GWAS) of obesity (84). This may reflect the relative lack of sensitivity of GWAS to detect effects in inducible enzyme systems under a high degree of environmental control, in this case, physical activity.

We confirmed the strong association of 3-methylhistidine with BMI, reflecting its association with muscle mass and dietary meat intake (24, 25, 39, 40). We also report a strong association of BMI with ethanolamine. However, unlike 3-methylhistidine, we find here that ethanolamine is largely uncorrelated with dietary intake. We thus propose that ethanolamine may be an intrinsic marker of skeletal muscle turnover, reflecting its important structural role in the skeletal muscle sarcolemma (85).

We identified a wide range of urinary markers associated with adiposity. These findings may have clinical utility. For example, it may be possible to identify nonobese individuals with an obesogenic urinary profile, who are at high risk of developing obesity and related metabolic disorders. Such individuals may benefit most from a personalized approach to obesity prevention including tailored individual life-style advice/interventions (for example, improved diet, increased physical activity, and pre- or probiotics) as an early preventive strategy to reduce or reverse the adverse metabolic patterns associated with obesity. In this way, the future disease burden associated with the obesity epidemic may be reduced. Such possibilities should be assessed in future studies.

Our study has a number of strengths. We included eight representative U.S. population samples from the INTERMAP study, and validated our findings using data from the INTERMAP UK population samples, with commonality of data collection methods, urinary specimen storage, and analysis. This design enabled us both to replicate our findings robustly in independent population samples and increase generalizability across geographically distant countries at the forefront of the obesity epidemic (1, 2). Standardized data collection in INTERMAP included collection of two 24-hour urine samples, providing integrated urinary metabolite excretion concentrations over a 24-hour period, and four multipass 24-hour dietary recalls per person (8, 9). The repeated urinary measures allowed us to replicate our findings in the second set of urine collections and to demonstrate reproducibility of urinary metabolite excretion over about 3-week period (table S3). We included both untargeted and targeted approaches to the analysis of urinary metabolites, thus maximizing discovery of metabolites in relation to BMI while including known associations with the BCAAs and possible relationships with other amino acids and related compounds. By using a network mapping approach, we were able to show pathway connectivities between the urinary metabolites significantly associated with BMI, reflecting different systems and sources of metabolites contributing to the metabolic signatures of adiposity.

Our study also has a number of limitations. Most importantly, it was largely cross-sectional involving a follow-up period of just 3 weeks, and therefore, cause and effect associations of urinary metabolites with BMI cannot be inferred directly. Specifically, we used mainly an untargeted metabolic profiling approach to identify combinations of correlations between urinary metabolites (reflecting whole system activity including that of the gut microbiota) and adiposity in two Western populations. We have identified multiple markers of adiposity and have extended the range of pathways and biochemical processes that can be linked to adiposity. In the absence of a strong causal model and directly supportive experimental data, we can point to the metabolic consequences of adiposity and their interconnections but rely on published network and reaction information (KEGG) to identify the intermediate steps and pathways linking the metabolic signatures of adiposity. Further downstream work, beyond the scope of this study, is needed to identify the specific alterations leading to the metabolic disturbances documented here. We have identified a series of interrelated pathways, linking both human metabolism and gut microbial co-metabolism, which provide a systems-level overview of the metabolic disturbances and consequences of adiposity. This approach is also scalable and translatable to other types of epidemiological study where tabulated data can be expressed in an interconnected metabolic network format.

BMI is a well-established measure of adiposity, highly correlated with the percentage of body fat (27, 86, 87) and strongly associated with cardiometabolic outcomes such as diabetes and cardiovascular disease (3). However, BMI is a relatively crude measure of general adiposity; inclusion of other measures, for example, waist and hip circumference (unavailable in INTERMAP) as a measure of central adiposity, may add more specificity to the phenotype (88) and allow refinement of metabolic phenotypes between different adiposity measures. In addition to percentage of body fat, BMI correlates with total muscle mass (27), so it was important to adjust in our analyses for urinary creatinine (model 3), as a marker of muscle turnover (26). Also, BMI does not take into account gender differences in fat distribution. We therefore adjusted the BMI-metabolite associations for gender (model 1) and looked separately at gender interactions in the urinary excretion of metabolites in relation to BMI (tables S5 and S6).

Our untargeted analyses are based on 1H NMR spectroscopy, which has both advantages and disadvantages compared with MS-based assays (19). Whereas NMR spectroscopy gives superior spectral reproducibility to chromatography linked to MS, which is important for large-scale metabolic profiling studies as undertaken here, it is usually less sensitive than MS for a given analytical problem (5, 6, 19, 89). However, many NMR-detectable metabolites are in major metabolic pathways and represent the key currencies of metabolic exchange around the body and thus are highly informative of relative pathway activities (6). Here, the spectra have been used as semiquantitative or relative quantification readouts where the patterns and combinations of metabolites are of interest rather than their absolute values. Finally, structural identification of metabolites can be challenging particularly where peaks overlap or low-intensity signals are detected close to baseline; we made considerable efforts through both statistical approaches and experimentation to identify the structure of adiposity-associated metabolites so these could be placed into the network model.

In summary, the relationships shown here between BMI and the 24-hour urinary metabotype, summarized in the network map, characterize a reproducible metabolic signature associated with the modern obesity epidemic. These relationships point to a variety of poorly understood avenues of metabolism that link to BMI, including gut microbial– and skeletal muscle–related energy pathways. Our findings reveal multiple connections between many metabolic compartments and pathways, and provide possible starting points for new approaches to prevention and treatment, for example, functional microbiome modulation (90) and stimulation of skeletal muscle mitochondrial metabolism. Tackling the obesity epidemic has now become an urgent priority; otherwise, gains in life expectancy enjoyed over recent decades may be halted or reversed, and healthcare provision to deal with the sequelae of obesity-related diseases may be overtaken by the demand.


Study design

The INTERMAP study is investigating dietary and other factors associated with raised blood pressure (8), the major modifiable risk factor underlying cardiovascular disease (91). The INTERMAP study surveyed 4680 men and women ages 40 to 59 years from 17 population samples in four countries (Japan, People’s Republic of China, UK, and United States). Participants were randomly recruited from general and occupational population samples in 1996 to 1999. Each participant made four clinic visits: the first two on consecutive days and the second two on average 3 weeks later, also on consecutive days. The data obtained include eight blood pressure measurements (two per visit), four 24-hour dietary recalls obtained by a trained interviewer using the multipass recall method, measurements of height and weight, and extensive questionnaire information. Two-timed 24-hour urine collections were obtained on the second and fourth visits. The present report relates to data from 1880 of the 2195 U.S. INTERMAP participants used in the discovery and replication phase and 444 of the 501 UK INTERMAP participants used in the validation phase (Fig. 1). Participant exclusion criteria are outlined below and in Fig. 1. This study is reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines (92).

Data collection and pretreatment (NMR)

The urine specimens were prepared for, and analyzed with, high-resolution NMR spectroscopy using a Bruker Avance III spectrometer, operating at 600.29 MHz for 1H, equipped with a 5-mm TCI Z-gradient CryoProbe using a standard one-dimensional (1D) pulse sequence with water suppression (7, 93). Free induction decays were Fourier-transformed, referenced to an internal standard (trimethylsilyl propionate), and baseline- and phase-corrected by use of an in-house software. The spectral regions containing water and urea (δ 6.4 to 4.5), the internal standard (δ 0.2 to −0.2), and regions δ 0.5 to 0.2, δ −0.2 to −4.5, and δ 15.5 to 9.5 were removed before normalization via the median fold change method (94). The remaining variables were “binned” to 7100 variables with bin widths of 0.001 ppm. Metabolic outliers were excluded from the data set by use of Hotelling’s T2 statistic on the scores of the principal components analysis using Pareto-scaled data (dividing each variable by the square root of the SD). Metabolic outliers were defined as participants whose scores, for either urine collection, mapped outside the Hotelling’s T2 ellipse (confidence interval = 0.95) in a cross-validated seven-component model; these participants often had high ethanol or nonsteroidal anti-inflammatory drug (NSAID) excretion (7). Participants with a doctor diagnosis of diabetes mellitus were also excluded (Fig. 1).

Data collection and pretreatment (IEC)

Urine specimens (first urine collection) were analyzed with IEC to quantify the concentration of amino acids and related compounds using the Biochrom 20+ and 30+ Amino Acid Analyser with Midas autosampler. The amino acids and related compounds were separated using mobile phases with different pH and by varying temperature. Ninhydrin was used to derivatize the compounds after separation followed by measurement of the absorbance at 570 or 440 nm (depending on the compound) of the colored complexes. EZChrom Elite was used to integrate the peaks automatically and calculate the concentration using an internal standard (2,4-diaminobutyrate) as a reference. The robustness of the method was checked using quality control samples, and compounds were excluded where the variation in concentration was bigger than ±2.5 SD. Zeros in the data matrix were replaced by the lowest nonzero value in each column before log normalization.

Metabolite identification

Statistical total correlations spectroscopy (STOCSY) (28) and subset optimization by reference matching (STORM) (29) were used to identify structural correlations in the spectral data. A Bruker reference library was used to assign the chemical shifts to metabolites. Confirmation of metabolite identification was achieved using a combination of strategies. Specimens containing the largest amount of each unidentified metabolite were selected using an in-house MATLAB script and/or STORM. A catalog of standard 1D 1H NMR pulse sequence with water peak presaturation and 2D NMR methods such as 2D J-resolved, 2D 1H-13C heteronuclear single quantum coherence (HSQC), 1H-1H correlation spectroscopy (COSY), 1H-1H total correlation spectroscopy (TOCSY), and diffusion-edited NMR experiments were applied to the selected specimens to confirm the signals and their chemical shifts. Pretreatment of urine with solid-phase extraction (SPE) methods was carried out for some metabolites to simplify urine specimens and isolate specific signals according to their polarity and dissociation constant. All the SPE fractions were analyzed by 1D NMR spectroscopy. Metabolites were confirmed by in situ spiking experiments using authentic chemical standards or enzymatic methods. The strategy used for each identified metabolite is shown in table S2. Table S3 shows the (3-week) ICC (Eq. 1) from comparison of excretion levels in the first and second urine collections for each metabolite. Here, σB2 indicates the between-sample variance and σW2 indicates the within-sample variance.Embedded Image(1)

The ICCs were calculated in R using the “irr” package (95). The degree of bias for standard correlation was calculated as in Eq. 2, where a BMI attenuation factor of 0.98 (96) was used.Embedded Image(2)

Statistical analysis

Partial correlation models. Partial correlation models were run for the 1H NMR data (7100 variables) and IEC data (22 variables) of the first urine collection with adjustment for progressively increasing numbers of confounding variables. Model 1 adjusts for demographic variables: age, gender, and sample; model 2 adds medical and life-style factors: history of heart disease or stroke, moderate to heavy physical activity (hours/day), medication for hypertension, prescribed lipid-lowering drugs, NSAID use, dietary supplement use, special diet, smoking status, years of education, and total energy intake per day (kcal/day); and model 3 adds 24-hour urinary creatinine (mmol/day). 1H NMR data of the second urine collection are used as replication data set.

The partial (Pearson) correlation is calculated in two steps. First, the data are adjusted for covariates (Eq. 3). Here, Xj indicates the jth variable of X, C indicates a matrix of a constant (column of ones) and the covariates (confounding variables) for a model (M) combined, and i indicates the ith sample.Embedded Image(3)

Second, the partial correlation is calculated (Eq. 4) using the two columns of R, , and Y˜, representing the adjusted variable Xj and outcome variable, respectively. Here, rj is the partial correlation for variable j and n is the number of samples.Embedded Image(4)

In addition to the adjustment for gender, we investigated the gender-metabolite interaction term, where gender was coded as 0 (men) and 1 (women).

Multiple testing. The ST-FDR (23) is used to calculate the Q values to correct for multiple testing for the 7100 1H NMR variables. The proportion of truly null P values (π0) was determined using bootstrap resampling (97) as described in (23). ST-FDR takes a list of P values (P) sorted from lowest (P1) to highest (Pp), where p is the total number of variables (Eq. 5). Next, the algorithm loops back from Qp–1 to Q1 to investigate whether the list of Q values is still sorted from smallest to largest (Eq. 6).Embedded Image(5)Embedded Image(6)

For the 1H NMR discovery analyses (U.S. samples, first urine collection), a threshold of Q ≤ 0.01 was used to indicate a significant association with BMI. For the replication analyses (U.S. samples, second urine collection), the threshold was 5%; the same threshold (5%) was used for the validation data sets (UK 1H NMR data). To avoid spurious associations, spectral variables were designated as significant only if they met the ST-FDR threshold, above, of 1 or 5%, and the two adjacent spectral signals also attained the same threshold in addition to all three variables having the same sign of the correlation.

For the targeted IEC data (22 amino acids and related compounds), a Bonferroni threshold was used to determine significance. For the discovery analyses (U.S.), a threshold of P ≤ 4.55 × 10–4 (P ≤ 0.01/22) was used. For the validation analyses (UK), a threshold of P ≤ 2.50 × 10–3 (P ≤ 0.05/20) was used (methionine and arginine were excluded). Second urine data were not available for the IEC analyses.

The gender-interaction analyses were adjusted for multiple testing using a Bonferroni threshold of P ≤ 3.33 × 10–4 (P ≤ 0.01/30), P ≤ 1.67 × 10–3 (P ≤ 0.05/30), and P ≤ 4.55 × 10–4 (P ≤ 0.01/22) for 1H NMR discovery, 1H NMR replication, and IEC discovery, respectively. For the validation analyses, the thresholds were P ≤ 1.67 × 10–3 (P ≤ 0.05/30) for 1H NMR and P ≤ 2.50 × 10–3 (P ≤ 0.05/20) for IEC.

Last, partial correlation models were run where the association of each metabolite (1H NMR and IEC combined) was adjusted for the association of the remaining metabolites (in addition to the covariates in models 1 and 2). In case of overlap between 1H NMR and IEC, the most significant variable of the two was included. For the discovery analyses, a Bonferroni threshold of P ≤ 2.56 × 10–4 (P ≤ 0.01/39) was used, and for the validation analyses, P ≤ 1.28 × 10–3 (P ≤ 0.05/39) was used.

Joint covariate and metabolite models. An EN (62) model was used to determine the important contributors to the BMI model with covariates and metabolites combined into one model. The EN model allows for the inclusion of correlated groups of variables in the model while avoiding the inclusion of all variables in the model at the same time, which may lead to overfitting. ENs are a trade-off between the lasso (ℓ1) (98) and ridge (ℓ2) (99) penalty; the ℓ2 penalty forces regression coefficients (β) toward 0, whereas the ℓ1 forces certain coefficients to be exactly 0 (Eq. 7). Here, β0 denotes the constant; n, the number of samples; p, the number of variables; Y, the outcome variable; X, the data matrix; λ, the shrinkage factor; and α, the weighing factor between the ℓ1 and ℓ2 penalties.Embedded Image(7)

A grid search was performed to find the optimal parameter settings for λ and α. For all combinations of nonnegative λ’s and α’s between 0 and 1, the mean squared error (MSE) of (10-fold) cross-validation was calculated. To avoid overfitting, the optimal λ and α were chosen as the combination for which the MSE falls within 1 SD of the overall minimal MSE. The coordinate descent algorithm (100) was used to calculate the regularization path for the EN.

Two EN models were calculated; the first used the U.S. data with all covariates and a unique set of 39 metabolites across the 1H NMR and IEC data sets to identify the important contributors to the BMI model. A second model was then calculated that applied the U.S. model coefficients to the UK data; this required that the variables representing the U.S. population samples were left out of the model; thus, the model applied to the UK data included all covariates (except the population sample variables) and the unique set of 39 metabolites (coefficients for some of which were set to zero).

Metabolic reaction network

Using MetaboNetworks (60), reactions that occur spontaneously or by means of an enzyme linked to a Homo sapiens or microbial gene were identified in KEGG. The gut microbiota that are included are those of the most abundant endosymbionts—the phyla Firmicutes, Bacteroidetes, Alphaproteobacteria, Betaproteobacteria, Deltaproteobacteria, Gammaproteobacteria, and Actinobacteria—because these make up 99% of the phylotypes found in the human gut (101). For each of the reactions, the main reaction pairs were identified, and an adjacency matrix was calculated for all compounds based on the main reactant pairs. The shortest paths between all metabolites significantly associated with BMI were calculated from the adjacency matrix, and a network graph was drawn for all compounds needed to connect all identified metabolites with the shortest paths. In the network, the three dashed lines indicate the closest related metabolite in the network based on literature for metabolites that are not found in the KEGG database.

All calculations were performed in MATLAB (2013a, The MathWorks) and R (3.0.2., R Foundation for Statistical Computing).


Fig. S1. Distributions of the ICCs of the 1H NMR data.

Fig. S2. Scatterplot of observed BMI (UK) versus BMI estimated using UK data by the U.S. EN model (tables S13 and S14) (F test: F52,391 = 4.04, P = 9.12 × 10–16).

Table S1. Descriptive data of the INTERMAP study.

Table S2. Chemical shifts identified and identification strategy for each 1H NMR metabolite.

Table S3. ICC for each 1H NMR metabolite for the first and second urine collection data (obtained on average 3 weeks apart).

Table S4. Structurally unidentified 1H NMR–derived signals associated significantly with BMI in 1880 U.S. INTERMAP participants using first urine collection specimens.

Table S5. P value for a gender-interaction term included in the models for the U.S. INTERMAP population (n = 1880).

Table S6. P value for a gender-interaction term included in the models for the UK INTERMAP population (n = 444).

Table S7. Structurally unidentified 1H NMR–derived signals associated significantly with BMI in 1880 U.S. INTERMAP participants using second urine collection specimens.

Table S8. Structurally identified 1H NMR–derived urinary metabolites associated significantly with BMI in 1880 U.S. INTERMAP participants using second urine collection specimens.

Table S9. Association with BMI of metabolites identified by 1H NMR in 444 UK INTERMAP participants for second urine collection specimens.

Table S10. Full names and abbreviations of metabolites in the metabolic reaction network (Fig. 3).

Table S11. Abbreviations and closest affinity classifications of metabolites in the metabolic reaction network (Fig. 3).

Table S12. Metabolite-BMI associations adjusted for all other metabolites.

Table S13. EN model of all variables combined (λ = 0.15, α = 0.98).

Table S14. EN model of all variables combined (λ = 0.17, α = 0.59), excluding population sample variables.


  1. Acknowledgments: We thank E. Maibaum for carrying out the 1H NMR spectroscopy of the INTERMAP urine samples, I. Yap for data analyses, and the staff at local, national, and international centers for collecting the INTERMAP data and samples. A partial listing of colleagues can be found in (8). Funding: The INTERMAP and INTERMAP metabonomic study are supported by the U.S. National Heart, Lung, and Blood Institute (grants R01-HL50490 and R01-HL84228). INTERMAP data collection was also supported by Chicago Health Research Foundation and national agencies in the People’s Republic of China, Japan [Ministry of Education, Science, Sports, and Culture, Grant-in-Aid for Scientific Research (A), No. 090357003], and UK. P.E. acknowledges support from Medical Research Council–Public Health England (MRC-PHE) Centre for Environment and Health, National Institute for Health Research (NIHR) Biomedical Research Centre at Imperial College Healthcare National Health Service Trust and Imperial College London, and the NIHR Health Protection Research Unit on Health Impact of Environmental Hazards. P.E. is an NIHR senior investigator. J.M.P. was supported by an MRC-PHE Centre for Environment and Health PhD studentship. We also thank the MRC-NIHR National Phenome Centre for facilitating this and related work. The data reported in this manuscript are tabulated in the main paper and in the Supplementary Materials. Author contributions: P.E., J.M.P., E.H., and J.K.N. wrote the manuscript; J.M.P. performed data analysis; Q.C. processed and provided dietary and population data; J.M.P., I.G.-P., A.W., and M.B. identified metabolites; I.G.-P. performed the experiments; T.M.D.E. provided statistical support; P.E. and J.K.N. designed the study; P.E., J.S., H.U., L.Z., L.v.H., and M.D. lead the INTERMAP study; J.K.N., P.E., and E.H. lead the INTERMAP metabonomic study. Competing interests: J.K.N. and E.H. are both nonexecutive directors of Metabometrix Ltd., one of several fee-for-service companies that provide metabolic profiling services. The company does not work directly in the area of obesity research, historically mainly providing toxicological screening capabilities. I.G.P. and T.M.D.E. have been paid consultants for Metabometrix Ltd. The other authors declare that they have no competing interests.
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article