Miller, Efrat; Yadid M. Algavi and Elhanan Borenstein

Background Microbiome-metabolome studies of the human gut have been gaining popularity in recent years, mostly due to accumulating evidence of the interplay between gut microbes, metabolites, and host health. Statistical and machine learning-based methods have been widely applied to analyze such paired microbiome-metabolome data, in the hope of identifying metabolites that are governed by the composition of the microbiome. Such metabolites can be likely modulated by microbiome-based interventions, offering a route for promoting gut metabolic health. Yet, to date, it remains unclear whether findings of microbially associated metabolites in any single study carry over to other studies or cohorts, and how robust and universal are microbiome-metabolites links. Results In this study, we addressed this challenge by performing a comprehensive meta-analysis to identify human gut metabolites that can be predicted based on the composition of the gut microbiome across multiple studies. We term such metabolites “robustly well-predicted”. To this end, we processed data from 1733 samples from 10 independent human gut microbiome-metabolome studies, focusing initially on healthy subjects, and implemented a machine learning pipeline to predict metabolite levels in each dataset based on the composition of the microbiome. Comparing the predictability of each metabolite across datasets, we found 97 robustly well-predicted metabolites. These include metabolites involved in important microbial pathways such as bile acid transformations and polyamines metabolism. Importantly, however, other metabolites exhibited large variation in predictability across datasets, suggesting a cohort- or study-specific relationship between the microbiome and the metabolite. Comparing taxonomic contributors to different models, we found that some robustly well-predicted metabolites were predicted by markedly different sets of taxa across datasets, suggesting that some microbially associated metabolites may be governed by different members of the microbiome in different cohorts. We finally examined whether models trained on a control group of a given study successfully predicted the metabolite’s level in the disease group of the same study, identifying several metabolites where the model was not transferable, indicating a shift in microbial metabolism in disease-associated dysbiosis. Conclusions Combined, our findings provide a better understanding of the link between the microbiome and metabolites and allow researchers to put identified microbially associated metabolites within the context of other studies.