Glycan Utilization Profiling in Human Gut Microbiomes of Common Funds DataHealthy diets are key to prevent various metabolic diseases (e.g. cardiovascular disease intestinal boweldisease and obesity). The western diets are known to be unhealthy as it lacks sufficient dietary fibers whichare critical to nurture a healthy gut microbiome. Furthermore not only the amount but also the types of dietaryfibers have a significant impact on the healthy gut microbiome. Personalized dietary intervention by givingdifferent dietary fibers as prebiotics to different individuals is an effective strategy to enable personalizednutrition for disease prevention. However microbiome-based personalized nutrition demands a betterunderstanding and a capability to computationally profiling glycan utilization in gut microbiomes of any humanindividuals from different populations lifestyles and diseases.To fill this research gap this R03 project aims to develop a bioinformatics workflow to automatically retrieveCAZyme (carbohydrate active enzyme) gene clusters (CGCs) from publicly available human gutmetagenomes. These include microbiome data generated in three NIH Common Fund programs: theHuman Microbiome Project (HMP) the Integrated Human Microbiome Project (iHMP) and the Human Heredityand Health in Africa (H3Africa) project. Other microbiome data that were not funded by NIH will also beincluded to have a better representation of more diverse human populations. The genomes from HMPH3Africa and other microbiome projects will be used to identify fiber degrading CAZymes and CGCs forming two reference databases (refCAZymes and refCGCs) that can be used to map sequencing reads from anyindividuals microbiome sample to infer personalized fiber utilization. To demonstrate this utility metagenomicand metatranscriptomic reads of 791 samples of iHMP Inflammatory Bowel Disease Multiomics database(iHMP-IBDMDB) will be mapped to refCAZymes and refCGCs to compare the glycan utilization abundance andprevalence between IBD patients and healthy people.The significance of this project is that it will contribute to a better understanding of the diversified glycanutilization among different human populations lifestyles and disease status. The workflow developed in thisproject will be implemented as a new software package named GLUP (glycan utilization profiling code anddocumentation will be on GitHub) using the popular workflow manager Nextflow to facilitate the emergingmicrobiome-based personalized nutrition and health industry. The innovation is that it will be the first globalCGC-based glycan profiling across different human populations especially in the under-represented and onlyrecently available African microbiomes. This project is built upon our highly cited CAZyme bioinformatics toolsuite named dbCAN that has been continuously developed since 2012.
Glycan Utilization Profiling in Human Gut Microbiomes of Common Funds Data
Objective
Investigators
YIN, YANBIN
Institution
UNIVERSITY OF NEBRASKA LINCOLN
Start date
2025
End date
2026
Funding Source
Project number
1R03OD039979-01
Accession number
39979