Daniel Garza, Meine Boer and I, we organized a 3-days workshop at INRAE/PROSE in Antony, on the basics of metabolic modelling, our new gap-filler, called DNNGIOR and applications of the methodology on microbiomes.
µGrowthDB is a relational database that provides a structure to store different types of microbial growth data. Our main objective is to serve as a comprehensive resource for researchers interested in studying the growth dynamics of microbial species by providing access to growth projects and studies uploaded by other researchers.
Hierarchical Modelling of Species Communities (Hmsc) is a flexible framework for Joint Species Distribution Modelling (JSDMs). relate species occurrences or abundances to environmental covariates, species traits and phylogenetic relationships. JSDMs are a special case of species distribution models (SDMs) that take into account the multivariate nature of communities which allows us to estimate community level responses as well capture biotic interactions and the influence of missing covariates in residual species associations. GitHub
Kinbiont – an ecosystem of numerical methods integrating advanced ordinary differential equation solvers, non-linear optimization, signal processing, and interpretable machine learning algorithms. https://pinheirogroup.github.io/Kinbiont.jl/ https://www.biorxiv.org/content/10.1101/2024.09.09.611847v1.full
In the following table I describe some terms I usually read/write about.
Term
Description
richness
number of different taxa in a community
evenness
the commonness or rarity of a species
mucin
a family of high molecular weight, heavily glycosylated proteins (glycoconjugates) produced by epithelial tissues in most animals
effective number
refers to the number of equally abundant species needed to obtain the same mean proportional species abundance as that observed in the dataset of interest (where all species may not be equally abundant)
copiotrophs
taxa living in environments rich in nutrients
succession
changes in the presence, relative abundance or absolute abundance of one or more organisms within a microbial community. Its processes can be deterministic or stochastic. Factors that drive deterministic succession fall into three categories: abiotic factors (pH, redox potential), environmental factors (cross-feeding, diet or travel) and biological factors (innate and adaptive immunity). Stochastic succession is defined as microbial community changes that are not the consequence of environmentally determined fitness (ecological drift). Whether microbial succession is more deterministic or stochastic is driven by several factors in the formation of the community, including birth mode, travel, diet (for example, human breast milk) and antibiotics.
Cool bioinfo
https://www.dbvis.com/thetable/
where we gather together to learn about and simplify the complexity of working with database technologies.
for example, here is how to combine mysql with docker https://www.dbvis.com/thetable/containerizing-mysql-with-docker-and-dbvisualizer/
creating beautiful command line interfaces in a composable way with as little code as necessary. https://click.palletsprojects.com/en/stable/
part of the Pallets community : https://palletsprojects.com
panGPT and panPrompt: Generative Pre-Trained Transformer for Large Pangenome Models (LPMs) from scratch. https://github.com/mol-evol/panGPT
Microbiome analysis counts several aspects, yet it always comes down at some basic concepts! Here, I will try to cover a thing or two and gather some links that describe them more thoroughly. To save some (of my) time, I will use a lot of copy-paste () always referring the initial resource
Although microbiome data has some of the attributes of compositional data, it is not perfectly compositional. Classic compositional data vectors represent portions of a whole. The total sum of the components is not meaningful, and only the relative difference between components matters [36]. For truly compositional data, the vectors (2, 1) and (2000, 1000) represent the same information: only that the first and second components are present in the ratio 2 : 1. For microbiome data, the size of the counts also contains information about the reliability of the ratio. Larger counts are more likely to closely match the true ratio in the sample [44].
Covariate Adjustment
Regression analysis, potentially with penalization for variable selection, has been used to analyze an outcome of interest modeled as a function of microbiome features.
which microbiome features are impacted by treatments or environmental conditions? identify features whose abundances change across treatments or conditions differential abundance analysis
which microbiome features mediate treatment effects on an outcome? identify taxa affected by treatments and that because of their change the outcome of the treatment is influenced mediation analysis
which microbiome features have an effect on an outcome, adjusting for confounders identify microbiome features with an effect on an outcome with no particular treatments of interest, but with potential complex confounding arising from relationships between microbes, host, and environment Feature Identification Adjusting for Confounding
Differential Abundance Analysis
the null hypothesis of a differential abundance test is that treatments do not affect the mean abundance level.
Differential abundance analysis (solid lines) selects microbiome features whose abundance levels change across treatments or conditions. It only examines the relationship between treatments/conditions and microbiome features, but not the relationships involving other outcomes
Mediation analysis
high dimensionality and sparsity of microbiome data.
In case of a treatment - microbiome - outcome study, one needs to avoid differential abundance analysis as the relationships you are looking for are out of scope for DA!
Mediation analysis examines the indirect effects of treatment on the outcome through the microbiome. To determine whether a feature has a mediation effect, a method must consider both the effect of the treatment on the feature and the effect of the feature on the outcome.
Feature Identification Adjusting for Confounding
asd
Microbial phenomena that can drive you nuts!
A strain dips to very low abundance in a microbial community and then recovers to thrive
This is quite common and reflects the complex interplay of ecological, evolutionary, and environmental factors. In this page you can find some of the ways a strain might manage to survive during these low-abundance periods without going completely extinct.
microbetag implements data integration techniques to annotate both the nodes (taxa) and the edges (predicted associations) of such a network to enhance microbial co-occurrence network analysis for amplicon data.
ELIXIR Marine Metagenomics Community, initiated amongst researchers focusing on marine microbiomes, has concentrated on promoting standards around microbiome-derived sequence analysis, as well as understanding the gaps in methods and reference databases, and solutions to computational overheads of performing such analyses.
Last month, I had the chance to visit ETH and the Institute of Food, Nutrition and Health.
I was lucky to see the awsome bioreactors they have set over the years there in the Laboratory of Food Biotechnology but most importantly, a group of people that struggle to keep doing what they love.
I was able to see how the data are produced but also spread the world about microbetag, our co-occurrence network annotator.
For sure, two weeks were not enough, so I hope we will meet again even somewhere not as beautiful as the snowy Zurich. A great thanks to Dr. Annelies Geirnaert and Professor for this opportunity and to all the lab members for making my stay so pleasant. Of course, a special thanks to Dr. Andi Erega for all the explanations and patience with me knowing but the basics in the lab but mostly for the burek!
metaGOflow is a Common Workflow Language (CWL) based pipeline for the analysis of shotgun metagenomic data at the sample level. It was initially built to address the needs of the EMO BON community but it can be used for any type of shotgun sequencing data.
metaGOflow is based on the tools and subworkflow implemented in the framework of MGnify.