PerMedCoE releases new tool COBREXA.jl

To pave the way for Exascale analysis of metabolic models of organisms and their biochemical interactions, PerMedCoE has implemented a new tool that will help orchestrate comprehensive HPC analyses.

Constraint-based modeling is an established way to represent and analyse biochemical processes and metabolism of living organisms and ecosystems. Individual reactions that power life are represented as single processes, which, at a certain rate, consume chemical reactants and output products. The rate of reactions is subject to constraints, as many reactions may (or may not) be unidirectional, the total reaction speed may be limited by the properties of the environment, and the influx of available reactants may be limited by factors such as transport speed across the cell membranes. Individual reactions are connected into larger groups (“networks of reactions”), forming comprehensive descriptions of organelles, cells, tissues, organs, and whole organisms.

This approach has been successfully used to create models that range in size from small microorganisms and bacteria and small eukaryotic organisms that “run” several thousands of different reactions, to large models or humans with hundreds of thousands of individual reactions taking place simultaneously in thousands of compartments. Recently, metagenomic studies have constructed models of ocean microflora that capture the huge biodiversity of plankton and metabolite and nutrient migration in the ocean, using millions of individual reactions.

To analyse this data, scientists use numeric methods to find a stable state of the constructed models, together with the expected flow of metabolites, e.g., influx of nutrients and output of biomass to build new organisms, and chemical byproducts. The models are regularly updated to match the measurements obtained from real organisms and other new research results. The ability to quickly determine the metabolism functionality of a huge model subjected to certain conditions such as nutrient availability is crucial for predicting and explaining many metabolism-related phenomena.

In this context, COBREXA.jl allows scientists to easily orchestrate huge analyses on the available HPC resources. This ability is important for many typical analyses that try to identify the cause of a metabolic property of an organism by searching through thousands of variants of a matching model, trying to identify one that could cause the desired or observed effect. While processing of the single model of a human may take mere minutes on current computers, the vast number of variants that need to be analysed to create a solid understanding of the model variability may reach millions of individual model variants, which would take years to scrutinise on a single computer. When the analysis is run in parallel on thousands of computing nodes at an HPC center at once, the millions of variants can be again analysed in minutes or hours, delivering the required insight into the inner processes in the organism.

Visualisation of a stable state of the metabolic network of E. Coli processed by COBREXA.jl, showing individual metabolites (white dots) that are consumed and produced by reactions at varying rates (colored line)

Visualisation of a stable state of the metabolic network of E. Coli processed by COBREXA.jl, showing individual metabolites (white dots) that are consumed and produced by reactions at varying rates (colored lines)


With comprehensive information of human metabolic processes available in the latest human models, the ability to simulate and explore those models directly benefits personalised medicine by providing a knowledge base that may help scientists and clinicians infer the origin of metabolic anomalies, rare diseases, and allergies.

A direct clinical application has unfolded with the availability of individual genome sequencing and variant screening. Based on the curated information that describe the correspondence between genes, gene mutations and the reactions in metabolic processes, a person may reconstruct a complete model of oneself’s metabolism, sometimes called a virtual metabolic twin. That can later be used to predict presence of various problems that may lead to disease and avoid the disease with early targeted medication and other preventive measures.

The ability to model whole communities of organisms can be used to simulate even the complex relations within the human gut microbiome. Understanding this microbiome implies understanding of many complicated aspects of human nutrition, including food intolerances and allergies that stem from imbalance of the microbe ecosystems that occur in our intestines. Developers of COBREXA.jl also cooperate with industrial stakeholders that specialise in using such models for creating personalised digestion and nutrient decomposition models, which may again be used for predicting and avoiding food-related health problems.

Applications of modeling of metabolic ecosystems may reach even far beyond the small-scale medicine. We expect that the viability of estimation of metabolic output of bacterial communities may help to design new methods in bioengineering. The suggestion of viable combinations of individual bacteria and their genetically modified versions that would improve the processing of chemicals in large bioreactors and industry-scale bioengineering, thus improving the common “green” technologies such as biofuel production, and safe biodegradation of waste material.

Authors: Miroslav KratochvĂ­l, Laurent Heirendt, Wei Gu, Christophe Trefois (University of Luxembourg)