Bioinformatics Tools

  • HiPathia

    Mechanistic models of signaling pathways

  • CoV-HiPathia

    Mechanistic models of the COVID-19 disease

  • CSVS

    A crowdsourcing database of the Spanish population genetic variability


    Crowdsourcing initiative to provide information about Copy Number Variations of the Spanish population to the scientific/medical community

  • Metabolizer

    Differential metabolic activity and discovery of therapeutic targets using summarized metabolic pathway models


    A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways

  • ngsCAT

    Next Generation Sequencing data Capture Assessment Tool

HiPathia: Mechanistic models of signaling pathways

hipathia banner

What is a mechanistic model?

Mechanistic models aim to bridge the gap between easily available genomic data, which account for gene activity (transcriptomics) or gene integrity (genome/exome sequencing), and the complex cell or organism phenotype (cell functional decisions or fate, ultimately responsible for the observed conditions, e.g. disease, drug response, etc.)

Genome-scale mechanistic models rely on the knowledge on cell signaling and metabolism already available (e.g. KEGG, Reactome, etc.) over which a mathematical model is build. These models that can quantify the intensity of signal transduction from the original measurements of gene expression, and consequently the activity of the different signaling circuits. They are called mechanistic models because they model the molecular mechanisms that dictate cell action and fate. Since they convey the notion of causality these models can be used not only to understand in detail the disease mechanisms but also to simulate the effects of interventions (e.g. drug inhibitions).

Here we present the mechanistic model HiPathia (Hidalgo et al, 2017), a model that simulates the transduction of the signal along signaling circuits in the pathways (see Figure 1), taking the gene expression values as proxies of the corresponding protein activities and considering distinct types of activities (inhibitions and activations). HiPathia is an improvement of a previous algorithm (Sebastian-Leon et al., 2013, 2014) that overcomes some limitations of the probabilistic approach.

circuit propagation

A recent benchmarking has demonstrated that HiPathia algorithm outperforms other competing algorithms for modeling signaling pathways mentioned above. The mechanistic model implemented in HiPathia has been successfully used to understand the disease mechanisms behind different cancers, including neuroblastoma, cancer-prone rare genodermatoses, common diseases such as diabetes, the response of cell lines to drugs (Amadoz et al, 2015), drug repositioning (Esteban-Medina et al, 2019, Loucera et al., 2020) and other biologically interesting scenarios such as the molecular mechanisms that explain how stress-induced activation of brown adipose tissue prevents obesity (Razzoli et al, 2016) or the mechanisms of death and the post-mortem ischemia of a tissue. Moreover, mechanistic models have recently been used to deconvolute the functional landscape at the level of single cell in glioblastoma.

Hipathia implementations

Currently three implementations of the HiPathia mechanistic model of signaling pathways are available:

  • R/Bioconductor, package for experienced users interested in a programmatic use of the algorithm.

  • Cytoscape plugin, which offer a graphic environment for end users of the Cytoscape community.

  • Web Tool, with a dynamic intuitive graphical interface, useful for inexperienced users, is also available. The web interface implements extra functionalities beyond the classical differential circuit activity for two-class comparisons, that include the analysis of the impact of simulated interventions (inhibitions, namely knock-outs or knock-downs, over-expressions, etc.) over the activity of the pathways and the evaluation of the potential consequences of mutations over signaling. Moreover, the web interface allows building predictors using signaling circuit activities as features. Interestingly, the features selected by the predictor as relevant for class discrimination provide at the same time valuable insights on the molecular mechanisms that explain the differences between the conditions to discriminate, namely diseases, drug action mechanisms, etc.


Hidalgo MR, Cubuk C, Amadoz A, Salavert F, Carbonell-Caballero J, Dopazo J: High throughput estimation of functional cell activities reveals disease mechanisms and predicts relevant clinical outcomes. Oncotarget 2017, 8:5160-5178

CoV-HiPathia: mechanistic models of the COVID-19 disease

covhipathia banner

CoV-Hipathia (Rian et al., 2021)[ is a web interface that implements a comprehensive mechanistic model of the SARS-CoV-2 disease map (Ostaszewski et al., 2020)[]. In this framework, the detailed activity of the human signaling circuits related to the viral infection, covering from the entry and replication mechanisms to the downstream consequences as inflammation and antigenic response, can be inferred from gene expression experiments. Moreover, the effect of potential interventions, such as knock-downs, or drug effects (currently the system models the effect of more than 8000 DrugBank drugs) can be studied. This freely available tool not only provides an unprecedentedly detailed view of the mechanisms of viral invasion and the consequences in the cell but has also the potential of becoming an invaluable asset in the search for efficient antiviral treatments


CoV-Hipathia is available here:


Rian K, Esteban-Medina M, Hidalgo MR, Çubuk C, Falco MM, Loucera C, Gunyel D, Ostaszewski M, Peña-Chilet M, Dopazo J. Mechanistic modeling of the SARS-CoV-2 disease map[ BioData Min. 2021 Jan 21;14(1):5. doi: 10.1186/s13040-021-00234-1.

CSVS: A crowdsourcing database of the Spanish population genetic variability

csvs banner

The knowledge of the genetic variability of the local population is of utmost importance in personalized medicine and has been revealed as a critical factor for the discovery of new disease variants. Here, we present the Collaborative Spanish Variability Server (CSVS) (Peña-Chilet el al., 2020), which currently contains more than 2000 genomes and exomes of unrelated Spanish individuals. This database has been generated in a collaborative crowdsourcing effort collecting sequencing data produced by local genomic projects and for other purposes, such as the MGP (Dopazo et al., 2016). Sequences have been grouped by ICD10 upper categories. A web interface allows querying the database removing one or more ICD10 categories. In this way, aggregated counts of allele frequencies of the pseudo-control Spanish population can be obtained for diseases belonging to the category removed. Interestingly, in addition to pseudo-control studies, some population studies can be made, as, for example, prevalence of pharmacogenomic variants, etc. In addition, this genomic data has been used to define the first Spanish Genome Reference Panel (SGRP1.0) for imputation. This is the first local repository of variability entirely produced by a crowdsourcing effort and constitutes an example for future initiatives to characterize local variability worldwide. CSVS is also part of the GA4GH Beacon network.


CSVS is available here:


Peña-Chilet M, Roldán G, Perez-Florido J, Ortuño FM, Carmona R, Aquino V, Lopez-Lopez D, Loucera C, Fernandez-Rueda JL, Gallego A, García-Garcia F, González-Neira A, Pita G, Núñez-Torres R, Santoyo-López J, Ayuso C, Minguez P, Avila-Fernandez A, Corton M, Moreno-Pelayo MÁ, Morin M, Gallego-Martinez A, Lopez-Escamez JA, Borrego S, Antiñolo G, Amigo J, Salgado-Garrido J, Pasalodos-Sanchez S, Morte B; Spanish Exome Crowdsourcing Consortium, Carracedo Á, Alonso Á, Dopazo J. CSVS, a crowdsourcing database of the Spanish population genetic variability. Nucleic Acids Res. 2021 Jan 8;49(D1):D1130-D1137. doi: 10.1093/nar/gkaa794.

SPACNACS: A crowdsourcing initiative to provide information about Copy Number Variations of the Spanish population to the scientific/medical community.


SPACNACS is a crowdsourcing initiative to provide information about Copy Number Variations of the Spanish population to the scientific/medical community. We accept submissions from WES or WGS, no matter whether these come from healthy or diseased individuals.

The sequences were contributed by different consortiums and projects, including groups from the Spanish Network for Research in Rare Diseases, CIBERER, results from the EnoD, the Project Genome 1000 Navarra and other research groups and initiatives across Spain.


SPACNACS is an open resource available at

Metabolizer: Differential metabolic activity and discovery of therapeutic targets using summarized metabolic pathway models

metabolizer banner

Metabolizer is a web-based application that offers an intuitive, easy-to-use interactive interface to analyze differences in pathway metabolic module activities that can also be used for class prediction and in silico prediction of knock-out (KO) effects (Cubuk et al., 2019). Moreover, Metabolizer can automatically predict the optimal KO intervention for restoring a diseased phenotype. We provide different types of validations of some of the predictions made by Metabolizer. Metabolizer is a web tool that allows understanding molecular mechanisms of disease or the MoA of drugs within the context of the metabolism by using gene expression measurements. In addition, this tool automatically suggests potential therapeutic targets for individualized therapeutic interventions (Cubuk et al., 2018).


Metabolizer is available here:


Çubuk C, Hidalgo MR, Amadoz A, Rian K, Salavert F, Pujana MA, Mateo F, Herranz C, Carbonell-Caballero J, Dopazo J. Differential metabolic activity and discovery of therapeutic targets using summarized metabolic pathway models. NPJ Syst Biol Appl. 2019 Mar 1;5:7. doi: 10.1038/s41540-019-0087-2. eCollection 2019.

MIGNON: A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways

MIGNON banner

MIGNON is a workflow for the analysis of RNA-Seq experiments, which not only efficiently manages the estimation of gene expression levels from raw sequencing reads, but also calls genomic variants present in the transcripts analyzed. Moreover, this is the first workflow that provides a framework for the integration of transcriptomic and genomic data based on a mechanistic model of signaling pathway activities that allows a detailed biological interpretation of the results, including a comprehensive functional profiling of cell activity. MIGNON covers the whole process, from reads to signaling circuit activity estimations, using state-of-the-art tools, it is easy to use and it is deployable in different computational environments, allowing an optimized use of the resources available.


MIGNON is available here:

The documentation can be found at

Instructions to run a bash script to perform a dry run can be found at


Garrido-Rodriguez M, Lopez-Lopez D, Ortuno FM, Peña-Chilet M, Muñoz E, Calzado MA, Dopazo J. A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways. PLoS Comput Biol. 2021 Feb 11;17(2):e1008748.

ngsCAT:Next Generation Sequencing data Capture Assessment Tool

ngsCAT (Lopez-Domingo et al., 2014) is a command-line application written in Python which facilitates a comprehensive evaluation of the performance of the capture step in targeted high-throughput sequencing experiments in terms of:

  • Sensitivity, which assesses the quality of the coverage on target regions. It is also important to provide a means of estimating how this coverage would improve by increasing sequencing depth.

  • Specificity, which measures how much of the sequencing effort is wasted on sequencing off-target bases.

  • Uniformity, which assesses sequencing biases due to specific genomic locations or nucleotide composition.

ngsCAT is an easy-to-use tool that can be run with just one command line in a standard computer, generating a detailed HTML report with metrics, summary tables, figures and plots that evaluate the efficiency of targeted enrichment sequencing.


ngsCAT is available at


Francisco J. López-Domingo, Javier P. Florido, Antonio Rueda, Joaquín Dopazo and Javier Santoyo-López (2014) ngsCAT: a tool to assess the efficiency of targeted enrichment sequencing, Bioinformatics, vol.30, no.12, pp.1767-1768, 2014; doi:10.1093/bioinformatics/btu108.