%0 Journal Article %J Front Bioinform %D 2023 %T Visualization of automatically combined disease maps and pathway diagrams for rare diseases. %A Gawron, Piotr %A Hoksza, David %A Piñero, Janet %A Peña-Chilet, Maria %A Esteban-Medina, Marina %A Fernandez-Rueda, Jose Luis %A Colonna, Vincenza %A Smula, Ewa %A Heirendt, Laurent %A Ancien, François %A Grouès, Valentin %A Satagopam, Venkata P %A Schneider, Reinhard %A Dopazo, Joaquin %A Furlong, Laura I %A Ostaszewski, Marek %X

Investigation of molecular mechanisms of human disorders, especially rare diseases, require exploration of various knowledge repositories for building precise hypotheses and complex data interpretation. Recently, increasingly more resources offer diagrammatic representation of such mechanisms, including disease-dedicated schematics in pathway databases and disease maps. However, collection of knowledge across them is challenging, especially for research projects with limited manpower. In this article we present an automated workflow for construction of maps of molecular mechanisms for rare diseases. The workflow requires a standardized definition of a disease using Orphanet or HPO identifiers to collect relevant genes and variants, and to assemble a functional, visual repository of related mechanisms, including data overlays. The diagrams composing the final map are unified to a common systems biology format from CellDesigner SBML, GPML and SBML+layout+render. The constructed resource contains disease-relevant genes and variants as data overlays for immediate visual exploration, including embedded genetic variant browser and protein structure viewer. We demonstrate the functionality of our workflow on two examples of rare diseases: Kawasaki disease and retinitis pigmentosa. Two maps are constructed based on their corresponding identifiers. Moreover, for the retinitis pigmentosa use-case, we include a list of differentially expressed genes to demonstrate how to tailor the workflow using omics datasets. In summary, our work allows for an ad-hoc construction of molecular diagrams combined from different sources, preserving their layout and graphical style, but integrating them into a single resource. This allows to reduce time consuming tasks of prototyping of a molecular disease map, enabling visual exploration, hypothesis building, data visualization and further refinement. The code of the workflow is open and accessible at https://gitlab.lcsb.uni.lu/minerva/automap/.

%B Front Bioinform %V 3 %P 1101505 %8 2023 %G eng %R 10.3389/fbinf.2023.1101505 %0 Journal Article %J PLoS Comput Biol %D 2021 %T A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways. %A Garrido-Rodriguez, Martín %A López-López, Daniel %A Ortuno, Francisco M %A Peña-Chilet, Maria %A Muñoz, Eduardo %A Calzado, Marco A %A Dopazo, Joaquin %K Algorithms %K Cell Line, Tumor %K Computational Biology %K Databases, Factual %K Gene Expression Profiling %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Models, Theoretical %K mutation %K RNA-seq %K Signal Transduction %K Software %K Transcriptome %K whole exome sequencing %K Workflow %X

MIGNON is a workflow for the analysis of RNA-Seq experiments, which not only efficiently manages the estimation of gene expression levels from raw sequencing reads, but also calls genomic variants present in the transcripts analyzed. Moreover, this is the first workflow that provides a framework for the integration of transcriptomic and genomic data based on a mechanistic model of signaling pathway activities that allows a detailed biological interpretation of the results, including a comprehensive functional profiling of cell activity. MIGNON covers the whole process, from reads to signaling circuit activity estimations, using state-of-the-art tools, it is easy to use and it is deployable in different computational environments, allowing an optimized use of the resources available.

%B PLoS Comput Biol %V 17 %P e1008748 %8 2021 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/33571195?dopt=Abstract %R 10.1371/journal.pcbi.1008748 %0 Journal Article %J BMC Bioinformatics %D 2017 %T VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy. %A Juanes, José M %A Gallego, Asunción %A Tárraga, Joaquín %A Chaves, Felipe J %A Marin-Garcia, Pablo %A Medina, Ignacio %A Arnau, Vicente %A Dopazo, Joaquin %K Base Sequence %K Genetic Therapy %K Genetic Vectors %K High-Throughput Nucleotide Sequencing %K Humans %K Internet %K User-Computer Interface %K Virus Integration %X

BACKGROUND: The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use.

RESULTS: Here we present VISMapper, a vector integration site analysis web server, to analyze next-generation sequencing data for retroviral vector integration sites. VISMapper can be found at: http://vismapper.babelomics.org .

CONCLUSIONS: Because it uses novel mapping algorithms VISMapper is remarkably faster than previous available programs. It also provides a useful graphical interface to analyze the integration sites found in the genomic context.

%B BMC Bioinformatics %V 18 %P 421 %8 2017 Sep 20 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28931371?dopt=Abstract %R 10.1186/s12859-017-1837-z %0 Journal Article %J Nucleic Acids Res %D 2012 %T VARIANT: Command Line, Web service and Web interface for fast and accurate functional characterization of variants found by Next-Generation Sequencing. %A Medina, Ignacio %A De Maria, Alejandro %A Bleda, Marta %A Salavert, Francisco %A Alonso, Roberto %A Gonzalez, Cristina Y %A Dopazo, Joaquin %K Databases, Nucleic Acid %K Genetic Variation %K High-Throughput Nucleotide Sequencing %K Internet %K Molecular Sequence Annotation %K mutation %K Polymorphism, Single Nucleotide %K Software %K User-Computer Interface %X

The massive use of Next-Generation Sequencing (NGS) technologies is uncovering an unexpected amount of variability. The functional characterization of such variability, particularly in the most common form of variation found, the Single Nucleotide Variants (SNVs), has become a priority that needs to be addressed in a systematic way. VARIANT (VARIant ANalyis Tool) reports information on the variants found that include consequence type and annotations taken from different databases and repositories (SNPs and variants from dbSNP and 1000 genomes, and disease-related variants from the Genome-Wide Association Study (GWAS) catalog, Online Mendelian Inheritance in Man (OMIM), Catalog of Somatic Mutations in Cancer (COSMIC) mutations, etc). VARIANT also produces a rich variety of annotations that include information on the regulatory (transcription factor or miRNA-binding sites, etc.) or structural roles, or on the selective pressures on the sites affected by the variation. This information allows extending the conventional reports beyond the coding regions and expands the knowledge on the contribution of non-coding or synonymous variants to the phenotype studied. Contrarily to other tools, VARIANT uses a remote database and operates through efficient RESTful Web Services that optimize search and transaction operations. In this way, local problems of installation, update or disk size limitations are overcome without the need of sacrifice speed (thousands of variants are processed per minute). VARIANT is available at: http://variant.bioinfo.cipf.es.

%B Nucleic Acids Res %V 40 %P W54-8 %8 2012 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/22693211?dopt=Abstract %R 10.1093/nar/gks572 %0 Journal Article %J Protein Eng Des Sel %D 2006 %T Variable gap penalty for protein sequence-structure alignment %A Madhusudhan, M. S. %A M. A. Marti-Renom %A Sanchez, R. %A Sali, A. %K Algorithms Amino Acid Sequence Models %K Amino Acid *Software %K Molecular Molecular Sequence Data Proteins/*chemistry Sequence Alignment/*methods Sequence Analysis %K Protein/*methods *Sequence Homology %X The penalty for inserting gaps into an alignment between two protein sequences is a major determinant of the alignment accuracy. Here, we present an algorithm for finding a globally optimal alignment by dynamic programming that can use a variable gap penalty (VGP) function of any form. We also describe a specific function that depends on the structural context of an insertion or deletion. It penalizes gaps that are introduced within regions of regular secondary structure, buried regions, straight segments and also between two spatially distant residues. The parameters of the penalty function were optimized on a set of 240 sequence pairs of known structure, spanning the sequence identity range of 20-40%. We then tested the algorithm on another set of 238 sequence pairs of known structures. The use of the VGP function increases the number of correctly aligned residues from 81.0 to 84.5% in comparison with the optimized affine gap penalty function; this difference is statistically significant according to Student’s t-test. We estimate that the new algorithm allows us to produce comparative models with an additional approximately 7 million accurately modeled residues in the approximately 1.1 million proteins that are detectably related to a known structure. %B Protein Eng Des Sel %V 19 %P 129-33 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16423846 %0 Journal Article %J FEBS Lett %D 2005 %T Variation and evolution of biomolecular systems: searching for functional relevance %A M. A. Huynen %A Gabaldón, T. %A B. Snel %K *Evolution %K Molecular Genetic Variation Multiprotein Complexes/*genetics Phylogeny Protein Binding/genetics %X The availability of genome sequences and functional genomics data from multiple species enables us to compare the composition of biomolecular systems like biochemical pathways and protein complexes between species. Here, we review small- and large-scale, "genomics-based" approaches to biomolecular systems variation. In general, caution is required when comparing the results of bioinformatics analyses of genomes or of functional genomics data between species. Limitations to the sensitivity of sequence analysis tools and the noisy nature of genomics data tend to lead to systematic overestimates of the amount of variation. Nevertheless, the results from detailed manual analyses, and of large-scale analyses that filter out systematic biases, point to a large amount of variation in the composition of biomolecular systems. Such observations challenge our understanding of the function of the systems and their individual components and can potentially facilitate the identification and functional characterization of sub-systems within a system. Mapping the inter-species variation of complex biomolecular systems on a phylogenetic species tree allows one to reconstruct their evolution. %B FEBS Lett %V 579 %P 1839-45 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15763561