%0 Journal Article %J Front Immunol %D 2024 %T Drug-target identification in COVID-19 disease mechanisms using computational systems biology approaches. %A Niarakis, Anna %A Ostaszewski, Marek %A Mazein, Alexander %A Kuperstein, Inna %A Kutmon, Martina %A Gillespie, Marc E %A Funahashi, Akira %A Acencio, Marcio Luis %A Hemedan, Ahmed %A Aichem, Michael %A Klein, Karsten %A Czauderna, Tobias %A Burtscher, Felicia %A Yamada, Takahiro G %A Hiki, Yusuke %A Hiroi, Noriko F %A Hu, Finterly %A Pham, Nhung %A Ehrhart, Friederike %A Willighagen, Egon L %A Valdeolivas, Alberto %A Dugourd, Aurélien %A Messina, Francesco %A Esteban-Medina, Marina %A Peña-Chilet, Maria %A Rian, Kinza %A Soliman, Sylvain %A Aghamiri, Sara Sadat %A Puniya, Bhanwar Lal %A Naldi, Aurélien %A Helikar, Tomáš %A Singh, Vidisha %A Fernández, Marco Fariñas %A Bermudez, Viviam %A Tsirvouli, Eirini %A Montagud, Arnau %A Noël, Vincent %A Ponce-de-Leon, Miguel %A Maier, Dieter %A Bauch, Angela %A Gyori, Benjamin M %A Bachman, John A %A Luna, Augustin %A Piñero, Janet %A Furlong, Laura I %A Balaur, Irina %A Rougny, Adrien %A Jarosz, Yohan %A Overall, Rupert W %A Phair, Robert %A Perfetto, Livia %A Matthews, Lisa %A Rex, Devasahayam Arokia Balaya %A Orlic-Milacic, Marija %A Gomez, Luis Cristobal Monraz %A De Meulder, Bertrand %A Ravel, Jean Marie %A Jassal, Bijay %A Satagopam, Venkata %A Wu, Guanming %A Golebiewski, Martin %A Gawron, Piotr %A Calzone, Laurence %A Beckmann, Jacques S %A Evelo, Chris T %A D'Eustachio, Peter %A Schreiber, Falk %A Saez-Rodriguez, Julio %A Dopazo, Joaquin %A Kuiper, Martin %A Valencia, Alfonso %A Wolkenhauer, Olaf %A Kitano, Hiroaki %A Barillot, Emmanuel %A Auffray, Charles %A Balling, Rudi %A Schneider, Reinhard %K Computer Simulation %K COVID-19 %K drug repositioning %K Humans %K SARS-CoV-2 %K Systems biology %X

INTRODUCTION: The COVID-19 Disease Map project is a large-scale community effort uniting 277 scientists from 130 Institutions around the globe. We use high-quality, mechanistic content describing SARS-CoV-2-host interactions and develop interoperable bioinformatic pipelines for novel target identification and drug repurposing.

METHODS: Extensive community work allowed an impressive step forward in building interfaces between Systems Biology tools and platforms. Our framework can link biomolecules from omics data analysis and computational modelling to dysregulated pathways in a cell-, tissue- or patient-specific manner. Drug repurposing using text mining and AI-assisted analysis identified potential drugs, chemicals and microRNAs that could target the identified key factors.

RESULTS: Results revealed drugs already tested for anti-COVID-19 efficacy, providing a mechanistic context for their mode of action, and drugs already in clinical trials for treating other diseases, never tested against COVID-19.

DISCUSSION: The key advance is that the proposed framework is versatile and expandable, offering a significant upgrade in the arsenal for virus-host interactions and other complex pathologies.

%B Front Immunol %V 14 %P 1282859 %8 2023 %G eng %R 10.3389/fimmu.2023.1282859 %0 Journal Article %J Aging Cell %D 2023 %T microRNAs-mediated regulation of insulin signaling in white adipose tissue during aging: Role of caloric restriction. %A Corrales, Patricia %A Martin-Taboada, Marina %A Vivas-García, Yurena %A Torres, Lucia %A Ramirez-Jimenez, Laura %A Lopez, Yamila %A Horrillo, Daniel %A Vila-Bedmar, Rocio %A Barber-Cano, Eloisa %A Izquierdo-Lahuerta, Adriana %A Peña-Chilet, Maria %A Martínez, Carmen %A Dopazo, Joaquin %A Ros, Manuel %A Medina-Gomez, Gema %X

Caloric restriction is a non-pharmacological intervention known to ameliorate the metabolic defects associated with aging, including insulin resistance. The levels of miRNA expression may represent a predictive tool for aging-related alterations. In order to investigate the role of miRNAs underlying insulin resistance in adipose tissue during the early stages of aging, 3- and 12-month-old male animals fed ad libitum, and 12-month-old male animals fed with a 20% caloric restricted diet were used. In this work we demonstrate that specific miRNAs may contribute to the impaired insulin-stimulated glucose metabolism specifically in the subcutaneous white adipose tissue, through the regulation of target genes implicated in the insulin signaling cascade. Moreover, the expression of these miRNAs is modified by caloric restriction in middle-aged animals, in accordance with the improvement of the metabolic state. Overall, our work demonstrates that alterations in posttranscriptional gene expression because of miRNAs dysregulation might represent an endogenous mechanism by which insulin response in the subcutaneous fat depot is already affected at middle age. Importantly, caloric restriction could prevent this modulation, demonstrating that certain miRNAs could constitute potential biomarkers of age-related metabolic alterations.

%B Aging Cell %P e13919 %8 2023 Jul 04 %G eng %R 10.1111/acel.13919 %0 Journal Article %J Cell Death Discov %D 2023 %T Rapid degeneration of iPSC-derived motor neurons lacking Gdap1 engages a mitochondrial-sustained innate immune response. %A León, Marian %A Prieto, Javier %A Molina-Navarro, María Micaela %A Garcia-Garcia, Francisco %A Barneo-Muñoz, Manuela %A Ponsoda, Xavier %A Sáez, Rosana %A Palau, Francesc %A Dopazo, Joaquin %A Izpisua Belmonte, Juan Carlos %A Torres, Josema %X

Charcot-Marie-Tooth disease is a chronic hereditary motor and sensory polyneuropathy targeting Schwann cells and/or motor neurons. Its multifactorial and polygenic origin portrays a complex clinical phenotype of the disease with a wide range of genetic inheritance patterns. The disease-associated gene GDAP1 encodes for a mitochondrial outer membrane protein. Mouse and insect models with mutations in Gdap1 have reproduced several traits of the human disease. However, the precise function in the cell types affected by the disease remains unknown. Here, we use induced-pluripotent stem cells derived from a Gdap1 knockout mouse model to better understand the molecular and cellular phenotypes of the disease caused by the loss-of-function of this gene. Gdap1-null motor neurons display a fragile cell phenotype prone to early degeneration showing (1) altered mitochondrial morphology, with an increase in the fragmentation of these organelles, (2) activation of autophagy and mitophagy, (3) abnormal metabolism, characterized by a downregulation of Hexokinase 2 and ATP5b proteins, (4) increased reactive oxygen species and elevated mitochondrial membrane potential, and (5) increased innate immune response and p38 MAP kinase activation. Our data reveals the existence of an underlying Redox-inflammatory axis fueled by altered mitochondrial metabolism in the absence of Gdap1. As this biochemical axis encompasses a wide variety of druggable targets, our results may have implications for developing therapies using combinatorial pharmacological approaches and improving therefore human welfare. A Redox-immune axis underlying motor neuron degeneration caused by the absence of Gdap1. Our results show that Gdap1 motor neurons have a fragile cellular phenotype that is prone to degeneration. Gdap1 iPSCs differentiated into motor neurons showed an altered metabolic state: decreased glycolysis and increased OXPHOS. These alterations may lead to hyperpolarization of mitochondria and increased ROS levels. Excessive amounts of ROS might be the cause of increased mitophagy, p38 activation and inflammation as a cellular response to oxidative stress. The p38 MAPK pathway and the immune response may, in turn, have feedback mechanisms, leading to the induction of apoptosis and senescence, respectively. CAC, citric acid cycle; ETC, electronic transport chain; Glc, glucose; Lac, lactate; Pyr, pyruvate.

%B Cell Death Discov %V 9 %P 217 %8 2023 Jul 01 %G eng %N 1 %R 10.1038/s41420-023-01531-w %0 Journal Article %J Arch Bronconeumol %D 2022 %T Incidence and Prevalence of Children's Diffuse Lung Disease in Spain. %A Torrent-Vernetta, Alba %A Gaboli, Mirella %A Castillo-Corullón, Silvia %A Mondéjar-López, Pedro %A Sanz Santiago, Verónica %A Costa-Colomer, Jordi %A Osona, Borja %A Torres-Borrego, Javier %A de la Serna-Blázquez, Olga %A Bellón Alonso, Sara %A Caro Aguilera, Pilar %A Gimeno-Díaz de Atauri, Álvaro %A Valenzuela Soria, Alfredo %A Ayats, Roser %A Martin de Vicente, Carlos %A Velasco González, Valle %A Moure González, José Domingo %A Canino Calderín, Elisa María %A Pastor-Vivero, María Dolores %A Villar Álvarez, María Ángeles %A Rovira-Amigo, Sandra %A Iglesias Serrano, Ignacio %A Díez Izquierdo, Ana %A de Mir Messa, Inés %A Gartner, Silvia %A Navarro, Alexandra %A Baz-Redón, Noelia %A Carmona, Rosario %A Camats-Tarruella, Núria %A Fernández-Cancio, Mónica %A Rapp, Christina %A Dopazo, Joaquin %A Griese, Matthias %A Moreno-Galdó, Antonio %X

BACKGROUND: Children's diffuse lung disease, also known as children's Interstitial Lung Diseases (chILD), are a heterogeneous group of rare diseases with relevant morbidity and mortality, which diagnosis and classification are very complex. Epidemiological data are scarce. The aim of this study was to analyse incidence and prevalence of chILD in Spain.

METHODS: Multicentre observational prospective study in patients from 0 to 18 years of age with chILD to analyse its incidence and prevalence in Spain, based on data reported in 2018 and 2019.

RESULTS: A total of 381 cases with chILD were notified from 51 paediatric pulmonology units all over Spain, covering the 91.7% of the paediatric population. The average incidence of chILD was 8.18 (CI 95% 6.28-10.48) new cases/million of children per year. The average prevalence of chILD was 46.53 (CI 95% 41.81-51.62) cases/million of children. The age group with the highest prevalence were children under 1 year of age. Different types of disorders were seen in children 2-18 years of age compared with children 0-2 years of age. Most frequent cases were: primary pulmonary interstitial glycogenosis in neonates (17/65), neuroendocrine cell hyperplasia of infancy in infants from 1 to 12 months (44/144), idiopathic pulmonary haemosiderosis in children from 1 to 5 years old (13/74), hypersensitivity pneumonitis in children from 5 to 10 years old (9/51), and scleroderma in older than 10 years old (8/47).

CONCLUSIONS: We found a higher incidence and prevalence of chILD than previously described probably due to greater understanding and increased clinician awareness of these rare diseases.

%B Arch Bronconeumol %V 58 %P 22-29 %8 2022 Jan %G eng %N 1 %R 10.1016/j.arbres.2021.06.001 %0 Journal Article %J Hum Mol Genet %D 2022 %T Novel genes and sex differences in COVID-19 severity. %A Cruz, Raquel %A Almeida, Silvia Diz-de %A Heredia, Miguel López %A Quintela, Inés %A Ceballos, Francisco C %A Pita, Guillermo %A Lorenzo-Salazar, José M %A González-Montelongo, Rafaela %A Gago-Domínguez, Manuela %A Porras, Marta Sevilla %A Castaño, Jair Antonio Tenorio %A Nevado, Julián %A Aguado, Jose María %A Aguilar, Carlos %A Aguilera-Albesa, Sergio %A Almadana, Virginia %A Almoguera, Berta %A Alvarez, Nuria %A Andreu-Bernabeu, Álvaro %A Arana-Arri, Eunate %A Arango, Celso %A Arranz, María J %A Artiga, Maria-Jesus %A Baptista-Rosas, Raúl C %A Barreda-Sánchez, María %A Belhassen-Garcia, Moncef %A Bezerra, Joao F %A Bezerra, Marcos A C %A Boix-Palop, Lucía %A Brión, Maria %A Brugada, Ramón %A Bustos, Matilde %A Calderón, Enrique J %A Carbonell, Cristina %A Castano, Luis %A Castelao, Jose E %A Conde-Vicente, Rosa %A Cordero-Lorenzana, M Lourdes %A Cortes-Sanchez, Jose L %A Corton, Marta %A Darnaude, M Teresa %A De Martino-Rodríguez, Alba %A Campo-Pérez, Victor %A Bustamante, Aranzazu Diaz %A Domínguez-Garrido, Elena %A Luchessi, André D %A Eirós, Rocío %A Sanabria, Gladys Mercedes Estigarribia %A Fariñas, María Carmen %A Fernández-Robelo, Uxía %A Fernández-Rodríguez, Amanda %A Fernández-Villa, Tania %A Gil-Fournier, Belén %A Gómez-Arrue, Javier %A Álvarez, Beatriz González %A Quirós, Fernan Gonzalez Bernaldo %A González-Peñas, Javier %A Gutiérrez-Bautista, Juan F %A Herrero, María José %A Herrero-Gonzalez, Antonio %A Jimenez-Sousa, María A %A Lattig, María Claudia %A Borja, Anabel Liger %A Lopez-Rodriguez, Rosario %A Mancebo, Esther %A Martín-López, Caridad %A Martín, Vicente %A Martinez-Nieto, Oscar %A Martinez-Lopez, Iciar %A Martinez-Resendez, Michel F %A Martinez-Perez, Ángel %A Mazzeu, Juliana A %A Macías, Eleuterio Merayo %A Minguez, Pablo %A Cuerda, Victor Moreno %A Silbiger, Vivian N %A Oliveira, Silviene F %A Ortega-Paino, Eva %A Parellada, Mara %A Paz-Artal, Estela %A Santos, Ney P C %A Pérez-Matute, Patricia %A Perez, Patricia %A Pérez-Tomás, M Elena %A Perucho, Teresa %A Pinsach-Abuin, Mel Lina %A Pompa-Mera, Ericka N %A Porras-Hurtado, Gloria L %A Pujol, Aurora %A León, Soraya Ramiro %A Resino, Salvador %A Fernandes, Marianne R %A Rodríguez-Ruiz, Emilio %A Rodriguez-Artalejo, Fernando %A Rodriguez-Garcia, José A %A Ruiz-Cabello, Francisco %A Ruiz-Hornillos, Javier %A Ryan, Pablo %A Soria, José Manuel %A Souto, Juan Carlos %A Tamayo, Eduardo %A Tamayo-Velasco, Alvaro %A Taracido-Fernandez, Juan Carlos %A Teper, Alejandro %A Torres-Tobar, Lilian %A Urioste, Miguel %A Valencia-Ramos, Juan %A Yáñez, Zuleima %A Zarate, Ruth %A Nakanishi, Tomoko %A Pigazzini, Sara %A Degenhardt, Frauke %A Butler-Laporte, Guillaume %A Maya-Miles, Douglas %A Bujanda, Luis %A Bouysran, Youssef %A Palom, Adriana %A Ellinghaus, David %A Martínez-Bueno, Manuel %A Rolker, Selina %A Amitrano, Sara %A Roade, Luisa %A Fava, Francesca %A Spinner, Christoph D %A Prati, Daniele %A Bernardo, David %A García, Federico %A Darcis, Gilles %A Fernández-Cadenas, Israel %A Holter, Jan Cato %A Banales, Jesus M %A Frithiof, Robert %A Duga, Stefano %A Asselta, Rosanna %A Pereira, Alexandre C %A Romero-Gómez, Manuel %A Nafría-Jiménez, Beatriz %A Hov, Johannes R %A Migeotte, Isabelle %A Renieri, Alessandra %A Planas, Anna M %A Ludwig, Kerstin U %A Buti, Maria %A Rahmouni, Souad %A Alarcón-Riquelme, Marta E %A Schulte, Eva C %A Franke, Andre %A Karlsen, Tom H %A Valenti, Luca %A Zeberg, Hugo %A Richards, Brent %A Ganna, Andrea %A Boada, Mercè %A Rojas, Itziar %A Ruiz, Agustín %A Sánchez, Pascual %A Real, Luis Miguel %A Guillén-Navarro, Encarna %A Ayuso, Carmen %A González-Neira, Anna %A Riancho, José A %A Rojas-Martinez, Augusto %A Flores, Carlos %A Lapunzina, Pablo %A Carracedo, Ángel %X

Here we describe the results of a genome-wide study conducted in 11 939 COVID-19 positive cases with an extensive clinical information that were recruited from 34 hospitals across Spain (SCOURGE consortium). In sex-disaggregated genome-wide association studies for COVID-19 hospitalization, genome-wide significance (p < 5x10-8) was crossed for variants in 3p21.31 and 21q22.11 loci only among males (p = 1.3x10-22 and p = 8.1x10-12, respectively), and for variants in 9q21.32 near TLE1 only among females (p = 4.4x10-8). In a second phase, results were combined with an independent Spanish cohort (1598 COVID-19 cases and 1068 population controls), revealing in the overall analysis two novel risk loci in 9p13.3 and 19q13.12, with fine-mapping prioritized variants functionally associated with AQP3 (p = 2.7x10-8) and ARHGAP33 (p = 1.3x10-8), respectively. The meta-analysis of both phases with four European studies stratified by sex from the Host Genetics Initiative confirmed the association of the 3p21.31 and 21q22.11 loci predominantly in males and replicated a recently reported variant in 11p13 (ELF5, p = 4.1x10-8). Six of the COVID-19 HGI discovered loci were replicated and an HGI-based genetic risk score predicted the severity strata in SCOURGE. We also found more SNP-heritability and larger heritability differences by age (<60 or ≥ 60 years) among males than among females. Parallel genome-wide screening of inbreeding depression in SCOURGE also showed an effect of homozygosity in COVID-19 hospitalization and severity and this effect was stronger among older males. In summary, new candidate genes for COVID-19 severity and evidence supporting genetic disparities among sexes are provided.

%B Hum Mol Genet %8 2022 Jun 16 %G eng %R 10.1093/hmg/ddac132 %0 Journal Article %J BMC Bioinformatics %D 2021 %T A comprehensive database for integrated analysis of omics data in autoimmune diseases. %A Martorell-Marugán, Jordi %A López-Domínguez, Raúl %A García-Moreno, Adrián %A Toro-Domínguez, Daniel %A Villatoro-García, Juan Antonio %A Barturen, Guillermo %A Martín-Gómez, Adoración %A Troule, Kevin %A Gómez-López, Gonzalo %A Al-Shahrour, Fátima %A González-Rumayor, Víctor %A Peña-Chilet, Maria %A Dopazo, Joaquin %A Saez-Rodriguez, Julio %A Alarcón-Riquelme, Marta E %A Carmona-Sáez, Pedro %K Autoimmune Diseases %K Computational Biology %K Databases, Factual %K Humans %X

BACKGROUND: Autoimmune diseases are heterogeneous pathologies with difficult diagnosis and few therapeutic options. In the last decade, several omics studies have provided significant insights into the molecular mechanisms of these diseases. Nevertheless, data from different cohorts and pathologies are stored independently in public repositories and a unified resource is imperative to assist researchers in this field.

RESULTS: Here, we present Autoimmune Diseases Explorer ( https://adex.genyo.es ), a database that integrates 82 curated transcriptomics and methylation studies covering 5609 samples for some of the most common autoimmune diseases. The database provides, in an easy-to-use environment, advanced data analysis and statistical methods for exploring omics datasets, including meta-analysis, differential expression or pathway analysis.

CONCLUSIONS: This is the first omics database focused on autoimmune diseases. This resource incorporates homogeneously processed data to facilitate integrative analyses among studies.

%B BMC Bioinformatics %V 22 %P 343 %8 2021 Jun 24 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/34167460?dopt=Abstract %R 10.1186/s12859-021-04268-4 %0 Journal Article %J Mol Syst Biol %D 2021 %T COVID19 Disease Map, a computational knowledge repository of virus-host interaction mechanisms. %A Ostaszewski, Marek %A Niarakis, Anna %A Mazein, Alexander %A Kuperstein, Inna %A Phair, Robert %A Orta-Resendiz, Aurelio %A Singh, Vidisha %A Aghamiri, Sara Sadat %A Acencio, Marcio Luis %A Glaab, Enrico %A Ruepp, Andreas %A Fobo, Gisela %A Montrone, Corinna %A Brauner, Barbara %A Frishman, Goar %A Monraz Gómez, Luis Cristóbal %A Somers, Julia %A Hoch, Matti %A Kumar Gupta, Shailendra %A Scheel, Julia %A Borlinghaus, Hanna %A Czauderna, Tobias %A Schreiber, Falk %A Montagud, Arnau %A Ponce de Leon, Miguel %A Funahashi, Akira %A Hiki, Yusuke %A Hiroi, Noriko %A Yamada, Takahiro G %A Dräger, Andreas %A Renz, Alina %A Naveez, Muhammad %A Bocskei, Zsolt %A Messina, Francesco %A Börnigen, Daniela %A Fergusson, Liam %A Conti, Marta %A Rameil, Marius %A Nakonecnij, Vanessa %A Vanhoefer, Jakob %A Schmiester, Leonard %A Wang, Muying %A Ackerman, Emily E %A Shoemaker, Jason E %A Zucker, Jeremy %A Oxford, Kristie %A Teuton, Jeremy %A Kocakaya, Ebru %A Summak, Gökçe Yağmur %A Hanspers, Kristina %A Kutmon, Martina %A Coort, Susan %A Eijssen, Lars %A Ehrhart, Friederike %A Rex, Devasahayam Arokia Balaya %A Slenter, Denise %A Martens, Marvin %A Pham, Nhung %A Haw, Robin %A Jassal, Bijay %A Matthews, Lisa %A Orlic-Milacic, Marija %A Senff Ribeiro, Andrea %A Rothfels, Karen %A Shamovsky, Veronica %A Stephan, Ralf %A Sevilla, Cristoffer %A Varusai, Thawfeek %A Ravel, Jean-Marie %A Fraser, Rupsha %A Ortseifen, Vera %A Marchesi, Silvia %A Gawron, Piotr %A Smula, Ewa %A Heirendt, Laurent %A Satagopam, Venkata %A Wu, Guanming %A Riutta, Anders %A Golebiewski, Martin %A Owen, Stuart %A Goble, Carole %A Hu, Xiaoming %A Overall, Rupert W %A Maier, Dieter %A Bauch, Angela %A Gyori, Benjamin M %A Bachman, John A %A Vega, Carlos %A Grouès, Valentin %A Vazquez, Miguel %A Porras, Pablo %A Licata, Luana %A Iannuccelli, Marta %A Sacco, Francesca %A Nesterova, Anastasia %A Yuryev, Anton %A de Waard, Anita %A Turei, Denes %A Luna, Augustin %A Babur, Ozgun %A Soliman, Sylvain %A Valdeolivas, Alberto %A Esteban-Medina, Marina %A Peña-Chilet, Maria %A Rian, Kinza %A Helikar, Tomáš %A Puniya, Bhanwar Lal %A Modos, Dezso %A Treveil, Agatha %A Olbei, Marton %A De Meulder, Bertrand %A Ballereau, Stephane %A Dugourd, Aurélien %A Naldi, Aurélien %A Noël, Vincent %A Calzone, Laurence %A Sander, Chris %A Demir, Emek %A Korcsmaros, Tamas %A Freeman, Tom C %A Augé, Franck %A Beckmann, Jacques S %A Hasenauer, Jan %A Wolkenhauer, Olaf %A Wilighagen, Egon L %A Pico, Alexander R %A Evelo, Chris T %A Gillespie, Marc E %A Stein, Lincoln D %A Hermjakob, Henning %A D'Eustachio, Peter %A Saez-Rodriguez, Julio %A Dopazo, Joaquin %A Valencia, Alfonso %A Kitano, Hiroaki %A Barillot, Emmanuel %A Auffray, Charles %A Balling, Rudi %A Schneider, Reinhard %K Antiviral Agents %K Computational Biology %K Computer Graphics %K COVID-19 %K Cytokines %K Data Mining %K Databases, Factual %K Gene Expression Regulation %K Host Microbial Interactions %K Humans %K Immunity, Cellular %K Immunity, Humoral %K Immunity, Innate %K Lymphocytes %K Metabolic Networks and Pathways %K Myeloid Cells %K Protein Interaction Mapping %K SARS-CoV-2 %K Signal Transduction %K Software %K Transcription Factors %K Viral Proteins %X

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.

%B Mol Syst Biol %V 17 %P e10387 %8 2021 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/34664389?dopt=Abstract %R 10.15252/msb.202110387 %0 Journal Article %J Am J Med Genet A %D 2021 %T De novo small deletion affecting transcription start site of short isoform of AUTS2 gene in a patient with syndromic neurodevelopmental defects. %A Martinez-Delgado, Beatriz %A Lopez-Martin, Estrella %A Lara-Herguedas, Julián %A Monzon, Sara %A Cuesta, Isabel %A Juliá, Miguel %A Aquino, Virginia %A Rodriguez-Martin, Carlos %A Damian, Alejandra %A Gonzalo, Irene %A Gomez-Mariano, Gema %A Baladron, Beatriz %A Cazorla, Rosario %A Iglesias, Gema %A Roman, Enriqueta %A Ros, Purificacion %A Tutor, Pablo %A Mellor, Susana %A Jimenez, Carlos %A Cabrejas, Maria Jose %A Gonzalez-Vioque, Emiliano %A Alonso, Javier %A Bermejo-Sánchez, Eva %A Posada, Manuel %K Child, Preschool %K Cytoskeletal Proteins %K Dwarfism %K Exons %K Gene Expression Regulation %K Genetic Association Studies %K Humans %K Male %K Neurodevelopmental Disorders %K Protein Isoforms %K RNA, Messenger %K Sequence Deletion %K Syndrome %K Transcription Factors %K Transcription Initiation Site %K Transcription, Genetic %X

Disruption of the autism susceptibility candidate 2 (AUTS2) gene through genomic rearrangements, copy number variations (CNVs), and intragenic deletions and mutations, has been recurrently involved in syndromic forms of developmental delay and intellectual disability, known as AUTS2 syndrome. The AUTS2 gene plays an important role in regulation of neuronal migration, and when altered, associates with a variable phenotype from severely to mildly affected patients. The more severe phenotypes significantly correlate with the presence of defects affecting the C-terminus part of the gene. This article reports a new patient with a syndromic neurodevelopmental disorder, who presents a deletion of 30 nucleotides in the exon 9 of the AUTS2 gene. Importantly, this deletion includes the transcription start site for the AUTS2 short transcript isoform, which has an important role in brain development. Gene expression analysis of AUTS2 full-length and short isoforms revealed that the deletion found in this patient causes a remarkable reduction in the expression level, not only of the short isoform, but also of the full AUTS2 transcripts. This report adds more evidence for the role of mutated AUTS2 short transcripts in the development of a severe phenotype in the AUTS2 syndrome.

%B Am J Med Genet A %V 185 %P 877-883 %8 2021 03 %G eng %N 3 %R 10.1002/ajmg.a.62017 %0 Journal Article %J Nat Methods %D 2021 %T DOME: recommendations for supervised machine learning validation in biology. %A Walsh, Ian %A Fishman, Dmytro %A Garcia-Gasulla, Dario %A Titma, Tiina %A Pollastri, Gianluca %A Harrow, Jennifer %A Psomopoulos, Fotis E %A Tosatto, Silvio C E %K Algorithms %K Computational Biology %K Guidelines as Topic %K Humans %K Models, Biological %K Research Design %K Supervised Machine Learning %B Nat Methods %V 18 %P 1122-1127 %8 2021 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/34316068?dopt=Abstract %R 10.1038/s41592-021-01205-4 %0 Journal Article %J Clinical Epigenetics %D 2021 %T Genome-wide analysis of DNA methylation in Hirschsprung enteric precursor cells: unraveling the epigenetic landscape of enteric nervous system developmentAbstractBackgroundResultsConclusionsGraphic abstract %A Villalba-Benito, Leticia %A López-López, Daniel %A Torroglosa, Ana %A Casimiro-Soriguer, Carlos S. %A Luzón-Toro, Berta %A Fernández, Raquel María %A Moya-Jiménez, María José %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %B Clinical Epigenetics %V 13 %8 Jan-12-2021 %G eng %U http://link.springer.com/article/10.1186/s13148-021-01040-6/fulltext.html %N 1 %! Clin Epigenet %R 10.1186/s13148-021-01040-6 %0 Journal Article %J J Pers Med %D 2021 %T Implementing Personalized Medicine in COVID-19 in Andalusia: An Opportunity to Transform the Healthcare System. %A Dopazo, Joaquin %A Maya-Miles, Douglas %A García, Federico %A Lorusso, Nicola %A Calleja, Miguel Ángel %A Pareja, María Jesús %A López-Miranda, José %A Rodríguez-Baño, Jesús %A Padillo, Javier %A Túnez, Isaac %A Romero-Gómez, Manuel %X

The COVID-19 pandemic represents an unprecedented opportunity to exploit the advantages of personalized medicine for the prevention, diagnosis, treatment, surveillance and management of a new challenge in public health. COVID-19 infection is highly variable, ranging from asymptomatic infections to severe, life-threatening manifestations. Personalized medicine can play a key role in elucidating individual susceptibility to the infection as well as inter-individual variability in clinical course, prognosis and response to treatment. Integrating personalized medicine into clinical practice can also transform health care by enabling the design of preventive and therapeutic strategies tailored to individual profiles, improving the detection of outbreaks or defining transmission patterns at an increasingly local level. SARS-CoV2 genome sequencing, together with the assessment of specific patient genetic variants, will support clinical decision-makers and ultimately better ways to fight this disease. Additionally, it would facilitate a better stratification and selection of patients for clinical trials, thus increasing the likelihood of obtaining positive results. Lastly, defining a national strategy to implement in clinical practice all available tools of personalized medicine in COVID-19 could be challenging but linked to a positive transformation of the health care system. In this review, we provide an update of the achievements, promises, and challenges of personalized medicine in the fight against COVID-19 from susceptibility to natural history and response to therapy, as well as from surveillance to control measures and vaccination. We also discuss strategies to facilitate the adoption of this new paradigm for medical and public health measures during and after the pandemic in health care systems.

%B J Pers Med %V 11 %8 2021 May 26 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/34073493?dopt=Abstract %R 10.3390/jpm11060475 %0 Journal Article %J Cancers (Basel) %D 2021 %T Mutational Characterization of Cutaneous Melanoma Supports Divergent Pathways Model for Melanoma Development. %A Millán-Esteban, David %A Peña-Chilet, Maria %A García-Casado, Zaida %A Manrique-Silva, Esperanza %A Requena, Celia %A Bañuls, José %A Lopez-Guerrero, Jose Antonio %A Rodríguez-Hernández, Aranzazu %A Traves, Víctor %A Dopazo, Joaquin %A Virós, Amaya %A Kumar, Rajiv %A Nagore, Eduardo %X

According to the divergent pathway model, cutaneous melanoma comprises a nevogenic group with a propensity to melanocyte proliferation and another one associated with cumulative solar damage (CSD). While characterized clinically and epidemiologically, the differences in the molecular profiles between the groups have remained primarily uninvestigated. This study has used a custom gene panel and bioinformatics tools to investigate the potential molecular differences in a thoroughly characterized cohort of 119 melanoma patients belonging to nevogenic and CSD groups. We found that the nevogenic melanomas had a restricted set of mutations, with the prominently mutated gene being . The CSD melanomas, in contrast, showed mutations in a diverse group of genes that included , , , and . We thus provide evidence that nevogenic and CSD melanomas constitute different biological entities and highlight the need to explore new targeted therapies.

%B Cancers (Basel) %V 13 %8 2021 Oct 18 %G eng %N 20 %R 10.3390/cancers13205219 %0 Journal Article %J Nature Genetics %D 2021 %T The NCI Genomic Data Commons %A Heath, Allison P. %A Ferretti, Vincent %A Agrawal, Stuti %A An, Maksim %A Angelakos, James C. %A Arya, Renuka %A Bajari, Rosita %A Baqar, Bilal %A Barnowski, Justin H. B. %A Burt, Jeffrey %A Catton, Ann %A Chan, Brandon F. %A Chu, Fay %A Cullion, Kim %A Davidsen, Tanja %A Do, Phuong-My %A Dompierre, Christian %A Ferguson, Martin L. %A Fitzsimons, Michael S. %A Ford, Michael %A Fukuma, Miyuki %A Gaheen, Sharon %A Ganji, Gajanan L. %A Garcia, Tzintzuni I. %A George, Sameera S. %A Gerhard, Daniela S. %A Gerthoffert, Francois %A Gomez, Fauzi %A Han, Kang %A Hernandez, Kyle M. %A Issac, Biju %A Jackson, Richard %A Jensen, Mark A. %A Joshi, Sid %A Kadam, Ajinkya %A Khurana, Aishmit %A Kim, Kyle M. J. %A Kraft, Victoria E. %A Li, Shenglai %A Lichtenberg, Tara M. %A Lodato, Janice %A Lolla, Laxmi %A Martinov, Plamen %A Mazzone, Jeffrey A. %A Miller, Daniel P. %A Miller, Ian %A Miller, Joshua S. %A Miyauchi, Koji %A Murphy, Mark W. %A Nullet, Thomas %A Ogwara, Rowland O. %A Ortuño, Francisco M. %A Pedrosa, Jesús %A Pham, Phuong L. %A Popov, Maxim Y. %A Porter, James J. %A Powell, Raymond %A Rademacher, Karl %A Reid, Colin P. %A Rich, Samantha %A Rogel, Bessie %A Sahni, Himanso %A Savage, Jeremiah H. %A Schmitt, Kyle A. %A Simmons, Trevar J. %A Sislow, Joseph %A Spring, Jonathan %A Stein, Lincoln %A Sullivan, Sean %A Tang, Yajing %A Thiagarajan, Mathangi %A Troyer, Heather D. %A Wang, Chang %A Wang, Zhining %A West, Bedford L. %A Wilmer, Alex %A Wilson, Shane %A Wu, Kaman %A Wysocki, William P. %A Xiang, Linda %A Yamada, Joseph T. %A Yang, Liming %A Yu, Christine %A Yung, Christina K. %A Zenklusen, Jean Claude %A Zhang, Junjun %A Zhang, Zhenyu %A Zhao, Yuanheng %A Zubair, Ariz %A Staudt, Louis M. %A Grossman, Robert L. %B Nature Genetics %8 Oct-02-2022 %G eng %U http://www.nature.com/articles/s41588-021-00791-5 %! Nat Genet %R 10.1038/s41588-021-00791-5 %0 Journal Article %J Sci Rep %D 2021 %T Real world evidence of calcifediol or vitamin D prescription and mortality rate of COVID-19 in a retrospective cohort of hospitalized Andalusian patients. %A Loucera, Carlos %A Peña-Chilet, Maria %A Esteban-Medina, Marina %A Muñoyerro-Muñiz, Dolores %A Villegas, Román %A López-Miranda, José %A Rodríguez-Baño, Jesús %A Túnez, Isaac %A Bouillon, Roger %A Dopazo, Joaquin %A Quesada Gomez, Jose Manuel %K Calcifediol %K COVID-19 %K Female %K Humans %K Kaplan-Meier Estimate %K Male %K Retrospective Studies %K Spain %K Survival Analysis %K Vitamin D %X

COVID-19 is a major worldwide health problem because of acute respiratory distress syndrome, and mortality. Several lines of evidence have suggested a relationship between the vitamin D endocrine system and severity of COVID-19. We present a survival study on a retrospective cohort of 15,968 patients, comprising all COVID-19 patients hospitalized in Andalusia between January and November 2020. Based on a central registry of electronic health records (the Andalusian Population Health Database, BPS), prescription of vitamin D or its metabolites within 15-30 days before hospitalization were recorded. The effect of prescription of vitamin D (metabolites) for other indication previous to the hospitalization was studied with respect to patient survival. Kaplan-Meier survival curves and hazard ratios support an association between prescription of these metabolites and patient survival. Such association was stronger for calcifediol (Hazard Ratio, HR = 0.67, with 95% confidence interval, CI, of [0.50-0.91]) than for cholecalciferol (HR = 0.75, with 95% CI of [0.61-0.91]), when prescribed 15 days prior hospitalization. Although the relation is maintained, there is a general decrease of this effect when a longer period of 30 days prior hospitalization is considered (calcifediol HR = 0.73, with 95% CI [0.57-0.95] and cholecalciferol HR = 0.88, with 95% CI [0.75, 1.03]), suggesting that association was stronger when the prescription was closer to the hospitalization.

%B Sci Rep %V 11 %P 23380 %8 2021 12 03 %G eng %N 1 %R 10.1038/s41598-021-02701-5 %0 Journal Article %J Nat Med %D 2021 %T Reporting guidelines for human microbiome research: the STORMS checklist. %A Mirzayi, Chloe %A Renson, Audrey %A Zohra, Fatima %A Elsafoury, Shaimaa %A Geistlinger, Ludwig %A Kasselman, Lora J %A Eckenrode, Kelly %A van de Wijgert, Janneke %A Loughman, Amy %A Marques, Francine Z %A MacIntyre, David A %A Arumugam, Manimozhiyan %A Azhar, Rimsha %A Beghini, Francesco %A Bergstrom, Kirk %A Bhatt, Ami %A Bisanz, Jordan E %A Braun, Jonathan %A Bravo, Hector Corrada %A Buck, Gregory A %A Bushman, Frederic %A Casero, David %A Clarke, Gerard %A Collado, Maria Carmen %A Cotter, Paul D %A Cryan, John F %A Demmer, Ryan T %A Devkota, Suzanne %A Elinav, Eran %A Escobar, Juan S %A Fettweis, Jennifer %A Finn, Robert D %A Fodor, Anthony A %A Forslund, Sofia %A Franke, Andre %A Furlanello, Cesare %A Gilbert, Jack %A Grice, Elizabeth %A Haibe-Kains, Benjamin %A Handley, Scott %A Herd, Pamela %A Holmes, Susan %A Jacobs, Jonathan P %A Karstens, Lisa %A Knight, Rob %A Knights, Dan %A Koren, Omry %A Kwon, Douglas S %A Langille, Morgan %A Lindsay, Brianna %A McGovern, Dermot %A McHardy, Alice C %A McWeeney, Shannon %A Mueller, Noel T %A Nezi, Luigi %A Olm, Matthew %A Palm, Noah %A Pasolli, Edoardo %A Raes, Jeroen %A Redinbo, Matthew R %A Rühlemann, Malte %A Balfour Sartor, R %A Schloss, Patrick D %A Schriml, Lynn %A Segal, Eran %A Shardell, Michelle %A Sharpton, Thomas %A Smirnova, Ekaterina %A Sokol, Harry %A Sonnenburg, Justin L %A Srinivasan, Sujatha %A Thingholm, Louise B %A Turnbaugh, Peter J %A Upadhyay, Vaibhav %A Walls, Ramona L %A Wilmes, Paul %A Yamada, Takuji %A Zeller, Georg %A Zhang, Mingyu %A Zhao, Ni %A Zhao, Liping %A Bao, Wenjun %A Culhane, Aedin %A Devanarayan, Viswanath %A Dopazo, Joaquin %A Fan, Xiaohui %A Fischer, Matthias %A Jones, Wendell %A Kusko, Rebecca %A Mason, Christopher E %A Mercer, Tim R %A Sansone, Susanna-Assunta %A Scherer, Andreas %A Shi, Leming %A Thakkar, Shraddha %A Tong, Weida %A Wolfinger, Russ %A Hunter, Christopher %A Segata, Nicola %A Huttenhower, Curtis %A Dowd, Jennifer B %A Jones, Heidi E %A Waldron, Levi %K Computational Biology %K Dysbiosis %K Humans %K Microbiota %K Observational Studies as Topic %K Research Design %K Translational Science, Biomedical %X

The particularly interdisciplinary nature of human microbiome research makes the organization and reporting of results spanning epidemiology, biology, bioinformatics, translational medicine and statistics a challenge. Commonly used reporting guidelines for observational or genetic epidemiology studies lack key features specific to microbiome studies. Therefore, a multidisciplinary group of microbiome epidemiology researchers adapted guidelines for observational and genetic studies to culture-independent human microbiome studies, and also developed new reporting elements for laboratory, bioinformatics and statistical analyses tailored to microbiome studies. The resulting tool, called 'Strengthening The Organization and Reporting of Microbiome Studies' (STORMS), is composed of a 17-item checklist organized into six sections that correspond to the typical sections of a scientific publication, presented as an editable table for inclusion in supplementary materials. The STORMS checklist provides guidance for concise and complete reporting of microbiome studies that will facilitate manuscript preparation, peer review, and reader comprehension of publications and comparative analysis of published results.

%B Nat Med %V 27 %P 1885-1892 %8 2021 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/34789871?dopt=Abstract %R 10.1038/s41591-021-01552-x %0 Journal Article %J Genes %D 2021 %T Schuurs–Hoeijmakers Syndrome (PACS1 Neurodevelopmental Disorder): Seven Novel Patients and a Review %A Tenorio-Castaño, Jair %A Morte, Beatriz %A Nevado, Julián %A Martínez-Glez, Víctor %A Santos-Simarro, Fernando %A García-Miñaur, Sixto %A Palomares-Bralo, María %A Pacio-Míguez, Marta %A Gómez, Beatriz %A Arias, Pedro %A Alcochea, Alba %A Carrión, Juan %A Arias, Patricia %A Almoguera, Berta %A López-Grondona, Fermina %A Lorda-Sanchez, Isabel %A Galán-Gómez, Enrique %A Valenzuela, Irene %A Méndez Perez, María %A Cuscó, Ivón %A Barros, Francisco %A Pié, Juan %A Ramos, Sergio %A Ramos, Feliciano %A Kuechler, Alma %A Tizzano, Eduardo %A Ayuso, Carmen %A Kaiser, Frank %A Pérez-Jurado, Luis %A Carracedo, Ángel %A Lapunzina, Pablo %B Genes %V 12 %P 738 %8 Jan-05-2021 %G eng %U https://www.mdpi.com/2073-4425/12/5/738https://www.mdpi.com/2073-4425/12/5/738/pdf %N 5 %! Genes %R 10.3390/genes12050738 %0 Journal Article %J EPMA J %D 2020 %T 10th Anniversary of the European Association for Predictive, Preventive and Personalised (3P) Medicine - EPMA World Congress Supplement 2020. %A Golubnitschaja, Olga %A Topolcan, Ondrej %A Kucera, Radek %A Costigliola, Vincenzo %X

In 2019, the EPMA celebrated its 10th anniversary at the 5th World Congress in Pilsen, Czech Republic. The history of the International Professional Network dedicated to Predictive, Preventive and Personalised Medicine (PPPM / 3PM) is rich in achievements. Facing the coronavirus COVID-19 pandemic it is getting evident globally that the predictive approach, targeted prevention and personalisation of medical services is the optimal paradigm in healthcare demonstrating the high potential to save lives and to benefit the society as a whole. The EPMA World Congress Supplement 2020 highlights advances in 3P medicine.

%B EPMA J %P 1-133 %8 2020 Aug 19 %G eng %R 10.1007/s13167-020-00206-1 %0 Journal Article %J F1000Res %D 2020 %T The ELIXIR Human Copy Number Variations Community: building bioinformatics infrastructure for research. %A Salgado, David %A Armean, Irina M %A Baudis, Michael %A Beltran, Sergi %A Capella-Gutíerrez, Salvador %A Carvalho-Silva, Denise %A Dominguez Del Angel, Victoria %A Dopazo, Joaquin %A Furlong, Laura I %A Gao, Bo %A Garcia, Leyla %A Gerloff, Dietlind %A Gut, Ivo %A Gyenesei, Attila %A Habermann, Nina %A Hancock, John M %A Hanauer, Marc %A Hovig, Eivind %A Johansson, Lennart F %A Keane, Thomas %A Korbel, Jan %A Lauer, Katharina B %A Laurie, Steve %A Leskošek, Brane %A Lloyd, David %A Marqués-Bonet, Tomás %A Mei, Hailiang %A Monostory, Katalin %A Piñero, Janet %A Poterlowicz, Krzysztof %A Rath, Ana %A Samarakoon, Pubudu %A Sanz, Ferran %A Saunders, Gary %A Sie, Daoud %A Swertz, Morris A %A Tsukanov, Kirill %A Valencia, Alfonso %A Vidak, Marko %A Yenyxe González, Cristina %A Ylstra, Bauke %A Béroud, Christophe %K Computational Biology %K DNA Copy Number Variations %K High-Throughput Nucleotide Sequencing %K Humans %X

Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While "High-Throughput" sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR's recently established with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context.

%B F1000Res %V 9 %8 2020 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/34367618?dopt=Abstract %& 1229 %R 10.12688/f1000research.24887.1 %0 Journal Article %J J Med Genet %D 2020 %T Optimised molecular genetic diagnostics of Fanconi anaemia by whole exome sequencing and functional studies. %A Bogliolo, Massimo %A Pujol, Roser %A Aza-Carmona, Miriam %A Muñoz-Subirana, Núria %A Rodriguez-Santiago, Benjamin %A Casado, José Antonio %A Rio, Paula %A Bauser, Christopher %A Reina-Castillón, Judith %A Lopez-Sanchez, Marcos %A Gonzalez-Quereda, Lidia %A Gallano, Pia %A Catalá, Albert %A Ruiz-Llobet, Ana %A Badell, Isabel %A Diaz-Heredia, Cristina %A Hladun, Raquel %A Senent, Leonort %A Argiles, Bienvenida %A Bergua Burgues, Juan Miguel %A Bañez, Fatima %A Arrizabalaga, Beatriz %A López Almaraz, Ricardo %A Lopez, Monica %A Figuera, Ángela %A Molinés, Antonio %A Pérez de Soto, Inmaculada %A Hernando, Inés %A Muñoz, Juan Antonio %A Del Rosario Marin, Maria %A Balmaña, Judith %A Stjepanovic, Neda %A Carrasco, Estela %A Cuesta, Isabel %A Cosuelo, José Miguel %A Regueiro, Alexandra %A Moraleda Jimenez, José %A Galera-Miñarro, Ana Maria %A Rosiñol, Laura %A Carrió, Anna %A Beléndez-Bieler, Cristina %A Escudero Soto, Antonio %A Cela, Elena %A de la Mata, Gregorio %A Fernández-Delgado, Rafael %A Garcia-Pardos, Maria Carmen %A Sáez-Villaverde, Raquel %A Barragaño, Marta %A Portugal, Raquel %A Lendinez, Francisco %A Hernadez, Ines %A Vagace, José Manue %A Tapia, Maria %A Nieto, José %A Garcia, Marta %A Gonzalez, Macarena %A Vicho, Cristina %A Galvez, Eva %A Valiente, Alberto %A Antelo, Maria Luisa %A Ancliff, Phil %A García, Francisco %A Dopazo, Joaquin %A Sevilla, Julian %A Paprotka, Tobias %A Pérez-Jurado, Luis Alberto %A Bueren, Juan %A Surralles, Jordi %K Cell Line %K DNA Copy Number Variations %K DNA Repair %K DNA-Binding Proteins %K Fanconi Anemia %K Fanconi Anemia Complementation Group A Protein %K Female %K Gene Knockout Techniques %K Genetic Predisposition to Disease %K Humans %K Male %K Mutation, Missense %K Polymorphism, Single Nucleotide %K whole exome sequencing %X

PURPOSE: Patients with Fanconi anaemia (FA), a rare DNA repair genetic disease, exhibit chromosome fragility, bone marrow failure, malformations and cancer susceptibility. FA molecular diagnosis is challenging since FA is caused by point mutations and large deletions in 22 genes following three heritability patterns. To optimise FA patients' characterisation, we developed a simplified but effective methodology based on whole exome sequencing (WES) and functional studies.

METHODS: 68 patients with FA were analysed by commercial WES services. Copy number variations were evaluated by sequencing data analysis with RStudio. To test missense variants, wt FANCA cDNA was cloned and variants were introduced by site-directed mutagenesis. Vectors were then tested for their ability to complement DNA repair defects of a FANCA-KO human cell line generated by TALEN technologies.

RESULTS: We identified 93.3% of mutated alleles including large deletions. We determined the pathogenicity of three FANCA missense variants and demonstrated that two variants reported in mutations databases as 'affecting functions' are SNPs. Deep analysis of sequencing data revealed patients' true mutations, highlighting the importance of functional analysis. In one patient, no pathogenic variant could be identified in any of the 22 known FA genes, and in seven patients, only one deleterious variant could be identified (three patients each with FANCA and FANCD2 and one patient with FANCE mutations) CONCLUSION: WES and proper bioinformatics analysis are sufficient to effectively characterise patients with FA regardless of the rarity of their complementation group, type of mutations, mosaic condition and DNA source.

%B J Med Genet %V 57 %P 258-268 %8 2020 04 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/31586946?dopt=Abstract %R 10.1136/jmedgenet-2019-106249 %0 Journal Article %J Nature %D 2020 %T Transparency and reproducibility in artificial intelligence. %A Haibe-Kains, Benjamin %A Adam, George Alexandru %A Hosny, Ahmed %A Khodakarami, Farnoosh %A Waldron, Levi %A Wang, Bo %A McIntosh, Chris %A Goldenberg, Anna %A Kundaje, Anshul %A Greene, Casey S %A Broderick, Tamara %A Hoffman, Michael M %A Leek, Jeffrey T %A Korthauer, Keegan %A Huber, Wolfgang %A Brazma, Alvis %A Pineau, Joelle %A Tibshirani, Robert %A Hastie, Trevor %A Ioannidis, John P A %A Quackenbush, John %A Aerts, Hugo J W L %K Algorithms %K Artificial Intelligence %K Reproducibility of Results %B Nature %V 586 %P E14-E16 %8 2020 10 %G eng %N 7829 %1 https://www.ncbi.nlm.nih.gov/pubmed/33057217?dopt=Abstract %R 10.1038/s41586-020-2766-y %0 Journal Article %J Lancet Oncol %D 2019 %T Pazopanib for treatment of advanced malignant and dedifferentiated solitary fibrous tumour: a multicentre, single-arm, phase 2 trial. %A Martin-Broto, Javier %A Stacchiotti, Silvia %A Lopez-Pousa, Antonio %A Redondo, Andres %A Bernabeu, Daniel %A de Alava, Enrique %A Casali, Paolo G %A Italiano, Antoine %A Gutierrez, Antonio %A Moura, David S %A Peña-Chilet, Maria %A Diaz-Martin, Juan %A Biscuola, Michele %A Taron, Miguel %A Collini, Paola %A Ranchere-Vince, Dominique %A Garcia Del Muro, Xavier %A Grignani, Giovanni %A Dumont, Sarah %A Martinez-Trufero, Javier %A Palmerini, Emanuela %A Hindi, Nadia %A Sebio, Ana %A Dopazo, Joaquin %A Dei Tos, Angelo Paolo %A LeCesne, Axel %A Blay, Jean-Yves %A Cruz, Josefina %K Adult %K Aged %K Angiogenesis Inhibitors %K Antineoplastic Agents %K Female %K Humans %K Indazoles %K Male %K Middle Aged %K Multivariate Analysis %K Pyrimidines %K Response Evaluation Criteria in Solid Tumors %K Soft Tissue Neoplasms %K Solitary Fibrous Tumors %K Sulfonamides %K Survival Analysis %X

BACKGROUND: A solitary fibrous tumour is a rare soft-tissue tumour with three clinicopathological variants: typical, malignant, and dedifferentiated. Preclinical experiments and retrospective studies have shown different sensitivities of solitary fibrous tumour to chemotherapy and antiangiogenics. We therefore designed a trial to assess the activity of pazopanib in a cohort of patients with malignant or dedifferentiated solitary fibrous tumour. The clinical and translational results are presented here.

METHODS: In this single-arm, phase 2 trial, adult patients (aged ≥ 18 years) with histologically confirmed metastatic or unresectable malignant or dedifferentiated solitary fibrous tumour at any location, who had progressed (by RECIST and Choi criteria) in the previous 6 months and had an ECOG performance status of 0-2, were enrolled at 16 third-level hospitals with expertise in sarcoma care in Spain, Italy, and France. Patients received pazopanib 800 mg once daily, taken orally without food, at least 1 h before or 2 h after a meal, until progression or intolerance. The primary endpoint of the study was overall response measured by Choi criteria in the subset of the intention-to-treat population (patients who received at least 1 month of treatment with at least one radiological assessment). All patients who received at least one dose of the study drug were included in the safety analyses. This study is registered with ClinicalTrials.gov, number NCT02066285, and with the European Clinical Trials Database, EudraCT number 2013-005456-15.

FINDINGS: From June 26, 2014, to Nov 24, 2016, of 40 patients assessed, 36 were enrolled (34 with malignant solitary fibrous tumour and two with dedifferentiated solitary fibrous tumour). Median follow-up was 27 months (IQR 16-31). Based on central radiology review, 18 (51%) of 35 evaluable patients had partial responses, nine (26%) had stable disease, and eight (23%) had progressive disease according to Choi criteria. Further enrolment of patients with dedifferentiated solitary fibrous tumour was stopped after detection of early and fast progressions in a planned interim analysis. 51% (95% CI 34-69) of 35 patients achieved an overall response according to Choi criteria. Ten (29%) of 35 patients died. There were no deaths related to adverse events and the most frequent grade 3 or higher adverse events were hypertension (11 [31%] of 36 patients), neutropenia (four [11%]), increased concentrations of alanine aminotransferase (four [11%]), and increased concentrations of bilirubin (three [8%]).

INTERPRETATION: To our knowledge, this is the first trial of pazopanib for treatment of malignant solitary fibrous tumour showing activity in this patient group. The manageable toxicity profile and the activity shown by pazopanib suggests that this drug could be an option for systemic treatment of advanced malignant solitary fibrous tumour, and provides a benchmark for future trials.

FUNDING: Spanish Group for Research on Sarcomas (GEIS), Italian Sarcoma Group (ISG), French Sarcoma Group (FSG), GlaxoSmithKline, and Novartis.

%B Lancet Oncol %V 20 %P 134-144 %8 2019 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/30578023?dopt=Abstract %R 10.1016/S1470-2045(18)30676-4 %0 Journal Article %J Nature Communications %D 2018 %T A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection %A Fourati, Slim %A Talla, Aarthi %A Mahmoudian, Mehrad %A Burkhart, Joshua G. %A Klén, Riku %A Henao, Ricardo %A Yu, Thomas %A Aydın, Zafer %A Yeung, Ka Yee %A Ahsen, Mehmet Eren %A Almugbel, Reem %A Jahandideh, Samad %A Liang, Xiao %A Nordling, Torbjörn E. M. %A Shiga, Motoki %A Stanescu, Ana %A Vogel, Robert %A Pandey, Gaurav %A Chiu, Christopher %A McClain, Micah T. %A Woods, Christopher W. %A Ginsburg, Geoffrey S. %A Elo, Laura L. %A Tsalik, Ephraim L. %A Mangravite, Lara M. %A Sieberts, Solveig K. %B Nature Communications %V 9 %8 Jan-12-2018 %G eng %U http://www.nature.com/articles/s41467-018-06735-8http://www.nature.com/articles/s41467-018-06735-8.pdfhttp://www.nature.com/articles/s41467-018-06735-8.pdfhttp://www.nature.com/articles/s41467-018-06735-8 %N 1 %! Nat Commun %R 10.1038/s41467-018-06735-8 %0 Journal Article %J Sci Rep %D 2018 %T Evolution of the Quorum network and the mobilome (plasmids and bacteriophages) in clinical strains of Acinetobacter baumannii during a decade. %A López, M %A Rueda, A %A Florido, J P %A Blasco, L %A Fernández-García, L %A Trastoy, R %A Fernández-Cuenca, F %A Martínez-Martínez, L %A Vila, J %A Pascual, A %A Bou, G %A Tomas, M %K Acinetobacter baumannii %K Acinetobacter Infections %K Bacteriophages %K Cross Infection %K Humans %K Plasmids %K Quorum Sensing %K Retrospective Studies %X

In this study, we compared eighteen clinical strains of A. baumannii belonging to the ST-2 clone and isolated from patients in the same intensive care unit (ICU) in 2000 (9 strains referred to collectively as Ab_GEIH-2000) and 2010 (9 strains referred to collectively as Ab_GEIH-2010), during the GEIH-REIPI project (Umbrella BioProject PRJNA422585). We observed two main molecular differences between the Ab_GEIH-2010 and the Ab_GEIH-2000 collections, acquired over the course of the decade long sampling interval and involving the mobilome: i) a plasmid harbouring genes for bla ß-lactamase and abKA/abkB proteins of a toxin-antitoxin system; and ii) two temperate bacteriophages, Ab105-1ϕ (63 proteins) and Ab105-2ϕ (93 proteins), containing important viral defence proteins. Moreover, all Ab_GEIH-2010 strains contained a Quorum functional network of Quorum Sensing (QS) and Quorum Quenching (QQ) mechanisms, including a new QQ enzyme, AidA, which acts as a bacterial defence mechanism against the exogenous 3-oxo-C12-HSL. Interestingly, the infective capacity of the bacteriophages isolated in this study (Ab105-1ϕ and Ab105-2ϕ) was higher in the Ab_GEIH-2010 strains (carrying a functional Quorum network) than in the Ab_GEIH-2000 strains (carrying a deficient Quorum network), in which the bacteriophages showed little or no infectivity. This is the first study about the evolution of the Quorum network and the mobilome in clinical strains of Acinetobacter baumannii during a decade.

%B Sci Rep %V 8 %P 2523 %8 2018 02 06 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29410443?dopt=Abstract %R 10.1038/s41598-018-20847-7 %0 Journal Article %J Nature %D 2018 %T Genomics of the origin and evolution of Citrus. %A Wu, Guohong Albert %A Terol, Javier %A Ibañez, Victoria %A López-García, Antonio %A Pérez-Román, Estela %A Borredá, Carles %A Domingo, Concha %A Tadeo, Francisco R %A Carbonell-Caballero, José %A Alonso, Roberto %A Curk, Franck %A Du, Dongliang %A Ollitrault, Patrick %A Roose, Mikeal L %A Dopazo, Joaquin %A Gmitter, Frederick G %A Rokhsar, Daniel S %A Talon, Manuel %K Asia, Southeastern %K Biodiversity %K citrus %K Crop Production %K Evolution, Molecular %K Genetic Speciation %K Genome, Plant %K Genomics %K Haplotypes %K Heterozygote %K History, Ancient %K Human Migration %K Hybridization, Genetic %K Phylogeny %X

The genus Citrus, comprising some of the most widely cultivated fruit crops worldwide, includes an uncertain number of species. Here we describe ten natural citrus species, using genomic, phylogenetic and biogeographic analyses of 60 accessions representing diverse citrus germ plasms, and propose that citrus diversified during the late Miocene epoch through a rapid southeast Asian radiation that correlates with a marked weakening of the monsoons. A second radiation enabled by migration across the Wallace line gave rise to the Australian limes in the early Pliocene epoch. Further identification and analyses of hybrids and admixed genomes provides insights into the genealogy of major commercial cultivars of citrus. Among mandarins and sweet orange, we find an extensive network of relatedness that illuminates the domestication of these groups. Widespread pummelo admixture among these mandarins and its correlation with fruit size and acidity suggests a plausible role of pummelo introgression in the selection of palatable mandarins. This work provides a new evolutionary framework for the genus Citrus.

%B Nature %V 554 %P 311-316 %8 2018 02 15 %G eng %N 7692 %1 https://www.ncbi.nlm.nih.gov/pubmed/29414943?dopt=Abstract %R 10.1038/nature25447 %0 Journal Article %J Nat Commun %D 2018 %T LRH-1 agonism favours an immune-islet dialogue which protects against diabetes mellitus. %A Cobo-Vuilleumier, Nadia %A Lorenzo, Petra I %A Rodríguez, Noelia García %A Herrera Gómez, Irene de Gracia %A Fuente-Martin, Esther %A López-Noriega, Livia %A Mellado-Gil, José Manuel %A Romero-Zerbo, Silvana-Yanina %A Baquié, Mathurin %A Lachaud, Christian Claude %A Stifter, Katja %A Perdomo, German %A Bugliani, Marco %A De Tata, Vincenzo %A Bosco, Domenico %A Parnaud, Geraldine %A Pozo, David %A Hmadcha, Abdelkrim %A Florido, Javier P %A Toscano, Miguel G %A de Haan, Peter %A Schoonjans, Kristina %A Sánchez Palazón, Luis %A Marchetti, Piero %A Schirmbeck, Reinhold %A Martín-Montalvo, Alejandro %A Meda, Paolo %A Soria, Bernat %A Bermúdez-Silva, Francisco-Javier %A St-Onge, Luc %A Gauthier, Benoit R %K Animals %K Apoptosis %K Cell Communication %K Cell Survival %K Diabetes Mellitus, Experimental %K Diabetes Mellitus, Type 2 %K Female %K Gene Expression Regulation %K Humans %K Hypoglycemic Agents %K Immunity, Innate %K insulin %K Insulin-Secreting Cells %K Islets of Langerhans %K Islets of Langerhans Transplantation %K Macrophages %K Male %K Mice %K Mice, Inbred C57BL %K Phenalenes %K Receptors, Cytoplasmic and Nuclear %K Streptozocin %K T-Lymphocytes, Regulatory %K Transplantation, Heterologous %X

Type 1 diabetes mellitus (T1DM) is due to the selective destruction of islet beta cells by immune cells. Current therapies focused on repressing the immune attack or stimulating beta cell regeneration still have limited clinical efficacy. Therefore, it is timely to identify innovative targets to dampen the immune process, while promoting beta cell survival and function. Liver receptor homologue-1 (LRH-1) is a nuclear receptor that represses inflammation in digestive organs, and protects pancreatic islets against apoptosis. Here, we show that BL001, a small LRH-1 agonist, impedes hyperglycemia progression and the immune-dependent inflammation of pancreas in murine models of T1DM, and beta cell apoptosis in islets of type 2 diabetic patients, while increasing beta cell mass and insulin secretion. Thus, we suggest that LRH-1 agonism favors a dialogue between immune and islet cells, which could be druggable to protect against diabetes mellitus.

%B Nat Commun %V 9 %P 1488 %8 2018 04 16 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29662071?dopt=Abstract %R 10.1038/s41467-018-03943-0 %0 Journal Article %J Oncotarget %D 2017 %T Genomic expression differences between cutaneous cells from red hair color individuals and black hair color individuals based on bioinformatic analysis. %A Puig-Butille, Joan Anton %A Gimenez-Xavier, Pol %A Visconti, Alessia %A Nsengimana, Jérémie %A Garcia-Garcia, Francisco %A Tell-Marti, Gemma %A Escamez, Maria José %A Newton-Bishop, Julia %A Bataille, Veronique %A Del Rio, Marcela %A Dopazo, Joaquin %A Falchi, Mario %A Puig, Susana %K Adult %K Coculture Techniques %K Computational Biology %K gene expression %K Genetic Predisposition to Disease %K Genomics %K Hair Color %K Humans %K Keratinocytes %K Melanocytes %K Middle Aged %K Phenotype %K Receptor, Melanocortin, Type 1 %X

The MC1R gene plays a crucial role in pigmentation synthesis. Loss-of-function MC1R variants, which impair protein function, are associated with red hair color (RHC) phenotype and increased skin cancer risk. Cultured cutaneous cells bearing loss-of-function MC1R variants show a distinct gene expression profile compared to wild-type MC1R cultured cutaneous cells. We analysed the gene signature associated with RHC co-cultured melanocytes and keratinocytes by Protein-Protein interaction (PPI) network analysis to identify genes related with non-functional MC1R variants. From two detected networks, we selected 23 nodes as hub genes based on topological parameters. Differential expression of hub genes was then evaluated in healthy skin biopsies from RHC and black hair color (BHC) individuals. We also compared gene expression in melanoma tumors from individuals with RHC versus BHC. Gene expression in normal skin from RHC cutaneous cells showed dysregulation in 8 out of 23 hub genes (CLN3, ATG10, WIPI2, SNX2, GABARAPL2, YWHA, PCNA and GBAS). Hub genes did not differ between melanoma tumors in RHC versus BHC individuals. The study suggests that healthy skin cells from RHC individuals present a constitutive genomic deregulation associated with the red hair phenotype and identify novel genes involved in melanocyte biology.

%B Oncotarget %V 8 %P 11589-11599 %8 2017 Feb 14 %G eng %U http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path%5B%5D=14140&path%5B%5D=45094 %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28030792?dopt=Abstract %R 10.18632/oncotarget.14140 %0 Journal Article %J Nucleic Acids Res %D 2017 %T HGVA: the Human Genome Variation Archive. %A Lopez, Javier %A Coll, Jacobo %A Haimel, Matthias %A Kandasamy, Swaathi %A Tárraga, Joaquín %A Furio-Tari, Pedro %A Bari, Wasim %A Bleda, Marta %A Rueda, Antonio %A Gräf, Stefan %A Rendon, Augusto %A Dopazo, Joaquin %A Medina, Ignacio %K Genetic Variation %K Genome, Human %K Humans %K Internet %K Software %K User-Computer Interface %X

High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets.

%B Nucleic Acids Res %V 45 %P W189-W194 %8 2017 07 03 %G eng %U https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkx445 %N W1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28535294?dopt=Abstract %R 10.1093/nar/gkx445 %0 Journal Article %J Plant Mol Biol %D 2017 %T Integration of transcriptomic and metabolic data reveals hub transcription factors involved in drought stress response in sunflower (Helianthus annuus L.). %A Moschen, Sebastián %A Di Rienzo, Julio A %A Higgins, Janet %A Tohge, Takayuki %A Watanabe, Mutsumi %A Gonzalez, Sergio %A Rivarola, Máximo %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Hopp, H Esteban %A Hoefgen, Rainer %A Fernie, Alisdair R %A Paniego, Norma %A Fernandez, Paula %A Heinz, Ruth A %K Chlorophyll %K Gene Expression Regulation, Plant %K Helianthus %K Plant Leaves %K Plant Proteins %K Protein Array Analysis %K RNA, Plant %K Stress, Physiological %K Transcription Factors %K Water %X

By integration of transcriptional and metabolic profiles we identified pathways and hubs transcription factors regulated during drought conditions in sunflower, useful for applications in molecular and/or biotechnological breeding. Drought is one of the most important environmental stresses that effects crop productivity in many agricultural regions. Sunflower is tolerant to drought conditions but the mechanisms involved in this tolerance remain unclear at the molecular level. The aim of this study was to characterize and integrate transcriptional and metabolic pathways related to drought stress in sunflower plants, by using a system biology approach. Our results showed a delay in plant senescence with an increase in the expression level of photosynthesis related genes as well as higher levels of sugars, osmoprotectant amino acids and ionic nutrients under drought conditions. In addition, we identified transcription factors that were upregulated during drought conditions and that may act as hubs in the transcriptional network. Many of these transcription factors belong to families implicated in the drought response in model species. The integration of transcriptomic and metabolomic data in this study, together with physiological measurements, has improved our understanding of the biological responses during droughts and contributes to elucidate the molecular mechanisms involved under this environmental condition. These findings will provide useful biotechnological tools to improve stress tolerance while maintaining crop yield under restricted water availability.

%B Plant Mol Biol %V 94 %P 549-564 %8 2017 Jul %G eng %N 4-5 %1 https://www.ncbi.nlm.nih.gov/pubmed/28639116?dopt=Abstract %R 10.1007/s11103-017-0625-5 %0 Journal Article %J Hum Mutat %D 2017 %T Mutations in TRAPPC11 are associated with a congenital disorder of glycosylation. %A Matalonga, Leslie %A Bravo, Miren %A Serra-Peinado, Carla %A García-Pelegrí, Elisabeth %A Ugarteburu, Olatz %A Vidal, Silvia %A Llambrich, Maria %A Quintana, Ester %A Fuster-Jorge, Pedro %A Gonzalez-Bravo, Maria Nieves %A Beltran, Sergi %A Dopazo, Joaquin %A Garcia-Garcia, Francisco %A Foulquier, François %A Matthijs, Gert %A Mills, Philippa %A Ribes, Antonia %A Egea, Gustavo %A Briones, Paz %A Tort, Frederic %A Girós, Marisa %K Abnormalities, Multiple %K Alleles %K Amino Acid Substitution %K Brain %K Congenital Disorders of Glycosylation %K Genotype %K Humans %K Magnetic Resonance Imaging %K Male %K mutation %K Phenotype %K Vesicular Transport Proteins %K Whole Genome Sequencing %X

Congenital disorders of glycosylation (CDG) are a heterogeneous and rapidly growing group of diseases caused by abnormal glycosylation of proteins and/or lipids. Mutations in genes involved in the homeostasis of the endoplasmic reticulum (ER), the Golgi apparatus (GA), and the vesicular trafficking from the ER to the ER-Golgi intermediate compartment (ERGIC) have been found to be associated with CDG. Here, we report a patient with defects in both N- and O-glycosylation combined with a delayed vesicular transport in the GA due to mutations in TRAPPC11, a subunit of the TRAPPIII complex. TRAPPIII is implicated in the anterograde transport from the ER to the ERGIC as well as in the vesicle export from the GA. This report expands the spectrum of genetic alterations associated with CDG, providing new insights for the diagnosis and the understanding of the physiopathological mechanisms underlying glycosylation disorders.

%B Hum Mutat %V 38 %P 148-151 %8 2017 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/27862579?dopt=Abstract %R 10.1002/humu.23145 %0 Journal Article %J BMC bioinformatics %D 2017 %T A new parallel pipeline for DNA methylation analysis of long reads datasets. %A Olanda, Ricardo %A Pérez, Mariano %A Orduña, Juan M %A Tárraga, Joaquín %A Joaquín Dopazo %K Methyl-Seq %K NGS %X BACKGROUND: DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. RESULTS: In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while yielding a better level of sensitivity, particularly for datasets composed of long reads. This strategy can be exported to other methylation, DNA and RNA analysis tools. CONCLUSIONS: The developed software tool achieves execution times one order of magnitude shorter than the existing tools, while yielding equal sensitivity for short reads and even better sensitivity for long reads. %B BMC bioinformatics %V 18 %P 161 %8 2017 Mar 09 %G eng %U http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1574-3 %R 10.1186/s12859-017-1574-3 %0 Journal Article %J BMC Bioinformatics %D 2017 %T VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy. %A Juanes, José M %A Gallego, Asunción %A Tárraga, Joaquín %A Chaves, Felipe J %A Marin-Garcia, Pablo %A Medina, Ignacio %A Arnau, Vicente %A Dopazo, Joaquin %K Base Sequence %K Genetic Therapy %K Genetic Vectors %K High-Throughput Nucleotide Sequencing %K Humans %K Internet %K User-Computer Interface %K Virus Integration %X

BACKGROUND: The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use.

RESULTS: Here we present VISMapper, a vector integration site analysis web server, to analyze next-generation sequencing data for retroviral vector integration sites. VISMapper can be found at: http://vismapper.babelomics.org .

CONCLUSIONS: Because it uses novel mapping algorithms VISMapper is remarkably faster than previous available programs. It also provides a useful graphical interface to analyze the integration sites found in the genomic context.

%B BMC Bioinformatics %V 18 %P 421 %8 2017 Sep 20 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28931371?dopt=Abstract %R 10.1186/s12859-017-1837-z %0 Journal Article %J Genome Biology %D 2017 %T Whole exome sequencing coupled with unbiased functional analysis reveals new Hirschsprung disease genes %A Gui, Hongsheng %A Schriemer, Duco %A Cheng, William W. %A Chauhan, Rajendra K. %A Antiňolo, Guillermo %A Berrios, Courtney %A Bleda, Marta %A Brooks, Alice S. %A Brouwer, Rutger W. W. %A Burns, Alan J. %A Cherny, Stacey S. %A Dopazo, Joaquin %A Eggen, Bart J. L. %A Griseri, Paola %A Jalloh, Binta %A Le, Thuy-Linh %A Lui, Vincent C. H. %A Luzón-Toro, Berta %A Matera, Ivana %A Ngan, Elly S. W. %A Pelet, Anna %A Ruiz-Ferrer, Macarena %A Sham, Pak C. %A Shepherd, Iain T. %A So, Man-Ting %A Sribudiani, Yunia %A Tang, Clara S. M. %A van den Hout, Mirjam C. G. N. %A van der Linde, Herma C. %A van Ham, Tjakko J. %A van IJcken, Wilfred F. J. %A Verheij, Joke B. G. M. %A Amiel, Jeanne %A Borrego, Salud %A Ceccherini, Isabella %A Chakravarti, Aravinda %A Lyonnet, Stanislas %A Tam, Paul K. H. %A Garcia-Barceló, Maria-Mercè %A Hofstra, Robert M. W. %B Genome Biology %V 18 %8 Jan-12-2017 %G eng %U http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1174-6http://link.springer.com/content/pdf/10.1186/s13059-017-1174-6.pdf %N 1 %! Genome Biol %R 10.1186/s13059-017-1174-6 %0 Journal Article %J Genome biology %D 2017 %T Whole exome sequencing coupled with unbiased functional analysis reveals new Hirschsprung disease genes. %A Gui, Hongsheng %A Schriemer, Duco %A Cheng, William W %A Chauhan, Rajendra K %A Antiňolo, Guillermo %A Berrios, Courtney %A Bleda, Marta %A Brooks, Alice S %A Brouwer, Rutger W W %A Burns, Alan J %A Cherny, Stacey S %A Dopazo, Joaquin %A Eggen, Bart J L %A Griseri, Paola %A Jalloh, Binta %A Le, Thuy-Linh %A Lui, Vincent C H %A Luzón-Toro, Berta %A Matera, Ivana %A Ngan, Elly S W %A Pelet, Anna %A Ruiz-Ferrer, Macarena %A Sham, Pak C %A Shepherd, Iain T %A So, Man-Ting %A Sribudiani, Yunia %A Tang, Clara S M %A van den Hout, Mirjam C G N %A van der Linde, Herma C %A van Ham, Tjakko J %A van IJcken, Wilfred F J %A Verheij, Joke B G M %A Amiel, Jeanne %A Borrego, Salud %A Ceccherini, Isabella %A Chakravarti, Aravinda %A Lyonnet, Stanislas %A Tam, Paul K H %A Garcia-Barceló, Maria-Mercè %A Hofstra, Robert Mw %K Hirschprung %K Rare Disease %K WES %X BACKGROUND: Hirschsprung disease (HSCR), which is congenital obstruction of the bowel, results from a failure of enteric nervous system (ENS) progenitors to migrate, proliferate, differentiate, or survive within the distal intestine. Previous studies that have searched for genes underlying HSCR have focused on ENS-related pathways and genes not fitting the current knowledge have thus often been ignored. We identify and validate novel HSCR genes using whole exome sequencing (WES), burden tests, in silico prediction, unbiased in vivo analyses of the mutated genes in zebrafish, and expression analyses in zebrafish, mouse, and human. RESULTS: We performed de novo mutation (DNM) screening on 24 HSCR trios. We identify 28 DNMs in 21 different genes. Eight of the DNMs we identified occur in RET, the main HSCR gene, and the remaining 20 DNMs reside in genes not reported in the ENS. Knockdown of all 12 genes with missense or loss-of-function DNMs showed that the orthologs of four genes (DENND3, NCLN, NUP98, and TBATA) are indispensable for ENS development in zebrafish, and these results were confirmed by CRISPR knockout. These genes are also expressed in human and mouse gut and/or ENS progenitors. Importantly, the encoded proteins are linked to neuronal processes shared by the central nervous system and the ENS. CONCLUSIONS: Our data open new fields of investigation into HSCR pathology and provide novel insights into the development of the ENS. Moreover, the study demonstrates that functional analyses of genes carrying DNMs are warranted to delineate the full genetic architecture of rare complex diseases. %B Genome biology %V 18 %P 48 %8 2017 Mar 08 %G eng %U http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1174-6 %R 10.1186/s13059-017-1174-6 %0 Journal Article %J The Journal of molecular diagnostics : JMD %D 2016 %T Assessment of Targeted Next-Generation Sequencing as a Tool for the Diagnosis of Charcot-Marie-Tooth Disease and Hereditary Motor Neuropathy. %A Lupo, Vincenzo %A Garcia-Garcia, Francisco %A Sancho, Paula %A Tello, Cristina %A García-Romero, Mar %A Villarreal, Liliana %A Alberti, Antonia %A Sivera, Rafael %A Joaquín Dopazo %A Pascual-Pascual, Samuel I %A Márquez-Infante, Celedonio %A Casasnovas, Carlos %A Sevilla, Teresa %A Espinós, Carmen %K Charcot-Marie-Tooth %K CMT %K Diagnostic %K NGS %K Panels %K rare diseases %K Targeted resequencing %X Charcot-Marie-Tooth disease is characterized by broad genetic heterogeneity with >50 known disease-associated genes. Mutations in some of these genes can cause a pure motor form of hereditary motor neuropathy, the genetics of which are poorly characterized. We designed a panel comprising 56 genes associated with Charcot-Marie-Tooth disease/hereditary motor neuropathy. We validated this diagnostic tool by first testing 11 patients with pathological mutations. A cohort of 33 affected subjects was selected for this study. The DNAJB2 c.352+1G>A mutation was detected in two cases; novel changes and/or variants with low frequency (<1%) were found in 12 cases. There were no candidate variants in 18 cases, and amplification failed for one sample. The DNAJB2 c.352+1G>A mutation was also detected in three additional families. On haplotype analysis, all of the patients from these five families shared the same haplotype; therefore, the DNAJB2 c.352+1G>A mutation may be a founder event. Our gene panel allowed us to perform a very rapid and cost-effective screening of genes involved in Charcot-Marie-Tooth disease/hereditary motor neuropathy. Our diagnostic strategy was robust in terms of both coverage and read depth for all of the genes and patient samples. These findings demonstrate the difficulty in achieving a definitive molecular diagnosis because of the complexity of interpreting new variants and the genetic heterogeneity that is associated with these neuropathies. %B The Journal of molecular diagnostics : JMD %8 2016 Jan 2 %G eng %U http://www.sciencedirect.com/science/article/pii/S1525157815002615 %R 10.1016/j.jmoldx.2015.10.005 %0 Journal Article %J Cell Cycle %D 2016 %T Dysfunctional mitochondrial fission impairs cell reprogramming. %A Prieto, Javier %A León, Marian %A Ponsoda, Xavier %A Garcia-Garcia, Francisco %A Bort, Roque %A Serna, Eva %A Barneo-Muñoz, Manuela %A Palau, Francesc %A Dopazo, Joaquin %A López-García, Carlos %A Torres, Josema %K Animals %K Cell Cycle Checkpoints %K Cellular Reprogramming %K DNA Damage %K G2 Phase %K Gene Knockdown Techniques %K Mice %K Mitochondrial Dynamics %K Mitosis %K Nerve Tissue Proteins %K Pluripotent Stem Cells %K Transcription Factors %X

We have recently shown that mitochondrial fission is induced early in reprogramming in a Drp1-dependent manner; however, the identity of the factors controlling Drp1 recruitment to mitochondria was unexplored. To investigate this, we used a panel of RNAi targeting factors involved in the regulation of mitochondrial dynamics and we observed that MiD51, Gdap1 and, to a lesser extent, Mff were found to play key roles in this process. Cells derived from Gdap1-null mice were used to further explore the role of this factor in cell reprogramming. Microarray data revealed a prominent down-regulation of cell cycle pathways in Gdap1-null cells early in reprogramming and cell cycle profiling uncovered a G2/M growth arrest in Gdap1-null cells undergoing reprogramming. High-Content analysis showed that this growth arrest was DNA damage-independent. We propose that lack of efficient mitochondrial fission impairs cell reprogramming by interfering with cell cycle progression in a DNA damage-independent manner.

%B Cell Cycle %V 15 %P 3240-3250 %8 2016 Dec %G eng %N 23 %1 https://www.ncbi.nlm.nih.gov/pubmed/27753531?dopt=Abstract %R 10.1080/15384101.2016.1241930 %0 Journal Article %J Nature Communications %D 2016 %T Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq) %A Lagarde, Julien %A Uszczynska-Ratajczak, Barbara %A Santoyo-López, Javier %A Gonzalez, Jose Manuel %A Tapanari, Electra %A Mudge, Jonathan M. %A Steward, Charles A. %A Wilming, Laurens %A Tanzer, Andrea %A Howald, Cédric %A Chrast, Jacqueline %A Vela-Boza, Alicia %A Rueda, Antonio %A Lopez-Domingo, Francisco J. %A Dopazo, Joaquin %A Reymond, Alexandre %A Guigó, Roderic %A Harrow, Jennifer %B Nature Communications %V 7 %8 Jan-11-2016 %G eng %U http://www.nature.com/articles/ncomms12339http://www.nature.com/articles/ncomms12339.pdfhttp://www.nature.com/articles/ncomms12339.pdfhttp://www.nature.com/articles/ncomms12339 %N 1 %! Nat Commun %R 10.1038/ncomms12339 %0 Journal Article %J Nature communications %D 2016 %T Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). %A Lagarde, Julien %A Uszczynska-Ratajczak, Barbara %A Santoyo-López, Javier %A Gonzalez, Jose Manuel %A Tapanari, Electra %A Mudge, Jonathan M %A Steward, Charles A %A Wilming, Laurens %A Tanzer, Andrea %A Howald, Cédric %A Chrast, Jacqueline %A Vela-Boza, Alicia %A Antonio Rueda %A López-Domingo, Francisco J %A Dopazo, Joaquin %A Reymond, Alexandre %A Guigó, Roderic %A Harrow, Jennifer %X Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking. Here we describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. We apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5’ or 3’, often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, we show that RACE-Seq is an effective tool to annotate an organism’s deep transcriptome, and compares favourably to other targeted sequencing techniques. %B Nature communications %V 7 %P 12339 %8 2016 %G eng %U http://www.nature.com/articles/ncomms12339 %R 10.1038/ncomms12339 %0 Journal Article %J DNA Res %D 2016 %T Highly sensitive and ultrafast read mapping for RNA-seq analysis. %A Medina, I %A Tárraga, J %A Martínez, H %A Barrachina, S %A Castillo, M I %A Paschall, J %A Salavert-Torres, J %A Blanquer-Espert, I %A Hernández-García, V %A Quintana-Ortí, E S %A Dopazo, J %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Sensitivity and Specificity %K Sequence Analysis, RNA %K Transcriptome %X

As sequencing technologies progress, the amount of data produced grows exponentially, shifting the bottleneck of discovery towards the data analysis phase. In particular, currently available mapping solutions for RNA-seq leave room for improvement in terms of sensitivity and performance, hindering an efficient analysis of transcriptomes by massive sequencing. Here, we present an innovative approach that combines re-engineering, optimization and parallelization. This solution results in a significant increase of mapping sensitivity over a wide range of read lengths and substantial shorter runtimes when compared with current RNA-seq mapping methods available.

%B DNA Res %V 23 %P 93-100 %8 2016 Apr %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/26740642?dopt=Abstract %R 10.1093/dnares/dsv039 %0 Journal Article %J BMC bioinformatics %D 2016 %T HPG pore: an efficient and scalable framework for nanopore sequencing data. %A Tárraga, Joaquín %A Gallego, Asunción %A Arnau, Vicente %A Medina, Ignacio %A Dopazo, Joaquin %K hadoop %K HPC %K nanopore %K NGS %X BACKGROUND: The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput. Despite the flood of data expected from this technology, the data analysis solutions currently available are only designed to manage small projects and are not scalable. RESULTS: Here we present HPG Pore, a toolkit for exploring and analysing nanopore sequencing data. HPG Pore can run on both individual computers and in the Hadoop distributed computing framework, which allows easy scale-up to manage the large amounts of data expected to result from extensive use of nanopore technologies in the future. CONCLUSIONS: HPG Pore allows for virtually unlimited sequencing data scalability, thus guaranteeing its continued management in near future scenarios. HPG Pore is available in GitHub at http://github.com/opencb/hpg-pore . %B BMC bioinformatics %V 17 %P 107 %8 2016 %G eng %U http://www.biomedcentral.com/1471-2105/17/107 %R 10.1186/s12859-016-0966-0 %0 Journal Article %J BMC Bioinformatics %D 2016 %T HPG pore: an efficient and scalable framework for nanopore sequencing data %A Tárraga, Joaquín %A Gallego, Asunción %A Arnau, Vicente %A Medina, Ignacio %A Dopazo, Joaquin %B BMC Bioinformatics %V 17 %8 Jan-12-2016 %G eng %U http://www.biomedcentral.com/1471-2105/17/107http://link.springer.com/content/pdf/10.1186/s12859-016-0966-0 %N 1 %! BMC Bioinformatics %R 10.1186/s12859-016-0966-0 %0 Journal Article %J Plant Biotechnol J %D 2016 %T Integrating transcriptomic and metabolomic analysis to understand natural leaf senescence in sunflower. %A Moschen, Sebastián %A Bengoa Luoni, Sofía %A Di Rienzo, Julio A %A Caro, María Del Pilar %A Tohge, Takayuki %A Watanabe, Mutsumi %A Hollmann, Julien %A Gonzalez, Sergio %A Rivarola, Máximo %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Hopp, Horacio Esteban %A Hoefgen, Rainer %A Fernie, Alisdair R %A Paniego, Norma %A Fernandez, Paula %A Heinz, Ruth A %K Gas Chromatography-Mass Spectrometry %K Gene Expression Profiling %K Gene Expression Regulation, Plant %K Gene ontology %K Genes, Plant %K Helianthus %K Ions %K metabolomics %K Oligonucleotide Array Sequence Analysis %K Plant Leaves %K Principal Component Analysis %K RNA, Messenger %K Transcription Factors %X

Leaf senescence is a complex process, which has dramatic consequences on crop yield. In sunflower, gap between potential and actual yields reveals the economic impact of senescence. Indeed, sunflower plants are incapable of maintaining their green leaf area over sustained periods. This study characterizes the leaf senescence process in sunflower through a systems biology approach integrating transcriptomic and metabolomic analyses: plants being grown under both glasshouse and field conditions. Our results revealed a correspondence between profile changes detected at the molecular, biochemical and physiological level throughout the progression of leaf senescence measured at different plant developmental stages. Early metabolic changes were detected prior to anthesis and before the onset of the first senescence symptoms, with more pronounced changes observed when physiological and molecular variables were assessed under field conditions. During leaf development, photosynthetic activity and cell growth processes decreased, whereas sucrose, fatty acid, nucleotide and amino acid metabolisms increased. Pathways related to nutrient recycling processes were also up-regulated. Members of the NAC, AP2-EREBP, HB, bZIP and MYB transcription factor families showed high expression levels, and their expression level was highly correlated, suggesting their involvement in sunflower senescence. The results of this study thus contribute to the elucidation of the molecular mechanisms involved in the onset and progression of leaf senescence in sunflower leaves as well as to the identification of candidate genes involved in this process.

%B Plant Biotechnol J %V 14 %P 719-34 %8 2016 Feb %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/26132509?dopt=Abstract %R 10.1111/pbi.12422 %0 Journal Article %J Am J Med Genet A %D 2016 %T Screening of CD96 and ASXL1 in 11 patients with Opitz C or Bohring-Opitz syndromes. %A Urreizti, Roser %A Roca-Ayats, Neus %A Trepat, Judith %A Garcia-Garcia, Francisco %A Alemán, Alejandro %A Orteschi, Daniela %A Marangi, Giuseppe %A Neri, Giovanni %A Opitz, John M %A Dopazo, Joaquin %A Cormand, Bru %A Vilageliu, Lluïsa %A Balcells, Susana %A Grinberg, Daniel %K Adolescent %K Antigens, CD %K Child %K Child, Preschool %K Craniosynostoses %K Exome %K Female %K High-Throughput Nucleotide Sequencing %K Humans %K Infant %K Intellectual Disability %K Male %K mutation %K Pedigree %K Phenotype %K Prognosis %K Repressor Proteins %X

Opitz C trigonocephaly (or Opitz C syndrome, OTCS) and Bohring-Opitz syndrome (BOS or C-like syndrome) are two rare genetic disorders with phenotypic overlap. The genetic causes of these diseases are not understood. However, two genes have been associated with OTCS or BOS with dominantly inherited de novo mutations. Whereas CD96 has been related to OTCS (one case) and to BOS (one case), ASXL1 has been related to BOS only (several cases). In this study we analyze CD96 and ASXL1 in a group of 11 affected individuals, including 2 sibs, 10 of them were diagnosed with OTCS, and one had a BOS phenotype. Exome sequences were available on six patients with OTCS and three parent pairs. Thus, we could analyze the CD96 and ASXL1 sequences in these patients bioinformatically. Sanger sequencing of all exons of CD96 and ASXL1 was carried out in the remaining patients. Detailed scrutiny of the sequences and assessment of variants allowed us to exclude putative pathogenic and private mutations in all but one of the patients. In this patient (with BOS) we identified a de novo mutation in ASXL1 (c.2100dupT). By nature and location within the gene, this mutation resembles those previously described in other BOS patients and we conclude that it may be responsible for the condition. Our results indicate that in 10 of 11, the disease (OTCS or BOS) cannot be explained by small changes in CD96 or ASXL1. However, the cohort is too small to make generalizations about the genetic etiology of these diseases.

%B Am J Med Genet A %V 170A %P 24-31 %8 2016 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26768331?dopt=Abstract %R 10.1002/ajmg.a.37418 %0 Journal Article %J Nucleic acids research %D 2015 %T Babelomics 5.0: functional interpretation for new generations of genomic data. %A Alonso, Roberto %A Salavert, Francisco %A Garcia-Garcia, Francisco %A Carbonell-Caballero, José %A Bleda, Marta %A García-Alonso, Luz %A Sanchis-Juan, Alba %A Perez-Gil, Daniel %A Marin-Garcia, Pablo %A Sánchez, Rubén %A Cubuk, Cankut %A Hidalgo, Marta R %A Amadoz, Alicia %A Hernansaiz-Ballesteros, Rosa D %A Alemán, Alejandro %A Tárraga, Joaquín %A Montaner, David %A Medina, Ignacio %A Dopazo, Joaquin %K babelomics %K data integration %K gene set analysis %K interactome %K network analysis %K NGS %K RNA-seq %K Systems biology %K transcriptomics %X Babelomics has been running for more than one decade offering a user-friendly interface for the functional analysis of gene expression and genomic data. Here we present its fifth release, which includes support for Next Generation Sequencing data including gene expression (RNA-seq), exome or genome resequencing. Babelomics has simplified its interface, being now more intuitive. Improved visualization options, such as a genome viewer as well as an interactive network viewer, have been implemented. New technical enhancements at both, client and server sides, makes the user experience faster and more dynamic. Babelomics offers user-friendly access to a full range of methods that cover: (i) primary data analysis, (ii) a variety of tests for different experimental designs and (iii) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context. In addition to the public server, local copies of Babelomics can be downloaded and installed. Babelomics is freely available at: http://www.babelomics.org. %B Nucleic acids research %V 43 %P W117-W121 %8 2015 Apr 20 %G eng %U http://nar.oxfordjournals.org/content/43/W1/W117 %R 10.1093/nar/gkv384 %0 Journal Article %J Nature methods %D 2015 %T Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. %A Ewing, Adam D %A Houlahan, Kathleen E %A Hu, Yin %A Ellrott, Kyle %A Caloian, Cristian %A Yamaguchi, Takafumi N %A Bare, J Christopher %A P’ng, Christine %A Waggott, Daryl %A Sabelnykova, Veronica Y %A Kellen, Michael R %A Norman, Thea C %A Haussler, David %A Friend, Stephen H %A Stolovitzky, Gustavo %A Margolin, Adam A %A Stuart, Joshua M %A Boutros, Paul C %E ICGC-TCGA DREAM Somatic Mutation Calling Challenge participants %E Liu Xi %E Ninad Dewal %E Yu Fan %E Wenyi Wang %E David Wheeler %E Andreas Wilm %E Grace Hui Ting %E Chenhao Li %E Denis Bertrand %E Niranjan Nagarajan %E Qing-Rong Chen %E Chih-Hao Hsu %E Ying Hu %E Chunhua Yan %E Warren Kibbe %E Daoud Meerzaman %E Kristian Cibulskis %E Mara Rosenberg %E Louis Bergelson %E Adam Kiezun %E Amie Radenbaugh %E Anne-Sophie Sertier %E Anthony Ferrari %E Laurie Tonton %E Kunal Bhutani %E Nancy F Hansen %E Difei Wang %E Lei Song %E Zhongwu Lai %E Liao, Yang %E Shi, Wei %E Carbonell-Caballero, José %E Joaquín Dopazo %E Cheryl C K Lau %E Justin Guinney %K cancer %K NGS %K variant calling %X The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/. %B Nature methods %8 2015 May 18 %G eng %U http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3407.html %R 10.1038/nmeth.3407 %0 Journal Article %J IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %D 2015 %T Concurrent and Accurate Short Read Mapping on Multicore Processors. %A Martinez, Hector %A Tárraga, Joaquín %A Medina, Ignacio %A Barrachina, Sergio %A Castillo, Maribel %A Dopazo, Joaquin %A Quintana-Orti, Enrique S %K HPC %K NGS %K short real mapping %X We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, [Formula: see text] ([Formula: see text] is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of [Formula: see text], on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR. %B IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %V 12 %P 995-1007 %8 2015 Sep-Oct %G eng %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=7010005 %R 10.1109/TCBB.2015.2392077 %0 Journal Article %J Scientific Reports %D 2015 %T Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease %A Luzón-Toro, Berta %A Gui, Hongsheng %A Ruiz-Ferrer, Macarena %A Sze-Man Tang, Clara %A Fernández, Raquel M. %A Sham, Pak-Chung %A Torroglosa, Ana %A Kwong-Hang Tam, Paul %A Espino-Paisán, Laura %A Cherny, Stacey S. %A Bleda, Marta %A Enguix-Riego, María Del Valle %A Dopazo, Joaquin %A Antiňolo, Guillermo %A Garcia-Barceló, Maria-Mercè %A Borrego, Salud %B Scientific Reports %V 5 %8 Jan-12-2015 %G eng %U http://www.nature.com/articles/srep16473http://www.nature.com/articles/srep16473.pdfhttp://www.nature.com/articles/srep16473.pdfhttp://www.nature.com/articles/srep16473 %N 1 %! Sci Rep %R 10.1038/srep16473 %0 Journal Article %J Scientific reports %D 2015 %T Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease. %A Luzón-Toro, Berta %A Gui, Hongsheng %A Ruiz-Ferrer, Macarena %A Sze-Man Tang, Clara %A Fernández, Raquel M %A Sham, Pak-Chung %A Torroglosa, Ana %A Kwong-Hang Tam, Paul %A Espino-Paisán, Laura %A Cherny, Stacey S %A Bleda, Marta %A Enguix-Riego, María Del Valle %A Joaquín Dopazo %A Antiňolo, Guillermo %A Garcia-Barceló, Maria-Mercè %A Borrego, Salud %K babelomics %K Hirschprung %K NGS %K prioritization %X Hirschsprung disease (HSCR; OMIM 142623) is a developmental disorder characterized by aganglionosis along variable lengths of the distal gastrointestinal tract, which results in intestinal obstruction. Interactions among known HSCR genes and/or unknown disease susceptibility loci lead to variable severity of phenotype. Neither linkage nor genome-wide association studies have efficiently contributed to completely dissect the genetic pathways underlying this complex genetic disorder. We have performed whole exome sequencing of 16 HSCR patients from 8 unrelated families with SOLID platform. Variants shared by affected relatives were validated by Sanger sequencing. We searched for genes recurrently mutated across families. Only variations in the FAT3 gene were significantly enriched in five families. Within-family analysis identified compound heterozygotes for AHNAK and several genes (N = 23) with heterozygous variants that co-segregated with the phenotype. Network and pathway analyses facilitated the discovery of polygenic inheritance involving FAT3, HSCR known genes and their gene partners. Altogether, our approach has facilitated the detection of more than one damaging variant in biologically plausible genes that could jointly contribute to the phenotype. Our data may contribute to the understanding of the complex interactions that occur during enteric nervous system development and the etiopathology of familial HSCR. %B Scientific reports %V 5 %P 16473 %8 2015 %G eng %U http://www.nature.com/articles/srep16473 %R 10.1038/srep16473 %0 Journal Article %J BMC Bioinformatics %D 2015 %T Fast inexact mapping using advanced tree exploration on backward search methods. %A Salavert, José %A Tomás, Andrés %A Tárraga, Joaquín %A Medina, Ignacio %A Dopazo, Joaquin %A Blanquer, Ignacio %K Algorithms %K Genome, Human %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Alignment %K Sequence Analysis, DNA %K Software %X

BACKGROUND: Short sequence mapping methods for Next Generation Sequencing consist on a combination of seeding techniques followed by local alignment based on dynamic programming approaches. Most seeding algorithms are based on backward search alignment, using the Burrows Wheeler Transform, the Ferragina and Manzini Index or Suffix Arrays. All these backward search algorithms have excellent performance, but their computational cost highly increases when allowing errors. In this paper, we discuss an inexact mapping algorithm based on pruning strategies for search tree exploration over genomic data.

RESULTS: The proposed algorithm achieves a 13x speed-up over similar algorithms when allowing 6 base errors, including insertions, deletions and mismatches. This algorithm can deal with 400 bps reads with up to 9 errors in a high quality Illumina dataset. In this example, the algorithm works as a preprocessor that reduces by 55% the number of reads to be aligned. Depending on the aligner the overall execution time is reduced between 20-40%.

CONCLUSIONS: Although not intended as a complete sequence mapping tool, the proposed algorithm could be used as a preprocessing step to modern sequence mappers. This step significantly reduces the number reads to be aligned, accelerating overall alignment time. Furthermore, this algorithm could be used for accelerating the seeding step of already available sequence mappers. In addition, an out-of-core index has been implemented for working with large genomes on systems without expensive memory configurations.

%B BMC Bioinformatics %V 16 %P 18 %8 2015 Jan 28 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25626517?dopt=Abstract %R 10.1186/s12859-014-0438-3 %0 Journal Article %J BMC medical genomics %D 2015 %T Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas. %A Luzón-Toro, Berta %A Bleda, Marta %A Navarro, Elena %A García-Alonso, Luz %A Ruiz-Ferrer, Macarena %A Medina, Ignacio %A Martín-Sánchez, Marta %A Gonzalez, Cristina Y %A Fernández, Raquel M %A Torroglosa, Ana %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %K epistasis %K GWAS %K Thyroid cancer %X BACKGROUND: The molecular mechanisms leading to sporadic medullary thyroid carcinoma (sMTC) and juvenile papillary thyroid carcinoma (PTC), two rare tumours of the thyroid gland, remain poorly understood. Genetic studies on thyroid carcinomas have been conducted, although just a few loci have been systematically associated. Given the difficulties to obtain single-loci associations, this work expands its scope to the study of epistatic interactions that could help to understand the genetic architecture of complex diseases and explain new heritable components of genetic risk. METHODS: We carried out the first screening for epistasis by Multifactor-Dimensionality Reduction (MDR) in genome-wide association study (GWAS) on sMTC and juvenile PTC, to identify the potential simultaneous involvement of pairs of variants in the disease. RESULTS: We have identified two significant epistatic gene interactions in sMTC (CHFR-AC016582.2 and C8orf37-RNU1-55P) and three in juvenile PTC (RP11-648k4.2-DIO1, RP11-648k4.2-DMGDH and RP11-648k4.2-LOXL1). Interestingly, each interacting gene pair included a non-coding RNA, providing thus support to the relevance that these elements are increasingly gaining to explain carcinoma development and progression. CONCLUSIONS: Overall, this study contributes to the understanding of the genetic basis of thyroid carcinoma susceptibility in two different case scenarios such as sMTC and juvenile PTC. %B BMC medical genomics %V 8 %P 83 %8 2015 %G eng %U http://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-015-0160-7 %R 10.1186/s12920-015-0160-7 %0 Journal Article %J BMC Medical Genomics %D 2015 %T Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas %A Luzón-Toro, Berta %A Bleda, Marta %A Navarro, Elena %A García-Alonso, Luz %A Ruiz-Ferrer, Macarena %A Medina, Ignacio %A Martín-Sánchez, Marta %A Gonzalez, Cristina Y. %A Fernández, Raquel M. %A Torroglosa, Ana %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %X The molecular mechanisms leading to sporadic medullary thyroid carcinoma (sMTC) and juvenile papillary thyroid carcinoma (PTC), two rare tumours of the thyroid gland, remain poorly understood. Genetic studies on thyroid carcinomas have been conducted, although just a few loci have been systematically associated. Given the difficulties to obtain single-loci associations, this work expands its scope to the study of epistatic interactions that could help to understand the genetic architecture of complex diseases and explain new heritable components of genetic risk. %B BMC Medical Genomics %V 8 %P 83 %8 Dec %G eng %U https://doi.org/10.1186/s12920-015-0160-7 %R 10.1186/s12920-015-0160-7 %0 Journal Article %J BMC genomics %D 2015 %T Involvement of a citrus meiotic recombination TTC-repeat motif in the formation of gross deletions generated by ionizing radiation and MULE activation. %A Terol, Javier %A Ibañez, Victoria %A Carbonell, José %A Alonso, Roberto %A Estornell, Leandro H %A Licciardello, Concetta %A Gut, Ivo G %A Joaquín Dopazo %A Talon, Manuel %X BACKGROUND: Transposable-element mediated chromosomal rearrangements require the involvement of two transposons and two double-strand breaks (DSB) located in close proximity. In radiobiology, DSB proximity is also a major factor contributing to rearrangements. However, the whole issue of DSB proximity remains virtually unexplored. RESULTS: Based on DNA sequencing analysis we show that the genomes of 2 derived mutations, Arrufatina (sport) and Nero (irradiation), share a similar 2 Mb deletion of chromosome 3. A 7 kb Mutator-like element found in Clemenules was present in Arrufatina in inverted orientation flanking the 5’ end of the deletion. The Arrufatina Mule displayed "dissimilar" 9-bp target site duplications separated by 2 Mb. Fine-scale single nucleotide variant analyses of the deleted fragments identified a TTC-repeat sequence motif located in the center of the deletion responsible of a meiotic crossover detected in the citrus reference genome. CONCLUSIONS: Taken together, this information is compatible with the proposal that in both mutants, the TTC-repeat motif formed a triplex DNA structure generating a loop that brought in close proximity the originally distinct reactive ends. In Arrufatina, the loop brought the Mule ends nearby the 2 distinct insertion target sites and the inverted insertion of the transposable element between these target sites provoked the release of the in-between fragment. This proposal requires the involvement of a unique transposon and sheds light on the unresolved question of how two distinct sites become located in close proximity. These observations confer a crucial role to the TTC-repeats in fundamental plant processes as meiotic recombination and chromosomal rearrangements. %B BMC genomics %V 16 %P 69 %8 2015 Feb 13 %G eng %U http://www.biomedcentral.com/1471-2164/16/69 %R 10.1186/s12864-015-1280-3 %0 Journal Article %J BMC Genomics %D 2015 %T Involvement of a citrus meiotic recombination TTC-repeat motif in the formation of gross deletions generated by ionizing radiation and MULE activation %A Terol, Javier %A Ibañez, Victoria %A Carbonell, José %A Alonso, Roberto %A Estornell, Leandro H. %A Licciardello, Concetta %A Gut, Ivo G. %A Dopazo, Joaquin %A Talon, Manuel %X Transposable-element mediated chromosomal rearrangements require the involvement of two transposons and two double-strand breaks (DSB) located in close proximity. In radiobiology, DSB proximity is also a major factor contributing to rearrangements. However, the whole issue of DSB proximity remains virtually unexplored. %B BMC Genomics %V 16 %P 69 %8 Feb %G eng %U https://doi.org/10.1186/s12864-015-1280-3 %R 10.1186/s12864-015-1280-3 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2015 %T A Parallel and Sensitive Software Tool for Methylation Analysis on Multicore Platforms. %A Tárraga, Joaquín %A Pérez, Mariano %A Orduña, Juan M %A Duato, José %A Medina, Ignacio %A Joaquín Dopazo %K BS-seq %K HPC %K methylation %K NGS %X MOTIVATION: DNA methylation analysis suffers from very long processing time, since the advent of Next-Generation Sequencers (NGS) has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. Since it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. RESULTS: We present a new software tool, called HPG-Methyl, which efficiently maps bisulfite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPGMethyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulfite reads. AVAILABILITY: Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password "anonymous"). CONTACT: Juan.Orduna@uv.es. %B Bioinformatics (Oxford, England) %V 31 %P 3130-3138 %8 2015 Jun 10 %G eng %U http://bioinformatics.oxfordjournals.org/content/31/19/3130.long %R 10.1093/bioinformatics/btv357 %0 Journal Article %J Molecular biology and evolution %D 2015 %T A phylogenetic analysis of 34 chloroplast genomes elucidates the relationships between wild and domestic species within the genus Citrus. %A Carbonell-Caballero, José %A Alonso, Roberto %A Ibañez, Victoria %A Terol, Javier %A Talon, Manuel %A Dopazo, Joaquin %K chloroplast %K citrus %K Phylogeny %K WGS %X Citrus genus includes some of the most important cultivated fruit trees worldwide. Despite being extensively studied because of its commercial relevance, the origin of cultivated citrus species and the history of its domestication still remain an open question. Here we present a phylogenetic analysis of the chloroplast genomes of 34 citrus genotypes which constitutes the most comprehensive and detailed study to date on the evolution and variability of the genus Citrus. A statistical model was used to estimate divergence times between the major citrus groups. Additionally, a complete map of the variability across the genome of different citrus species was produced, including single nucleotide variants, heteroplasmic positions, indels and large structural variants. The distribution of all these variants provided further independent support to the phylogeny obtained. An unexpected finding was the high level of heteroplasmy found in several of the analysed genomes. The use of the complete chloroplast DNA not only paves the way for a better understanding of the phylogenetic relationships within the Citrus genus, but also provides original insights into other elusive evolutionary processes such as chloroplast inheritance, heteroplasmy and gene selection. %B Molecular biology and evolution %V 32 %P 2015-2035 %8 2015 Apr 14 %G eng %U http://mbe.oxfordjournals.org/content/early/2015/04/27/molbev.msv082.full %R 10.1093/molbev/msv082 %0 Journal Article %J Nature biotechnology %D 2015 %T Prediction of human population responses to toxic compounds by a collaborative competition. %A Eduati, Federica %A Mangravite, Lara M %A Wang, Tao %A Tang, Hao %A Bare, J Christopher %A Huang, Ruili %A Norman, Thea %A Kellen, Mike %A Menden, Michael P %A Yang, Jichen %A Zhan, Xiaowei %A Zhong, Rui %A Xiao, Guanghua %A Xia, Menghang %A Abdo, Nour %A Kosyk, Oksana %X The ability to computationally predict the effects of toxic compounds on humans could help address the deficiencies of current chemical safety testing. Here, we report the results from a community-based DREAM challenge to predict toxicities of environmental compounds with potential adverse health effects for human populations. We measured the cytotoxicity of 156 compounds in 884 lymphoblastoid cell lines for which genotype and transcriptional data are available as part of the Tox21 1000 Genomes Project. The challenge participants developed algorithms to predict interindividual variability of toxic response from genomic profiles and population-level cytotoxicity data from structural attributes of the compounds. 179 submitted predictions were evaluated against an experimental data set to which participants were blinded. Individual cytotoxicity predictions were better than random, with modest correlations (Pearson’s r < 0.28), consistent with complex trait genomic prediction. In contrast, predictions of population-level response to different compounds were higher (r < 0.66). The results highlight the possibility of predicting health risks associated with unknown compounds, although risk estimation accuracy remains suboptimal. %B Nature biotechnology %8 2015 Aug 10 %G eng %U http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3299.html %R 10.1038/nbt.3299 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2014 %T Acceleration of short and long DNA read mapping without loss of accuracy using suffix array. %A Tárraga, Joaquín %A Arnau, Vicente %A Martinez, Hector %A Moreno, Raul %A Cazorla, Diego %A Salavert-Torres, José %A Blanquer-Espert, Ignacio %A Joaquín Dopazo %A Medina, Ignacio %K NGS %K short read mapping. HPC. suffix arrays %X HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20x for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current, state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies. %B Bioinformatics (Oxford, England) %V 30 %P 3396-3398 %8 2014 Aug 20 %G eng %U http://bioinformatics.oxfordjournals.org/content/early/2014/08/19/bioinformatics.btu553.long %R 10.1093/bioinformatics/btu553 %0 Journal Article %J Nature communications %D 2014 %T Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. %A Munro, Sarah A %A Lund, Steven P %A Pine, P Scott %A Binder, Hans %A Clevert, Djork-Arné %A Ana Conesa %A Dopazo, Joaquin %A Fasold, Mario %A Hochreiter, Sepp %A Hong, Huixiao %A Jafari, Nadereh %A Kreil, David P %A Labaj, Paweł P %A Li, Sheng %A Liao, Yang %A Lin, Simon M %A Meehan, Joseph %A Mason, Christopher E %A Santoyo-López, Javier %A Setterquist, Robert A %A Shi, Leming %A Shi, Wei %A Smyth, Gordon K %A Stralis-Pavese, Nancy %A Su, Zhenqiang %A Tong, Weida %A Wang, Charles %A Wang, Jian %A Xu, Joshua %A Ye, Zhan %A Yang, Yong %A Yu, Ying %A Salit, Marc %K RNA-seq %X There is a critical need for standard approaches to assess, report and compare the technical performance of genome-scale differential gene expression experiments. Here we assess technical performance with a proposed standard ’dashboard’ of metrics derived from analysis of external spike-in RNA control ratio mixtures. These control ratio mixtures with defined abundance ratios enable assessment of diagnostic performance of differentially expressed transcript lists, limit of detection of ratio (LODR) estimates and expression ratio variability and measurement bias. The performance metrics suite is applicable to analysis of a typical experiment, and here we also apply these metrics to evaluate technical performance among laboratories. An interlaboratory study using identical samples shared among 12 laboratories with three different measurement processes demonstrates generally consistent diagnostic power across 11 laboratories. Ratio measurement variability and bias are also comparable among laboratories for the same measurement process. We observe different biases for measurement processes using different mRNA-enrichment protocols. %B Nature communications %V 5 %P 5125 %8 2014 %G eng %U http://www.nature.com/ncomms/2014/140925/ncomms6125/full/ncomms6125.html %R 10.1038/ncomms6125 %0 Journal Article %J PloS one %D 2014 %T Combined genetic and high-throughput strategies for molecular diagnosis of inherited retinal dystrophies. %A de Castro-Miró, Marta %A Pomares, Esther %A Lorés-Motta, Laura %A Tonda, Raul %A Joaquín Dopazo %A Marfany, Gemma %A Gonzàlez-Duarte, Roser %X Most diagnostic laboratories are confronted with the increasing demand for molecular diagnosis from patients and families and the ever-increasing genetic heterogeneity of visual disorders. Concerning Retinal Dystrophies (RD), almost 200 causative genes have been reported to date, and most families carry private mutations. We aimed to approach RD genetic diagnosis using all the available genetic information to prioritize candidates for mutational screening, and then restrict the number of cases to be analyzed by massive sequencing. We constructed and optimized a comprehensive cosegregation RD-chip based on SNP genotyping and haplotype analysis. The RD-chip allows to genotype 768 selected SNPs (closely linked to 100 RD causative genes) in a single cost-, time-effective step. Full diagnosis was attained in 17/36 Spanish pedigrees, yielding 12 new and 12 previously reported mutations in 9 RD genes. The most frequently mutated genes were USH2A and CRB1. Notably, RD3-up to now only associated to Leber Congenital Amaurosis- was identified as causative of Retinitis Pigmentosa. The main assets of the RD-chip are: i) the robustness of the genetic information that underscores the most probable candidates, ii) the invaluable clues in cases of shared haplotypes, which are indicative of a common founder effect, and iii) the detection of extended haplotypes over closely mapping genes, which substantiates cosegregation, although the assumptions in which the genetic analysis is based could exceptionally lead astray. The combination of the genetic approach with whole exome sequencing (WES) greatly increases the diagnosis efficiency, and revealed novel mutations in USH2A and GUCY2D. Overall, the RD-chip diagnosis efficiency ranges from 16% in dominant, to 80% in consanguineous recessive pedigrees, with an average of 47%, well within the upper range of massive sequencing approaches, highlighting the validity of this time- and cost-effective approach whilst high-throughput methodologies become amenable for routine diagnosis in medium sized labs. %B PloS one %V 9 %P e88410 %8 2014 %G eng %U http://dx.plos.org/10.1371/journal.pone.0088410 %R 10.1371/journal.pone.0088410 %0 Journal Article %J Human mutation %D 2014 %T A New Overgrowth Syndrome is Due to Mutations in RNF125. %A Tenorio, Jair %A Mansilla, Alicia %A Valencia, María %A Martínez-Glez, Víctor %A Romanelli, Valeria %A Arias, Pedro %A Castrejón, Nerea %A Poletta, Fernando %A Guillén-Navarro, Encarna %A Gordo, Gema %A Mansilla, Elena %A García-Santiago, Fé %A González-Casado, Isabel %A Vallespín, Elena %A Palomares, María %A Mori, María A %A Santos-Simarro, Fernando %A García-Miñaur, Sixto %A Fernández, Luis %A Mena, Rocío %A Benito-Sanz, Sara %A Del Pozo, Angela %A Silla, Juan Carlos %A Ibañez, Kristina %A López-Granados, Eduardo %A Martín-Trujillo, Alex %A Montaner, David %A Heath, Karen E %A Campos-Barros, Angel %A Joaquín Dopazo %A Nevado, Julián %A Monk, David %A Ruiz-Pérez, Víctor L %A Lapunzina, Pablo %K NGS %K prioritization %K Rare Disease %X Overgrowth syndromes (OGS) are a group of disorders in which all parameters of growth and physical development are above the mean for age and sex. We evaluated a series of 270 families from the Spanish Overgrowth Syndrome Registry with no known overgrowth syndrome. We identified one de novo deletion and three missense mutations in RNF125 in six patients from 4 families with overgrowth, macrocephaly, intellectual disability, mild hydrocephaly, hypoglycaemia and inflammatory diseases resembling Sjögren syndrome. RNF125 encodes an E3 ubiquitin ligase and is a novel gene of OGS. Our studies of the RNF125 pathway point to upregulation of RIG-I-IPS1-MDA5 and/or disruption of the PI3K-AKT and interferon signaling pathways as the putative final effectors. This article is protected by copyright. All rights reserved. %B Human mutation %V 35 %P 1436–1441 %8 2014 Sep 5 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/humu.22689/abstract %R 10.1002/humu.22689 %0 Journal Article %J BMC Syst Biol %D 2014 %T Pathway network inference from gene expression data. %A Ponzoni, Ignacio %A Nueda, María %A Tarazona, Sonia %A Götz, Stefan %A Montaner, David %A Dussaut, Julieta %A Dopazo, Joaquin %A Conesa, Ana %K Alzheimer Disease %K Cell Cycle %K DNA Replication %K Gene Expression Profiling %K Gene Regulatory Networks %K Gluconeogenesis %K Glycolysis %K Oxidative Phosphorylation %K Proteolysis %K Purines %K Saccharomyces cerevisiae %K Systems biology %K Ubiquitin %X

BACKGROUND: The development of high-throughput omics technologies enabled genome-wide measurements of the activity of cellular elements and provides the analytical resources for the progress of the Systems Biology discipline. Analysis and interpretation of gene expression data has evolved from the gene to the pathway and interaction level, i.e. from the detection of differentially expressed genes, to the establishment of gene interaction networks and the identification of enriched functional categories. Still, the understanding of biological systems requires a further level of analysis that addresses the characterization of the interaction between functional modules.

RESULTS: We present a novel computational methodology to study the functional interconnections among the molecular elements of a biological system. The PANA approach uses high-throughput genomics measurements and a functional annotation scheme to extract an activity profile from each functional block -or pathway- followed by machine-learning methods to infer the relationships between these functional profiles. The result is a global, interconnected network of pathways that represents the functional cross-talk within the molecular system. We have applied this approach to describe the functional transcriptional connections during the yeast cell cycle and to identify pathways that change their connectivity in a disease condition using an Alzheimer example.

CONCLUSIONS: PANA is a useful tool to deepen in our understanding of the functional interdependences that operate within complex biological systems. We show the approach is algorithmically consistent and the inferred network is well supported by the available functional data. The method allows the dissection of the molecular basis of the functional connections and we describe the different regulatory mechanisms that explain the network's topology obtained for the yeast cell cycle data.

%B BMC Syst Biol %V 8 Suppl 2 %P S7 %8 2014 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25032889?dopt=Abstract %R 10.1186/1752-0509-8-S2-S7 %0 Journal Article %J PLoS One %D 2014 %T Permanent cardiac sarcomere changes in a rabbit model of intrauterine growth restriction. %A Torre, Iratxe %A González-Tendero, Anna %A García-Cañadilla, Patricia %A Crispi, Fátima %A Garcia-Garcia, Francisco %A Bijnens, Bart %A Iruretagoyena, Igor %A Dopazo, Joaquin %A Amat-Roldán, Ivan %A Gratacós, Eduard %K Animals %K biomarkers %K Blood Pressure %K Body Weight %K Disease Models, Animal %K Echocardiography %K Female %K Fetal Growth Retardation %K Fetal Heart %K Fetus %K Gene Expression Profiling %K Organ Size %K Placenta %K Pregnancy %K Rabbits %K Sarcomeres %X

BACKGROUND: Intrauterine growth restriction (IUGR) induces fetal cardiac remodelling and dysfunction, which persists postnatally and may explain the link between low birth weight and increased cardiovascular mortality in adulthood. However, the cellular and molecular bases for these changes are still not well understood. We tested the hypothesis that IUGR is associated with structural and functional gene expression changes in the fetal sarcomere cytoarchitecture, which remain present in adulthood.

METHODS AND RESULTS: IUGR was induced in New Zealand pregnant rabbits by selective ligation of the utero-placental vessels. Fetal echocardiography demonstrated more globular hearts and signs of cardiac dysfunction in IUGR. Second harmonic generation microscopy (SHGM) showed shorter sarcomere length and shorter A-band and thick-thin filament interaction lengths, that were already present in utero and persisted at 70 postnatal days (adulthood). Sarcomeric M-band (GO: 0031430) functional term was over-represented in IUGR fetal hearts.

CONCLUSION: The results suggest that IUGR induces cardiac dysfunction and permanent changes on the sarcomere.

%B PLoS One %V 9 %P e113067 %8 2014 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/25402351?dopt=Abstract %R 10.1371/journal.pone.0113067 %0 Journal Article %J BMC systems biology %D 2014 %T Understanding disease mechanisms with models of signaling pathway activities %A Sebastián-Leon, Patricia %A Vidal, Enrique %A Minguez, Pablo %A Conesa, Ana %A Tarazona, Sonia %A Amadoz, Alicia %A Armero, Carmen %A Salavert Torres, Francisco %A Vidal-Puig, Antonio %A Montaner, David %A Dopazo, Joaquin %B BMC systems biology %V 8 %P 121 %8 10 %G eng %R 10.1186/s12918-014-0121-3 %0 Journal Article %J BMC systems biology %D 2014 %T Understanding disease mechanisms with models of signaling pathway activities. %A Sebastián-Leon, Patricia %A Vidal, Enrique %A Minguez, Pablo %A Ana Conesa %A Sonia Tarazona %A Amadoz, Alicia %A Armero, Carmen %A Salavert, Francisco %A Vidal-Puig, Antonio %A Montaner, David %A Joaquín Dopazo %K Disease mechanism %K pathway %K signalling %K Systems biology %X BackgroundUnderstanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine.ResultsHere we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation statuses can be interpreted as biomarkers that discriminate among the compared conditions. This type of mechanism-based biomarkers accounts for cell functional activities and can easily be associated to disease or drug action mechanisms. The accuracy of the proposed model is demonstrated with simulations and real datasets.ConclusionsThe proposed model provides detailed information that enables the interpretation disease mechanisms as a consequence of the complex combinations of altered gene expression values. Moreover, it offers a framework for suggesting possible ways of therapeutic intervention in a pathologically perturbed system. %B BMC systems biology %V 8 %P 121 %8 2014 Oct 25 %G eng %U http://www.biomedcentral.com/1752-0509/8/121/abstract %R 10.1186/s12918-014-0121-3 %0 Journal Article %J Oncotarget %D 2013 %T Capturing the biological impact of CDKN2A and MC1R genes as an early predisposing event in melanoma and non melanoma skin cancer. %A Puig-Butille, Joan Anton %A Escamez, Maria José %A Garcia-Garcia, Francisco %A Tell-Marti, Gemma %A Fabra, Angels %A Martínez-Santamaría, Lucía %A Badenas, Celia %A Aguilera, Paula %A Pevida, Marta %A Joaquín Dopazo %A Del Rio, Marcela %A Puig, Susana %X Germline mutations in CDKN2A and/or red hair color variants in MC1R genes are associated with an increased susceptibility to develop cutaneous melanoma or non melanoma skin cancer. We studied the impact of the CDKN2A germinal mutation p.G101W and MC1R variants on gene expression and transcription profiles associated with skin cancer. To this end we set-up primary skin cell co-cultures from siblings of melanoma prone-families that were later analyzed using the expression array approach. As a result, we found that 1535 transcripts were deregulated in CDKN2A mutated cells, with over-expression of immunity-related genes (HLA-DPB1, CLEC2B, IFI44, IFI44L, IFI27, IFIT1, IFIT2, SP110 and IFNK) and down-regulation of genes playing a role in the Notch signaling pathway. 3570 transcripts were deregulated in MC1R variant carriers. In particular, genes related to oxidative stress and DNA damage pathways were up-regulated as well as genes associated with neurodegenerative diseases such as Parkinson’s, Alzheimer and Huntington. Finally, we observed that the expression signatures indentified in phenotypically normal cells carrying CDKN2A mutations or MC1R variants are maintained in skin cancer tumors (melanoma and squamous cell carcinoma). These results indicate that transcriptome deregulation represents an early event critical for skin cancer development. %B Oncotarget %8 2013 Dec 16 %G eng %U http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path%5B%5D=1444&path%5B%5D=1824 %0 Journal Article %J Mol Genet Metab %D 2013 %T Exome sequencing identifies a new mutation in SERAC1 in a patient with 3-methylglutaconic aciduria. %A Tort, Frederic %A García-Silva, María Teresa %A Ferrer-Cortès, Xènia %A Navarro-Sastre, Aleix %A Garcia-Villoria, Judith %A Coll, Maria Josep %A Vidal, Enrique %A Jiménez-Almazán, Jorge %A Dopazo, Joaquin %A Briones, Paz %A Elpeleg, Orly %A Ribes, Antonia %K Adolescent %K Adult %K Carboxylic Ester Hydrolases %K Child %K Exome %K Female %K High-Throughput Nucleotide Sequencing %K Humans %K Infant %K Male %K Metabolism, Inborn Errors %K mutation %X

3-Methylglutaconic aciduria (3-MGA-uria) is a heterogeneous group of syndromes characterized by an increased excretion of 3-methylglutaconic and 3-methylglutaric acids. Five types of 3-MGA-uria (I to V) with different clinical presentations have been described. Causative mutations in TAZ, OPA3, DNAJC19, ATP12, ATP5E, and TMEM70 have been identified. After excluding the known genetic causes of 3-MGA-uria we used exome sequencing to investigate a patient with Leigh syndrome and 3-MGA-uria. We identified a homozygous variant in SERAC1 (c.202C>T; p.Arg68*), that generates a premature stop codon at position 68 of SERAC1 protein. Western blot analysis in patient's fibroblasts showed a complete absence of SERAC1 that was consistent with the prediction of a truncated protein and supports the pathogenic role of the mutation. During the course of this project a parallel study identified mutations in SERAC1 as the genetic cause of the disease in 15 patients with MEGDEL syndrome, which was compatible with the clinical and biochemical phenotypes of the patient described here. In addition, our patient developed microcephaly and optic atrophy, two features not previously reported in MEGDEL syndrome. We highlight the usefulness of exome sequencing to reveal the genetic bases of human rare diseases even if only one affected individual is available.

%B Mol Genet Metab %V 110 %P 73-7 %8 2013 Sep-Oct %G eng %N 1-2 %1 https://www.ncbi.nlm.nih.gov/pubmed/23707711?dopt=Abstract %R 10.1016/j.ymgme.2013.04.021 %0 Journal Article %J Carcinogenesis %D 2013 %T Grape antioxidant dietary fiber (GADF) inhibits intestinal polyposis in ApcMin/+ mice: relation to cell cycle and immune response. %A Sánchez-Tena, Susana %A Lizarraga, Daneida %A Miranda, Anibal %A Vinardell, Maria Pilar %A Garcia-Garcia, Francisco %A Joaquín Dopazo %A Torres, Josep Lluís %A Saura-Calixto, Fulgencio %A Capellà, Gabriel %A Cascante, Marta %X Epidemiological and experimental studies suggest that fiber and phenolic compounds might have a protective effect on the development of colon cancer in humans. Accordingly, we assessed the chemopreventive efficacy and associated mechanisms of action of a lyophilized red grape pomace containing proanthocyanidin-rich dietary fiber (Grape Antioxidant Dietary Fiber, GADF) on spontaneous intestinal tumorigenesis in the Apc(Min/+) mouse model. Mice were fed a standard diet (control group) or a 1% (w/w) GADF-supplemented diet (GADF group) for 6 weeks. GADF supplementation greatly reduced intestinal tumorigenesis, significantly decreasing the total number of polyps by 76%. Moreover, size distribution analysis showed a considerable reduction in all polyp size categories [diameter <1 mm (65%), 1-2 mm (67%) and >2 mm (87%)]. In terms of polyp formation in the proximal, middle and distal portions of the small intestine a decrease of 76%, 81% and 73% was observed respectively. Putative molecular mechanisms underlying the inhibition of intestinal tumorigenesis were investigated by comparison of microarray expression profiles of GADF-treated and non-treated mice. We observed that the effects of GADF are mainly associated with the induction of a G1 cell cycle arrest and the downregulation of genes related to the immune response and inflammation. Our findings show for the first time the efficacy and associated mechanisms of action of GADF against intestinal tumorigenesis in Apc(Min/+) mice, suggesting its potential for the prevention of colorectal cancer. %B Carcinogenesis %8 2013 Apr 24 %G eng %U http://carcin.oxfordjournals.org/content/early/2013/04/23/carcin.bgt140.abstract %R 10.1093/carcin/bgt140 %0 Journal Article %J Carcinogenesis %D 2013 %T Grape antioxidant dietary fiber inhibits intestinal polyposis in ApcMin/+ mice: relation to cell cycle and immune response. %A Sánchez-Tena, Susana %A Lizarraga, Daneida %A Miranda, Anibal %A Vinardell, Maria P %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Torres, Josep L %A Saura-Calixto, Fulgencio %A Capellà, Gabriel %A Cascante, Marta %K Animals %K Antioxidants %K Body Weight %K Carcinogenesis %K Cell Cycle %K Cell Cycle Checkpoints %K Colorectal Neoplasms %K Dietary Fiber %K Dietary Supplements %K Down-Regulation %K G1 Phase %K Inflammation %K Intestinal Polyposis %K Intestinal Polyps %K Intestine, Small %K Male %K Mice %K Transcriptome %K Vitis %X

Epidemiological and experimental studies suggest that fiber and phenolic compounds might have a protective effect on the development of colon cancer in humans. Accordingly, we assessed the chemopreventive efficacy and associated mechanisms of action of a lyophilized red grape pomace containing proanthocyanidin (PA)-rich dietary fiber [grape antioxidant dietary fiber (GADF)] on spontaneous intestinal tumorigenesis in the Apc(Min/+) mouse model. Mice were fed a standard diet (control group) or a 1% (w/w) GADF-supplemented diet (GADF group) for 6 weeks. GADF supplementation greatly reduced intestinal tumorigenesis, significantly decreasing the total number of polyps by 76%. Moreover, size distribution analysis showed a considerable reduction in all polyp size categories [diameter <1mm (65%), 1-2mm (67%) and >2mm (87%)]. In terms of polyp formation in the proximal, middle and distal portions of the small intestine, a decrease of 76, 81 and 73% was observed, respectively. Putative molecular mechanisms underlying the inhibition of intestinal tumorigenesis were investigated by comparison of microarray expression profiles of GADF-treated and non-treated mice. We observed that the effects of GADF are mainly associated with the induction of a G1 cell cycle arrest and the downregulation of genes related to the immune response and inflammation. Our findings show for the first time the efficacy and associated mechanisms of action of GADF against intestinal tumorigenesis in Apc(Min/+) mice, suggesting its potential for the prevention of colorectal cancer.

%B Carcinogenesis %V 34 %P 1881-8 %8 2013 Aug %G eng %N 8 %1 https://www.ncbi.nlm.nih.gov/pubmed/23615403?dopt=Abstract %R 10.1093/carcin/bgt140 %0 Journal Article %J Am J Physiol Heart Circ Physiol %D 2013 %T Intrauterine growth restriction is associated with cardiac ultrastructural and gene expression changes related to the energetic metabolism in a rabbit model. %A González-Tendero, Anna %A Torre, Iratxe %A García-Cañadilla, Patricia %A Crispi, Fátima %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Bijnens, Bart %A Gratacós, Eduard %K Animals %K Disease Models, Animal %K Energy Metabolism %K Female %K Fetal Growth Retardation %K gene expression %K Mitochondria %K Myocardium %K Oxidative Phosphorylation %K Placenta %K Pregnancy %K Rabbits %X

Intrauterine growth restriction (IUGR) affects 7-10% of pregnancies and is associated with cardiovascular remodeling and dysfunction, which persists into adulthood. The underlying subcellular remodeling and cardiovascular programming events are still poorly documented. Cardiac muscle is central in the fetal adaptive mechanism to IUGR given its high energetic demands. The energetic homeostasis depends on the correct interaction of several molecular pathways and the adequate arrangement of intracellular energetic units (ICEUs), where mitochondria interact with the contractile machinery and the main cardiac ATPases to enable a quick and efficient energy transfer. We studied subcellular cardiac adaptations to IUGR in an experimental rabbit model. We evaluated the ultrastructure of ICEUs with transmission electron microscopy and observed an altered spatial arrangement in IUGR, with significant increases in cytosolic space between mitochondria and myofilaments. A global decrease of mitochondrial density was also observed. In addition, we conducted a global gene expression profile by advanced bioinformatics tools to assess the expression of genes involved in the cardiomyocyte energetic metabolism and identified four gene modules with a coordinated over-representation in IUGR: oxygen homeostasis (GO: 0032364), mitochondrial respiratory chain complex I (GO:0005747), oxidative phosphorylation (GO: 0006119), and NADH dehydrogenase activity (GO:0003954). These findings might contribute to changes in energetic homeostasis in IUGR. The potential persistence and role of these changes in long-term cardiovascular programming deserves further investigation.

%B Am J Physiol Heart Circ Physiol %V 305 %P H1752-60 %8 2013 Dec %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/24097427?dopt=Abstract %R 10.1152/ajpheart.00514.2013 %0 Journal Article %J Exp Dermatol %D 2013 %T Role of CPI-17 in restoring skin homoeostasis in cutaneous field of cancerization: effects of topical application of a film-forming medical device containing photolyase and UV filters. %A Puig-Butille, Joan Anton %A Malvehy, Josep %A Potrony, Miriam %A Trullas, Carles %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Puig, Susana %K Administration, Topical %K Adult %K Aged %K Aged, 80 and over %K Biopsy %K Deoxyribodipyrimidine Photo-Lyase %K Female %K Gene Expression Profiling %K Gene Expression Regulation, Enzymologic %K Gene Expression Regulation, Neoplastic %K Homeostasis %K Humans %K Inflammation %K Intracellular Signaling Peptides and Proteins %K Liposomes %K Male %K Middle Aged %K Muscle Proteins %K Phenotype %K Phosphoprotein Phosphatases %K Reactive Oxygen Species %K Skin %K Skin Neoplasms %K Ultraviolet Rays %X

Cutaneous field of cancerization (CFC) is caused in part by the carcinogenic effect of the cyclobutane pyrimidine dimers CPD and 6-4 photoproducts (6-4PPs). Photoreactivation is carried out by photolyases which specifically recognize and repair both photoproducts. The study evaluates the molecular effects of topical application of a film-forming medical device containing photolyase and UV filters on the precancerous field in AK from seven patients. Skin improvement after treatment was confirmed in all patients by histopathological and molecular assessment. A gene set analysis showed that skin recovery was associated with biological processes involved in tissue homoeostasis and cell maintenance. The CFC response was associated with over-expression of the CPI-17 gene, and a dependence on the initial expression level was observed (P = 0.001). Low CPI-17 levels were directly associated with pro-inflammatory genes such as TNF (P = 0.012) and IL-1B (P = 0.07). Our results suggest a role for CPI-17 in restoring skin homoeostasis in CFC lesions.

%B Exp Dermatol %V 22 %P 494-6 %8 2013 Jul %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/23800065?dopt=Abstract %R 10.1111/exd.12177 %0 Journal Article %J Nucleic acids research %D 2012 %T CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources. %A Bleda, Marta %A Tárraga, Joaquín %A De Maria, Alejandro %A Salavert, Francisco %A García-Alonso, Luz %A Celma, Matilde %A Martin, Ainoha %A Dopazo, Joaquin %A Medina, Ignacio %X During the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase. %B Nucleic acids research %V 40 %P W609-14 %8 2012 Jul %G eng %U http://nar.oxfordjournals.org/content/40/W1/W609.long %R 10.1093/nar/gks575 %0 Journal Article %J International journal of cancer. Journal international du cancer %D 2012 %T Expression profiling shows differential molecular pathways and provides potential new diagnostic biomarkers for colorectal serrated adenocarcinoma. %A Conesa-Zamora, Pablo %A García-Solano, José %A Garcia-Garcia, Francisco %A Del Carmen Turpin, María %A Trujillo-Santos, Javier %A Torres-Moreno, Daniel %A Oviedo-Ramírez, Isabel %A Carbonell-Muñoz, Rosa %A Muñoz-Delgado, Encarnación %A Rodriguez-Braun, Edith %A Ana Conesa %A Pérez-Guillermo, Miguel %X Serrated adenocarcinoma (SAC) is a recently recognized colorectal cancer (CRC) subtype accounting for 7.5-8.7% of CRCs. It has been shown that SAC has a poorer prognosis and has different molecular and immunohistochemical features compared to conventional carcinoma (CC) but, to date, only one previous study has analysed its mRNA expression profile by microarray. Using a different microarray platform, we have studied the molecular signature of 11 SACs and compared it with that of 15 matched CC with the aim of discerning the functions which characterize SAC biology and validating, at the mRNA and protein level, the most differentially expressed genes which were also tested using a validation set of 70 SACs and 70 CCs to assess their diagnostic and prognostic values. Microarray data showed a higher representation of morphogenesis-, hypoxia-, cytoskeleton- and vesicle transport-related functions and also an over-expression of fascin1 (actin-bundling protein associated with invasion) and the antiapoptotic gene hippocalcin in SAC all of which were validated both by qPCR and immunohistochemistry. Fascin1 expression was statistically associated with KRAS mutation with 88.6% sensitivity and 85.7% specificity for SAC diagnosis and the positivity of fascin1 or hippocalcin was highly suggestive of SAC diagnosis (sensitivity=100%). Evaluation of these markers in CRCs showing histological and molecular characteristics of high-level microsatellite instability (MSI-H) also helped to distinguish SACs from MSI-H CRCs. Molecular profiling demonstrates that SAC shows activation of distinct signalling pathways and that immunohistochemical fascin1 and hippocalcin expression can be reliably used for its differentiation from other CRC subtypes. © 2012 Wiley Periodicals, Inc. %B International journal of cancer. Journal international du cancer %8 2012 Jun 14 %G eng %R 10.1002/ijc.27674 %0 Journal Article %J Orphanet J Rare Dis %D 2012 %T Four new loci associations discovered by pathway-based and network analyses of the genome-wide variability profile of Hirschsprung's disease. %A Fernández, Raquel Ma %A Bleda, Marta %A Núñez-Torres, Rocío %A Medina, Ignacio %A Luzón-Toro, Berta %A García-Alonso, Luz %A Torroglosa, Ana %A Marbà, Martina %A Enguix-Riego, Ma Valle %A Montaner, David %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %K Female %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Genotype %K Hirschsprung Disease %K Humans %K Male %X

Finding gene associations in rare diseases is frequently hampered by the reduced numbers of patients accessible. Conventional gene-based association tests rely on the availability of large cohorts, which constitutes a serious limitation for its application in this scenario. To overcome this problem we have used here a combined strategy in which a pathway-based analysis (PBA) has been initially conducted to prioritize candidate genes in a Spanish cohort of 53 trios of short-segment Hirschsprung's disease. Candidate genes have been further validated in an independent population of 106 trios. The study revealed a strong association of 11 gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other HSCR-related processes. Among the preselected candidates, a total of 4 loci, RASGEF1A, IQGAP2, DLC1 and CHRNA7, related to signal transduction and migration processes, were found to be significantly associated to HSCR. Network analysis also confirms their involvement in the network of already known disease genes. This approach, based on the study of functionally-related gene sets, requires of lower sample sizes and opens new opportunities for the study of rare diseases.

%B Orphanet J Rare Dis %V 7 %P 103 %8 2012 Dec 28 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/23270508?dopt=Abstract %R 10.1186/1750-1172-7-103 %0 Journal Article %J Orphanet journal of rare diseases %D 2012 %T Four new loci associations discovered by pathway-based and network analyses of the genome-wide variability profile of Hirschsprung’s disease. %A Fernández, Raquel Ma %A Bleda, Marta %A Núñez-Torres, Rocío %A Medina, Ignacio %A Luzón-Toro, Berta %A García-Alonso, Luz %A Torroglosa, Ana %A Marbà, Martina %A Enguix-Riego, Ma Valle %A Montaner, David %A Antiňolo, Guillermo %A Joaquín Dopazo %A Borrego, Salud %X ABSTRACT: Finding gene associations in rare diseases is frequently hampered by the reduced numbers of patients accessible. Conventional gene-based association tests rely on the availability of large cohorts, which constitutes a serious limitation for its application in this scenario. To overcome this problem we have used here a combined strategy in which a pathway-based analysis (PBA) has been initially conducted to prioritize candidate genes in a Spanish cohort of 53 trios of short-segment Hirschsprung’s disease. Candidate genes have been further validated in an independent population of 106 trios. The study revealed a strong association of 11 gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other HSCR-related processes. Among the preselected candidates, a total of 4 loci, RASGEF1A, IQGAP2, DLC1 and CHRNA7, related to signal transduction and migration processes, were found to be significantly associated to HSCR. Network analysis also confirms their involvement in the network of already known disease genes. This approach, based on the study of functionally-related gene sets, requires of lower sample sizes and opens new opportunities for the study of rare diseases. %B Orphanet journal of rare diseases %V 7 %P 103 %8 2012 Dec 28 %G eng %U http://www.ojrd.com/content/7/1/103/abstract %R 10.1186/1750-1172-7-103 %0 Journal Article %J Stem Cell Rev Rep %D 2012 %T IL1β induces mesenchymal stem cells migration and leucocyte chemotaxis through NF-κB. %A Carrero, Rubén %A Cerrada, Inmaculada %A Lledó, Elisa %A Dopazo, Joaquin %A Garcia-Garcia, Francisco %A Rubio, Mari-Paz %A Trigueros, César %A Dorronsoro, Akaitz %A Ruiz-Sauri, Amparo %A Montero, José Anastasio %A Sepúlveda, Pilar %K Cell Adhesion %K Cell Movement %K Cell Proliferation %K Chemokines %K Chemotaxis, Leukocyte %K Collagen %K Fibronectins %K Gene Expression Profiling %K Gene Knockdown Techniques %K HEK293 Cells %K Humans %K I-kappa B Kinase %K Inflammation Mediators %K Intercellular Signaling Peptides and Proteins %K Interleukin-1beta %K Laminin %K Leukocytes %K Mesenchymal Stem Cells %K NF-kappa B %K Oligonucleotide Array Sequence Analysis %K RNA Interference %K Signal Transduction %X

Mesenchymal stem cells are often transplanted into inflammatory environments where they are able to survive and modulate host immune responses through a poorly understood mechanism. In this paper we analyzed the responses of MSC to IL-1β: a representative inflammatory mediator. Microarray analysis of MSC treated with IL-1β revealed that this cytokine activateds a set of genes related to biological processes such as cell survival, cell migration, cell adhesion, chemokine production, induction of angiogenesis and modulation of the immune response. Further more detailed analysis by real-time PCR and functional assays revealed that IL-1β mainly increaseds the production of chemokines such as CCL5, CCL20, CXCL1, CXCL3, CXCL5, CXCL6, CXCL10, CXCL11 and CX(3)CL1, interleukins IL-6, IL-8, IL23A, IL32, Toll-like receptors TLR2, TLR4, CLDN1, metalloproteins MMP1 and MMP3, growth factors CSF2 and TNF-α, together with adhesion molecules ICAM1 and ICAM4. Functional analysis of MSC proliferation, migration and adhesion to extracellular matrix components revealed that IL-1β did not affect proliferation but also served to induce the secretion of trophic factors and adhesion to ECM components such as collagen and laminin. IL-1β treatment enhanced the ability of MSC to recruit monocytes and granulocytes in vitro. Blockade of NF-κβ transcription factor activation with IκB kinase beta (IKKβ) shRNA impaired MSC migration, adhesion and leucocyte recruitment, induced by IL-1β demonstrating that NF-κB pathway is an important downstream regulator of these responses. These findings are relevant to understanding the biological responses of MSC to inflammatory environments.

%B Stem Cell Rev Rep %V 8 %P 905-16 %8 2012 Sep %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/22467443?dopt=Abstract %R 10.1007/s12015-012-9364-9 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2012 %T Qualimap: evaluating next-generation sequencing alignment data. %A García-Alcalde, Fernando %A Okonechnikov, Konstantin %A Carbonell, José %A Cruz, Luis M %A Götz, Stefan %A Sonia Tarazona %A Joaquín Dopazo %A Meyer, Thomas F %A Ana Conesa %K NGS %X MOTIVATION: The sequence alignment/map (SAM) and the binary alignment/map (BAM) formats have become the standard method of representation of nucleotide sequence alignments for next-generation sequencing data. SAM/BAM files usually contain information from tens to hundreds of millions of reads. Often, the sequencing technology, protocol and/or the selected mapping algorithm introduce some unwanted biases in these data. The systematic detection of such biases is a non-trivial task that is crucial to drive appropriate downstream analyses. RESULTS: We have developed Qualimap, a Java application that supports user-friendly quality control of mapping data, by considering sequence features and their genomic properties. Qualimap takes sequence alignment data and provides graphical and statistical analyses for the evaluation of data. Such quality-control data are vital for highlighting problems in the sequencing and/or mapping processes, which must be addressed prior to further analyses. AVAILABILITY: Qualimap is freely available from http://www.qualimap.org. CONTACT: aconesa@cipf.es SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics (Oxford, England) %V 28 %P 2678-9 %8 2012 Oct 15 %G eng %U http://bioinformatics.oxfordjournals.org/content/28/20/2678.long %R 10.1093/bioinformatics/bts503 %0 Journal Article %J PloS one %D 2012 %T Transcriptome profiling of the intoxication response of Tenebrio molitor larvae to Bacillus thuringiensis Cry3Aa protoxin. %A Oppert, Brenda %A Dowd, Scot E %A Bouffard, Pascal %A Li, Lewyn %A Ana Conesa %A Lorenzen, Marcé D %A Toutges, Michelle %A Marshall, Jeremy %A Huestis, Diana L %A Fabrick, Jeff %A Oppert, Cris %A Jurat-Fuentes, Juan Luis %K Administration %K Animals %K Bacterial Proteins %K Base Sequence %K Biosynthetic Pathways %K Complementary %K DNA %K Endotoxins %K Energy Metabolism %K Gene Expression Profiling %K Hemolysin Proteins %K Larva %K Microarray Analysis %K Molecular Sequence Data %K Oral %K Sequence Analysis %K Tenebrio %K Time Factors %K Transcriptome %X Bacillus thuringiensis (Bt) crystal (Cry) proteins are effective against a select number of insect pests, but improvements are needed to increase efficacy and decrease time to mortality for coleopteran pests. To gain insight into the Bt intoxication process in Coleoptera, we performed RNA-Seq on cDNA generated from the guts of Tenebrio molitor larvae that consumed either a control diet or a diet containing Cry3Aa protoxin. Approximately 134,090 and 124,287 sequence reads from the control and Cry3Aa-treated groups were assembled into 1,318 and 1,140 contigs, respectively. Enrichment analyses indicated that functions associated with mitochondrial respiration, signalling, maintenance of cell structure, membrane integrity, protein recycling/synthesis, and glycosyl hydrolases were significantly increased in Cry3Aa-treated larvae, whereas functions associated with many metabolic processes were reduced, especially glycolysis, tricarboxylic acid cycle, and fatty acid synthesis. Microarray analysis was used to evaluate temporal changes in gene expression after 6, 12 or 24 h of Cry3Aa exposure. Overall, microarray analysis indicated that transcripts related to allergens, chitin-binding proteins, glycosyl hydrolases, and tubulins were induced, and those related to immunity and metabolism were repressed in Cry3Aa-intoxicated larvae. The 24 h microarray data validated most of the RNA-Seq data. Of the three intoxication intervals, larvae demonstrated more differential expression of transcripts after 12 h exposure to Cry3Aa. Gene expression examined by three different methods in control vs. Cry3Aa-treated larvae at the 24 h time point indicated that transcripts encoding proteins with chitin-binding domain 3 were the most differentially expressed in Cry3Aa-intoxicated larvae. Overall, the data suggest that T. molitor larvae mount a complex response to Cry3Aa during the initial 24 h of intoxication. Data from this study represent the largest genetic sequence dataset for T. molitor to date. Furthermore, the methods in this study are useful for comparative analyses in organisms lacking a sequenced genome. %B PloS one %V 7 %P e34624 %8 2012 %G eng %R 10.1371/journal.pone.0034624 %0 Journal Article %J IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %D 2012 %T Using GPUs for the Exact Alignment of Short-read Genetic Sequences by Means of the Burrows–Wheeler Transform. %A Salavert Torres, Jose %A Blanquer Espert, Ignacio %A Tomas Dominguez, Andres %A Hernendez, Vicente %A Medina, Ignacio %A Terraga, Joaquin %A Dopazo, Joaquin %K Burrows-Wheeler transform %K CPU execution %K GPGPU %K NGS %X General Purpose Graphic Processing Units (GPGPUs) constitute an inexpensive resource for computing-intensive applications that could exploit an intrinsic fine-grain parallelism. This paper presents the design and implementation in GPGPUs of an exact alignment tool for nucleotide sequences based on the Burrows-Wheeler Transform. We compare this algorithm with state-of-the-art implementations of the same algorithm over standard CPUs, and considering the same conditions in terms of I/O. Excluding disk transfers, the implementation of the algorithm in GPUs shows a speedup larger than 12x, when compared to CPU execution. This implementation exploits the parallelism by concurrently searching different sequences on the same reference search tree, maximising memory locality and ensuring a symmetric access to the data. The article describes the behaviour of the algorithm in GPU, showing a good scalability in the performance, only limited by the size of the GPU inner memory. %B IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %V 9 %P 1245-1256 %8 2012 Mar 20 %G eng %U http://ieeexplore.ieee.org.sire.ub.edu/xpl/articleDetails.jsp?reload=true&arnumber=6175888 %R 10.1109/TCBB.2012.49 %0 Journal Article %J IEEE/ACM Trans Comput Biol Bioinform %D 2012 %T Using GPUs for the exact alignment of short-read genetic sequences by means of the Burrows-Wheeler transform. %A Salavert Torres, Jose %A Blanquer Espert, Ignacio %A Domínguez, Andrés Tomás %A Hernández García, Vicente %A Medina Castelló, Ignacio %A Tárraga Giménez, Joaquín %A Dopazo Blázquez, Joaquín %K Algorithms %K Animals %K Computational Biology %K Computer Graphics %K Data Compression %K Drosophila melanogaster %K Genes, Insect %K Image Processing, Computer-Assisted %K Models, Genetic %K Sequence Alignment %K Sequence Analysis, DNA %X

General Purpose Graphic Processing Units (GPGPUs) constitute an inexpensive resource for computing-intensive applications that could exploit an intrinsic fine-grain parallelism. This paper presents the design and implementation in GPGPUs of an exact alignment tool for nucleotide sequences based on the Burrows-Wheeler Transform. We compare this algorithm with state-of-the-art implementations of the same algorithm over standard CPUs, and considering the same conditions in terms of I/O. Excluding disk transfers, the implementation of the algorithm in GPUs shows a speedup larger than 12, when compared to CPU execution. This implementation exploits the parallelism by concurrently searching different sequences on the same reference search tree, maximizing memory locality and ensuring a symmetric access to the data. The paper describes the behavior of the algorithm in GPU, showing a good scalability in the performance, only limited by the size of the GPU inner memory.

%B IEEE/ACM Trans Comput Biol Bioinform %V 9 %P 1245-56 %8 2012 Jul-Aug %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/22450827?dopt=Abstract %R 10.1109/TCBB.2012.49 %0 Journal Article %J IEEE/ACM Transactions on Computational Biology and Bioinformatics %D 2012 %T Using GPUs for the Exact Alignment of Short-Read Genetic Sequences by Means of the Burrows-Wheeler Transform %A Torres, J. S. %A Espert, I. B. %A Dominguez, A. T. %A Garcia, V. Hernendez %A Castello, I. Medina %A Gimenez, J. Terraga %A Blazquez, J. Dopazo %B IEEE/ACM Transactions on Computational Biology and Bioinformatics %V 9 %P 1245 - 1256 %8 Jan-07-2012 %G eng %U http://ieeexplore.ieee.org/document/6175888/http://xplorestaging.ieee.org/ielx5/8857/6202798/06175888.pdf?arnumber=6175888 %N 4 %! IEEE/ACM Trans. Comput. Biol. and Bioinf. %R 10.1109/TCBB.2012.49 %0 Journal Article %J PloS one %D 2011 %T Analysis of normal-tumour tissue interaction in tumours: prediction of prostate cancer features from the molecular profile of adjacent normal cells. %A Trevino, Victor %A Tadesse, Mahlet G %A Vannucci, Marina %A Fatima Al-Shahrour %A Antczak, Philipp %A Durant, Sarah %A Bikfalvi, Andreas %A Dopazo, Joaquin %A Campbell, Moray J %A Falciani, Francesco %X

Statistical modelling, in combination with genome-wide expression profiling techniques, has demonstrated that the molecular state of the tumour is sufficient to infer its pathological state. These studies have been extremely important in diagnostics and have contributed to improving our understanding of tumour biology. However, their importance in in-depth understanding of cancer patho-physiology may be limited since they do not explicitly take into consideration the fundamental role of the tissue microenvironment in specifying tumour physiology. Because of the importance of normal cells in shaping the tissue microenvironment we formulate the hypothesis that molecular components of the profile of normal epithelial cells adjacent the tumour are predictive of tumour physiology. We addressed this hypothesis by developing statistical models that link gene expression profiles representing the molecular state of adjacent normal epithelial cells to tumour features in prostate cancer. Furthermore, network analysis showed that predictive genes are linked to the activity of important secreted factors, which have the potential to influence tumor biology, such as IL1, IGF1, PDGF BB, AGT, and TGFβ.

%B PloS one %V 6 %P e16492 %8 2011 %G eng %0 Journal Article %J Bioinformatics (Oxford, England) %D 2011 %T B2G-FAR, a species centered GO annotation repository. %A Götz, Stefan %A Arnold, Roland %A Sebastián-Leon, Patricia %A Martín-Rodríguez, Samuel %A Tischler, Patrick %A Jehl, Marc-André %A Joaquín Dopazo %A Rattei, Thomas %A Ana Conesa %X

MOTIVATION: Functional genomics research has expanded enormously in the last decade thanks to the cost-reduction in high-throughput technologies and the development of computational tools that generate, standardize and share information on gene and protein function such as the Gene Ontology (GO). Nevertheless many biologists, especially working with non-model organisms, still suffer from non-existing or low coverage functional annotation, or simply struggle retrieving, summarizing and querying these data. RESULTS: The Blast2GO Functional Annotation Repository (B2G-FAR) is a bioinformatics resource envisaged to provide functional information for otherwise uncharacterized sequence-data and offers data-mining tools to analyze a larger repertoire of species than currently available. This new annotation resource has been created by applying the Blast2GO functional annotation engine in a strongly high-throughput manner to the entire space of public available sequences. The resulting repository contains GO term predictions for over 13.2 million non-redundant protein sequences based on BLAST search alignments from the SIMAP database. We generated GO annotation for approximately 150.000 different taxa making available the 2000 species with the highest coverage through B2G-FAR. A second section within B2G-FAR holds functional annotations for 17 non-model organism Affymetrix GeneChips. Conclusions: B2G-FAR provides easy access to exhaustive functional annotation for 2000 species offering a good balance between quality and quantity, thereby supporting functional genomics research especially in the case of non-model organisms. AVAILABILITY: The annotation resource is available at http://b2gfar.bioinfo.cipf.es. CONTACT: aconesa@cipf.es, sgoetz@cipf.es.

%B Bioinformatics (Oxford, England) %V 27 %P 919-924 %8 2011 Feb 18 %G eng %0 Journal Article %J Genome Res %D 2011 %T Differential expression in RNA-seq: a matter of depth. %A Tarazona, Sonia %A García-Alcalde, Fernando %A Dopazo, Joaquin %A Ferrer, Alberto %A Conesa, Ana %K Algorithms %K Expressed Sequence Tags %K Gene Expression Profiling %K Gene Expression Regulation %K Humans %K Models, Genetic %K Oligonucleotide Array Sequence Analysis %X

Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach--NOISeq--that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.

%B Genome Res %V 21 %P 2213-23 %8 2011 Dec %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/21903743?dopt=Abstract %R 10.1101/gr.124321.111 %0 Journal Article %J PLoS pathogens %D 2011 %T Discovery of an ebolavirus-like filovirus in europe. %A Negredo, Ana %A Palacios, Gustavo %A Vázquez-Morón, Sonia %A González, Félix %A Dopazo, Hernán %A Molero, Francisca %A Juste, Javier %A Quetglas, Juan %A Savji, Nazir %A de la Cruz Martínez, Maria %A Herrera, Jesus Enrique %A Pizarro, Manuel %A Hutchison, Stephen K %A Echevarría, Juan E %A Lipkin, W Ian %A Tenorio, Antonio %X

Filoviruses, amongst the most lethal of primate pathogens, have only been reported as natural infections in sub-Saharan Africa and the Philippines. Infections of bats with the ebolaviruses and marburgviruses do not appear to be associated with disease. Here we report identification in dead insectivorous bats of a genetically distinct filovirus, provisionally named Lloviu virus, after the site of detection, Cueva del Lloviu, in Spain.

%B PLoS pathogens %V 7 %P e1002304 %8 2011 Oct %G eng %0 Journal Article %J Protein science : a publication of the Protein Society %D 2011 %T N-glycosylation efficiency is determined by the distance to the C-terminus and the amino acid preceding an Asn-Ser-Thr sequon. %A Bañó-Polo, Manuel %A Baldin, Francesca %A Tamborero, Silvia %A Marti-Renom, Marc A %A Mingarro, Ismael %X

N-glycosylation is the most common and versatile protein modification. In eukaryotic cells, this modification is catalyzed cotranslationally by the enzyme oligosaccharyltransferase, which targets the β-amide of the asparagine in an Asn-Xaa-Ser/Thr consensus sequon (where Xaa is any amino acid but proline) in nascent proteins as they enter the endoplasmic reticulum. Because modification of the glycosylation acceptor site on membrane proteins occurs in a compartment-specific manner, the presence of glycosylation is used to indicate membrane protein topology. Moreover, glycosylation sites can be added to gain topological information. In this study, we explored the determinants of N-glycosylation with the in vitro transcription/translation of a truncated model protein in the presence of microsomes and surveyed 25,488 glycoproteins, of which 2,533 glycosylation sites had been experimentally validated. We found that glycosylation efficiency was dependent on both the distance to the C-terminus and the nature of the amino acid that preceded the consensus sequon. These findings establish a broadly applicable method for membrane protein tagging in topological studies.

%B Protein science : a publication of the Protein Society %V 20 %P 179-86 %8 2011 Jan %G eng %0 Journal Article %J Nucleic Acids Res %D 2011 %T Phylemon 2.0: a suite of web-tools for molecular evolution, phylogenetics, phylogenomics and hypotheses testing. %A Sánchez, Rubén %A Serra, François %A Tárraga, Joaquín %A Medina, Ignacio %A Carbonell, José %A Pulido, Luis %A De Maria, Alejandro %A Capella-Gutíerrez, Salvador %A Huerta-Cepas, Jaime %A Gabaldón, Toni %A Dopazo, Joaquin %A Dopazo, Hernán %K Evolution, Molecular %K Genomics %K Internet %K Phylogeny %K Sequence Alignment %K Software %X

Phylemon 2.0 is a new release of the suite of web tools for molecular evolution, phylogenetics, phylogenomics and hypotheses testing. It has been designed as a response to the increasing demand of molecular sequence analyses for experts and non-expert users. Phylemon 2.0 has several unique features that differentiates it from other similar web resources: (i) it offers an integrated environment that enables evolutionary analyses, format conversion, file storage and edition of results; (ii) it suggests further analyses, thereby guiding the users through the web server; and (iii) it allows users to design and save phylogenetic pipelines to be used over multiple genes (phylogenomics). Altogether, Phylemon 2.0 integrates a suite of 30 tools covering sequence alignment reconstruction and trimming; tree reconstruction, visualization and manipulation; and evolutionary hypotheses testing.

%B Nucleic Acids Res %V 39 %P W470-4 %8 2011 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/21646336?dopt=Abstract %R 10.1093/nar/gkr408 %0 Journal Article %J The Plant journal : for cell and molecular biology %D 2011 %T Role of tomato BRANCHED1-like genes in the control of shoot branching. %A Martín-Trillo, Mar %A Grandío, Eduardo González %A Serra, François %A Marcel, Fabien %A Rodríguez-Buey, María Luisa %A Schmitz, Gregor %A Theres, Klaus %A Bendahmane, Abdelhafid %A Dopazo, Hernán %A Cubas, Pilar %X

In angiosperms, shoot branching greatly determines overall plant architecture and affects fundamental aspects of plant life. Branching patterns are determined by genetic pathways conserved widely across angiosperms. In Arabidopsis thaliana (Brassicaceae, Rosidae) BRANCHED1 (BRC1) plays a central role in this process, acting locally to arrest axillary bud growth. In tomato (Solanum lycopersicum, Solanaceae, Asteridae) we have identified two BRC1-like paralogues, SlBRC1a and SlBRC1b. These genes are expressed in arrested axillary buds and both are down-regulated upon bud activation, although SlBRC1a is transcribed at much lower levels than SlBRC1b. Alternative splicing of SlBRC1a renders two transcripts that encode two BRC1-like proteins with different C-t domains due to a 3’-terminal frameshift. The phenotype of loss-of-function lines suggests that SlBRC1b has retained the ancestral role of BRC1 in shoot branch suppression. We have isolated the BRC1a and BRC1b genes of other Solanum species and have studied their evolution rates across the lineages. These studies indicate that, after duplication of an ancestral BRC1-like gene, BRC1b genes continued to evolve under a strong purifying selection that was consistent with the conserved function of SlBRC1b in shoot branching control. In contrast, the coding sequences of Solanum BRC1a genes have evolved at a higher evolution rate. Branch-site tests indicate that this difference does not reflect relaxation but rather positive selective pressure for adaptation.

%B The Plant journal : for cell and molecular biology %V 67 %P 701-14 %8 2011 Aug %G eng %R 10.1111/j.1365-313X.2011.04629.x %0 Journal Article %J Nucleic Acids Research %D 2010 %T Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling. %A Medina, Ignacio %A Carbonell, José %A Pulido, Luis %A Madeira, Sara C %A Goetz, Stefan %A Ana Conesa %A Tárraga, Joaquín %A Pascual-Montano, Alberto %A Nogales-Cadenas, Ruben %A Santoyo, Javier %A García, Francisco %A Marbà, Martina %A Montaner, David %A Joaquín Dopazo %K babelomics %K gene expression %K genotyping %K gepas %K GSA %K GWAS %X

Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein-protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org.

%B Nucleic Acids Research %V 38 %P W210-W213. Featured in NAR %8 2010 May 16 %G eng %U http://nar.oxfordjournals.org/content/38/suppl_2/W210.full %& Featured in NAR %0 Journal Article %J The ISME journal %D 2010 %T Fine-scale evolution: genomic, phenotypic and ecological differentiation in two coexisting Salinibacter ruber strains. %A Peña, Arantxa %A Teeling, Hanno %A Huerta-Cepas, Jaime %A Santos, Fernando %A Yarza, Pablo %A Brito-Echeverría, Jocelyn %A Lucio, Marianna %A Schmitt-Kopplin, Philippe %A Meseguer, Inmaculada %A Schenowitz, Chantal %A Dossat, Carole %A Barbe, Valerie %A Joaquín Dopazo %A Rosselló-Mora, Ramon %A Schüler, Margarete %A Glöckner, Frank Oliver %A Amann, Rudolf %A Gabaldón, Toni %A Antón, Josefa %X

Genomic and metagenomic data indicate a high degree of genomic variation within microbial populations, although the ecological and evolutive meaning of this microdiversity remains unknown. Microevolution analyses, including genomic and experimental approaches, are so far very scarce for non-pathogenic bacteria. In this study, we compare the genomes, metabolomes and selected ecological traits of the strains M8 and M31 of the hyperhalophilic bacterium Salinibacter ruber that contain ribosomal RNA (rRNA) gene and intergenic regions that are identical in sequence and were simultaneously isolated from a Mediterranean solar saltern. Comparative analyses indicate that S. ruber genomes present a mosaic structure with conserved and hypervariable regions (HVRs). The HVRs or genomic islands, are enriched in transposases, genes related to surface properties, strain-specific genes and highly divergent orthologous. However, the many indels outside the HVRs indicate that genome plasticity extends beyond them. Overall, 10% of the genes encoded in the M8 genome are absent from M31 and could stem from recent acquisitions. S. ruber genomes also harbor 34 genes located outside HVRs that are transcribed during standard growth and probably derive from lateral gene transfers with Archaea preceding the M8/M31 divergence. Metabolomic analyses, phage susceptibility and competition experiments indicate that these genomic differences cannot be considered neutral from an ecological perspective. The results point to the avoidance of competition by micro-niche adaptation and response to viral predation as putative major forces that drive microevolution within these Salinibacter strains. In addition, this work highlights the extent of bacterial functional diversity and environmental adaptation, beyond the resolution of the 16S rRNA and internal transcribed spacers regions.The ISME Journal advance online publication, 18 February 2010; doi:10.1038/ismej.2010.6.

%B The ISME journal %8 2010 Feb 18 %G eng %0 Journal Article %J Pharmacogenomics J %D 2010 %T Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes. %A Shi, W %A Bessarabova, M %A Dosymbekov, D %A Dezso, Z %A Nikolskaya, T %A Dudoladova, M %A Serebryiskaya, T %A Bugrim, A %A Guryanov, A %A Brennan, R J %A Shah, R %A Dopazo, J %A Chen, M %A Deng, Y %A Shi, T %A Jurman, G %A Furlanello, C %A Thomas, R S %A Corton, J C %A Tong, W %A Shi, L %A Nikolsky, Y %K Algorithms %K Databases, Genetic %K Endpoint Determination %K Gene Expression Profiling %K Genomics %K Humans %K Neural Networks, Computer %K Oligonucleotide Array Sequence Analysis %K Phenotype %K Predictive Value of Tests %K Proteins %K Quality Control %X

Gene expression signatures of toxicity and clinical response benefit both safety assessment and clinical practice; however, difficulties in connecting signature genes with the predicted end points have limited their application. The Microarray Quality Control Consortium II (MAQCII) project generated 262 signatures for ten clinical and three toxicological end points from six gene expression data sets, an unprecedented collection of diverse signatures that has permitted a wide-ranging analysis on the nature of such predictive models. A comprehensive analysis of the genes of these signatures and their nonredundant unions using ontology enrichment, biological network building and interactome connectivity analyses demonstrated the link between gene signatures and the biological basis of their predictive power. Different signatures for a given end point were more similar at the level of biological properties and transcriptional control than at the gene level. Signatures tended to be enriched in function and pathway in an end point and model-specific manner, and showed a topological bias for incoming interactions. Importantly, the level of biological similarity between different signatures for a given end point correlated positively with the accuracy of the signature predictions. These findings will aid the understanding, and application of predictive genomic signatures, and support their broader application in predictive medicine.

%B Pharmacogenomics J %V 10 %P 310-23 %8 2010 Aug %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/20676069?dopt=Abstract %R 10.1038/tpj.2010.35 %0 Journal Article %J Stem Cells %D 2010 %T Hypoxia promotes efficient differentiation of human embryonic stem cells to functional endothelium. %A Prado-Lopez, Sonia %A Conesa, Ana %A Armiñán, Ana %A Martínez-Losa, Magdalena %A Escobedo-Lucea, Carmen %A Gandia, Carolina %A Tarazona, Sonia %A Melguizo, Dario %A Blesa, David %A Montaner, David %A Sanz-González, Silvia %A Sepúlveda, Pilar %A Götz, Stefan %A O'Connor, José Enrique %A Moreno, Ruben %A Dopazo, Joaquin %A Burks, Deborah J %A Stojkovic, Miodrag %K Angiopoietin-1 %K Animals %K biomarkers %K Cell Culture Techniques %K Cell Differentiation %K Cell Hypoxia %K Cell Transplantation %K Cells, Cultured %K Down-Regulation %K Embryonic Stem Cells %K Endothelial Cells %K Gene Expression Profiling %K Gene Expression Regulation %K Humans %K Male %K Myocardial Infarction %K Neovascularization, Physiologic %K Oxygen %K Pluripotent Stem Cells %K Rats %K Rats, Nude %K Vascular Endothelial Growth Factor A %X

Early development of mammalian embryos occurs in an environment of relative hypoxia. Nevertheless, human embryonic stem cells (hESC), which are derived from the inner cell mass of blastocyst, are routinely cultured under the same atmospheric conditions (21% O(2)) as somatic cells. We hypothesized that O(2) levels modulate gene expression and differentiation potential of hESC, and thus, we performed gene profiling of hESC maintained under normoxic or hypoxic (1% or 5% O(2)) conditions. Our analysis revealed that hypoxia downregulates expression of pluripotency markers in hESC but increases significantly the expression of genes associated with angio- and vasculogenesis including vascular endothelial growth factor and angiopoitein-like proteins. Consequently, we were able to efficiently differentiate hESC to functional endothelial cells (EC) by varying O(2) levels; after 24 hours at 5% O(2), more than 50% of cells were CD34+. Transplantation of resulting endothelial-like cells improved both systolic function and fractional shortening in a rodent model of myocardial infarction. Moreover, analysis of the infarcted zone revealed that transplanted EC reduced the area of fibrous scar tissue by 50%. Thus, use of hypoxic conditions to specify the endothelial lineage suggests a novel strategy for cellular therapies aimed at repair of damaged vasculature in pathologies such as cerebral ischemia and myocardial infarction.

%B Stem Cells %V 28 %P 407-18 %8 2010 Mar 31 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/20049902?dopt=Abstract %R 10.1002/stem.295 %0 Journal Article %J Nature biotechnology %D 2010 %T The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. %A Shi, Leming %A Campbell, Gregory %A Jones, Wendell D %A Campagne, Fabien %A Wen, Zhining %A Walker, Stephen J %A Su, Zhenqiang %A Chu, Tzu-Ming %A Goodsaid, Federico M %A Pusztai, Lajos %A Shaughnessy, John D %A Oberthuer, André %A Thomas, Russell S %A Paules, Richard S %A Fielden, Mark %A Barlogie, Bart %A Chen, Weijie %A Du, Pan %A Fischer, Matthias %A Furlanello, Cesare %A Gallas, Brandon D %A Ge, Xijin %A Megherbi, Dalila B %A Symmans, W Fraser %A Wang, May D %A Zhang, John %A Bitter, Hans %A Brors, Benedikt %A Bushel, Pierre R %A Bylesjo, Max %A Chen, Minjun %A Cheng, Jie %A Cheng, Jing %A Chou, Jeff %A Davison, Timothy S %A Delorenzi, Mauro %A Deng, Youping %A Devanarayan, Viswanath %A Dix, David J %A Dopazo, Joaquin %A Dorff, Kevin C %A Elloumi, Fathi %A Fan, Jianqing %A Fan, Shicai %A Fan, Xiaohui %A Fang, Hong %A Gonzaludo, Nina %A Hess, Kenneth R %A Hong, Huixiao %A Huan, Jun %A Irizarry, Rafael A %A Judson, Richard %A Juraeva, Dilafruz %A Lababidi, Samir %A Lambert, Christophe G %A Li, Li %A Li, Yanen %A Li, Zhen %A Lin, Simon M %A Liu, Guozhen %A Lobenhofer, Edward K %A Luo, Jun %A Luo, Wen %A McCall, Matthew N %A Nikolsky, Yuri %A Pennello, Gene A %A Perkins, Roger G %A Philip, Reena %A Popovici, Vlad %A Price, Nathan D %A Qian, Feng %A Scherer, Andreas %A Shi, Tieliu %A Shi, Weiwei %A Sung, Jaeyun %A Thierry-Mieg, Danielle %A Thierry-Mieg, Jean %A Thodima, Venkata %A Trygg, Johan %A Vishnuvajjala, Lakshmi %A Wang, Sue Jane %A Wu, Jianping %A Wu, Yichao %A Xie, Qian %A Yousef, Waleed A %A Zhang, Liang %A Zhang, Xuegong %A Zhong, Sheng %A Zhou, Yiming %A Zhu, Sheng %A Arasappan, Dhivya %A Bao, Wenjun %A Lucas, Anne Bergstrom %A Berthold, Frank %A Brennan, Richard J %A Buness, Andreas %A Catalano, Jennifer G %A Chang, Chang %A Chen, Rong %A Cheng, Yiyu %A Cui, Jian %A Czika, Wendy %A Demichelis, Francesca %A Deng, Xutao %A Dosymbekov, Damir %A Eils, Roland %A Feng, Yang %A Fostel, Jennifer %A Fulmer-Smentek, Stephanie %A Fuscoe, James C %A Gatto, Laurent %A Ge, Weigong %A Goldstein, Darlene R %A Guo, Li %A Halbert, Donald N %A Han, Jing %A Harris, Stephen C %A Hatzis, Christos %A Herman, Damir %A Huang, Jianping %A Jensen, Roderick V %A Jiang, Rui %A Johnson, Charles D %A Jurman, Giuseppe %A Kahlert, Yvonne %A Khuder, Sadik A %A Kohl, Matthias %A Li, Jianying %A Li, Li %A Li, Menglong %A Li, Quan-Zhen %A Li, Shao %A Li, Zhiguang %A Liu, Jie %A Liu, Ying %A Liu, Zhichao %A Meng, Lu %A Madera, Manuel %A Martinez-Murillo, Francisco %A Medina, Ignacio %A Meehan, Joseph %A Miclaus, Kelci %A Moffitt, Richard A %A Montaner, David %A Mukherjee, Piali %A Mulligan, George J %A Neville, Padraic %A Nikolskaya, Tatiana %A Ning, Baitang %A Page, Grier P %A Parker, Joel %A Parry, R Mitchell %A Peng, Xuejun %A Peterson, Ron L %A Phan, John H %A Quanz, Brian %A Ren, Yi %A Riccadonna, Samantha %A Roter, Alan H %A Samuelson, Frank W %A Schumacher, Martin M %A Shambaugh, Joseph D %A Shi, Qiang %A Shippy, Richard %A Si, Shengzhu %A Smalter, Aaron %A Sotiriou, Christos %A Soukup, Mat %A Staedtler, Frank %A Steiner, Guido %A Stokes, Todd H %A Sun, Qinglan %A Tan, Pei-Yi %A Tang, Rong %A Tezak, Zivana %A Thorn, Brett %A Tsyganova, Marina %A Turpaz, Yaron %A Vega, Silvia C %A Visintainer, Roberto %A von Frese, Juergen %A Wang, Charles %A Wang, Eric %A Wang, Junwei %A Wang, Wei %A Westermann, Frank %A Willey, James C %A Woods, Matthew %A Wu, Shujian %A Xiao, Nianqing %A Xu, Joshua %A Xu, Lei %A Yang, Lun %A Zeng, Xiao %A Zhang, Jialu %A Zhang, Li %A Zhang, Min %A Zhao, Chen %A Puri, Raj K %A Scherf, Uwe %A Tong, Weida %A Wolfinger, Russell D %X

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

%B Nature biotechnology %V 28 %P 827-38 %8 2010 Aug %G eng %U http://www.nature.com/nbt/journal/v28/n8/full/nbt.1665.html %0 Journal Article %J Nucleic acids research %D 2010 %T SIMAP–a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters. %A Rattei, Thomas %A Tischler, Patrick %A Götz, Stefan %A Jehl, Marc-André %A Hoser, Jonathan %A Arnold, Roland %A Ana Conesa %A Mewes, Hans-Werner %X

The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).

%B Nucleic acids research %V 38 %P D223-6 %8 2010 Jan %G eng %0 Journal Article %J Leuk Lymphoma %D 2009 %T Analysis of chronic lymphotic leukemia transcriptomic profile: differences between molecular subgroups %A Jantus Lewintre, E. %A Reinoso Martin, C. %A Montaner, D. %A Marin, M. %A Jose Terol, M. %A Farras, R. %A Benet, I. %A Calvete, J. J. %A Dopazo, J. %A Garcia-Conde, J. %K cancer %K microarray data analysis %X

B cell chronic lymphocytic leukemia (CLL) is a lymphoproliferative disorder with a variable clinical course. Patients with unmutated IgV(H) gene show a shorter progression-free and overall survival than patients with immunoglobulin heavy chain variable regions (IgV(H)) gene mutated. In addition, BCL6 mutations identify a subgroup of patients with high risk of progression. Gene expression was analysed in 36 early-stage patients using high-density microarrays. Around 150 genes differentially expressed were found according to IgV(H) mutations, whereas no difference was found according to BCL6 mutations. Functional profiling methods allowed us to distinguish KEGG and gene ontology terms showing coordinated gene expression changes across subgroups of CLL. We validated a set of differentially expressed genes according to IgV(H) status, scoring them as putative prognostic markers in CLL. Among them, CRY1, LPL, CD82 and DUSP22 are the ones with at least equal or superior performance to ZAP70 which is actually the most used surrogate marker of IgV(H) status.

%B Leuk Lymphoma %V 50 %P 68-79 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19127482 %0 Journal Article %J BMC Bioinformatics %D 2009 %T Functional assessment of time course microarray data. %A Nueda, Maria José %A Sebastián, Patricia %A Tarazona, Sonia %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Ferrer, Alberto %A Conesa, Ana %K Computer Simulation %K Gene Expression Profiling %K Oligonucleotide Array Sequence Analysis %K Time Factors %X

MOTIVATION: Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.

METHODS: We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.

RESULTS: Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

%B BMC Bioinformatics %V 10 Suppl 6 %P S9 %8 2009 Jun 16 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/19534758?dopt=Abstract %R 10.1186/1471-2105-10-S6-S9 %0 Journal Article %J Nucl. Acids Res. %D 2009 %T Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies %A Medina, Ignacio %A Montaner, David %A Bonifaci, Núria %A Pujana, Miguel Angel %A Carbonell, José %A Tárraga, Joaquín %A Fatima Al-Shahrour %A Dopazo, Joaquin %K babelomics %K gene set %K GESBAP %K pathway-based analysis %K SNP %X

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/

%B Nucl. Acids Res. %V 37 %P W340-344 %G eng %U http://nar.oxfordjournals.org/cgi/content/abstract/37/suppl_2/W340 %R 10.1093/nar/gkp481 %0 Journal Article %J Nucleic Acids Res %D 2009 %T Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies. %A Medina, Ignacio %A Montaner, David %A Bonifaci, Núria %A Pujana, Miguel Angel %A Carbonell, José %A Tárraga, Joaquín %A Al-Shahrour, Fátima %A Dopazo, Joaquin %K Biological Phenomena %K Breast Neoplasms %K Female %K Genes %K Genetic Variation %K Genome-Wide Association Study %K Humans %K Polymorphism, Single Nucleotide %K Software %K User-Computer Interface %X

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/.

%B Nucleic Acids Res %V 37 %P W340-4 %8 2009 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/19502494?dopt=Abstract %R 10.1093/nar/gkp481 %0 Journal Article %J PLoS Negl Trop Dis %D 2009 %T A kernel for open source drug discovery in tropical diseases %A Orti, L. %A Carbajo, R. J. %A Pieper, U. %A Eswar, N. %A Maurer, S. M. %A Rai, A. K. %A Taylor, G. %A Todd, M. H. %A Pineda-Lucena, A. %A Sali, A. %A M. A. Marti-Renom %X BACKGROUND: Conventional patent-based drug development incentives work badly for the developing world, where commercial markets are usually small to non-existent. For this reason, the past decade has seen extensive experimentation with alternative R&D institutions ranging from private-public partnerships to development prizes. Despite extensive discussion, however, one of the most promising avenues-open source drug discovery-has remained elusive. We argue that the stumbling block has been the absence of a critical mass of preexisting work that volunteers can improve through a series of granular contributions. Historically, open source software collaborations have almost never succeeded without such "kernels". METHODOLOGY/PRINCIPAL FINDINGS: HERE, WE USE A COMPUTATIONAL PIPELINE FOR: (i) comparative structure modeling of target proteins, (ii) predicting the localization of ligand binding sites on their surfaces, and (iii) assessing the similarity of the predicted ligands to known drugs. Our kernel currently contains 143 and 297 protein targets from ten pathogen genomes that are predicted to bind a known drug or a molecule similar to a known drug, respectively. The kernel provides a source of potential drug targets and drug candidates around which an online open source community can nucleate. Using NMR spectroscopy, we have experimentally tested our predictions for two of these targets, confirming one and invalidating the other. CONCLUSIONS/SIGNIFICANCE: The TDI kernel, which is being offered under the Creative Commons attribution share-alike license for free and unrestricted use, can be accessed on the World Wide Web at http://www.tropicaldisease.org. We hope that the kernel will facilitate collaborative efforts towards the discovery of new drugs against parasites that cause tropical diseases. %B PLoS Negl Trop Dis %V 3 %P e418 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19381286 %0 Journal Article %J Nat Biotechnol %D 2009 %T A kernel for the Tropical Disease Initiative %A Orti, L. %A Carbajo, R. J. %A Pieper, U. %A Eswar, N. %A Maurer, S. M. %A Rai, A. K. %A Taylor, G. %A Todd, M. H. %A Pineda-Lucena, A. %A Sali, A. %A M. A. Marti-Renom %B Nat Biotechnol %V 27 %P 320-1 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19352362 %0 Journal Article %J Funct Integr Genomics %D 2009 %T Membrane transporters and carbon metabolism implicated in chloride homeostasis differentiate salt stress responses in tolerant and sensitive Citrus rootstocks %A Brumos, J. %A Colmenero-Flores, J. M. %A A. Conesa %A Izquierdo, P. %A Sanchez, G. %A Iglesias, D. J. %A Lopez-Climent, M. F. %A Gomez-Cadenas, A. %A Talon, M. %X

Salinity tolerance in Citrus is strongly related to leaf chloride accumulation. Both chloride homeostasis and specific genetic responses to Cl(-) toxicity are issues scarcely investigated in plants. To discriminate the transcriptomic network related to Cl(-) toxicity and salinity tolerance, we have used two Cl(-) salt treatments (NaCl and KCl) to perform a comparative microarray approach on two Citrus genotypes, the salt-sensitive Carrizo citrange, a poor Cl(-) excluder, and the tolerant Cleopatra mandarin, an efficient Cl(-) excluder. The data indicated that Cl(-) toxicity, rather than Na(+) toxicity and/or the concomitant osmotic perturbation, is the primary factor involved in the molecular responses of citrus plant leaves to salinity. A number of uncharacterized membrane transporter genes, like NRT1-2, were differentially regulated in the tolerant and the sensitive genotypes, suggesting its potential implication in Cl(-) homeostasis. Analyses of enriched functional categories showed that the tolerant rootstock induced wider stress responses in gene expression while repressing central metabolic processes such as photosynthesis and carbon utilization. These features were in agreement with phenotypic changes in the patterns of photosynthesis, transpiration, and stomatal conductance and support the concept that regulation of transpiration and its associated metabolic adjustments configure an adaptive response to salinity that reduces Cl(-) accumulation in the tolerant genotype.

%B Funct Integr Genomics %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19190944 %0 Journal Article %J Nucleic Acids Res %D 2008 %T Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments %A Fatima Al-Shahrour %A Carbonell, J. %A Minguez, P. %A Goetz, S. %A A. Conesa %A Tarraga, J. %A Medina, Ignacio %A Alloza, E. %A Montaner, D. %A Dopazo, J. %K babelomics %K funtional profiling %X

We present a new version of Babelomics, a complete suite of web tools for the functional profiling of genome scale experiments, with new and improved methods as well as more types of functional definitions. Babelomics includes different flavours of conventional functional enrichment methods as well as more advanced gene set analysis methods that makes it a unique tool among the similar resources available. In addition to the well-known functional definitions (GO, KEGG), Babelomics includes new ones such as Biocarta pathways or text mining-derived functional terms. Regulatory modules implemented include transcriptional control (Transfac, CisRed) and other levels of regulation such as miRNA-mediated interference. Moreover, Babelomics allows for sub-selection of terms in order to test more focused hypothesis. Also gene annotation correspondence tables can be imported, which allows testing with user-defined functional modules. Finally, a tool for the ’de novo’ functional annotation of sequences has been included in the system. This allows using yet unannotated organisms in the program. Babelomics has been extensively re-engineered and now it includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. Babelomics is available at http://www.babelomics.org.

%B Nucleic Acids Res %V 36 %P W341-6 %G eng %U http://nar.oxfordjournals.org/content/36/suppl_2/W341.long %0 Journal Article %J Nucleic Acids Res %D 2008 %T GEPAS, a web-based tool for microarray data analysis and interpretation. %A Tárraga, Joaquín %A Medina, Ignacio %A Carbonell, José %A Huerta-Cepas, Jaime %A Minguez, Pablo %A Alloza, Eva %A Al-Shahrour, Fátima %A Vegas-Azcárate, Susana %A Goetz, Stefan %A Escobar, Pablo %A Garcia-Garcia, Francisco %A Conesa, Ana %A Montaner, David %A Dopazo, Joaquin %K Computer Graphics %K Dose-Response Relationship, Drug %K Gene Expression Profiling %K Internet %K Kinetics %K Oligonucleotide Array Sequence Analysis %K Software %X

Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 36 %P W308-14 %8 2008 Jul 01 %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/18508806?dopt=Abstract %R 10.1093/nar/gkn303 %0 Journal Article %J Nucleic Acids Res %D 2008 %T GEPAS, a web-based tool for microarray data analysis and interpretation %A Tarraga, J. %A Medina, Ignacio %A Carbonell, J. %A Huerta-Cepas, J. %A Minguez, P. %A Alloza, E. %A Fatima Al-Shahrour %A Vegas-Azcarate, S. %A Goetz, S. %A Escobar, P. %A Garcia-Garcia, F. %A A. Conesa %A Montaner, D. %A Dopazo, J. %K gepas %K microarray data analysis %X

Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 36 %P W308-14 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18508806 %0 Journal Article %J Nucleic Acids Res %D 2008 %T High-throughput functional annotation and data mining with the Blast2GO suite. %A Götz, Stefan %A García-Gómez, Juan Miguel %A Terol, Javier %A Williams, Tim D %A Nagaraj, Shivashankar H %A Nueda, Maria José %A Robles, Montserrat %A Talon, Manuel %A Dopazo, Joaquin %A Conesa, Ana %K Animals %K Computational Biology %K Computer Graphics %K Databases, Genetic %K Expressed Sequence Tags %K Genes %K Genomics %K Sequence Analysis, DNA %K Sequence Analysis, Protein %K Software %K Vocabulary, Controlled %X

Functional genomics technologies have been widely adopted in the biological research of both model and non-model species. An efficient functional annotation of DNA or protein sequences is a major requirement for the successful application of these approaches as functional information on gene products is often the key to the interpretation of experimental results. Therefore, there is an increasing need for bioinformatics resources which are able to cope with large amount of sequence data, produce valuable annotation results and are easily accessible to laboratories where functional genomics projects are being undertaken. We present the Blast2GO suite as an integrated and biologist-oriented solution for the high-throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. The most outstanding Blast2GO features are: (i) the combination of various annotation strategies and tools controlling type and intensity of annotation, (ii) the numerous graphical features such as the interactive GO-graph visualization for gene-set function profiling or descriptive charts, (iii) the general sequence management features and (iv) high-throughput capabilities. We used the Blast2GO framework to carry out a detailed analysis of annotation behaviour through homology transfer and its impact in functional genomics research. Our aim is to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data.

%B Nucleic Acids Res %V 36 %P 3420-35 %8 2008 Jun %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/18445632?dopt=Abstract %R 10.1093/nar/gkn176 %0 Journal Article %J Brief Bioinform %D 2008 %T Interoperability with Moby 1.0--it's better than sharing your toothbrush! %A Wilkinson, Mark D %A Senger, Martin %A Kawas, Edward %A Bruskiewich, Richard %A Gouzy, Jerome %A Noirot, Celine %A Bardou, Philippe %A Ng, Ambrose %A Haase, Dirk %A Saiz, Enrique de Andres %A Wang, Dennis %A Gibbons, Frank %A Gordon, Paul M K %A Sensen, Christoph W %A Carrasco, Jose Manuel Rodriguez %A Fernández, José M %A Shen, Lixin %A Links, Matthew %A Ng, Michael %A Opushneva, Nina %A Neerincx, Pieter B T %A Leunissen, Jack A M %A Ernst, Rebecca %A Twigger, Simon %A Usadel, Bjorn %A Good, Benjamin %A Wong, Yan %A Stein, Lincoln %A Crosby, William %A Karlsson, Johan %A Royo, Romina %A Párraga, Iván %A Ramírez, Sergio %A Gelpi, Josep Lluis %A Trelles, Oswaldo %A Pisano, David G %A Jimenez, Natalia %A Kerhornou, Arnaud %A Rosset, Roman %A Zamacola, Leire %A Tárraga, Joaquín %A Huerta-Cepas, Jaime %A Carazo, Jose María %A Dopazo, Joaquin %A Guigó, Roderic %A Navarro, Arcadi %A Orozco, Modesto %A Valencia, Alfonso %A Claros, M Gonzalo %A Pérez, Antonio J %A Aldana, Jose %A Rojano, M Mar %A Fernandez-Santa Cruz, Raul %A Navas, Ismael %A Schiltz, Gary %A Farmer, Andrew %A Gessler, Damian %A Schoof, Heiko %A Groscurth, Andreas %K Computational Biology %K Database Management Systems %K Databases, Factual %K Information Storage and Retrieval %K Internet %K Programming Languages %K Systems Integration %X

The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.0 version of the interoperability framework, registry Application Programming Interface and supporting Perl and Java code-bases. Together, these provide interoperable access to over 1400 bioinformatics resources worldwide through the BioMoby platform, and this number continues to grow. Here we highlight and discuss the features of BioMoby that make it distinct from other Semantic Web Service and interoperability initiatives, and that have been instrumental to its deployment and use by a wide community of bioinformatics service providers. The standard, client software, and supporting code libraries are all freely available at http://www.biomoby.org/.

%B Brief Bioinform %V 9 %P 220-31 %8 2008 May %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/18238804?dopt=Abstract %R 10.1093/bib/bbn003 %0 Journal Article %J Brief Bioinform %D 2008 %T Interoperability with Moby 1.0–it’s better than sharing your toothbrush! %A Wilkinson, M. D. %A Senger, M. %A Kawas, E. %A Bruskiewich, R. %A Gouzy, J. %A Noirot, C. %A Bardou, P. %A Ng, A. %A Haase, D. %A Saiz Ede, A. %A Wang, D. %A Gibbons, F. %A Gordon, P. M. %A Sensen, C. W. %A Carrasco, J. M. %A Fernandez, J. M. %A Shen, L. %A Links, M. %A Ng, M. %A Opushneva, N. %A Neerincx, P. B. %A Leunissen, J. A. %A Ernst, R. %A Twigger, S. %A Usadel, B. %A Good, B. %A Wong, Y. %A Stein, L. %A Crosby, W. %A Karlsson, J. %A Royo, R. %A Parraga, I. %A Ramirez, S. %A Gelpi, J. L. %A Trelles, O. %A Pisano, D. G. %A Jimenez, N. %A Kerhornou, A. %A Rosset, R. %A Zamacola, L. %A Tarraga, J. %A Huerta-Cepas, J. %A Carazo, J. M. %A Dopazo, J. %A R. Guigo %A Navarro, A. %A Orozco, M. %A Valencia, A. %A Claros, M. G. %A Perez, A. J. %A Aldana, J. %A Rojano, M. M. %A Fernandez-Santa Cruz, R. %A Navas, I. %A Schiltz, G. %A Farmer, A. %A Gessler, D. %A Schoof, H. %A Groscurth, A. %K Computational Biology/*methods *Database Management Systems *Databases %K Factual Information Storage and Retrieval/*methods *Internet *Programming Languages Systems Integration %X

The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.0 version of the interoperability framework, registry Application Programming Interface and supporting Perl and Java code-bases. Together, these provide interoperable access to over 1400 bioinformatics resources worldwide through the BioMoby platform, and this number continues to grow. Here we highlight and discuss the features of BioMoby that make it distinct from other Semantic Web Service and interoperability initiatives, and that have been instrumental to its deployment and use by a wide community of bioinformatics service providers. The standard, client software, and supporting code libraries are all freely available at http://www.biomoby.org/.

%B Brief Bioinform %V 9 %P 220-31 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18238804 %0 Journal Article %J Nat Genet %D 2008 %T SNP and haplotype mapping for genetic analysis in the rat %A K. Saar %A A. Beck %A M. T. Bihoreau %A E. Birney %A D. Brocklebank %A Y. Chen %A E. Cuppen %A S. Demonchy %A Dopazo, J. %A P. Flicek %A M. Foglio %A A. Fujiyama %A I. G. Gut %A D. Gauguier %A R. Guigo %A V. Guryev %A M. Heinig %A O. Hummel %A N. Jahn %A S. Klages %A V. Kren %A M. Kube %A H. Kuhl %A Kuramoto, T. %A Kuroki, Y. %A Lechner, D. %A Lee, Y. A. %A Lopez-Bigas, N. %A Lathrop, G. M. %A Mashimo, T. %A Medina, Ignacio %A Mott, R. %A Patone, G. %A Perrier-Cornet, J. A. %A Platzer, M. %A Pravenec, M. %A Reinhardt, R. %A Sakaki, Y. %A Schilhabel, M. %A Schulz, H. %A Serikawa, T. %A Shikhagaie, M. %A Tatsumoto, S. %A Taudien, S. %A Toyoda, A. %A Voigt, B. %A Zelenika, D. %A Zimdahl, H. %A Hubner, N. %K Animals Chromosome Mapping *Databases %K Genetic %K Genetic Genome *Haplotypes Linkage Disequilibrium Phylogeny *Polymorphism %K Inbred Strains/*genetics Recombination %K Single Nucleotide *Quantitative Trait Loci Rats/*genetics Rats %X

The laboratory rat is one of the most extensively studied model organisms. Inbred laboratory rat strains originated from limited Rattus norvegicus founder populations, and the inherited genetic variation provides an excellent resource for the correlation of genotype to phenotype. Here, we report a survey of genetic variation based on almost 3 million newly identified SNPs. We obtained accurate and complete genotypes for a subset of 20,238 SNPs across 167 distinct inbred rat strains, two rat recombinant inbred panels and an F2 intercross. Using 81% of these SNPs, we constructed high-density genetic maps, creating a large dataset of fully characterized SNPs for disease gene mapping. Our data characterize the population structure and illustrate the degree of linkage disequilibrium. We provide a detailed SNP map and demonstrate its utility for mapping of quantitative trait loci. This community resource is openly available and augments the genetic tools for this workhorse of physiological studies.

%B Nat Genet %V 40 %P 560-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18443594 %0 Journal Article %J Nat Genet %D 2008 %T SNP and haplotype mapping for genetic analysis in the rat. %A Saar, Kathrin %A Beck, Alfred %A Bihoreau, Marie-Thérèse %A Birney, Ewan %A Brocklebank, Denise %A Chen, Yuan %A Cuppen, Edwin %A Demonchy, Stephanie %A Dopazo, Joaquin %A Flicek, Paul %A Foglio, Mario %A Fujiyama, Asao %A Gut, Ivo G %A Gauguier, Dominique %A Guigó, Roderic %A Guryev, Victor %A Heinig, Matthias %A Hummel, Oliver %A Jahn, Niels %A Klages, Sven %A Kren, Vladimir %A Kube, Michael %A Kuhl, Heiner %A Kuramoto, Takashi %A Kuroki, Yoko %A Lechner, Doris %A Lee, Young-Ae %A Lopez-Bigas, Nuria %A Lathrop, G Mark %A Mashimo, Tomoji %A Medina, Ignacio %A Mott, Richard %A Patone, Giannino %A Perrier-Cornet, Jeanne-Antide %A Platzer, Matthias %A Pravenec, Michal %A Reinhardt, Richard %A Sakaki, Yoshiyuki %A Schilhabel, Markus %A Schulz, Herbert %A Serikawa, Tadao %A Shikhagaie, Medya %A Tatsumoto, Shouji %A Taudien, Stefan %A Toyoda, Atsushi %A Voigt, Birger %A Zelenika, Diana %A Zimdahl, Heike %A Hubner, Norbert %K Animals %K Chromosome Mapping %K Databases, Genetic %K Genome %K Haplotypes %K Linkage Disequilibrium %K Phylogeny %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Rats %K Rats, Inbred Strains %K Recombination, Genetic %X

The laboratory rat is one of the most extensively studied model organisms. Inbred laboratory rat strains originated from limited Rattus norvegicus founder populations, and the inherited genetic variation provides an excellent resource for the correlation of genotype to phenotype. Here, we report a survey of genetic variation based on almost 3 million newly identified SNPs. We obtained accurate and complete genotypes for a subset of 20,238 SNPs across 167 distinct inbred rat strains, two rat recombinant inbred panels and an F2 intercross. Using 81% of these SNPs, we constructed high-density genetic maps, creating a large dataset of fully characterized SNPs for disease gene mapping. Our data characterize the population structure and illustrate the degree of linkage disequilibrium. We provide a detailed SNP map and demonstrate its utility for mapping of quantitative trait loci. This community resource is openly available and augments the genetic tools for this workhorse of physiological studies.

%B Nat Genet %V 40 %P 560-6 %8 2008 May %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/18443594?dopt=Abstract %R 10.1038/ng.124 %0 Journal Article %J Gastroenterology %D 2008 %T Transcriptional profiling of mRNA expression in the mouse distal colon %A Hoogerwerf, W. A. %A Sinha, M. %A A. Conesa %A Luxon, B. A. %A Shahinian, V. B. %A Cornelissen, G. %A Halberg, F. %A Bostwick, J. %A Timm, J. %A Cassone, V. M. %K Animals Blotting %K Genetic %K Inbred C57BL Microarray Analysis Proteins/*genetics/metabolism RNA %K Messenger/biosynthesis/*genetics Reverse Transcriptase Polymerase Chain Reaction *Transcription %K Western Cell Proliferation Circadian Rhythm/*genetics Colon/cytology/*metabolism Male Mice Mice %X BACKGROUND & AIMS: Intestinal epithelial cells and the myenteric plexus of the mouse gastrointestinal tract contain a circadian clock-based intrinsic time-keeping system. Because disruption of the biological clock has been associated with increased susceptibility to colon cancer and gastrointestinal symptoms, we aimed to identify rhythmically expressed genes in the mouse distal colon. METHODS: Microarray analysis was used to identify genes that were rhythmically expressed over a 24-hour light/dark cycle. The transcripts were then classified according to expression pattern, function, and association with physiologic and pathophysiologic processes of the colon. RESULTS: A circadian gene expression pattern was detected in approximately 3.7% of distal colonic genes. A large percentage of these genes were involved in cell signaling, differentiation, and proliferation and cell death. Of all the rhythmically expressed genes in the mouse colon, approximately 7% (64/906) have been associated with colorectal cancer formation (eg, B-cell leukemia/lymphoma-2 [Bcl2]) and 1.8% (18/906) with various colonic functions such as motility and secretion (eg, vasoactive intestinal polypeptide, cystic fibrosis transmembrane conductance regulator). CONCLUSIONS: A subset of genes in the murine colon follows a rhythmic expression pattern. These findings may have significant implications for colonic physiology and pathophysiology. %B Gastroenterology %V 135 %P 2019-29 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18848557 %0 Journal Article %J BMC Genomics %D 2007 %T Analysis of 13000 unique Citrus clusters associated with fruit quality, production and salinity tolerance %A Terol, J. %A A. Conesa %A Colmenero, J. M. %A Cercos, M. %A Tadeo, F. %A Agusti, J. %A Alos, E. %A Andres, F. %A Soler, G. %A Brumos, J. %A Iglesias, D. J. %A Gotz, S. %A Legaz, F. %A Argout, X. %A Courtois, B. %A Ollitrault, P. %A Dossat, C. %A Wincker, P. %A Morillon, R. %A Talon, M. %K Acclimatization/*genetics Amino Acid Motifs Citrus/*genetics Cluster Analysis Expressed Sequence Tags Fruit/genetics Gene Duplication *Gene Expression Regulation %K Plant Gene Library Genes %K Plant Genomics Molecular Sequence Data Multigene Family Phylogeny *Salts/adverse effects %X BACKGROUND: Improvement of Citrus, the most economically important fruit crop in the world, is extremely slow and inherently costly because of the long-term nature of tree breeding and an unusual combination of reproductive characteristics. Aside from disease resistance, major commercial traits in Citrus are improved fruit quality, higher yield and tolerance to environmental stresses, especially salinity. RESULTS: A normalized full length and 9 standard cDNA libraries were generated, representing particular treatments and tissues from selected varieties (Citrus clementina and C. sinensis) and rootstocks (C. reshni, and C. sinenis x Poncirus trifoliata) differing in fruit quality, resistance to abscission, and tolerance to salinity. The goal of this work was to provide a large expressed sequence tag (EST) collection enriched with transcripts related to these well appreciated agronomical traits. Towards this end, more than 54000 ESTs derived from these libraries were analyzed and annotated. Assembly of 52626 useful sequences generated 15664 putative transcription units distributed in 7120 contigs, and 8544 singletons. BLAST annotation produced significant hits for more than 80% of the hypothetical transcription units and suggested that 647 of these might be Citrus specific unigenes. The unigene set, composed of 13000 putative different transcripts, including more than 5000 novel Citrus genes, was assigned with putative functions based on similarity, GO annotations and protein domains CONCLUSION: Comparative genomics with Arabidopsis revealed the presence of putative conserved orthologs and single copy genes in Citrus and also the occurrence of both gene duplication events and increased number of genes for specific pathways. In addition, phylogenetic analysis performed on the ammonium transporter family and glycosyl transferase family 20 suggested the existence of Citrus paralogs. Analysis of the Citrus gene space showed that the most important metabolic pathways known to affect fruit quality were represented in the unigene set. Overall, the similarity analyses indicated that the sequences of the genes belonging to these varieties and rootstocks were essentially identical, suggesting that the differential behaviour of these species cannot be attributed to major sequence divergences. This Citrus EST assembly contributes both crucial information to discover genes of agronomical interest and tools for genetic and genomic analyses, such as the development of new markers and microarrays. %B BMC Genomics %V 8 %P 31 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17254327 %0 Journal Article %J Bioinformatics %D 2007 %T Discovering gene expression patterns in time course microarray experiments by ANOVA-SCA %A Nueda, M. J. %A A. Conesa %A Westerhuis, J. A. %A Hoefsloot, H. C. %A Smilde, A. K. %A Talon, M. %A Ferrer, A. %K Algorithms *Analysis of Variance Computational Biology/*methods Computer Simulation Data Interpretation %K Genetic %K Genetic Models %K Statistical Gene Expression Profiling/*methods Models %K Statistical Oligonucleotide Array Sequence Analysis/*methods Principal Component Analysis Time Factors Transcription %X MOTIVATION: Designed microarray experiments are used to investigate the effects that controlled experimental factors have on gene expression and learn about the transcriptional responses associated with external variables. In these datasets, signals of interest coexist with varying sources of unwanted noise in a framework of (co)relation among the measured variables and with the different levels of the studied factors. Discovering experimentally relevant transcriptional changes require methodologies that take all these elements into account. RESULTS: In this work, we develop the application of the Analysis of variance-simultaneous component analysis (ANOVA-SCA) Smilde et al. Bioinformatics, (2005) to the analysis of multiple series time course microarray data as an example of multifactorial gene expression profiling experiments. We denoted this implementation as ASCA-genes. We show how the combination of ANOVA-modeling and a dimension reduction technique is effective in extracting targeted signals from data by-passing structural noise. The methodology is valuable for identifying main and secondary responses associated with the experimental factors and spotting relevant experimental conditions. We additionally propose a novel approach for gene selection in the context of the relation of individual transcriptional patterns to global gene expression signals. We demonstrate the methodology on both real and synthetic datasets. AVAILABILITY: ASCA-genes has been implemented in the statistical language R and is available at http://www.ivia.es/centrodegenomica/bioinformatics.htm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics %V 23 %P 1792-800 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17519250 %0 Journal Article %J Nucleic Acids Res %D 2007 %T FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments %A Fatima Al-Shahrour %A Minguez, P. %A Tarraga, J. %A Medina, Ignacio %A Alloza, E. %A Montaner, D. %A Dopazo, J. %K babelomics %K functional enrichment analysys %X

The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the data, relating the available information with the hypotheses that originated the experiment. Thus, functional profiling methods have become essential in diverse scenarios such as microarray experiments, proteomics, etc. We present the FatiGO+, a web-based tool for the functional profiling of genome-scale experiments, specially oriented to the interpretation of microarray experiments. In addition to different functional annotations (gene ontology, KEGG pathways, Interpro motifs, Swissprot keywords and text-mining based bioentities related to diseases and chemical compounds) FatiGO+ includes, as a novelty, regulatory and structural information. The regulatory information used includes predictions of targets for distinct regulatory elements (obtained from the Transfac and CisRed databases). Additionally FatiGO+ uses predictions of target motifs of miRNA to infer which of these can be activated or deactivated in the sample of genes studied. Finally, properties of gene products related to their relative location and connections in the interactome have also been used. Also, enrichment of any of these functional terms can be directly analysed on chromosomal coordinates. FatiGO+ can be found at: http://www.fatigoplus.org and within the Babelomics environment http://www.babelomics.org.

%B Nucleic Acids Res %V 35 %P W91-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17478504 %0 Journal Article %J Nucleic Acids Res %D 2007 %T FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. %A Al-Shahrour, Fátima %A Minguez, Pablo %A Tárraga, Joaquín %A Medina, Ignacio %A Alloza, Eva %A Montaner, David %A Dopazo, Joaquin %K Amino Acid Motifs %K Animals %K Binding Sites %K Computational Biology %K Gene Expression Profiling %K Genes %K Genomics %K Humans %K Internet %K Oligonucleotide Array Sequence Analysis %K Programming Languages %K Software %K Systems Integration %K Transcription Factors %X

The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the data, relating the available information with the hypotheses that originated the experiment. Thus, functional profiling methods have become essential in diverse scenarios such as microarray experiments, proteomics, etc. We present the FatiGO+, a web-based tool for the functional profiling of genome-scale experiments, specially oriented to the interpretation of microarray experiments. In addition to different functional annotations (gene ontology, KEGG pathways, Interpro motifs, Swissprot keywords and text-mining based bioentities related to diseases and chemical compounds) FatiGO+ includes, as a novelty, regulatory and structural information. The regulatory information used includes predictions of targets for distinct regulatory elements (obtained from the Transfac and CisRed databases). Additionally FatiGO+ uses predictions of target motifs of miRNA to infer which of these can be activated or deactivated in the sample of genes studied. Finally, properties of gene products related to their relative location and connections in the interactome have also been used. Also, enrichment of any of these functional terms can be directly analysed on chromosomal coordinates. FatiGO+ can be found at: http://www.fatigoplus.org and within the Babelomics environment http://www.babelomics.org.

%B Nucleic Acids Res %V 35 %P W91-6 %8 2007 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/17478504?dopt=Abstract %R 10.1093/nar/gkm260 %0 Journal Article %J Bioinformation %D 2007 %T Functional profiling and gene expression analysis of chromosomal copy number alterations %A L. Conde %A Montaner, D. %A Burguet-Castell, J. %A Tarraga, J. %A Fatima Al-Shahrour %A Dopazo, J. %K babelomics %X

Contrarily to the traditional view in which only one or a few key genes were supposed to be the causative factors of diseases, we discuss the importance of considering groups of functionally related genes in the study of pathologies characterised by chromosomal copy number alterations. Recent observations have reported the existence of regions in higher eukaryotic chromosomes (including humans) containing genes of related function that show a high degree of coregulation. Copy number alterations will consequently affect to clusters of functionally related genes, which will be the final causative agents of the diseased phenotype, in many cases. Therefore, we propose that the functional profiling of the regions affected by copy number alterations must be an important aspect to take into account in the understanding of this type of pathologies. To illustrate this, we present an integrated study of DNA copy number variations, gene expression along with the functional profiling of chromosomal regions in a case of multiple myeloma.

%B Bioinformation %V 1 %P 432-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17597935 %0 Journal Article %J Bioinformation %D 2007 %T Functional profiling and gene expression analysis of chromosomal copy number alterations. %A Conde, Lucia %A Montaner, David %A Burguet-Castell, Jordi %A Tárraga, Joaquín %A Al-Shahrour, Fátima %A Dopazo, Joaquin %X

Contrarily to the traditional view in which only one or a few key genes were supposed to be the causative factors of diseases, we discuss the importance of considering groups of functionally related genes in the study of pathologies characterised by chromosomal copy number alterations. Recent observations have reported the existence of regions in higher eukaryotic chromosomes (including humans) containing genes of related function that show a high degree of coregulation. Copy number alterations will consequently affect to clusters of functionally related genes, which will be the final causative agents of the diseased phenotype, in many cases. Therefore, we propose that the functional profiling of the regions affected by copy number alterations must be an important aspect to take into account in the understanding of this type of pathologies. To illustrate this, we present an integrated study of DNA copy number variations, gene expression along with the functional profiling of chromosomal regions in a case of multiple myeloma.

%B Bioinformation %V 1 %P 432-5 %8 2007 Apr 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/17597935?dopt=Abstract %R 10.6026/97320630001432 %0 Journal Article %J Nucleic Acids Res %D 2007 %T ISACGH: a web-based environment for the analysis of Array CGH and gene expression which includes functional profiling %A L. Conde %A Montaner, D. %A Burguet-Castell, J. %A Tarraga, J. %A Medina, Ignacio %A Fatima Al-Shahrour %A Dopazo, J. %K Animals Cluster Analysis Computational Biology/*methods Computer Graphics Gene Expression Profiling/*methods Humans Internet Models %K Genetic *Nucleic Acid Hybridization Oligonucleotide Array Sequence Analysis/*methods Programming Languages *Software Systems Integration User-Computer Interface %X We present the ISACGH, a web-based system that allows for the combination of genomic data with gene expression values and provides different options for functional profiling of the regions found. Several visualization options offer a convenient representation of the results. Different efficient methods for accurate estimation of genomic copy number from array-CGH hybridization data have been included in the program. Moreover, the connection to the gene expression analysis package GEPAS allows the use of different facilities for data pre-processing and analysis. A DAS server allows exporting the results to the Ensembl viewer where contextual genomic information can be obtained. The program is freely available at: http://isacgh.bioinfo.cipf.es or within http://www.gepas.org. %B Nucleic Acids Res %V 35 %P W81-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17468499 %0 Journal Article %J Nucleic Acids Res %D 2007 %T ISACGH: a web-based environment for the analysis of Array CGH and gene expression which includes functional profiling. %A Conde, Lucia %A Montaner, David %A Burguet-Castell, Jordi %A Tárraga, Joaquín %A Medina, Ignacio %A Al-Shahrour, Fátima %A Dopazo, Joaquin %K Animals %K Cluster Analysis %K Computational Biology %K Computer Graphics %K Gene Expression Profiling %K Humans %K Internet %K Models, Genetic %K Nucleic Acid Hybridization %K Oligonucleotide Array Sequence Analysis %K Programming Languages %K Software %K Systems Integration %K User-Computer Interface %X

We present the ISACGH, a web-based system that allows for the combination of genomic data with gene expression values and provides different options for functional profiling of the regions found. Several visualization options offer a convenient representation of the results. Different efficient methods for accurate estimation of genomic copy number from array-CGH hybridization data have been included in the program. Moreover, the connection to the gene expression analysis package GEPAS allows the use of different facilities for data pre-processing and analysis. A DAS server allows exporting the results to the Ensembl viewer where contextual genomic information can be obtained. The program is freely available at: http://isacgh.bioinfo.cipf.es or within http://www.gepas.org.

%B Nucleic Acids Res %V 35 %P W81-5 %8 2007 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/17468499?dopt=Abstract %R 10.1093/nar/gkm257 %0 Journal Article %J Nucleic Acids Res %D 2007 %T Phylemon: a suite of web tools for molecular evolution, phylogenetics and phylogenomics %A Tarraga, J. %A Medina, Ignacio %A Arbiza, L. %A Huerta-Cepas, J. %A Gabaldón, T. %A Dopazo, J. %A H. Dopazo %K Animals Computational Biology/*methods Databases %K DNA Sequence Analysis %K Genetic Evolution %K Molecular Genetic Techniques Humans *Internet Models %K Protein Software User-Computer Interface %K Statistical *Phylogeny Programming Languages Sequence Alignment Sequence Analysis %X Phylemon is an online platform for phylogenetic and evolutionary analyses of molecular sequence data. It has been developed as a web server that integrates a suite of different tools selected among the most popular stand-alone programs in phylogenetic and evolutionary analysis. It has been conceived as a natural response to the increasing demand of data analysis of many experimental scientists wishing to add a molecular evolution and phylogenetics insight into their research. Tools included in Phylemon cover a wide yet selected range of programs: from the most basic for multiple sequence alignment to elaborate statistical methods of phylogenetic reconstruction including methods for evolutionary rates analyses and molecular adaptation. Phylemon has several features that differentiates it from other resources: (i) It offers an integrated environment that enables the direct concatenation of evolutionary analyses, the storage of results and handles required data format conversions, (ii) Once an outfile is produced, Phylemon suggests the next possible analyses, thus guiding the user and facilitating the integration of multi-step analyses, and (iii) users can define and save complete pipelines for specific phylogenetic analysis to be automatically used on many genes in subsequent sessions or multiple genes in a single session (phylogenomics). The Phylemon web server is available at http://phylemon.bioinfo.cipf.es. %B Nucleic Acids Res %V 35 %P W38-42 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17452346 %0 Journal Article %J Bioinformatics %D 2007 %T Prophet, a web-based tool for class prediction using microarray data %A Medina, Ignacio %A Montaner, D. %A Tarraga, J. %A Dopazo, J. %K babelomics %K gepas %K predictors %X

Sample classification and class prediction is the aim of many gene expression studies. We present a web-based application, Prophet, which builds prediction rules and allows using them for further sample classification. Prophet automatically chooses the best classifier, along with the optimal selection of genes, using a strategy that renders unbiased cross-validated errors. Prophet is linked to different microarray data analysis modules, and includes a unique feature: the possibility of performing the functional interpretation of the molecular signature found. Availability: Prophet can be found at the URL http://prophet.bioinfo.cipf.es/ or within the GEPAS package at http://www.gepas.org/ Supplementary information: http://gepas.bioinfo.cipf.es/tutorial/prophet.html.

%B Bioinformatics %V 23 %P 390-1 %G eng %U http://bioinformatics.oxfordjournals.org/cgi/content/full/23/3/390?view=long&pmid=17138587 %0 Journal Article %J Eukaryot Cell %D 2007 %T Spatial differentiation in the vegetative mycelium of Aspergillus niger %A Levin, A. M. %A de Vries, R. P. %A A. Conesa %A de Bekker, C. %A Talon, M. %A Menke, H. H. %A van Peij, N. N. %A Wosten, H. A. %K Aspergillus niger/*metabolism Cell Wall/metabolism Fungal Proteins/metabolism *Gene Expression Regulation %K Biological Mycelium/*metabolism Oligonucleotide Array Sequence Analysis RNA %K Fungal Genes %K Fungal Genome %K Fungal Glucans/chemistry Maltose/chemistry Models %K Fungal Time Factors Trans-Activators/metabolism Xylose/chemistry %X Fungal mycelia are exposed to heterogenic substrates. The substrate in the central part of the colony has been (partly) degraded, whereas it is still unexplored at the periphery of the mycelium. We here assessed whether substrate heterogeneity is a main determinant of spatial gene expression in colonies of Aspergillus niger. This question was addressed by analyzing whole-genome gene expression in five concentric zones of 7-day-old maltose- and xylose-grown colonies. Expression profiles at the periphery and the center were clearly different. More than 25% of the active genes showed twofold differences in expression between the inner and outermost zones of the colony. Moreover, 9% of the genes were expressed in only one of the five concentric zones, showing that a considerable part of the genome is active in a restricted part of the colony only. Statistical analysis of expression profiles of colonies that had either been or not been transferred to fresh xylose-containing medium showed that differential expression in a colony is due to the heterogeneity of the medium (e.g., genes involved in secretion, genes encoding proteases, and genes involved in xylose metabolism) as well as to medium-independent mechanisms (e.g., genes involved in nitrate metabolism and genes involved in cell wall synthesis and modification). Thus, we conclude that the mycelia of 7-day-old colonies of A. niger are highly differentiated. This conclusion is also indicated by the fact that distinct zones of the colony grow and secrete proteins, even after transfer to fresh medium. %B Eukaryot Cell %V 6 %P 2311-22 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17951513 %0 Journal Article %J Nucleic Acids Res %D 2006 %T BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments %A Fatima Al-Shahrour %A Minguez, P. %A Tarraga, J. %A Montaner, D. %A Alloza, E. %A Vaquerizas, J. M. %A L. Conde %A Blaschke, C. %A Vera, J. %A Dopazo, J. %K babelomics %K functional profiling %X

We present a new version of Babelomics, a complete suite of web tools for functional analysis of genome-scale experiments, with new and improved tools. New functionally relevant terms have been included such as CisRed motifs or bioentities obtained by text-mining procedures. An improved indexing has considerably speeded up several of the modules. An improved version of the FatiScan method for studying the coordinate behaviour of groups of functionally related genes is presented, along with a similar tool, the Gene Set Enrichment Analysis. Babelomics is now more oriented to test systems biology inspired hypotheses. Babelomics can be found at http://www.babelomics.org.

%B Nucleic Acids Res %V 34 %P W472-6 %G eng %U http://nar.oxfordjournals.org/content/34/suppl_2/W472.long %0 Journal Article %J Stud Health Technol Inform %D 2006 %T Blast2GO goes grid: developing a grid-enabled prototype for functional genomics analysis %A Aparicio, G. %A Gotz, S. %A A. Conesa %A Segrelles, D. %A Blanquer, I. %A Garcia, J. M. %A Hernandez, V. %A Robles, M. %A Talon, M. %K babelomics %X

The vast amount in complexity of data generated in Genomic Research implies that new dedicated and powerful computational tools need to be developed to meet their analysis requirements. Blast2GO (B2G) is a bioinformatics tool for Gene Ontology-based DNA or protein sequence annotation and function-based data mining. The application has been developed with the aim of affering an easy-to-use tool for functional genomics research. Typical B2G users are middle size genomics labs carrying out sequencing, ETS and microarray projects, handling datasets up to several thousand sequences. In the current version of B2G. The power and analytical potential of both annotation and function data-mining is somehow restricted to the computational power behind each particular installation. In order to be able to offer the possibility of an enhanced computational capacity within this bioinformatics application, a Grid component is being developed. A prototype has been conceived for the particular problem of speeding up the Blast searches to obtain fast results for large datasets. Many efforts have been done in the literature concerning the speeding up of Blast searches, but few of them deal with the use of large heterogeneous production Grid Infrastructures. These are the infrastructures that could reach the largest number of resources and the best load balancing for data access. The Grid Service under development will analyse requests based on the number of sequences, splitting them accordingly to the available resources. Lower-level computation will be performed through MPIBLAST. The software architecture is based on the WSRF standard.

%B Stud Health Technol Inform %V 120 %P 194-204 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16823138 %0 Journal Article %J Bioinformatics %D 2006 %T maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments %A A. Conesa %A Nueda, M. J. %A Ferrer, A. %A Talon, M. %K *Algorithms Computer Simulation Gene Expression/*physiology Gene Expression Profiling/*methods *Models %K Genetic Models %K Statistical Oligonucleotide Array Sequence Analysis/*methods *Software Time Factors %X MOTIVATION: Multi-series time-course microarray experiments are useful approaches for exploring biological processes. In this type of experiments, the researcher is frequently interested in studying gene expression changes along time and in evaluating trend differences between the various experimental groups. The large amount of data, multiplicity of experimental conditions and the dynamic nature of the experiments poses great challenges to data analysis. RESULTS: In this work, we propose a statistical procedure to identify genes that show different gene expression profiles across analytical groups in time-course experiments. The method is a two-regression step approach where the experimental groups are identified by dummy variables. The procedure first adjusts a global regression model with all the defined variables to identify differentially expressed genes, and in second a variable selection strategy is applied to study differences between groups and to find statistically significant different profiles. The methodology is illustrated on both a real and a simulated microarray dataset. %B Bioinformatics %V 22 %P 1096-102 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16481333 %0 Journal Article %J Nucleic Acids Res %D 2006 %T Next station in microarray data analysis: GEPAS %A Montaner, D. %A Tarraga, J. %A Huerta-Cepas, J. %A Burguet, J. %A Vaquerizas, J. M. %A L. Conde %A Minguez, P. %A Vera, J. %A Mukherjee, S. %A Valls, J. %A Pujana, M. A. %A Alloza, E. %A Herrero, J. %A Fatima Al-Shahrour %A Dopazo, J. %K gepas %K microarray data analysis %X

The Gene Expression Profile Analysis Suite (GEPAS) has been running for more than four years. During this time it has evolved to keep pace with the new interests and trends in the still changing world of microarray data analysis. GEPAS has been designed to provide an intuitive although powerful web-based interface that offers diverse analysis options from the early step of preprocessing (normalization of Affymetrix and two-colour microarray experiments and other preprocessing options), to the final step of the functional annotation of the experiment (using Gene Ontology, pathways, PubMed abstracts etc.), and include different possibilities for clustering, gene selection, class prediction and array-comparative genomic hybridization management. GEPAS is extensively used by researchers of many countries and its records indicate an average usage rate of 400 experiments per day. The web-based pipeline for microarray gene expression data, GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 34 %P W486-91 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16845056 %0 Journal Article %J Biol Direct %D 2006 %T Origin and evolution of the peroxisomal proteome %A Gabaldón, T. %A B. Snel %A van Zimmeren, F. %A Hemrika, W. %A Tabak, H. %A M. A. Huynen %X BACKGROUND: Peroxisomes are ubiquitous eukaryotic organelles involved in various oxidative reactions. Their enzymatic content varies between species, but the presence of common protein import and organelle biogenesis systems support a single evolutionary origin. The precise scenario for this origin remains however to be established. The ability of peroxisomes to divide and import proteins post-translationally, just like mitochondria and chloroplasts, supports an endosymbiotic origin. However, this view has been challenged by recent discoveries that mutant, peroxisome-less cells restore peroxisomes upon introduction of the wild-type gene, and that peroxisomes are formed from the Endoplasmic Reticulum. The lack of a peroxisomal genome precludes the use of classical analyses, as those performed with mitochondria or chloroplasts, to settle the debate. We therefore conducted large-scale phylogenetic analyses of the yeast and rat peroxisomal proteomes. RESULTS : Our results show that most peroxisomal proteins (39-58%) are of eukaryotic origin, comprising all proteins involved in organelle biogenesis or maintenance. A significant fraction (13-18%), consisting mainly of enzymes, has an alpha-proteobacterial origin and appears to be the result of the recruitment of proteins originally targeted to mitochondria. Consistent with the findings that peroxisomes are formed in the Endoplasmic Reticulum, we find that the most universally conserved Peroxisome biogenesis and maintenance proteins are homologous to proteins from the Endoplasmic Reticulum Assisted Decay pathway. CONCLUSION: Altogether our results indicate that the peroxisome does not have an endosymbiotic origin and that its proteins were recruited from pools existing within the primitive eukaryote. Moreover the reconstruction of primitive peroxisomal proteomes suggests that ontogenetically as well as phylogenetically, peroxisomes stem from the Endoplasmic Reticulum. REVIEWERS: This article was reviewed by Arcady Mushegian, Gaspar Jekely and John Logsdon. OPEN PEER REVIEW: Reviewed by Arcady Mushegian, Gaspar Jekely and John Logsdon. For the full reviews, please go to the Reviewers’ comments section. %B Biol Direct %V 1 %P 8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16556314 %0 Journal Article %J J Mol Biol %D 2006 %T Refinement of protein structures by iterative comparative modeling and CryoEM density fitting %A Topf, M. %A Baker, M. L. %A M. A. Marti-Renom %A Chiu, W. %A Sali, A. %K Amino Acid Sequence Cryoelectron Microscopy *Models %K Molecular Molecular Sequence Data Plant Viruses/chemistry *Protein Conformation Software Viral Proteins/*chemistry/genetics %X We developed a method for structure characterization of assembly components by iterative comparative protein structure modeling and fitting into cryo-electron microscopy (cryoEM) density maps. Specifically, we calculate a comparative model of a given component by considering many alternative alignments between the target sequence and a related template structure while optimizing the fit of a model into the corresponding density map. The method relies on the previously developed Moulder protocol that iterates over alignment, model building, and model assessment. The protocol was benchmarked using 20 varied target-template pairs of known structures with less than 30% sequence identity and corresponding simulated density maps at resolutions from 5A to 25A. Relative to the models based on the best existing sequence profile alignment methods, the percentage of C(alpha) atoms that are within 5A of the corresponding C(alpha) atoms in the superposed native structure increases on average from 52% to 66%, which is half-way between the starting models and the models from the best possible alignments (82%). The test also reveals that despite the improvements in the accuracy of the fitness function, this function is still the bottleneck in reducing the remaining errors. To demonstrate the usefulness of the protocol, we applied it to the upper domain of the P8 capsid protein of rice dwarf virus that has been studied by cryoEM at 6.8A. The C(alpha) root-mean-square deviation of the model based on the remotely related template, bluetongue virus VP7, improved from 8.7A to 6.0A, while the best possible model has a C(alpha) RMSD value of 5.3A. Moreover, the resulting model fits better into the cryoEM density map than the initial template structure. The method is being implemented in our program MODELLER for protein structure modeling by satisfaction of spatial restraints and will be applicable to the rapidly increasing number of cryoEM density maps of macromolecular assemblies. %B J Mol Biol %V 357 %P 1655-68 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16490207 %0 Book Section %B Discovery of biomolecular mechanisms with theoretical data analyses %D 2006 %T Reliable and specific protein function prediction by combining homology with genomic(s) context %A M. A. Huynen %A B. Snel %A Gabaldón T %B Discovery of biomolecular mechanisms with theoretical data analyses %I F. Eisenhaber, Landes Bioscience %G eng %U http://www.landesbioscience.com/iu/output.php?id=479 %0 Journal Article %J Nature %D 2005 %T An anaerobic mitochondrion that produces hydrogen %A Boxma, B. %A de Graaf, R. M. %A van der Staay, G. W. %A van Alen, T. A. %A Ricard, G. %A Gabaldón, T. %A van Hoek, A. H. %A Moon-van der Staay, S. Y. %A Koopman, W. J. %A van Hellemond, J. J. %A Tielens, A. G. %A Friedrich, T. %A Veenhuis, M. %A M. A. Huynen %A Hackstein, J. H. %K *Anaerobiosis Animals Ciliophora/*cytology/genetics/*metabolism/ultrastructure Cockroaches/parasitology DNA %K Mitochondrial/genetics Electron Transport Electron Transport Complex I/antagonists & inhibitors/metabolism Genome Glucose/metabolism Hydrogen/*metabolism Mitochondria/enzymology/genetics/*metabolism/ultrastructure Molecular Sequence Data Open Reading Fra %X Hydrogenosomes are organelles that produce ATP and hydrogen, and are found in various unrelated eukaryotes, such as anaerobic flagellates, chytridiomycete fungi and ciliates. Although all of these organelles generate hydrogen, the hydrogenosomes from these organisms are structurally and metabolically quite different, just like mitochondria where large differences also exist. These differences have led to a continuing debate about the evolutionary origin of hydrogenosomes. Here we show that the hydrogenosomes of the anaerobic ciliate Nyctotherus ovalis, which thrives in the hindgut of cockroaches, have retained a rudimentary genome encoding components of a mitochondrial electron transport chain. Phylogenetic analyses reveal that those proteins cluster with their homologues from aerobic ciliates. In addition, several nucleus-encoded components of the mitochondrial proteome, such as pyruvate dehydrogenase and complex II, were identified. The N. ovalis hydrogenosome is sensitive to inhibitors of mitochondrial complex I and produces succinate as a major metabolic end product–biochemical traits typical of anaerobic mitochondria. The production of hydrogen, together with the presence of a genome encoding respiratory chain components, and biochemical features characteristic of anaerobic mitochondria, identify the N. ovalis organelle as a missing link between mitochondria and hydrogenosomes. %B Nature %V 434 %P 74-9 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15744302 %0 Journal Article %J Bioinformatics %D 2005 %T Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research %A A. Conesa %A Gotz, S. %A Garcia-Gomez, J. M. %A Terol, J. %A Talon, M. %A Robles, M. %K babelomics %X

SUMMARY: We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. AVAILABILITY: Blast2GO is freely available via Java Web Start at http://www.blast2go.de. SUPPLEMENTARY MATERIAL: http://www.blast2go.de -> Evaluation.

%B Bioinformatics %V 21 %P 3674-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16081474 %0 Journal Article %J Plant Mol Biol %D 2005 %T Development of a citrus genome-wide EST collection and cDNA microarray as resources for genomic studies %A J. Forment %A J. Gadea %A Huerta, L. %A Abizanda, L. %A Agusti, J. %A Alamar, S. %A Alos, E. %A Andres, F. %A Arribas, R. %A Beltran, J. P. %A Berbel, A. %A Blazquez, M. A. %A Brumos, J. %A Canas, L. A. %A Cercos, M. %A Colmenero-Flores, J. M. %A A. Conesa %A Estables, B. %A Gandia, M. %A Garcia-Martinez, J. L. %A Gimeno, J. %A Gisbert, A. %A Gomez, G. %A Gonzalez-Candelas, L. %A Granell, A. %A Guerri, J. %A Lafuente, M. T. %A Madueno, F. %A Marcos, J. F. %A Marques, M. C. %A Martinez, F. %A Martinez-Godoy, M. A. %A Miralles, S. %A Moreno, P. %A Navarro, L. %A Pallas, V. %A Perez-Amador, M. A. %A Perez-Valle, J. %A Pons, C. %A Rodrigo, I. %A Rodriguez, P. L. %A Royo, C. %A Serrano, R. %A Soler, G. %A Tadeo, F. %A Talon, M. %A Terol, J. %A Trenor, M. %A Vaello, L. %A Vicente, O. %A Vidal, Ch %A Zacarias, L. %A Conejero, V. %K Citrus/*genetics DNA %K Complementary/chemistry/genetics *Expressed Sequence Tags Gene Expression Profiling Gene Library *Genome %K DNA %K Plant Genomics/*methods Molecular Sequence Data Oligonucleotide Array Sequence Analysis/*methods RNA %K Plant/genetics/metabolism Reproducibility of Results Sequence Analysis %X A functional genomics project has been initiated to approach the molecular characterization of the main biological and agronomical traits of citrus. As a key part of this project, a citrus EST collection has been generated from 25 cDNA libraries covering different tissues, developmental stages and stress conditions. The collection includes a total of 22,635 high-quality ESTs, grouped in 11,836 putative unigenes, which represent at least one third of the estimated number of genes in the citrus genome. Functional annotation of unigenes which have Arabidopsis orthologues (68% of all unigenes) revealed gene representation in every major functional category, suggesting that a genome-wide EST collection was obtained. A Citrus clementina Hort. ex Tan. cv. Clemenules genomic library, that will contribute to further characterization of relevant genes, has also been constructed. To initiate the analysis of citrus transcriptome, we have developed a cDNA microarray containing 12,672 probes corresponding to 6875 putative unigenes of the collection. Technical characterization of the microarray showed high intra- and inter-array reproducibility, as well as a good range of sensitivity. We have also validated gene expression data achieved with this microarray through an independent technique such as RNA gel blot analysis. %B Plant Mol Biol %V 57 %P 375-91 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15830128 %0 Journal Article %J J Biotechnol %D 2002 %T Bioinformatics methods for the analysis of expression arrays: data clustering and information extraction %A J. Tamames %A Clark, D. %A Herrero, J. %A Dopazo, J. %A Blaschke, C. %A Fernandez, J. M. %A Oliveros, J. C. %A Valencia, A. %K Abstracting and Indexing as Topic/methods *Cluster Analysis *Database Management Systems Databases %K Computer-Assisted/methods Information Storage and Retrieval/*methods Internet Medline National Library of Medicine (U.S.) Oligonucleotide Array Sequence Analysis/*methods United States %K Genetic Gene Expression Gene Expression Profiling/*methods Image Processing %X Expression arrays facilitate the monitoring of changes in the expression patterns of large collections of genes. The analysis of expression array data has become a computationally-intensive task that requires the development of bioinformatics technology for a number of key stages in the process, such as image analysis, database storage, gene clustering and information extraction. Here, we review the current trends in each of these areas, with particular emphasis on the development of the related technology being carried out within our groups. %B J Biotechnol %V 98 %P 269-83 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12141992 %0 Journal Article %J Am J Pathol %D 2002 %T Identification of genes involved in resistance to interferon-alpha in cutaneous T-cell lymphoma %A Tracey, L. %A Villuendas, R. %A Ortiz, P. %A Dopazo, A. %A Spiteri, I. %A Lombardia, L. %A Rodriguez-Peralto, J. L. %A Fernandez-Herrera, J. %A Hernandez, A. %A Fraga, J. %A Dominguez, O. %A Herrero, J. %A Alonso, M. A. %A Dopazo, J. %A Piris, M. A. %K Antineoplastic Agents/*pharmacology/therapeutic use Carrier Proteins/biosynthesis/genetics DNA-Binding Proteins/biosynthesis/genetics Drug Resistance %K Biological Oligonucleotide Array Sequence Analysis RNA %K Cultured %K Cutaneous/diagnosis/drug therapy/*genetics/metabolism *Membrane Glycoproteins Models %K Interleukin-1 Reproducibility of Results STAT1 Transcription Factor STAT3 Transcription Factor Trans-Activators/biosynthesis/genetics Tumor Cells %K Neoplasm Gene Expression Profiling *Gene Expression Regulation %K Neoplasm/biosynthesis *Receptors %K Neoplastic Humans Interferon-alpha/*pharmacology/therapeutic use Kinetics Lymphoma %K T-Cell %X Interferon-alpha therapy has been shown to be active in the treatment of mycosis fungoides although the individual response to this therapy is unpredictable and dependent on essentially unknown factors. In an effort to better understand the molecular mechanisms of interferon-alpha resistance we have developed an interferon-alpha resistant variant from a sensitive cutaneous T-cell lymphoma cell line. We have performed expression analysis to detect genes differentially expressed between both variants using a cDNA microarray including 6386 cancer-implicated genes. The experiments showed that resistance to interferon-alpha is consistently associated with changes in the expression of a set of 39 genes, involved in signal transduction, apoptosis, transcription regulation, and cell growth. Additional studies performed confirm that STAT1 and STAT3 expression and interferon-alpha induction and activation are not altered between both variants. The gene MAL, highly overexpressed by resistant cells, was also found to be expressed by tumoral cells in a series of cutaneous T-cell lymphoma patients treated with interferon-alpha and/or photochemotherapy. MAL expression was associated with longer time to complete remission. Time-course experiments of the sensitive and resistant cells showed a differential expression of a subset of genes involved in interferon-response (1 to 4 hours), cell growth and apoptosis (24 to 48 hours.), and signal transduction. %B Am J Pathol %V 161 %P 1825-37 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12414529 %0 Book %D 2002 %T Methods of Microarray Data Analysis IISupervised Neural Networks for Clustering Conditions in DNA Array Data After Reducing Noise by Clustering Gene Expression Profiles %A Mateos, Alvaro %A Herrero, Javier %A Tamames, Javier %A Dopazo, Joaquin %E Lin, Simon M. %E Johnson, Kimberly F. %I Kluwer Academic Publishers %C Boston %P 91 - 103 %G eng %U http://www.springerlink.com/index/10.1007/b112982http://link.springer.com/10.1007/0-306-47598-7_7http://www.springerlink.com/index/pdf/10.1007/0-306-47598-7_7 %R 10.1007/b11298210.1007/0-306-47598-7_7 %0 Book Section %B Microarray data analysis II %D 2002 %T Supervised Neural Networks For Clustering Conditions In DNA Array Data After Reducing Noise By Clustering Gene Expression Profiles %A A. Mateos %A Herrero, J. %A J. Tamames %A Dopazo, J. %B Microarray data analysis II %I Kluwer Academic %P 91-103 %G eng %0 Journal Article %J Genome Res %D 2002 %T Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons %A A. Mateos %A Dopazo, J. %A Jansen, R. %A Tu, Y. %A Gerstein, M. %A Stolovitzky, G. %K Algorithms Artificial Intelligence Citric Acid Cycle/genetics Cluster Analysis Computational Biology/methods Gene Expression Profiling/*methods/statistics & numerical data Genes/*physiology Genetic Heterogeneity Neural Networks (Computer) Oligonucleotide %X Recent advances in microarray technology have opened new ways for functional annotation of previously uncharacterised genes on a genomic scale. This has been demonstrated by unsupervised clustering of co-expressed genes and, more importantly, by supervised learning algorithms. Using prior knowledge, these algorithms can assign functional annotations based on more complex expression signatures found in existing functional classes. Previously, support vector machines (SVMs) and other machine-learning methods have been applied to a limited number of functional classes for this purpose. Here we present, for the first time, the comprehensive application of supervised neural networks (SNNs) for functional annotation. Our study is novel in that we report systematic results for 100 classes in the Munich Information Center for Protein Sequences (MIPS) functional catalog. We found that only 10% of these are learnable (based on the rate of false negatives). A closer analysis reveals that false positives (and negatives) in a machine-learning context are not necessarily "false" in a biological sense. We show that the high degree of interconnections among functional classes confounds the signatures that ought to be learned for a unique class. We term this the "Borges effect" and introduce two new numerical indices for its quantification. Our analysis indicates that classification systems with a lower Borges effect are better suitable for machine learning. Furthermore, we introduce a learning procedure for combining false positives with the original class. We show that in a few iterations this process converges to a gene set that is learnable with considerably low rates of false positives and negatives and contains genes that are biologically related to the original class, allowing for a coarse reconstruction of the interactions between associated biological pathways. We exemplify this methodology using the well-studied tricarboxylic acid cycle. %B Genome Res %V 12 %P 1703-15 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12421757