%0 Journal Article %J Front Immunol %D 2024 %T Drug-target identification in COVID-19 disease mechanisms using computational systems biology approaches. %A Niarakis, Anna %A Ostaszewski, Marek %A Mazein, Alexander %A Kuperstein, Inna %A Kutmon, Martina %A Gillespie, Marc E %A Funahashi, Akira %A Acencio, Marcio Luis %A Hemedan, Ahmed %A Aichem, Michael %A Klein, Karsten %A Czauderna, Tobias %A Burtscher, Felicia %A Yamada, Takahiro G %A Hiki, Yusuke %A Hiroi, Noriko F %A Hu, Finterly %A Pham, Nhung %A Ehrhart, Friederike %A Willighagen, Egon L %A Valdeolivas, Alberto %A Dugourd, Aurélien %A Messina, Francesco %A Esteban-Medina, Marina %A Peña-Chilet, Maria %A Rian, Kinza %A Soliman, Sylvain %A Aghamiri, Sara Sadat %A Puniya, Bhanwar Lal %A Naldi, Aurélien %A Helikar, Tomáš %A Singh, Vidisha %A Fernández, Marco Fariñas %A Bermudez, Viviam %A Tsirvouli, Eirini %A Montagud, Arnau %A Noël, Vincent %A Ponce-de-Leon, Miguel %A Maier, Dieter %A Bauch, Angela %A Gyori, Benjamin M %A Bachman, John A %A Luna, Augustin %A Piñero, Janet %A Furlong, Laura I %A Balaur, Irina %A Rougny, Adrien %A Jarosz, Yohan %A Overall, Rupert W %A Phair, Robert %A Perfetto, Livia %A Matthews, Lisa %A Rex, Devasahayam Arokia Balaya %A Orlic-Milacic, Marija %A Gomez, Luis Cristobal Monraz %A De Meulder, Bertrand %A Ravel, Jean Marie %A Jassal, Bijay %A Satagopam, Venkata %A Wu, Guanming %A Golebiewski, Martin %A Gawron, Piotr %A Calzone, Laurence %A Beckmann, Jacques S %A Evelo, Chris T %A D'Eustachio, Peter %A Schreiber, Falk %A Saez-Rodriguez, Julio %A Dopazo, Joaquin %A Kuiper, Martin %A Valencia, Alfonso %A Wolkenhauer, Olaf %A Kitano, Hiroaki %A Barillot, Emmanuel %A Auffray, Charles %A Balling, Rudi %A Schneider, Reinhard %K Computer Simulation %K COVID-19 %K drug repositioning %K Humans %K SARS-CoV-2 %K Systems biology %X

INTRODUCTION: The COVID-19 Disease Map project is a large-scale community effort uniting 277 scientists from 130 Institutions around the globe. We use high-quality, mechanistic content describing SARS-CoV-2-host interactions and develop interoperable bioinformatic pipelines for novel target identification and drug repurposing.

METHODS: Extensive community work allowed an impressive step forward in building interfaces between Systems Biology tools and platforms. Our framework can link biomolecules from omics data analysis and computational modelling to dysregulated pathways in a cell-, tissue- or patient-specific manner. Drug repurposing using text mining and AI-assisted analysis identified potential drugs, chemicals and microRNAs that could target the identified key factors.

RESULTS: Results revealed drugs already tested for anti-COVID-19 efficacy, providing a mechanistic context for their mode of action, and drugs already in clinical trials for treating other diseases, never tested against COVID-19.

DISCUSSION: The key advance is that the proposed framework is versatile and expandable, offering a significant upgrade in the arsenal for virus-host interactions and other complex pathologies.

%B Front Immunol %V 14 %P 1282859 %8 2023 %G eng %R 10.3389/fimmu.2023.1282859 %0 Journal Article %J J Transl Med %D 2024 %T The mechanistic functional landscape of retinitis pigmentosa: a machine learning-driven approach to therapeutic target discovery. %A Esteban-Medina, Marina %A Loucera, Carlos %A Rian, Kinza %A Velasco, Sheyla %A Olivares-González, Lorena %A Rodrigo, Regina %A Dopazo, Joaquin %A Peña-Chilet, Maria %K Animals %K Mice %K Retinitis pigmentosa %K Signal Transduction %X

BACKGROUND: Retinitis pigmentosa is the prevailing genetic cause of blindness in developed nations with no effective treatments. In the pursuit of unraveling the intricate dynamics underlying this complex disease, mechanistic models emerge as a tool of proven efficiency rooted in systems biology, to elucidate the interplay between RP genes and their mechanisms. The integration of mechanistic models and drug-target interactions under the umbrella of machine learning methodologies provides a multifaceted approach that can boost the discovery of novel therapeutic targets, facilitating further drug repurposing in RP.

METHODS: By mapping Retinitis Pigmentosa-related genes (obtained from Orphanet, OMIM and HPO databases) onto KEGG signaling pathways, a collection of signaling functional circuits encompassing Retinitis Pigmentosa molecular mechanisms was defined. Next, a mechanistic model of the so-defined disease map, where the effects of interventions can be simulated, was built. Then, an explainable multi-output random forest regressor was trained using normal tissue transcriptomic data to learn causal connections between targets of approved drugs from DrugBank and the functional circuits of the mechanistic disease map. Selected target genes involvement were validated on rd10 mice, a murine model of Retinitis Pigmentosa.

RESULTS: A mechanistic functional map of Retinitis Pigmentosa was constructed resulting in 226 functional circuits belonging to 40 KEGG signaling pathways. The method predicted 109 targets of approved drugs in use with a potential effect over circuits corresponding to nine hallmarks identified. Five of those targets were selected and experimentally validated in rd10 mice: Gabre, Gabra1 (GABARα1 protein), Slc12a5 (KCC2 protein), Grin1 (NR1 protein) and Glr2a. As a result, we provide a resource to evaluate the potential impact of drug target genes in Retinitis Pigmentosa.

CONCLUSIONS: The possibility of building actionable disease models in combination with machine learning algorithms to learn causal drug-disease interactions opens new avenues for boosting drug discovery. Such mechanistically-based hypotheses can guide and accelerate the experimental validations prioritizing drug target candidates. In this work, a mechanistic model describing the functional disease map of Retinitis Pigmentosa was developed, identifying five promising therapeutic candidates targeted by approved drug. Further experimental validation will demonstrate the efficiency of this approach for a systematic application to other rare diseases.

%B J Transl Med %V 22 %P 139 %8 2024 Feb 06 %G eng %N 1 %R 10.1186/s12967-024-04911-7 %0 Journal Article %J Front Med (Lausanne) %D 2023 %T Case report: Analysis of phage therapy failure in a patient with a Pseudomonas aeruginosa prosthetic vascular graft infection %A Blasco, Lucia %A López-Hernández, Inmaculada %A Rodríguez-Fernández, Miguel %A Perez-Florido, Javier %A Casimiro-Soriguer, Carlos S %X

Clinical case of a patient with a multidrug-resistant prosthetic vascular graft infection which was treated with a cocktail of phages (PT07, 14/01, and PNM) in combination with ceftazidime-avibactam (CZA). After the application of the phage treatment and in absence of antimicrobial therapy, a new bloodstream infection (BSI) with a septic residual limb metastasis occurred, now involving a wild-type strain being susceptible to ß-lactams and quinolones. Clinical strains were analyzed by microbiology and whole genome sequencing techniques. In relation with phage administration, the clinical isolates of before phage therapy (HE2011471) and post phage therapy (HE2105886) showed a clonal relationship but with important genomic changes which could be involved in the resistance to this therapy. Finally, phenotypic studies showed a decrease in Minimum Inhibitory Concentration (MIC) to ß-lactams and quinolones as well as an increase of the biofilm production and phage resistant mutants in the clinical isolate of post phage therapy.

%B Front Med (Lausanne) %V 10 %P 1199657 %8 2023 %G eng %U https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10235614/ %R 10.3389/fmed.2023.1199657 %0 Book %B Lecture Notes in Computer Science. Computational Methods in Systems Biology %D 2023 %T Cell-Level Pathway Scoring Comparison with a Biologically Constrained Variational Autoencoder %A Gundogdu, Pelin %A Payá-Milans, Miriam %A Alamo-Alvarez, Inmaculada %A Nepomuceno-Chamorro, Isabel A. %A Dopazo, Joaquin %A Loucera, Carlos %B Lecture Notes in Computer Science. Computational Methods in Systems Biology %I Springer Nature Switzerland %C Cham %V 14137 %P 62 - 77 %@ 978-3-031-42696-4 %G eng %U https://link.springer.com/chapter/10.1007/978-3-031-42697-1_5 %R 10.1007/978-3-031-42697-110.1007/978-3-031-42697-1_5 %0 Journal Article %J Pharmaceutics %D 2023 %T A Comprehensive Analysis of 21 Actionable Pharmacogenes in the Spanish Population: From Genetic Characterisation to Clinical Impact. %A Núñez-Torres, Rocío %A Pita, Guillermo %A Peña-Chilet, Maria %A López-López, Daniel %A Zamora, Jorge %A Roldán, Gema %A Herráez, Belén %A Alvarez, Nuria %A Alonso, María Rosario %A Dopazo, Joaquin %A González-Neira, Anna %X

The implementation of pharmacogenetics (PGx) is a main milestones of precision medicine nowadays in order to achieve safer and more effective therapies. Nevertheless, the implementation of PGx diagnostics is extremely slow and unequal worldwide, in part due to a lack of ethnic PGx information. We analysed genetic data from 3006 Spanish individuals obtained by different high-throughput (HT) techniques. Allele frequencies were determined in our population for the main 21 actionable PGx genes associated with therapeutical changes. We found that 98% of the Spanish population harbours at least one allele associated with a therapeutical change and, thus, there would be a need for a therapeutical change in a mean of 3.31 of the 64 associated drugs. We also identified 326 putative deleterious variants that were not previously related with PGx in 18 out of the 21 main PGx genes evaluated and a total of 7122 putative deleterious variants for the 1045 PGx genes described. Additionally, we performed a comparison of the main HT diagnostic techniques, revealing that after whole genome sequencing, genotyping with the PGx HT array is the most suitable solution for PGx diagnostics. Finally, all this information was integrated in the Collaborative Spanish Variant Server to be available to and updated by the scientific community.

%B Pharmaceutics %V 15 %8 2023 Apr 19 %G eng %N 4 %R 10.3390/pharmaceutics15041286 %0 Journal Article %J Int J Mol Sci %D 2023 %T Crosstalk between Metabolite Production and Signaling Activity in Breast Cancer. %A Cubuk, Cankut %A Loucera, Carlos %A Peña-Chilet, Maria %A Dopazo, Joaquin %X

The reprogramming of metabolism is a recognized cancer hallmark. It is well known that different signaling pathways regulate and orchestrate this reprogramming that contributes to cancer initiation and development. However, recent evidence is accumulating, suggesting that several metabolites could play a relevant role in regulating signaling pathways. To assess the potential role of metabolites in the regulation of signaling pathways, both metabolic and signaling pathway activities of Breast invasive Carcinoma (BRCA) have been modeled using mechanistic models. Gaussian Processes, powerful machine learning methods, were used in combination with SHapley Additive exPlanations (SHAP), a recent methodology that conveys causality, to obtain potential causal relationships between the production of metabolites and the regulation of signaling pathways. A total of 317 metabolites were found to have a strong impact on signaling circuits. The results presented here point to the existence of a complex crosstalk between signaling and metabolic pathways more complex than previously was thought.

%B Int J Mol Sci %V 24 %8 2023 Apr 18 %G eng %N 8 %R 10.3390/ijms24087450 %0 Journal Article %J Hum Genomics %D 2023 %T A crowdsourcing database for the copy-number variation of the Spanish population. %A López-López, Daniel %A Roldán, Gema %A Fernandez-Rueda, Jose L %A Bostelmann, Gerrit %A Carmona, Rosario %A Aquino, Virginia %A Perez-Florido, Javier %A Ortuno, Francisco %A Pita, Guillermo %A Núñez-Torres, Rocío %A González-Neira, Anna %A Peña-Chilet, Maria %A Dopazo, Joaquin %X

BACKGROUND: Despite being a very common type of genetic variation, the distribution of copy-number variations (CNVs) in the population is still poorly understood. The knowledge of the genetic variability, especially at the level of the local population, is a critical factor for distinguishing pathogenic from non-pathogenic variation in the discovery of new disease variants.

RESULTS: Here, we present the SPAnish Copy Number Alterations Collaborative Server (SPACNACS), which currently contains copy number variation profiles obtained from more than 400 genomes and exomes of unrelated Spanish individuals. By means of a collaborative crowdsourcing effort whole genome and whole exome sequencing data, produced by local genomic projects and for other purposes, is continuously collected. Once checked both, the Spanish ancestry and the lack of kinship with other individuals in the SPACNACS, the CNVs are inferred for these sequences and they are used to populate the database. A web interface allows querying the database with different filters that include ICD10 upper categories. This allows discarding samples from the disease under study and obtaining pseudo-control CNV profiles from the local population. We also show here additional studies on the local impact of CNVs in some phenotypes and on pharmacogenomic variants. SPACNACS can be accessed at: http://csvs.clinbioinfosspa.es/spacnacs/ .

CONCLUSION: SPACNACS facilitates disease gene discovery by providing detailed information of the local variability of the population and exemplifies how to reuse genomic data produced for other purposes to build a local reference database.

%B Hum Genomics %V 17 %P 20 %8 2023 Mar 09 %G eng %N 1 %R 10.1186/s40246-023-00466-8 %0 Journal Article %J Cell Rep %D 2023 %T Defective extracellular matrix remodeling in brown adipose tissue is associated with fibro-inflammation and reduced diet-induced thermogenesis. %A Pellegrinelli, Vanessa %A Figueroa-Juárez, Elizabeth %A Samuelson, Isabella %A U-Din, Mueez %A Rodriguez-Fdez, Sonia %A Virtue, Samuel %A Leggat, Jennifer %A Cubuk, Cankut %A Peirce, Vivian J %A Niemi, Tarja %A Campbell, Mark %A Rodriguez-Cuenca, Sergio %A Dopazo, Joaquin %A Carobbio, Stefania %A Virtanen, Kirsi A %A Vidal-Puig, Antonio %X

The relevance of extracellular matrix (ECM) remodeling is reported in white adipose tissue (AT) and obesity-related dysfunctions, but little is known about the importance of ECM remodeling in brown AT (BAT) function. Here, we show that a time course of high-fat diet (HFD) feeding progressively impairs diet-induced thermogenesis concomitantly with the development of fibro-inflammation in BAT. Higher markers of fibro-inflammation are associated with lower cold-induced BAT activity in humans. Similarly, when mice are housed at thermoneutrality, inactivated BAT features fibro-inflammation. We validate the pathophysiological relevance of BAT ECM remodeling in response to temperature challenges and HFD using a model of a primary defect in the collagen turnover mediated by partial ablation of the Pepd prolidase. Pepd-heterozygous mice display exacerbated dysfunction and BAT fibro-inflammation at thermoneutrality and in HFD. Our findings show the relevance of ECM remodeling in BAT activation and provide a mechanism for BAT dysfunction in obesity.

%B Cell Rep %V 42 %P 112640 %8 2023 Jun 13 %G eng %N 6 %R 10.1016/j.celrep.2023.112640 %0 Journal Article %J Int J Mol Sci %D 2023 %T Detection of High Level of Co-Infection and the Emergence of Novel SARS CoV-2 Delta-Omicron and Omicron-Omicron Recombinants in the Epidemiological Surveillance of Andalusia. %A Perez-Florido, Javier %A Casimiro-Soriguer, Carlos S %A Ortuno, Francisco %A Fernandez-Rueda, Jose L %A Aguado, Andrea %A Lara, María %A Riazzo, Cristina %A Rodriguez-Iglesias, Manuel A %A Camacho-Martinez, Pedro %A Merino-Diaz, Laura %A Pupo-Ledo, Inmaculada %A de Salazar, Adolfo %A Viñuela, Laura %A Fuentes, Ana %A Chueca, Natalia %A García, Federico %A Dopazo, Joaquin %A Lepe, Jose A %X

Recombination is an evolutionary strategy to quickly acquire new viral properties inherited from the parental lineages. The systematic survey of the SARS-CoV-2 genome sequences of the Andalusian genomic surveillance strategy has allowed the detection of an unexpectedly high number of co-infections, which constitute the ideal scenario for the emergence of new recombinants. Whole genome sequence of SARS-CoV-2 has been carried out as part of the genomic surveillance programme. Sample sources included the main hospitals in the Andalusia region. In addition to the increase of co-infections and known recombinants, three novel SARS-CoV-2 delta-omicron and omicron-omicron recombinant variants with two break points have been detected. Our observations document an epidemiological scenario in which co-infection and recombination are detected more frequently. Finally, we describe a family case in which co-infection is followed by the detection of a recombinant made from the two co-infecting variants. This increased number of recombinants raises the risk of emergence of recombinant variants with increased transmissibility and pathogenicity.

%B Int J Mol Sci %V 24 %8 2023 Jan 26 %G eng %N 3 %R 10.3390/ijms24032419 %0 Journal Article %J Front Genet %D 2023 %T Editorial: Critical assessment of massive data analysis (CAMDA) annual conference 2021. %A Łabaj, Paweł P %A Dopazo, Joaquin %A Xiao, Wenzhong %A Kreil, David P %B Front Genet %V 14 %P 1154398 %8 2023 %G eng %R 10.3389/fgene.2023.1154398 %0 Journal Article %J Epidemiol Infect %D 2023 %T Evaluation of a combined detection of SARS-CoV-2 and its variants using real-time allele-specific PCR strategy: an advantage for clinical practice. %A Chaves-Blanco, Lucía %A de Salazar, Adolfo %A Fuentes, Ana %A Viñuela, Laura %A Perez-Florido, Javier %A Dopazo, Joaquin %A García, Federico %K Alleles %K COVID-19 %K COVID-19 Testing %K Humans %K Real-Time Polymerase Chain Reaction %K SARS-CoV-2 %K Sensitivity and Specificity %X

This study aimed to assess the ability of a real-time reverse transcription polymerase chain reaction (RT-PCR) with multiple targets to detect SARS-CoV-2 and its variants in a single test. Nasopharyngeal specimens were collected from patients in Granada, Spain, between January 2021 and December 2022. Five allele-specific RT-PCR kits were used sequentially, with each kit designed to detect a predominant variant at the time. When the Alpha variant was dominant, the kit included the HV69/70 deletion, E and N genes. When Delta replaced Alpha, the kit incorporated the L452R mutation in addition to E and N genes. When Omicron became dominant, L452R was replaced with the N679K mutation. Before incorporating each variant kit, a comparative analysis was carried out with SARS-CoV-2 whole genome sequencing (WGS). The results demonstrated that RT-PCR with multiple targets can provide rapid and effective detection of SARS-CoV-2 and its variants in a single test. A very high degree of agreement (96.2%) was obtained between the comparison of RT-PCR and WGS. Allele-specific RT-PCR assays make it easier to implement epidemiological surveillance systems for effective public health decision making.

%B Epidemiol Infect %V 151 %P e201 %8 2023 Nov 24 %G eng %R 10.1017/S095026882300184X %0 Journal Article %J Med Clin (Barc) %D 2023 %T Evidence of the association between increased use of direct oral anticoagulants and a reduction in the rate of atrial fibrillation-related stroke and major bleeding at the population level (2012-2019). %A Loucera, Carlos %A Carmona, Rosario %A Bostelmann, Gerrit %A Muñoyerro-Muñiz, Dolores %A Villegas, Román %A Gonzalez-Manzanares, Rafael %A Dopazo, Joaquin %A Anguita, Manuel %X

BACKGROUND: The introduction of direct-acting oral anticoagulants (DOACs) has shown to decrease atrial fibrillation (AF)-related stroke and bleeding rates in clinical studies, but there is no certain evidence about their effects at the population level. Our aim was to assess changes in AF-related stroke and major bleeding rates between 2012 and 2019 in Andalusia (Spain), and the association between DOACs use and events rates at the population level.

METHODS: All patients with an AF diagnosis from 2012 to 2019 were identified using the Andalusian Health Population Base, that provides clinical information on all Andalusian people. Annual ischemic and hemorrhagic stroke, major bleeding rates, and used antithrombotic treatments were determined. Marginal hazard ratios (HR) were calculated for each treatment.

RESULTS: A total of 95,085 patients with an AF diagnosis were identified. Mean age was 76.1±10.2 years (49.7% women). An increase in the use of DOACs was observed throughout the study period in both males and females (p<0.001). The annual rate of ischemic stroke decreased by one third, while that of hemorrhagic stroke and major bleeding decreased 2-3-fold from 2012 to 2019. Marginal HR was lower than 0.50 for DOACs compared to VKA for all ischemic or hemorrhagic events.

CONCLUSIONS: In this contemporary population-based study using clinical and administrative databases in Andalusia, a significant reduction in the incidence of AF-related ischemic and hemorrhagic stroke and major bleeding was observed between 2012 and 2019. The increased use of DOACs seems to be associated with this reduction.

%B Med Clin (Barc) %8 2023 Nov 20 %G eng %R 10.1016/j.medcli.2023.10.008 %0 Journal Article %J Commun Biol %D 2023 %T Metabolic reprogramming by Acly inhibition using SB-204990 alters glucoregulation and modulates molecular mechanisms associated with aging. %A Sola-García, Alejandro %A Cáliz-Molina, María Ángeles %A Espadas, Isabel %A Petr, Michael %A Panadero-Morón, Concepción %A González-Morán, Daniel %A Martín-Vázquez, María Eugenia %A Narbona-Pérez, Álvaro Jesús %A López-Noriega, Livia %A Martínez-Corrales, Guillermo %A López-Fernández-Sobrino, Raúl %A Carmona-Marin, Lina M %A Martínez-Force, Enrique %A Yanes, Oscar %A Vinaixa, Maria %A López-López, Daniel %A Reyes, José Carlos %A Dopazo, Joaquin %A Martín, Franz %A Gauthier, Benoit R %A Scheibye-Knudsen, Morten %A Capilla-González, Vivian %A Martín-Montalvo, Alejandro %X

ATP-citrate lyase is a central integrator of cellular metabolism in the interface of protein, carbohydrate, and lipid metabolism. The physiological consequences as well as the molecular mechanisms orchestrating the response to long-term pharmacologically induced Acly inhibition are unknown. We report here that the Acly inhibitor SB-204990 improves metabolic health and physical strength in wild-type mice when fed with a high-fat diet, while in mice fed with healthy diet results in metabolic imbalance and moderated insulin resistance. By applying a multiomic approach using untargeted metabolomics, transcriptomics, and proteomics, we determined that, in vivo, SB-204990 plays a role in the regulation of molecular mechanisms associated with aging, such as energy metabolism, mitochondrial function, mTOR signaling, and folate cycle, while global alterations on histone acetylation are absent. Our findings indicate a mechanism for regulating molecular pathways of aging that prevents the development of metabolic abnormalities associated with unhealthy dieting. This strategy might be explored for devising therapeutic approaches to prevent metabolic diseases.

%B Commun Biol %V 6 %P 250 %8 2023 Mar 08 %G eng %N 1 %R 10.1038/s42003-023-04625-4 %0 Journal Article %J Aging Cell %D 2023 %T microRNAs-mediated regulation of insulin signaling in white adipose tissue during aging: Role of caloric restriction. %A Corrales, Patricia %A Martin-Taboada, Marina %A Vivas-García, Yurena %A Torres, Lucia %A Ramirez-Jimenez, Laura %A Lopez, Yamila %A Horrillo, Daniel %A Vila-Bedmar, Rocio %A Barber-Cano, Eloisa %A Izquierdo-Lahuerta, Adriana %A Peña-Chilet, Maria %A Martínez, Carmen %A Dopazo, Joaquin %A Ros, Manuel %A Medina-Gomez, Gema %X

Caloric restriction is a non-pharmacological intervention known to ameliorate the metabolic defects associated with aging, including insulin resistance. The levels of miRNA expression may represent a predictive tool for aging-related alterations. In order to investigate the role of miRNAs underlying insulin resistance in adipose tissue during the early stages of aging, 3- and 12-month-old male animals fed ad libitum, and 12-month-old male animals fed with a 20% caloric restricted diet were used. In this work we demonstrate that specific miRNAs may contribute to the impaired insulin-stimulated glucose metabolism specifically in the subcutaneous white adipose tissue, through the regulation of target genes implicated in the insulin signaling cascade. Moreover, the expression of these miRNAs is modified by caloric restriction in middle-aged animals, in accordance with the improvement of the metabolic state. Overall, our work demonstrates that alterations in posttranscriptional gene expression because of miRNAs dysregulation might represent an endogenous mechanism by which insulin response in the subcutaneous fat depot is already affected at middle age. Importantly, caloric restriction could prevent this modulation, demonstrating that certain miRNAs could constitute potential biomarkers of age-related metabolic alterations.

%B Aging Cell %P e13919 %8 2023 Jul 04 %G eng %R 10.1111/acel.13919 %0 Journal Article %J Environ Pollut %D 2023 %T Polystyrene nanoplastics affect transcriptomic and epigenomic signatures of human fibroblasts and derived induced pluripotent stem cells: Implications for human health. %A Stojkovic, Miodrag %A Ortuño Guzmán, Francisco Manuel %A Han, Dongjun %A Stojkovic, Petra %A Dopazo, Joaquin %A Stankovic, Konstantina M %X

Plastic pollution is increasing at an alarming rate yet the impact of this pollution on human health is poorly understood. Because human induced pluripotent stem cells (hiPSC) are frequently derived from dermal fibroblasts, these cells offer a powerful platform for the identification of molecular biomarkers of environmental pollution in human cells. Here, we describe a novel proof-of-concept for deriving hiPSC from human dermal fibroblasts deliberately exposed to polystyrene (PS) nanoplastic particles; unexposed hiPSC served as controls. In parallel, unexposed hiPSC were exposed to low and high concentrations of PS nanoparticles. Transcriptomic and epigenomic signatures of all fibroblasts and hiPSCs were defined using RNA-seq and whole genome methyl-seq, respectively. Both PS-treated fibroblasts and derived hiPSC showed alterations in expression of ESRRB and HNF1A genes and circuits involved in the pluripotency of stem cells, as well as in pathways involved in cancer, inflammatory disorders, gluconeogenesis, carbohydrate metabolism, innate immunity, and dopaminergic synapse. Similarly, the expression levels of identified key transcriptional and DNA methylation changes (DNMT3A, ESSRB, FAM133CP, HNF1A, SEPTIN7P8, and TTC34) were significantly affected in both PS-exposed fibroblasts and hiPSC. This study illustrates the power of human cellular models of environmental pollution to narrow down and prioritize the list of candidate molecular biomarkers of environmental pollution. This knowledge will facilitate the deciphering of the origins of environmental diseases.

%B Environ Pollut %P 120849 %8 2022 Dec 09 %G eng %R 10.1016/j.envpol.2022.120849 %0 Journal Article %J Cell Death Discov %D 2023 %T Rapid degeneration of iPSC-derived motor neurons lacking Gdap1 engages a mitochondrial-sustained innate immune response. %A León, Marian %A Prieto, Javier %A Molina-Navarro, María Micaela %A Garcia-Garcia, Francisco %A Barneo-Muñoz, Manuela %A Ponsoda, Xavier %A Sáez, Rosana %A Palau, Francesc %A Dopazo, Joaquin %A Izpisua Belmonte, Juan Carlos %A Torres, Josema %X

Charcot-Marie-Tooth disease is a chronic hereditary motor and sensory polyneuropathy targeting Schwann cells and/or motor neurons. Its multifactorial and polygenic origin portrays a complex clinical phenotype of the disease with a wide range of genetic inheritance patterns. The disease-associated gene GDAP1 encodes for a mitochondrial outer membrane protein. Mouse and insect models with mutations in Gdap1 have reproduced several traits of the human disease. However, the precise function in the cell types affected by the disease remains unknown. Here, we use induced-pluripotent stem cells derived from a Gdap1 knockout mouse model to better understand the molecular and cellular phenotypes of the disease caused by the loss-of-function of this gene. Gdap1-null motor neurons display a fragile cell phenotype prone to early degeneration showing (1) altered mitochondrial morphology, with an increase in the fragmentation of these organelles, (2) activation of autophagy and mitophagy, (3) abnormal metabolism, characterized by a downregulation of Hexokinase 2 and ATP5b proteins, (4) increased reactive oxygen species and elevated mitochondrial membrane potential, and (5) increased innate immune response and p38 MAP kinase activation. Our data reveals the existence of an underlying Redox-inflammatory axis fueled by altered mitochondrial metabolism in the absence of Gdap1. As this biochemical axis encompasses a wide variety of druggable targets, our results may have implications for developing therapies using combinatorial pharmacological approaches and improving therefore human welfare. A Redox-immune axis underlying motor neuron degeneration caused by the absence of Gdap1. Our results show that Gdap1 motor neurons have a fragile cellular phenotype that is prone to degeneration. Gdap1 iPSCs differentiated into motor neurons showed an altered metabolic state: decreased glycolysis and increased OXPHOS. These alterations may lead to hyperpolarization of mitochondria and increased ROS levels. Excessive amounts of ROS might be the cause of increased mitophagy, p38 activation and inflammation as a cellular response to oxidative stress. The p38 MAPK pathway and the immune response may, in turn, have feedback mechanisms, leading to the induction of apoptosis and senescence, respectively. CAC, citric acid cycle; ETC, electronic transport chain; Glc, glucose; Lac, lactate; Pyr, pyruvate.

%B Cell Death Discov %V 9 %P 217 %8 2023 Jul 01 %G eng %N 1 %R 10.1038/s41420-023-01531-w %0 Journal Article %J Virol J %D 2023 %T Real-world evidence with a retrospective cohort of 15,968 COVID-19 hospitalized patients suggests 21 new effective treatments. %A Loucera, Carlos %A Carmona, Rosario %A Esteban-Medina, Marina %A Bostelmann, Gerrit %A Muñoyerro-Muñiz, Dolores %A Villegas, Román %A Peña-Chilet, Maria %A Dopazo, Joaquin %X

PURPOSE: Despite the extensive vaccination campaigns in many countries, COVID-19 is still a major worldwide health problem because of its associated morbidity and mortality. Therefore, finding efficient treatments as fast as possible is a pressing need. Drug repurposing constitutes a convenient alternative when the need for new drugs in an unexpected medical scenario is urgent, as is the case with COVID-19.

METHODS: Using data from a central registry of electronic health records (the Andalusian Population Health Database), the effect of prior consumption of drugs for other indications previous to the hospitalization with respect to patient outcomes, including survival and lymphocyte progression, was studied on a retrospective cohort of 15,968 individuals, comprising all COVID-19 patients hospitalized in Andalusia between January and November 2020.

RESULTS: Covariate-adjusted hazard ratios and analysis of lymphocyte progression curves support a significant association between consumption of 21 different drugs and better patient survival. Contrarily, one drug, furosemide, displayed a significant increase in patient mortality.

CONCLUSIONS: In this study we have taken advantage of the availability of a regional clinical database to study the effect of drugs, which patients were taking for other indications, on their survival. The large size of the database allowed us to control covariates effectively.

%B Virol J %V 20 %P 226 %8 2023 Oct 06 %G eng %N 1 %R 10.1186/s12985-023-02195-9 %0 Journal Article %J Nature %D 2023 %T A second update on mapping the human genetic architecture of COVID-19. %K COVID-19 %K Human Genetics %K Humans %B Nature %V 621 %P E7-E26 %8 2023 Sep %G eng %N 7977 %R 10.1038/s41586-023-06355-3 %0 Journal Article %J Biology (Basel) %D 2023 %T SigPrimedNet: A Signaling-Informed Neural Network for scRNA-seq Annotation of Known and Unknown Cell Types. %A Gundogdu, Pelin %A Alamo, Inmaculada %A Nepomuceno-Chamorro, Isabel A %A Dopazo, Joaquin %A Loucera, Carlos %X

Single-cell RNA sequencing is increasing our understanding of the behavior of complex tissues or organs, by providing unprecedented details on the complex cell type landscape at the level of individual cells. Cell type definition and functional annotation are key steps to understanding the molecular processes behind the underlying cellular communication machinery. However, the exponential growth of scRNA-seq data has made the task of manually annotating cells unfeasible, due not only to an unparalleled resolution of the technology but to an ever-increasing heterogeneity of the data. Many supervised and unsupervised methods have been proposed to automatically annotate cells. Supervised approaches for cell-type annotation outperform unsupervised methods except when new (unknown) cell types are present. Here, we introduce SigPrimedNet an artificial neural network approach that leverages (i) efficient training by means of a sparsity-inducing signaling circuits-informed layer, (ii) feature representation learning through supervised training, and (iii) unknown cell-type identification by fitting an anomaly detection method on the learned representation. We show that SigPrimedNet can efficiently annotate known cell types while keeping a low false-positive rate for unseen cells across a set of publicly available datasets. In addition, the learned representation acts as a proxy for signaling circuit activity measurements, which provide useful estimations of the cell functionalities.

%B Biology (Basel) %V 12 %8 2023 Apr 10 %G eng %N 4 %R 10.3390/biology12040579 %0 Journal Article %J Front Bioinform %D 2023 %T Visualization of automatically combined disease maps and pathway diagrams for rare diseases. %A Gawron, Piotr %A Hoksza, David %A Piñero, Janet %A Peña-Chilet, Maria %A Esteban-Medina, Marina %A Fernandez-Rueda, Jose Luis %A Colonna, Vincenza %A Smula, Ewa %A Heirendt, Laurent %A Ancien, François %A Grouès, Valentin %A Satagopam, Venkata P %A Schneider, Reinhard %A Dopazo, Joaquin %A Furlong, Laura I %A Ostaszewski, Marek %X

Investigation of molecular mechanisms of human disorders, especially rare diseases, require exploration of various knowledge repositories for building precise hypotheses and complex data interpretation. Recently, increasingly more resources offer diagrammatic representation of such mechanisms, including disease-dedicated schematics in pathway databases and disease maps. However, collection of knowledge across them is challenging, especially for research projects with limited manpower. In this article we present an automated workflow for construction of maps of molecular mechanisms for rare diseases. The workflow requires a standardized definition of a disease using Orphanet or HPO identifiers to collect relevant genes and variants, and to assemble a functional, visual repository of related mechanisms, including data overlays. The diagrams composing the final map are unified to a common systems biology format from CellDesigner SBML, GPML and SBML+layout+render. The constructed resource contains disease-relevant genes and variants as data overlays for immediate visual exploration, including embedded genetic variant browser and protein structure viewer. We demonstrate the functionality of our workflow on two examples of rare diseases: Kawasaki disease and retinitis pigmentosa. Two maps are constructed based on their corresponding identifiers. Moreover, for the retinitis pigmentosa use-case, we include a list of differentially expressed genes to demonstrate how to tailor the workflow using omics datasets. In summary, our work allows for an ad-hoc construction of molecular diagrams combined from different sources, preserving their layout and graphical style, but integrating them into a single resource. This allows to reduce time consuming tasks of prototyping of a molecular disease map, enabling visual exploration, hypothesis building, data visualization and further refinement. The code of the workflow is open and accessible at https://gitlab.lcsb.uni.lu/minerva/automap/.

%B Front Bioinform %V 3 %P 1101505 %8 2023 %G eng %R 10.3389/fbinf.2023.1101505 %0 Journal Article %J Viruses %D 2022 %T Assessing the Impact of SARS-CoV-2 Lineages and Mutations on Patient Survival. %A Loucera, Carlos %A Perez-Florido, Javier %A Casimiro-Soriguer, Carlos S %A Ortuno, Francisco M %A Carmona, Rosario %A Bostelmann, Gerrit %A Martínez-González, L Javier %A Muñoyerro-Muñiz, Dolores %A Villegas, Román %A Rodríguez-Baño, Jesús %A Romero-Gómez, Manuel %A Lorusso, Nicola %A Garcia-León, Javier %A Navarro-Marí, Jose M %A Camacho-Martinez, Pedro %A Merino-Diaz, Laura %A Salazar, Adolfo de %A Viñuela, Laura %A Lepe, Jose A %A García, Federico %A Dopazo, Joaquin %K COVID-19 %K Genome, Viral %K Humans %K mutation %K Pandemics %K Phylogeny %K SARS-CoV-2 %X

OBJECTIVES: More than two years into the COVID-19 pandemic, SARS-CoV-2 still remains a global public health problem. Successive waves of infection have produced new SARS-CoV-2 variants with new mutations for which the impact on COVID-19 severity and patient survival is uncertain.

METHODS: A total of 764 SARS-CoV-2 genomes, sequenced from COVID-19 patients, hospitalized from 19th February 2020 to 30 April 2021, along with their clinical data, were used for survival analysis.

RESULTS: A significant association of B.1.1.7, the alpha lineage, with patient mortality (log hazard ratio (LHR) = 0.51, C.I. = [0.14,0.88]) was found upon adjustment by all the covariates known to affect COVID-19 prognosis. Moreover, survival analysis of mutations in the SARS-CoV-2 genome revealed 27 of them were significantly associated with higher mortality of patients. Most of these mutations were located in the genes coding for the S, ORF8, and N proteins.

CONCLUSIONS: This study illustrates how a combination of genomic and clinical data can provide solid evidence for the impact of viral lineage on patient survival.

%B Viruses %V 14 %8 2022 Aug 27 %G eng %N 9 %R 10.3390/v14091893 %0 Journal Article %J Clin Genet %D 2022 %T CIBERER: Spanish National Network for Research on Rare Diseases: a highly productive collaborative initiative. %A Luque, Juan %A Mendes, Ingrid %A Gómez, Beatriz %A Morte, Beatriz %A de Heredia, Miguel López %A Herreras, Enrique %A Corrochano, Virginia %A Bueren, Juan %A Gallano, Pia %A Artuch, Rafael %A Fillat, Cristina %A Pérez-Jurado, Luis A %A Montoliu, Lluis %A Carracedo, Ángel %A Millán, José M %A Webb, Susan M %A Palau, Francesc %A Lapunzina, Pablo %X

CIBER (Center for Biomedical Network Research; Centro de Investigación Biomédica En Red) is a public national consortium created in 2006 under the umbrella of the Spanish National Institute of Health Carlos III (ISCIII). This innovative research structure comprises 11 different specific areas dedicated to the main public health priorities in the National Health System. CIBERER, the thematic area of CIBER focused on Rare Diseases currently consists of 75 research groups belonging to universities, research centers and hospitals of the entire country. CIBERER's mission is to be a center prioritizing and favoring collaboration and cooperation between biomedical and clinical research groups, with special emphasis on the aspects of genetic, molecular, biochemical and cellular research of rare diseases. This research is the basis for providing new tools for the diagnosis and therapy of low-prevalence diseases, in line with the International Rare Diseases Research Consortium (IRDiRC) objectives, thus favoring translational research between the scientific environment of the laboratory and the clinical setting of health centers. In this paper, we intend to review CIBERER's 15-year journey and summarize the main results obtained in terms of internationalization, scientific production, contributions towards the discovery of new therapies and novel genes associated to diseases, cooperation with patients' associations and many other topics related to rare disease research. This article is protected by copyright. All rights reserved.

%B Clin Genet %8 2022 Jan 20 %G eng %R 10.1111/cge.14113 %0 Journal Article %J Hum Mol Genet %D 2022 %T Discovering potential interactions between rare diseases and COVID-19 by combining mechanistic models of viral infection with statistical modeling. %A López-Sánchez, Macarena %A Loucera, Carlos %A Peña-Chilet, Maria %A Dopazo, Joaquin %X

Recent studies have demonstrated a relevant role of the host genetics in the COVID-19 prognosis. Most of the 7000 rare diseases described to date have a genetic component, typically highly penetrant. However, this vast spectrum of genetic variability remains yet unexplored with respect to possible interactions with COVID-19. Here, a mathematical mechanistic model of the COVID-19 molecular disease mechanism has been used to detect potential interactions between rare disease genes and the COVID-19 infection process and downstream consequences. Out of the 2518 disease genes analyzed, causative of 3854 rare diseases, a total of 254 genes have a direct effect on the COVID-19 molecular disease mechanism and 207 have an indirect effect revealed by a significant strong correlation. This remarkable potential of interaction occurs for more than 300 rare diseases. Mechanistic modeling of COVID-19 disease map has allowed a holistic systematic analysis of the potential interactions between the loss of function in known rare disease genes and the pathological consequences of COVID-19 infection. The results identify links between disease genes and COVID-19 hallmarks and demonstrate the usefulness of the proposed approach for future preventive measures in some rare diseases.

%B Hum Mol Genet %8 2022 Jan 12 %G eng %R 10.1093/hmg/ddac007 %0 Journal Article %J Int J Mol Sci %D 2022 %T Endoglin and MMP14 Contribute to Ewing Sarcoma Spreading by Modulation of Cell-Matrix Interactions. %A Puerto-Camacho, Pilar %A Diaz-Martin, Juan %A Olmedo-Pelayo, Joaquín %A Bolado-Carrancio, Alfonso %A Salguero-Aranda, Carmen %A Jordán-Pérez, Carmen %A Esteban-Medina, Marina %A Alamo-Alvarez, Inmaculada %A Delgado-Bellido, Daniel %A Lobo-Selma, Laura %A Dopazo, Joaquin %A Sastre, Ana %A Alonso, Javier %A Grünewald, Thomas G P %A Bernabeu, Carmelo %A Byron, Adam %A Brunton, Valerie G %A Amaral, Ana Teresa %A de Alava, Enrique %K Bone Neoplasms %K Endoglin %K Humans %K Matrix Metalloproteinase 14 %K Proteomics %K Receptors, Growth Factor %K Sarcoma, Ewing %K Signal Transduction %X

Endoglin (ENG) is a mesenchymal stem cell (MSC) marker typically expressed by active endothelium. This transmembrane glycoprotein is shed by matrix metalloproteinase 14 (MMP14). Our previous work demonstrated potent preclinical activity of first-in-class anti-ENG antibody-drug conjugates as a nascent strategy to eradicate Ewing sarcoma (ES), a devastating rare bone/soft tissue cancer with a putative MSC origin. We also defined a correlation between ENG and MMP14 expression in ES. Herein, we show that ENG expression is significantly associated with a dismal prognosis in a large cohort of ES patients. Moreover, both ENG/MMP14 are frequently expressed in primary ES tumors and metastasis. To deepen in their functional relevance in ES, we conducted transcriptomic and proteomic profiling of in vitro ES models that unveiled a key role of ENG and MMP14 in cell mechano-transduction. Migration and adhesion assays confirmed that loss of ENG disrupts actin filament assembly and filopodia formation, with a concomitant effect on cell spreading. Furthermore, we observed that ENG regulates cell-matrix interaction through activation of focal adhesion signaling and protein kinase C expression. In turn, loss of MMP14 contributed to a more adhesive phenotype of ES cells by modulating the transcriptional extracellular matrix dynamics. Overall, these results suggest that ENG and MMP14 exert a significant role in mediating correct spreading machinery of ES cells, impacting the aggressiveness of the disease.

%B Int J Mol Sci %V 23 %8 2022 Aug 04 %G eng %N 15 %R 10.3390/ijms23158657 %0 Journal Article %J Arch Bronconeumol %D 2022 %T Incidence and Prevalence of Children's Diffuse Lung Disease in Spain. %A Torrent-Vernetta, Alba %A Gaboli, Mirella %A Castillo-Corullón, Silvia %A Mondéjar-López, Pedro %A Sanz Santiago, Verónica %A Costa-Colomer, Jordi %A Osona, Borja %A Torres-Borrego, Javier %A de la Serna-Blázquez, Olga %A Bellón Alonso, Sara %A Caro Aguilera, Pilar %A Gimeno-Díaz de Atauri, Álvaro %A Valenzuela Soria, Alfredo %A Ayats, Roser %A Martin de Vicente, Carlos %A Velasco González, Valle %A Moure González, José Domingo %A Canino Calderín, Elisa María %A Pastor-Vivero, María Dolores %A Villar Álvarez, María Ángeles %A Rovira-Amigo, Sandra %A Iglesias Serrano, Ignacio %A Díez Izquierdo, Ana %A de Mir Messa, Inés %A Gartner, Silvia %A Navarro, Alexandra %A Baz-Redón, Noelia %A Carmona, Rosario %A Camats-Tarruella, Núria %A Fernández-Cancio, Mónica %A Rapp, Christina %A Dopazo, Joaquin %A Griese, Matthias %A Moreno-Galdó, Antonio %X

BACKGROUND: Children's diffuse lung disease, also known as children's Interstitial Lung Diseases (chILD), are a heterogeneous group of rare diseases with relevant morbidity and mortality, which diagnosis and classification are very complex. Epidemiological data are scarce. The aim of this study was to analyse incidence and prevalence of chILD in Spain.

METHODS: Multicentre observational prospective study in patients from 0 to 18 years of age with chILD to analyse its incidence and prevalence in Spain, based on data reported in 2018 and 2019.

RESULTS: A total of 381 cases with chILD were notified from 51 paediatric pulmonology units all over Spain, covering the 91.7% of the paediatric population. The average incidence of chILD was 8.18 (CI 95% 6.28-10.48) new cases/million of children per year. The average prevalence of chILD was 46.53 (CI 95% 41.81-51.62) cases/million of children. The age group with the highest prevalence were children under 1 year of age. Different types of disorders were seen in children 2-18 years of age compared with children 0-2 years of age. Most frequent cases were: primary pulmonary interstitial glycogenosis in neonates (17/65), neuroendocrine cell hyperplasia of infancy in infants from 1 to 12 months (44/144), idiopathic pulmonary haemosiderosis in children from 1 to 5 years old (13/74), hypersensitivity pneumonitis in children from 5 to 10 years old (9/51), and scleroderma in older than 10 years old (8/47).

CONCLUSIONS: We found a higher incidence and prevalence of chILD than previously described probably due to greater understanding and increased clinician awareness of these rare diseases.

%B Arch Bronconeumol %V 58 %P 22-29 %8 2022 Jan %G eng %N 1 %R 10.1016/j.arbres.2021.06.001 %0 Journal Article %J BioData Min %D 2022 %T Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data. %A Gundogdu, Pelin %A Loucera, Carlos %A Alamo-Alvarez, Inmaculada %A Dopazo, Joaquin %A Nepomuceno, Isabel %X

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) data provide valuable insights into cellular heterogeneity which is significantly improving the current knowledge on biology and human disease. One of the main applications of scRNA-seq data analysis is the identification of new cell types and cell states. Deep neural networks (DNNs) are among the best methods to address this problem. However, this performance comes with the trade-off for a lack of interpretability in the results. In this work we propose an intelligible pathway-driven neural network to correctly solve cell-type related problems at single-cell resolution while providing a biologically meaningful representation of the data.

RESULTS: In this study, we explored the deep neural networks constrained by several types of prior biological information, e.g. signaling pathway information, as a way to reduce the dimensionality of the scRNA-seq data. We have tested the proposed biologically-based architectures on thousands of cells of human and mouse origin across a collection of public datasets in order to check the performance of the model. Specifically, we tested the architecture across different validation scenarios that try to mimic how unknown cell types are clustered by the DNN and how it correctly annotates cell types by querying a database in a retrieval problem. Moreover, our approach demonstrated to be comparable to other less interpretable DNN approaches constrained by using protein-protein interactions gene regulation data. Finally, we show how the latent structure learned by the network could be used to visualize and to interpret the composition of human single cell datasets.

CONCLUSIONS: Here we demonstrate how the integration of pathways, which convey fundamental information on functional relationships between genes, with DNNs, that provide an excellent classification framework, results in an excellent alternative to learn a biologically meaningful representation of scRNA-seq data. In addition, the introduction of prior biological knowledge in the DNN reduces the size of the network architecture. Comparative results demonstrate a superior performance of this approach with respect to other similar approaches. As an additional advantage, the use of pathways within the DNN structure enables easy interpretability of the results by connecting features to cell functionalities by means of the pathway nodes, as demonstrated with an example with human melanoma tumor cells.

%B BioData Min %V 15 %P 1 %8 2022 Jan 03 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/34980200?dopt=Abstract %R 10.1186/s13040-021-00285-4 %0 Journal Article %J Hum Mol Genet %D 2022 %T Novel genes and sex differences in COVID-19 severity. %A Cruz, Raquel %A Almeida, Silvia Diz-de %A Heredia, Miguel López %A Quintela, Inés %A Ceballos, Francisco C %A Pita, Guillermo %A Lorenzo-Salazar, José M %A González-Montelongo, Rafaela %A Gago-Domínguez, Manuela %A Porras, Marta Sevilla %A Castaño, Jair Antonio Tenorio %A Nevado, Julián %A Aguado, Jose María %A Aguilar, Carlos %A Aguilera-Albesa, Sergio %A Almadana, Virginia %A Almoguera, Berta %A Alvarez, Nuria %A Andreu-Bernabeu, Álvaro %A Arana-Arri, Eunate %A Arango, Celso %A Arranz, María J %A Artiga, Maria-Jesus %A Baptista-Rosas, Raúl C %A Barreda-Sánchez, María %A Belhassen-Garcia, Moncef %A Bezerra, Joao F %A Bezerra, Marcos A C %A Boix-Palop, Lucía %A Brión, Maria %A Brugada, Ramón %A Bustos, Matilde %A Calderón, Enrique J %A Carbonell, Cristina %A Castano, Luis %A Castelao, Jose E %A Conde-Vicente, Rosa %A Cordero-Lorenzana, M Lourdes %A Cortes-Sanchez, Jose L %A Corton, Marta %A Darnaude, M Teresa %A De Martino-Rodríguez, Alba %A Campo-Pérez, Victor %A Bustamante, Aranzazu Diaz %A Domínguez-Garrido, Elena %A Luchessi, André D %A Eirós, Rocío %A Sanabria, Gladys Mercedes Estigarribia %A Fariñas, María Carmen %A Fernández-Robelo, Uxía %A Fernández-Rodríguez, Amanda %A Fernández-Villa, Tania %A Gil-Fournier, Belén %A Gómez-Arrue, Javier %A Álvarez, Beatriz González %A Quirós, Fernan Gonzalez Bernaldo %A González-Peñas, Javier %A Gutiérrez-Bautista, Juan F %A Herrero, María José %A Herrero-Gonzalez, Antonio %A Jimenez-Sousa, María A %A Lattig, María Claudia %A Borja, Anabel Liger %A Lopez-Rodriguez, Rosario %A Mancebo, Esther %A Martín-López, Caridad %A Martín, Vicente %A Martinez-Nieto, Oscar %A Martinez-Lopez, Iciar %A Martinez-Resendez, Michel F %A Martinez-Perez, Ángel %A Mazzeu, Juliana A %A Macías, Eleuterio Merayo %A Minguez, Pablo %A Cuerda, Victor Moreno %A Silbiger, Vivian N %A Oliveira, Silviene F %A Ortega-Paino, Eva %A Parellada, Mara %A Paz-Artal, Estela %A Santos, Ney P C %A Pérez-Matute, Patricia %A Perez, Patricia %A Pérez-Tomás, M Elena %A Perucho, Teresa %A Pinsach-Abuin, Mel Lina %A Pompa-Mera, Ericka N %A Porras-Hurtado, Gloria L %A Pujol, Aurora %A León, Soraya Ramiro %A Resino, Salvador %A Fernandes, Marianne R %A Rodríguez-Ruiz, Emilio %A Rodriguez-Artalejo, Fernando %A Rodriguez-Garcia, José A %A Ruiz-Cabello, Francisco %A Ruiz-Hornillos, Javier %A Ryan, Pablo %A Soria, José Manuel %A Souto, Juan Carlos %A Tamayo, Eduardo %A Tamayo-Velasco, Alvaro %A Taracido-Fernandez, Juan Carlos %A Teper, Alejandro %A Torres-Tobar, Lilian %A Urioste, Miguel %A Valencia-Ramos, Juan %A Yáñez, Zuleima %A Zarate, Ruth %A Nakanishi, Tomoko %A Pigazzini, Sara %A Degenhardt, Frauke %A Butler-Laporte, Guillaume %A Maya-Miles, Douglas %A Bujanda, Luis %A Bouysran, Youssef %A Palom, Adriana %A Ellinghaus, David %A Martínez-Bueno, Manuel %A Rolker, Selina %A Amitrano, Sara %A Roade, Luisa %A Fava, Francesca %A Spinner, Christoph D %A Prati, Daniele %A Bernardo, David %A García, Federico %A Darcis, Gilles %A Fernández-Cadenas, Israel %A Holter, Jan Cato %A Banales, Jesus M %A Frithiof, Robert %A Duga, Stefano %A Asselta, Rosanna %A Pereira, Alexandre C %A Romero-Gómez, Manuel %A Nafría-Jiménez, Beatriz %A Hov, Johannes R %A Migeotte, Isabelle %A Renieri, Alessandra %A Planas, Anna M %A Ludwig, Kerstin U %A Buti, Maria %A Rahmouni, Souad %A Alarcón-Riquelme, Marta E %A Schulte, Eva C %A Franke, Andre %A Karlsen, Tom H %A Valenti, Luca %A Zeberg, Hugo %A Richards, Brent %A Ganna, Andrea %A Boada, Mercè %A Rojas, Itziar %A Ruiz, Agustín %A Sánchez, Pascual %A Real, Luis Miguel %A Guillén-Navarro, Encarna %A Ayuso, Carmen %A González-Neira, Anna %A Riancho, José A %A Rojas-Martinez, Augusto %A Flores, Carlos %A Lapunzina, Pablo %A Carracedo, Ángel %X

Here we describe the results of a genome-wide study conducted in 11 939 COVID-19 positive cases with an extensive clinical information that were recruited from 34 hospitals across Spain (SCOURGE consortium). In sex-disaggregated genome-wide association studies for COVID-19 hospitalization, genome-wide significance (p < 5x10-8) was crossed for variants in 3p21.31 and 21q22.11 loci only among males (p = 1.3x10-22 and p = 8.1x10-12, respectively), and for variants in 9q21.32 near TLE1 only among females (p = 4.4x10-8). In a second phase, results were combined with an independent Spanish cohort (1598 COVID-19 cases and 1068 population controls), revealing in the overall analysis two novel risk loci in 9p13.3 and 19q13.12, with fine-mapping prioritized variants functionally associated with AQP3 (p = 2.7x10-8) and ARHGAP33 (p = 1.3x10-8), respectively. The meta-analysis of both phases with four European studies stratified by sex from the Host Genetics Initiative confirmed the association of the 3p21.31 and 21q22.11 loci predominantly in males and replicated a recently reported variant in 11p13 (ELF5, p = 4.1x10-8). Six of the COVID-19 HGI discovered loci were replicated and an HGI-based genetic risk score predicted the severity strata in SCOURGE. We also found more SNP-heritability and larger heritability differences by age (<60 or ≥ 60 years) among males than among females. Parallel genome-wide screening of inbreeding depression in SCOURGE also showed an effect of homozygosity in COVID-19 hospitalization and severity and this effect was stronger among older males. In summary, new candidate genes for COVID-19 severity and evidence supporting genetic disparities among sexes are provided.

%B Hum Mol Genet %8 2022 Jun 16 %G eng %R 10.1093/hmg/ddac132 %0 Journal Article %J Sci Rep %D 2022 %T Protein and functional isoform levels and genetic variants of the BAFF and APRIL pathway components in systemic lupus erythematosus. %A Ortiz-Aljaro, Pilar %A Montes-Cano, Marco Antonio %A García-Lozano, José-Raúl %A Aquino, Virginia %A Carmona, Rosario %A Perez-Florido, Javier %A García-Hernández, Francisco José %A Dopazo, Joaquin %A González-Escribano, María Francisca %X

Systemic lupus erythematosus (SLE) is the prototype of an autoimmune disease. Belimumab, a monoclonal antibody targets BAFF, is the only biologic approved for SLE and active lupus nephritis. BAFF is a cytokine with a key-regulatory role in the B cell homeostasis, which acts by binding to three receptors: BAFF-R, TACI and BCMA. TACI and BCMA also bind APRIL. Many studies reported elevated soluble BAFF and APRIL levels in the sera of SLE patients, but other questions about the role of this system in the disease remain open. The study aimed to investigate the utility of the cytokine levels in serum and urine as biomarkers, the role of non-functional isoforms, and the association of gene variants with the disease. This case-control study includes a cohort (women, 18-60 years old) of 100 patients (48% with nephritis) and 100 healthy controls. We used ELISA assays to measure the cytokine concentrations in serum (sBAFF and sAPRIL) and urine (uBAFF and uAPRIL); TaqMan Gene Expression Assays to quantify the relative mRNA expression of ΔBAFF, βAPRIL, and εAPRIL, and next-generation sequencing to genotype the cytokine (TNFSF13 and TNFSF13B) and receptor (TNFRSF13B, TNFRSF17 and TNFRSF13C) genes. The statistical tests used were: Kruskal-Wallis (qualitative variables), the Spearman Rho coefficient (correlations), the Chi-square and SKAT (association of common and rare genetic variants, respectively). As expected, sBAFF and sAPRIL levels were higher in patients than in controls (p ≤ 0.001) but found differences between patient subgroups. sBAFF and sAPRIL significantly correlated only in patients with nephritis (r = 0.67, p ≤ 0.001) and βAPRIL levels were lower in patients with nephritis (p = 0.04), and ΔBAFF levels were lower in patients with dsDNA antibodies (p = 0.04). Rare variants of TNFSF13 and TNFRSF13B and TNFSF13 p.Gly67Arg and TNFRSF13B p.Val220Ala were associated with SLE. Our study supports differences among SLE patient subgroups with diverse clinical features in the BAFF/APRIL pathway. In addition, it suggests the involvement of genetic variants in the susceptibility to the disease.

%B Sci Rep %V 12 %P 11219 %8 2022 Jul 02 %G eng %N 1 %R 10.1038/s41598-022-15549-0 %0 Journal Article %J Antioxidants (Basel) %D 2022 %T An SPM-Enriched Marine Oil Supplement Shifted Microglia Polarization toward M2, Ameliorating Retinal Degeneration in Mice. %A Olivares-González, Lorena %A Velasco, Sheyla %A Gallego, Idoia %A Esteban-Medina, Marina %A Puras, Gustavo %A Loucera, Carlos %A Martínez-Romero, Alicia %A Peña-Chilet, Maria %A Pedraz, José Luis %A Rodrigo, Regina %X

Retinitis pigmentosa (RP) is the most common inherited retinal dystrophy causing progressive vision loss. It is accompanied by chronic and sustained inflammation, including M1 microglia activation. This study evaluated the effect of an essential fatty acid (EFA) supplement containing specialized pro-resolving mediators (SPMs), on retinal degeneration and microglia activation in mice, a model of RP, as well as on LPS-stimulated BV2 cells. The EFA supplement was orally administered to mice from postnatal day (P)9 to P18. At P18, the electrical activity of the retina was examined by electroretinography (ERG) and innate behavior in response to light were measured. Retinal degeneration was studied via histology including the TUNEL assay and microglia immunolabeling. Microglia polarization (M1/M2) was assessed by flow cytometry, qPCR, ELISA and histology. Redox status was analyzed by measuring antioxidant enzymes and markers of oxidative damage. Interestingly, the EFA supplement ameliorated retinal dysfunction and degeneration by improving ERG recording and sensitivity to light, and reducing photoreceptor cell loss. The EFA supplement reduced inflammation and microglia activation attenuating M1 markers as well as inducing a shift to the M2 phenotype in mouse retinas and LPS-stimulated BV2 cells. It also reduced oxidative stress markers of lipid peroxidation and carbonylation. These findings could open up new therapeutic opportunities based on resolving inflammation with oral supplementation with SPMs such as the EFA supplement.

%B Antioxidants (Basel) %V 12 %8 2022 Dec 30 %G eng %N 1 %R 10.3390/antiox12010098 %0 Journal Article %J Sci Rep %D 2022 %T Towards a metagenomics machine learning interpretable model for understanding the transition from adenoma to colorectal cancer. %A Casimiro-Soriguer, Carlos S %A Loucera, Carlos %A Peña-Chilet, Maria %A Dopazo, Joaquin %X

Gut microbiome is gaining interest because of its links with several diseases, including colorectal cancer (CRC), as well as the possibility of being used to obtain non-intrusive predictive disease biomarkers. Here we performed a meta-analysis of 1042 fecal metagenomic samples from seven publicly available studies. We used an interpretable machine learning approach based on functional profiles, instead of the conventional taxonomic profiles, to produce a highly accurate predictor of CRC with better precision than those of previous proposals. Moreover, this approach is also able to discriminate samples with adenoma, which makes this approach very promising for CRC prevention by detecting early stages in which intervention is easier and more effective. In addition, interpretable machine learning methods allow extracting features relevant for the classification, which reveals basic molecular mechanisms accounting for the changes undergone by the microbiome functional landscape in the transition from healthy gut to adenoma and CRC conditions. Functional profiles have demonstrated superior accuracy in predicting CRC and adenoma conditions than taxonomic profiles and additionally, in a context of explainable machine learning, provide useful hints on the molecular mechanisms operating in the microbiota behind these conditions.

%B Sci Rep %V 12 %P 450 %8 2022 Jan 10 %G eng %N 1 %R 10.1038/s41598-021-04182-y %0 Journal Article %J BMC Bioinformatics %D 2021 %T A comprehensive database for integrated analysis of omics data in autoimmune diseases. %A Martorell-Marugán, Jordi %A López-Domínguez, Raúl %A García-Moreno, Adrián %A Toro-Domínguez, Daniel %A Villatoro-García, Juan Antonio %A Barturen, Guillermo %A Martín-Gómez, Adoración %A Troule, Kevin %A Gómez-López, Gonzalo %A Al-Shahrour, Fátima %A González-Rumayor, Víctor %A Peña-Chilet, Maria %A Dopazo, Joaquin %A Saez-Rodriguez, Julio %A Alarcón-Riquelme, Marta E %A Carmona-Sáez, Pedro %K Autoimmune Diseases %K Computational Biology %K Databases, Factual %K Humans %X

BACKGROUND: Autoimmune diseases are heterogeneous pathologies with difficult diagnosis and few therapeutic options. In the last decade, several omics studies have provided significant insights into the molecular mechanisms of these diseases. Nevertheless, data from different cohorts and pathologies are stored independently in public repositories and a unified resource is imperative to assist researchers in this field.

RESULTS: Here, we present Autoimmune Diseases Explorer ( https://adex.genyo.es ), a database that integrates 82 curated transcriptomics and methylation studies covering 5609 samples for some of the most common autoimmune diseases. The database provides, in an easy-to-use environment, advanced data analysis and statistical methods for exploring omics datasets, including meta-analysis, differential expression or pathway analysis.

CONCLUSIONS: This is the first omics database focused on autoimmune diseases. This resource incorporates homogeneously processed data to facilitate integrative analyses among studies.

%B BMC Bioinformatics %V 22 %P 343 %8 2021 Jun 24 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/34167460?dopt=Abstract %R 10.1186/s12859-021-04268-4 %0 Journal Article %J Mol Syst Biol %D 2021 %T COVID19 Disease Map, a computational knowledge repository of virus-host interaction mechanisms. %A Ostaszewski, Marek %A Niarakis, Anna %A Mazein, Alexander %A Kuperstein, Inna %A Phair, Robert %A Orta-Resendiz, Aurelio %A Singh, Vidisha %A Aghamiri, Sara Sadat %A Acencio, Marcio Luis %A Glaab, Enrico %A Ruepp, Andreas %A Fobo, Gisela %A Montrone, Corinna %A Brauner, Barbara %A Frishman, Goar %A Monraz Gómez, Luis Cristóbal %A Somers, Julia %A Hoch, Matti %A Kumar Gupta, Shailendra %A Scheel, Julia %A Borlinghaus, Hanna %A Czauderna, Tobias %A Schreiber, Falk %A Montagud, Arnau %A Ponce de Leon, Miguel %A Funahashi, Akira %A Hiki, Yusuke %A Hiroi, Noriko %A Yamada, Takahiro G %A Dräger, Andreas %A Renz, Alina %A Naveez, Muhammad %A Bocskei, Zsolt %A Messina, Francesco %A Börnigen, Daniela %A Fergusson, Liam %A Conti, Marta %A Rameil, Marius %A Nakonecnij, Vanessa %A Vanhoefer, Jakob %A Schmiester, Leonard %A Wang, Muying %A Ackerman, Emily E %A Shoemaker, Jason E %A Zucker, Jeremy %A Oxford, Kristie %A Teuton, Jeremy %A Kocakaya, Ebru %A Summak, Gökçe Yağmur %A Hanspers, Kristina %A Kutmon, Martina %A Coort, Susan %A Eijssen, Lars %A Ehrhart, Friederike %A Rex, Devasahayam Arokia Balaya %A Slenter, Denise %A Martens, Marvin %A Pham, Nhung %A Haw, Robin %A Jassal, Bijay %A Matthews, Lisa %A Orlic-Milacic, Marija %A Senff Ribeiro, Andrea %A Rothfels, Karen %A Shamovsky, Veronica %A Stephan, Ralf %A Sevilla, Cristoffer %A Varusai, Thawfeek %A Ravel, Jean-Marie %A Fraser, Rupsha %A Ortseifen, Vera %A Marchesi, Silvia %A Gawron, Piotr %A Smula, Ewa %A Heirendt, Laurent %A Satagopam, Venkata %A Wu, Guanming %A Riutta, Anders %A Golebiewski, Martin %A Owen, Stuart %A Goble, Carole %A Hu, Xiaoming %A Overall, Rupert W %A Maier, Dieter %A Bauch, Angela %A Gyori, Benjamin M %A Bachman, John A %A Vega, Carlos %A Grouès, Valentin %A Vazquez, Miguel %A Porras, Pablo %A Licata, Luana %A Iannuccelli, Marta %A Sacco, Francesca %A Nesterova, Anastasia %A Yuryev, Anton %A de Waard, Anita %A Turei, Denes %A Luna, Augustin %A Babur, Ozgun %A Soliman, Sylvain %A Valdeolivas, Alberto %A Esteban-Medina, Marina %A Peña-Chilet, Maria %A Rian, Kinza %A Helikar, Tomáš %A Puniya, Bhanwar Lal %A Modos, Dezso %A Treveil, Agatha %A Olbei, Marton %A De Meulder, Bertrand %A Ballereau, Stephane %A Dugourd, Aurélien %A Naldi, Aurélien %A Noël, Vincent %A Calzone, Laurence %A Sander, Chris %A Demir, Emek %A Korcsmaros, Tamas %A Freeman, Tom C %A Augé, Franck %A Beckmann, Jacques S %A Hasenauer, Jan %A Wolkenhauer, Olaf %A Wilighagen, Egon L %A Pico, Alexander R %A Evelo, Chris T %A Gillespie, Marc E %A Stein, Lincoln D %A Hermjakob, Henning %A D'Eustachio, Peter %A Saez-Rodriguez, Julio %A Dopazo, Joaquin %A Valencia, Alfonso %A Kitano, Hiroaki %A Barillot, Emmanuel %A Auffray, Charles %A Balling, Rudi %A Schneider, Reinhard %K Antiviral Agents %K Computational Biology %K Computer Graphics %K COVID-19 %K Cytokines %K Data Mining %K Databases, Factual %K Gene Expression Regulation %K Host Microbial Interactions %K Humans %K Immunity, Cellular %K Immunity, Humoral %K Immunity, Innate %K Lymphocytes %K Metabolic Networks and Pathways %K Myeloid Cells %K Protein Interaction Mapping %K SARS-CoV-2 %K Signal Transduction %K Software %K Transcription Factors %K Viral Proteins %X

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.

%B Mol Syst Biol %V 17 %P e10387 %8 2021 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/34664389?dopt=Abstract %R 10.15252/msb.202110387 %0 Journal Article %J Nucleic Acids Res %D 2021 %T CSVS, a crowdsourcing database of the Spanish population genetic variability. %A Peña-Chilet, Maria %A Roldán, Gema %A Perez-Florido, Javier %A Ortuno, Francisco M %A Carmona, Rosario %A Aquino, Virginia %A López-López, Daniel %A Loucera, Carlos %A Fernandez-Rueda, Jose L %A Gallego, Asunción %A Garcia-Garcia, Francisco %A González-Neira, Anna %A Pita, Guillermo %A Núñez-Torres, Rocío %A Santoyo-López, Javier %A Ayuso, Carmen %A Minguez, Pablo %A Avila-Fernandez, Almudena %A Corton, Marta %A Moreno-Pelayo, Miguel Ángel %A Morin, Matías %A Gallego-Martinez, Alvaro %A Lopez-Escamez, Jose A %A Borrego, Salud %A Antiňolo, Guillermo %A Amigo, Jorge %A Salgado-Garrido, Josefa %A Pasalodos-Sanchez, Sara %A Morte, Beatriz %A Carracedo, Ángel %A Alonso, Ángel %A Dopazo, Joaquin %K Alleles %K Chromosome Mapping %K Crowdsourcing %K Databases, Genetic %K Exome %K Gene Frequency %K Genetic Variation %K Genetics, Population %K Genome, Human %K Genomics %K Humans %K Internet %K Precision Medicine %K Software %K Spain %X

The knowledge of the genetic variability of the local population is of utmost importance in personalized medicine and has been revealed as a critical factor for the discovery of new disease variants. Here, we present the Collaborative Spanish Variability Server (CSVS), which currently contains more than 2000 genomes and exomes of unrelated Spanish individuals. This database has been generated in a collaborative crowdsourcing effort collecting sequencing data produced by local genomic projects and for other purposes. Sequences have been grouped by ICD10 upper categories. A web interface allows querying the database removing one or more ICD10 categories. In this way, aggregated counts of allele frequencies of the pseudo-control Spanish population can be obtained for diseases belonging to the category removed. Interestingly, in addition to pseudo-control studies, some population studies can be made, as, for example, prevalence of pharmacogenomic variants, etc. In addition, this genomic data has been used to define the first Spanish Genome Reference Panel (SGRP1.0) for imputation. This is the first local repository of variability entirely produced by a crowdsourcing effort and constitutes an example for future initiatives to characterize local variability worldwide. CSVS is also part of the GA4GH Beacon network. CSVS can be accessed at: http://csvs.babelomics.org/.

%B Nucleic Acids Res %V 49 %P D1130-D1137 %8 2021 01 08 %G eng %N D1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32990755?dopt=Abstract %R 10.1093/nar/gkaa794 %0 Journal Article %J Am J Med Genet A %D 2021 %T De novo small deletion affecting transcription start site of short isoform of AUTS2 gene in a patient with syndromic neurodevelopmental defects. %A Martinez-Delgado, Beatriz %A Lopez-Martin, Estrella %A Lara-Herguedas, Julián %A Monzon, Sara %A Cuesta, Isabel %A Juliá, Miguel %A Aquino, Virginia %A Rodriguez-Martin, Carlos %A Damian, Alejandra %A Gonzalo, Irene %A Gomez-Mariano, Gema %A Baladron, Beatriz %A Cazorla, Rosario %A Iglesias, Gema %A Roman, Enriqueta %A Ros, Purificacion %A Tutor, Pablo %A Mellor, Susana %A Jimenez, Carlos %A Cabrejas, Maria Jose %A Gonzalez-Vioque, Emiliano %A Alonso, Javier %A Bermejo-Sánchez, Eva %A Posada, Manuel %K Child, Preschool %K Cytoskeletal Proteins %K Dwarfism %K Exons %K Gene Expression Regulation %K Genetic Association Studies %K Humans %K Male %K Neurodevelopmental Disorders %K Protein Isoforms %K RNA, Messenger %K Sequence Deletion %K Syndrome %K Transcription Factors %K Transcription Initiation Site %K Transcription, Genetic %X

Disruption of the autism susceptibility candidate 2 (AUTS2) gene through genomic rearrangements, copy number variations (CNVs), and intragenic deletions and mutations, has been recurrently involved in syndromic forms of developmental delay and intellectual disability, known as AUTS2 syndrome. The AUTS2 gene plays an important role in regulation of neuronal migration, and when altered, associates with a variable phenotype from severely to mildly affected patients. The more severe phenotypes significantly correlate with the presence of defects affecting the C-terminus part of the gene. This article reports a new patient with a syndromic neurodevelopmental disorder, who presents a deletion of 30 nucleotides in the exon 9 of the AUTS2 gene. Importantly, this deletion includes the transcription start site for the AUTS2 short transcript isoform, which has an important role in brain development. Gene expression analysis of AUTS2 full-length and short isoforms revealed that the deletion found in this patient causes a remarkable reduction in the expression level, not only of the short isoform, but also of the full AUTS2 transcripts. This report adds more evidence for the role of mutated AUTS2 short transcripts in the development of a severe phenotype in the AUTS2 syndrome.

%B Am J Med Genet A %V 185 %P 877-883 %8 2021 03 %G eng %N 3 %R 10.1002/ajmg.a.62017 %0 Journal Article %J Mathematics %D 2021 %T Deciphering Genomic Heterogeneity and the Internal Composition of Tumour Activities through a Hierarchical Factorisation Model %A Carbonell-Caballero, José %A López-Quílez, Antonio %A Conesa, David %A Dopazo, Joaquin %B Mathematics %V 9 %P 2833 %8 Jan-11-2021 %G eng %U https://www.mdpi.com/2227-7390/9/21/2833https://www.mdpi.com/2227-7390/9/21/2833/pdf %N 21 %! Mathematics %R 10.3390/math9212833 %0 Journal Article %J Mol Oncol %D 2021 %T A DNA damage repair gene-associated signature predicts responses of patients with advanced soft-tissue sarcoma to treatment with trabectedin. %A Moura, David S %A Peña-Chilet, Maria %A Cordero Varela, Juan Antonio %A Alvarez-Alegret, Ramiro %A Agra-Pujol, Carolina %A Izquierdo, Francisco %A Ramos, Rafael %A Ortega-Medina, Luis %A Martin-Davila, Francisco %A Castilla-Ramirez, Carolina %A Hernandez-Leon, Carmen Nieves %A Romagosa, Cleofe %A Vaz Salgado, Maria Angeles %A Lavernia, Javier %A Bagué, Silvia %A Mayodormo-Aranda, Empar %A Vicioso, Luis %A Hernández Barceló, Jose Emilio %A Rubio-Casadevall, Jordi %A de Juan, Ana %A Fiaño-Valverde, Maria Concepcion %A Hindi, Nadia %A Lopez-Alvarez, Maria %A Lacerenza, Serena %A Dopazo, Joaquin %A Gutierrez, Antonio %A Alvarez, Rosa %A Valverde, Claudia %A Martinez-Trufero, Javier %A Martin-Broto, Javier %X

Predictive biomarkers of trabectedin represent an unmet need in advanced soft-tissue sarcomas (STS). DNA damage repair (DDR) genes, involved in homologous recombination or nucleotide excision repair, had been previously described as biomarkers of trabectedin resistance or sensitivity, respectively. The majority of these studies only focused on specific factors (ERCC1, ERCC5, and BRCA1) and did not evaluate several other DDR-related genes that could have a relevant role for trabectedin efficacy. In this retrospective translational study, 118 genes involved in DDR were evaluated to determine, by transcriptomics, a predictive gene signature of trabectedin efficacy. A six-gene predictive signature of trabectedin efficacy was built in a series of 139 tumor samples from patients with advanced STS. Patients in the high-risk gene signature group showed a significantly worse progression-free survival compared with patients in the low-risk group (2.1 vs 6.0 months, respectively). Differential gene expression analysis defined new potential predictive biomarkers of trabectedin sensitivity (PARP3 and CCNH) or resistance (DNAJB11 and PARP1). Our study identified a new gene signature that significantly predicts patients with higher probability to respond to treatment with trabectedin. Targeting some genes of this signature emerges as a potential strategy to enhance trabectedin efficacy.

%B Mol Oncol %V 15 %P 3691-3705 %8 2021 12 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/33983674?dopt=Abstract %R 10.1002/1878-0261.12996 %0 Journal Article %J Nat Methods %D 2021 %T DOME: recommendations for supervised machine learning validation in biology. %A Walsh, Ian %A Fishman, Dmytro %A Garcia-Gasulla, Dario %A Titma, Tiina %A Pollastri, Gianluca %A Harrow, Jennifer %A Psomopoulos, Fotis E %A Tosatto, Silvio C E %K Algorithms %K Computational Biology %K Guidelines as Topic %K Humans %K Models, Biological %K Research Design %K Supervised Machine Learning %B Nat Methods %V 18 %P 1122-1127 %8 2021 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/34316068?dopt=Abstract %R 10.1038/s41592-021-01205-4 %0 Journal Article %J Computational and Structural Biotechnology Journal %D 2021 %T Genome-scale mechanistic modeling of signaling pathways made easy: A bioconductor/cytoscape/web server framework for the analysis of omic data %A Rian, Kinza %A Hidalgo, Marta R. %A Cubuk, Cankut %A Falco, Matias M. %A Loucera, Carlos %A Esteban-Medina, Marina %A Alamo-Alvarez, Inmaculada %A Peña-Chilet, Maria %A Dopazo, Joaquin %B Computational and Structural Biotechnology Journal %V 19 %P 2968 - 2978 %8 Jan-01-2021 %G eng %U https://linkinghub.elsevier.com/retrieve/pii/S2001037021002038 %! Computational and Structural Biotechnology Journal %R 10.1016/j.csbj.2021.05.022 %0 Journal Article %J Clinical Epigenetics %D 2021 %T Genome-wide analysis of DNA methylation in Hirschsprung enteric precursor cells: unraveling the epigenetic landscape of enteric nervous system developmentAbstractBackgroundResultsConclusionsGraphic abstract %A Villalba-Benito, Leticia %A López-López, Daniel %A Torroglosa, Ana %A Casimiro-Soriguer, Carlos S. %A Luzón-Toro, Berta %A Fernández, Raquel María %A Moya-Jiménez, María José %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %B Clinical Epigenetics %V 13 %8 Jan-12-2021 %G eng %U http://link.springer.com/article/10.1186/s13148-021-01040-6/fulltext.html %N 1 %! Clin Epigenet %R 10.1186/s13148-021-01040-6 %0 Journal Article %J Gigascience %D 2021 %T Highly accurate whole-genome imputation of SARS-CoV-2 from partial or low-quality sequences. %A Ortuno, Francisco M %A Loucera, Carlos %A Casimiro-Soriguer, Carlos S %A Lepe, Jose A %A Camacho Martinez, Pedro %A Merino Diaz, Laura %A de Salazar, Adolfo %A Chueca, Natalia %A García, Federico %A Perez-Florido, Javier %A Dopazo, Joaquin %K Genome, Viral %K Phylogeny %K SARS-CoV-2 %K Whole Genome Sequencing %X

BACKGROUND: The current SARS-CoV-2 pandemic has emphasized the utility of viral whole-genome sequencing in the surveillance and control of the pathogen. An unprecedented ongoing global initiative is producing hundreds of thousands of sequences worldwide. However, the complex circumstances in which viruses are sequenced, along with the demand of urgent results, causes a high rate of incomplete and, therefore, useless sequences. Viral sequences evolve in the context of a complex phylogeny and different positions along the genome are in linkage disequilibrium. Therefore, an imputation method would be able to predict missing positions from the available sequencing data.

RESULTS: We have developed the impuSARS application, which takes advantage of the enormous number of SARS-CoV-2 genomes available, using a reference panel containing 239,301 sequences, to produce missing data imputation in viral genomes. ImpuSARS was tested in a wide range of conditions (continuous fragments, amplicons or sparse individual positions missing), showing great fidelity when reconstructing the original sequences, recovering the lineage with a 100% precision for almost all the lineages, even in very poorly covered genomes (<20%).

CONCLUSIONS: Imputation can improve the pace of SARS-CoV-2 sequencing production by recovering many incomplete or low-quality sequences that would be otherwise discarded. ImpuSARS can be incorporated in any primary data processing pipeline for SARS-CoV-2 whole-genome sequencing.

%B Gigascience %V 10 %8 2021 12 02 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/34865008?dopt=Abstract %R 10.1093/gigascience/giab078 %0 Journal Article %J Cancer Immunol Immunother %D 2021 %T Immunotherapy in nonsmall-cell lung cancer: current status and future prospects for liquid biopsy. %A Brozos-Vázquez, Elena María %A Díaz-Peña, Roberto %A García-González, Jorge %A León-Mateos, Luis %A Mondelo-Macía, Patricia %A Peña-Chilet, Maria %A López-López, Rafael %K Animals %K Biomarkers, Tumor %K Carcinoma, Non-Small-Cell Lung %K Cell-Free Nucleic Acids %K Exosomes %K Humans %K Immunotherapy %K Liquid Biopsy %K Lung Neoplasms %X

Immunotherapy has been one of the great advances in the recent years for the treatment of advanced tumors, with nonsmall-cell lung cancer (NSCLC) being one of the cancers that has benefited most from this approach. Currently, the only validated companion diagnostic test for first-line immunotherapy in metastatic NSCLC patients is testing for programmed death ligand 1 (PD-L1) expression in tumor tissues. However, not all patients experience an effective response with the established selection criteria and immune checkpoint inhibitors (ICIs). Liquid biopsy offers a noninvasive opportunity to monitor disease in patients with cancer and identify those who would benefit the most from immunotherapy. This review focuses on the use of liquid biopsy in immunotherapy treatment of NSCLC patients. Circulating tumor cells (CTCs), cell-free DNA (cfDNA) and exosomes are promising tools for developing new biomarkers. We discuss the current application and future implementation of these parameters to improve therapeutic decision-making and identify the patients who will benefit most from immunotherapy.

%B Cancer Immunol Immunother %V 70 %P 1177-1188 %8 2021 May %G eng %N 5 %R 10.1007/s00262-020-02752-z %0 Journal Article %J J Pers Med %D 2021 %T Implementing Personalized Medicine in COVID-19 in Andalusia: An Opportunity to Transform the Healthcare System. %A Dopazo, Joaquin %A Maya-Miles, Douglas %A García, Federico %A Lorusso, Nicola %A Calleja, Miguel Ángel %A Pareja, María Jesús %A López-Miranda, José %A Rodríguez-Baño, Jesús %A Padillo, Javier %A Túnez, Isaac %A Romero-Gómez, Manuel %X

The COVID-19 pandemic represents an unprecedented opportunity to exploit the advantages of personalized medicine for the prevention, diagnosis, treatment, surveillance and management of a new challenge in public health. COVID-19 infection is highly variable, ranging from asymptomatic infections to severe, life-threatening manifestations. Personalized medicine can play a key role in elucidating individual susceptibility to the infection as well as inter-individual variability in clinical course, prognosis and response to treatment. Integrating personalized medicine into clinical practice can also transform health care by enabling the design of preventive and therapeutic strategies tailored to individual profiles, improving the detection of outbreaks or defining transmission patterns at an increasingly local level. SARS-CoV2 genome sequencing, together with the assessment of specific patient genetic variants, will support clinical decision-makers and ultimately better ways to fight this disease. Additionally, it would facilitate a better stratification and selection of patients for clinical trials, thus increasing the likelihood of obtaining positive results. Lastly, defining a national strategy to implement in clinical practice all available tools of personalized medicine in COVID-19 could be challenging but linked to a positive transformation of the health care system. In this review, we provide an update of the achievements, promises, and challenges of personalized medicine in the fight against COVID-19 from susceptibility to natural history and response to therapy, as well as from surveillance to control measures and vaccination. We also discuss strategies to facilitate the adoption of this new paradigm for medical and public health measures during and after the pandemic in health care systems.

%B J Pers Med %V 11 %8 2021 May 26 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/34073493?dopt=Abstract %R 10.3390/jpm11060475 %0 Journal Article %J Nature %D 2021 %T Mapping the human genetic architecture of COVID-19. %X

The genetic make-up of an individual contributes to the susceptibility and response to viral infection. Although environmental, clinical and social factors have a role in the chance of exposure to SARS-CoV-2 and the severity of COVID-19, host genetics may also be important. Identifying host-specific genetic factors may reveal biological mechanisms of therapeutic relevance and clarify causal relationships of modifiable environmental risk factors for SARS-CoV-2 infection and outcomes. We formed a global network of researchers to investigate the role of human genetics in SARS-CoV-2 infection and COVID-19 severity. Here we describe the results of three genome-wide association meta-analyses that consist of up to 49,562 patients with COVID-19 from 46 studies across 19 countries. We report 13 genome-wide significant loci that are associated with SARS-CoV-2 infection or severe manifestations of COVID-19. Several of these loci correspond to previously documented associations to lung or autoimmune and inflammatory diseases. They also represent potentially actionable mechanisms in response to infection. Mendelian randomization analyses support a causal role for smoking and body-mass index for severe COVID-19 although not for type II diabetes. The identification of novel host genetic factors associated with COVID-19 was made possible by the community of human genetics researchers coming together to prioritize the sharing of data, results, resources and analytical frameworks. This working model of international collaboration underscores what is possible for future genetic discoveries in emerging pandemics, or indeed for any complex human disease.

%B Nature %V 600 %P 472-477 %8 2021 Dec %G eng %N 7889 %1 https://www.ncbi.nlm.nih.gov/pubmed/34237774?dopt=Abstract %R 10.1038/s41586-021-03767-x %0 Journal Article %J BioData Min %D 2021 %T Mechanistic modeling of the SARS-CoV-2 disease map. %A Rian, Kinza %A Esteban-Medina, Marina %A Hidalgo, Marta R %A Cubuk, Cankut %A Falco, Matias M %A Loucera, Carlos %A Gunyel, Devrim %A Ostaszewski, Marek %A Peña-Chilet, Maria %A Dopazo, Joaquin %X

Here we present a web interface that implements a comprehensive mechanistic model of the SARS-CoV-2 disease map. In this framework, the detailed activity of the human signaling circuits related to the viral infection, covering from the entry and replication mechanisms to the downstream consequences as inflammation and antigenic response, can be inferred from gene expression experiments. Moreover, the effect of potential interventions, such as knock-downs, or drug effects (currently the system models the effect of more than 8000 DrugBank drugs) can be studied. This freely available tool not only provides an unprecedentedly detailed view of the mechanisms of viral invasion and the consequences in the cell but has also the potential of becoming an invaluable asset in the search for efficient antiviral treatments.

%B BioData Min %V 14 %P 5 %8 2021 Jan 21 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33478554?dopt=Abstract %R 10.1186/s13040-021-00234-1 %0 Journal Article %J Cancers (Basel) %D 2021 %T Mutational Characterization of Cutaneous Melanoma Supports Divergent Pathways Model for Melanoma Development. %A Millán-Esteban, David %A Peña-Chilet, Maria %A García-Casado, Zaida %A Manrique-Silva, Esperanza %A Requena, Celia %A Bañuls, José %A Lopez-Guerrero, Jose Antonio %A Rodríguez-Hernández, Aranzazu %A Traves, Víctor %A Dopazo, Joaquin %A Virós, Amaya %A Kumar, Rajiv %A Nagore, Eduardo %X

According to the divergent pathway model, cutaneous melanoma comprises a nevogenic group with a propensity to melanocyte proliferation and another one associated with cumulative solar damage (CSD). While characterized clinically and epidemiologically, the differences in the molecular profiles between the groups have remained primarily uninvestigated. This study has used a custom gene panel and bioinformatics tools to investigate the potential molecular differences in a thoroughly characterized cohort of 119 melanoma patients belonging to nevogenic and CSD groups. We found that the nevogenic melanomas had a restricted set of mutations, with the prominently mutated gene being . The CSD melanomas, in contrast, showed mutations in a diverse group of genes that included , , , and . We thus provide evidence that nevogenic and CSD melanomas constitute different biological entities and highlight the need to explore new targeted therapies.

%B Cancers (Basel) %V 13 %8 2021 Oct 18 %G eng %N 20 %R 10.3390/cancers13205219 %0 Journal Article %J Nature Genetics %D 2021 %T The NCI Genomic Data Commons %A Heath, Allison P. %A Ferretti, Vincent %A Agrawal, Stuti %A An, Maksim %A Angelakos, James C. %A Arya, Renuka %A Bajari, Rosita %A Baqar, Bilal %A Barnowski, Justin H. B. %A Burt, Jeffrey %A Catton, Ann %A Chan, Brandon F. %A Chu, Fay %A Cullion, Kim %A Davidsen, Tanja %A Do, Phuong-My %A Dompierre, Christian %A Ferguson, Martin L. %A Fitzsimons, Michael S. %A Ford, Michael %A Fukuma, Miyuki %A Gaheen, Sharon %A Ganji, Gajanan L. %A Garcia, Tzintzuni I. %A George, Sameera S. %A Gerhard, Daniela S. %A Gerthoffert, Francois %A Gomez, Fauzi %A Han, Kang %A Hernandez, Kyle M. %A Issac, Biju %A Jackson, Richard %A Jensen, Mark A. %A Joshi, Sid %A Kadam, Ajinkya %A Khurana, Aishmit %A Kim, Kyle M. J. %A Kraft, Victoria E. %A Li, Shenglai %A Lichtenberg, Tara M. %A Lodato, Janice %A Lolla, Laxmi %A Martinov, Plamen %A Mazzone, Jeffrey A. %A Miller, Daniel P. %A Miller, Ian %A Miller, Joshua S. %A Miyauchi, Koji %A Murphy, Mark W. %A Nullet, Thomas %A Ogwara, Rowland O. %A Ortuño, Francisco M. %A Pedrosa, Jesús %A Pham, Phuong L. %A Popov, Maxim Y. %A Porter, James J. %A Powell, Raymond %A Rademacher, Karl %A Reid, Colin P. %A Rich, Samantha %A Rogel, Bessie %A Sahni, Himanso %A Savage, Jeremiah H. %A Schmitt, Kyle A. %A Simmons, Trevar J. %A Sislow, Joseph %A Spring, Jonathan %A Stein, Lincoln %A Sullivan, Sean %A Tang, Yajing %A Thiagarajan, Mathangi %A Troyer, Heather D. %A Wang, Chang %A Wang, Zhining %A West, Bedford L. %A Wilmer, Alex %A Wilson, Shane %A Wu, Kaman %A Wysocki, William P. %A Xiang, Linda %A Yamada, Joseph T. %A Yang, Liming %A Yu, Christine %A Yung, Christina K. %A Zenklusen, Jean Claude %A Zhang, Junjun %A Zhang, Zhenyu %A Zhao, Yuanheng %A Zubair, Ariz %A Staudt, Louis M. %A Grossman, Robert L. %B Nature Genetics %8 Oct-02-2022 %G eng %U http://www.nature.com/articles/s41588-021-00791-5 %! Nat Genet %R 10.1038/s41588-021-00791-5 %0 Journal Article %J Nat Commun %D 2021 %T Orchestrating and sharing large multimodal data for transparent and reproducible research. %A Mammoliti, Anthony %A Smirnov, Petr %A Nakano, Minoru %A Safikhani, Zhaleh %A Eeles, Christopher %A Seo, Heewon %A Nair, Sisira Kadambat %A Mer, Arvind S %A Smith, Ian %A Ho, Chantal %A Beri, Gangesh %A Kusko, Rebecca %A Lin, Eva %A Yu, Yihong %A Martin, Scott %A Hafner, Marc %A Haibe-Kains, Benjamin %X

Reproducibility is essential to open science, as there is limited relevance for findings that can not be reproduced by independent research groups, regardless of its validity. It is therefore crucial for scientists to describe their experiments in sufficient detail so they can be reproduced, scrutinized, challenged, and built upon. However, the intrinsic complexity and continuous growth of biomedical data makes it increasingly difficult to process, analyze, and share with the community in a FAIR (findable, accessible, interoperable, and reusable) manner. To overcome these issues, we created a cloud-based platform called ORCESTRA ( orcestra.ca ), which provides a flexible framework for the reproducible processing of multimodal biomedical data. It enables processing of clinical, genomic and perturbation profiles of cancer samples through automated processing pipelines that are user-customizable. ORCESTRA creates integrated and fully documented data objects with persistent identifiers (DOI) and manages multiple dataset versions, which can be shared for future studies.

%B Nat Commun %V 12 %P 5797 %8 2021 10 04 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/34608132?dopt=Abstract %R 10.1038/s41467-021-25974-w %0 Journal Article %J Viruses %D 2021 %T Phylogenetic Analysis of the 2020 West Nile Virus (WNV) Outbreak in Andalusia (Spain) %A Casimiro-Soriguer, Carlos S. %A Perez-Florido, Javier %A Fernandez-Rueda, Jose L. %A Pedrosa-Corral, Irene %A Guillot-Sulay, Vicente %A Lorusso, Nicola %A Martinez-Gonzalez, Luis Javier %A Navarro-Marí, Jose M. %A Dopazo, Joaquin %A Sanbonmatsu-Gámez, Sara %B Viruses %V 13 %P 836 %8 Jan-05-2021 %G eng %U https://www.mdpi.com/1999-4915/13/5/836 %N 5 %! Viruses %R 10.3390/v13050836 %0 Journal Article %J Front Mol Neurosci %D 2021 %T Presenilin-1 Mutations Are a Cause of Primary Lateral Sclerosis-Like Syndrome. %A Vázquez-Costa, Juan Francisco %A Payá-Montes, María %A Martínez-Molina, Marina %A Jaijo, Teresa %A Szymanski, Jazek %A Mazón, Miguel %A Sopena-Novales, Pablo %A Pérez-Tur, Jordi %A Sevilla, Teresa %X

Background and Purpose: Primary lateral sclerosis (PLS) is a progressive upper motor neuron (UMN) disorder. It is debated whether PLS is part of the amyotrophic lateral sclerosis (ALS) spectrum, or a syndrome encompassing different neurodegenerative diseases. Recently, new diagnostic criteria for PLS have been proposed. We describe four patients of two pedigrees, meeting definite PLS criteria and harboring two different mutations in presenilin 1 ().

Methods: Patients underwent neurological and neuropsychological examination, MRI, 18F-fluorodeoxyglucose positron emission tomography (FDG-PET), amyloid-related biomarkers, and next-generation sequencing (NGS) testing.

Results: Four patients, aged 25-45 years old, presented with a progressive UMN syndrome meeting clinical criteria of definite PLS. Cognitive symptoms and signs were mild or absent during the first year of the disease but appeared or progressed later in the disease course. Brain MRI showed microbleeds in two siblings, but iron-related hypointensities in the motor cortex were absent. Brain FDG-PET showed variable areas of hypometabolism, including the motor cortex and frontotemporal lobes. Amyloid deposition was confirmed with either cerebrospinal fluid (CSF) or imaging biomarkers. Two heterozygous likely pathogenic mutations in (p.Pro88Leu and p.Leu166Pro) were found in the NGS testing.

Conclusion: Clinically defined PLS is a syndrome encompassing different neurodegenerative diseases. The NGS testing should be part of the diagnostic workup in patients with PLS, at least in those with red flags, such as early-onset, cognitive impairment, and/or family history of neurodegenerative diseases.

%B Front Mol Neurosci %V 14 %P 721047 %8 2021 %G eng %R 10.3389/fnmol.2021.721047 %0 Journal Article %J Sci Rep %D 2021 %T Real world evidence of calcifediol or vitamin D prescription and mortality rate of COVID-19 in a retrospective cohort of hospitalized Andalusian patients. %A Loucera, Carlos %A Peña-Chilet, Maria %A Esteban-Medina, Marina %A Muñoyerro-Muñiz, Dolores %A Villegas, Román %A López-Miranda, José %A Rodríguez-Baño, Jesús %A Túnez, Isaac %A Bouillon, Roger %A Dopazo, Joaquin %A Quesada Gomez, Jose Manuel %K Calcifediol %K COVID-19 %K Female %K Humans %K Kaplan-Meier Estimate %K Male %K Retrospective Studies %K Spain %K Survival Analysis %K Vitamin D %X

COVID-19 is a major worldwide health problem because of acute respiratory distress syndrome, and mortality. Several lines of evidence have suggested a relationship between the vitamin D endocrine system and severity of COVID-19. We present a survival study on a retrospective cohort of 15,968 patients, comprising all COVID-19 patients hospitalized in Andalusia between January and November 2020. Based on a central registry of electronic health records (the Andalusian Population Health Database, BPS), prescription of vitamin D or its metabolites within 15-30 days before hospitalization were recorded. The effect of prescription of vitamin D (metabolites) for other indication previous to the hospitalization was studied with respect to patient survival. Kaplan-Meier survival curves and hazard ratios support an association between prescription of these metabolites and patient survival. Such association was stronger for calcifediol (Hazard Ratio, HR = 0.67, with 95% confidence interval, CI, of [0.50-0.91]) than for cholecalciferol (HR = 0.75, with 95% CI of [0.61-0.91]), when prescribed 15 days prior hospitalization. Although the relation is maintained, there is a general decrease of this effect when a longer period of 30 days prior hospitalization is considered (calcifediol HR = 0.73, with 95% CI [0.57-0.95] and cholecalciferol HR = 0.88, with 95% CI [0.75, 1.03]), suggesting that association was stronger when the prescription was closer to the hospitalization.

%B Sci Rep %V 11 %P 23380 %8 2021 12 03 %G eng %N 1 %R 10.1038/s41598-021-02701-5 %0 Journal Article %J Nat Med %D 2021 %T Reporting guidelines for human microbiome research: the STORMS checklist. %A Mirzayi, Chloe %A Renson, Audrey %A Zohra, Fatima %A Elsafoury, Shaimaa %A Geistlinger, Ludwig %A Kasselman, Lora J %A Eckenrode, Kelly %A van de Wijgert, Janneke %A Loughman, Amy %A Marques, Francine Z %A MacIntyre, David A %A Arumugam, Manimozhiyan %A Azhar, Rimsha %A Beghini, Francesco %A Bergstrom, Kirk %A Bhatt, Ami %A Bisanz, Jordan E %A Braun, Jonathan %A Bravo, Hector Corrada %A Buck, Gregory A %A Bushman, Frederic %A Casero, David %A Clarke, Gerard %A Collado, Maria Carmen %A Cotter, Paul D %A Cryan, John F %A Demmer, Ryan T %A Devkota, Suzanne %A Elinav, Eran %A Escobar, Juan S %A Fettweis, Jennifer %A Finn, Robert D %A Fodor, Anthony A %A Forslund, Sofia %A Franke, Andre %A Furlanello, Cesare %A Gilbert, Jack %A Grice, Elizabeth %A Haibe-Kains, Benjamin %A Handley, Scott %A Herd, Pamela %A Holmes, Susan %A Jacobs, Jonathan P %A Karstens, Lisa %A Knight, Rob %A Knights, Dan %A Koren, Omry %A Kwon, Douglas S %A Langille, Morgan %A Lindsay, Brianna %A McGovern, Dermot %A McHardy, Alice C %A McWeeney, Shannon %A Mueller, Noel T %A Nezi, Luigi %A Olm, Matthew %A Palm, Noah %A Pasolli, Edoardo %A Raes, Jeroen %A Redinbo, Matthew R %A Rühlemann, Malte %A Balfour Sartor, R %A Schloss, Patrick D %A Schriml, Lynn %A Segal, Eran %A Shardell, Michelle %A Sharpton, Thomas %A Smirnova, Ekaterina %A Sokol, Harry %A Sonnenburg, Justin L %A Srinivasan, Sujatha %A Thingholm, Louise B %A Turnbaugh, Peter J %A Upadhyay, Vaibhav %A Walls, Ramona L %A Wilmes, Paul %A Yamada, Takuji %A Zeller, Georg %A Zhang, Mingyu %A Zhao, Ni %A Zhao, Liping %A Bao, Wenjun %A Culhane, Aedin %A Devanarayan, Viswanath %A Dopazo, Joaquin %A Fan, Xiaohui %A Fischer, Matthias %A Jones, Wendell %A Kusko, Rebecca %A Mason, Christopher E %A Mercer, Tim R %A Sansone, Susanna-Assunta %A Scherer, Andreas %A Shi, Leming %A Thakkar, Shraddha %A Tong, Weida %A Wolfinger, Russ %A Hunter, Christopher %A Segata, Nicola %A Huttenhower, Curtis %A Dowd, Jennifer B %A Jones, Heidi E %A Waldron, Levi %K Computational Biology %K Dysbiosis %K Humans %K Microbiota %K Observational Studies as Topic %K Research Design %K Translational Science, Biomedical %X

The particularly interdisciplinary nature of human microbiome research makes the organization and reporting of results spanning epidemiology, biology, bioinformatics, translational medicine and statistics a challenge. Commonly used reporting guidelines for observational or genetic epidemiology studies lack key features specific to microbiome studies. Therefore, a multidisciplinary group of microbiome epidemiology researchers adapted guidelines for observational and genetic studies to culture-independent human microbiome studies, and also developed new reporting elements for laboratory, bioinformatics and statistical analyses tailored to microbiome studies. The resulting tool, called 'Strengthening The Organization and Reporting of Microbiome Studies' (STORMS), is composed of a 17-item checklist organized into six sections that correspond to the typical sections of a scientific publication, presented as an editable table for inclusion in supplementary materials. The STORMS checklist provides guidance for concise and complete reporting of microbiome studies that will facilitate manuscript preparation, peer review, and reader comprehension of publications and comparative analysis of published results.

%B Nat Med %V 27 %P 1885-1892 %8 2021 11 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/34789871?dopt=Abstract %R 10.1038/s41591-021-01552-x %0 Journal Article %J Genes %D 2021 %T Schuurs–Hoeijmakers Syndrome (PACS1 Neurodevelopmental Disorder): Seven Novel Patients and a Review %A Tenorio-Castaño, Jair %A Morte, Beatriz %A Nevado, Julián %A Martínez-Glez, Víctor %A Santos-Simarro, Fernando %A García-Miñaur, Sixto %A Palomares-Bralo, María %A Pacio-Míguez, Marta %A Gómez, Beatriz %A Arias, Pedro %A Alcochea, Alba %A Carrión, Juan %A Arias, Patricia %A Almoguera, Berta %A López-Grondona, Fermina %A Lorda-Sanchez, Isabel %A Galán-Gómez, Enrique %A Valenzuela, Irene %A Méndez Perez, María %A Cuscó, Ivón %A Barros, Francisco %A Pié, Juan %A Ramos, Sergio %A Ramos, Feliciano %A Kuechler, Alma %A Tizzano, Eduardo %A Ayuso, Carmen %A Kaiser, Frank %A Pérez-Jurado, Luis %A Carracedo, Ángel %A Lapunzina, Pablo %B Genes %V 12 %P 738 %8 Jan-05-2021 %G eng %U https://www.mdpi.com/2073-4425/12/5/738https://www.mdpi.com/2073-4425/12/5/738/pdf %N 5 %! Genes %R 10.3390/genes12050738 %0 Journal Article %J Mol Med %D 2021 %T Taxonomic variations in the gut microbiome of gout patients with and without tophi might have a functional impact on urate metabolism. %A Méndez-Salazar, Eder Orlando %A Vázquez-Mellado, Janitzia %A Casimiro-Soriguer, Carlos S %A Dopazo, Joaquin %A Cubuk, Cankut %A Zamudio-Cuevas, Yessica %A Francisco-Balderas, Adriana %A Martínez-Flores, Karina %A Fernández-Torres, Javier %A Lozada-Pérez, Carlos %A Pineda, Carlos %A Sánchez-González, Austreberto %A Silveira, Luis H %A Burguete-García, Ana I %A Orbe-Orihuela, Citlalli %A Lagunas-Martínez, Alfredo %A Vazquez-Gomez, Alonso %A López-Reyes, Alberto %A Palacios-González, Berenice %A Martínez-Nava, Gabriela Angélica %K Biodiversity %K Computational Biology %K Dysbiosis %K Gastrointestinal Microbiome %K Gout %K Humans %K Metagenome %K metagenomics %K Protein Interaction Mapping %K Protein Interaction Maps %K Uric Acid %X

OBJECTIVE: To evaluate the taxonomic composition of the gut microbiome in gout patients with and without tophi formation, and predict bacterial functions that might have an impact on urate metabolism.

METHODS: Hypervariable V3-V4 regions of the bacterial 16S rRNA gene from fecal samples of gout patients with and without tophi (n = 33 and n = 25, respectively) were sequenced and compared to fecal samples from 53 healthy controls. We explored predictive functional profiles using bioinformatics in order to identify differences in taxonomy and metabolic pathways.

RESULTS: We identified a microbiome characterized by the lowest richness and a higher abundance of Phascolarctobacterium, Bacteroides, Akkermansia, and Ruminococcus_gnavus_group genera in patients with gout without tophi when compared to controls. The Proteobacteria phylum and the Escherichia-Shigella genus were more abundant in patients with tophaceous gout than in controls. Fold change analysis detected nine genera enriched in healthy controls compared to gout groups (Bifidobacterium, Butyricicoccus, Oscillobacter, Ruminococcaceae_UCG_010, Lachnospiraceae_ND2007_group, Haemophilus, Ruminococcus_1, Clostridium_sensu_stricto_1, and Ruminococcaceae_UGC_013). We found that the core microbiota of both gout groups shared Bacteroides caccae, Bacteroides stercoris ATCC 43183, and Bacteroides coprocola DSM 17136. These bacteria might perform functions linked to one-carbon metabolism, nucleotide binding, amino acid biosynthesis, and purine biosynthesis. Finally, we observed differences in key bacterial enzymes involved in urate synthesis, degradation, and elimination.

CONCLUSION: Our findings revealed that taxonomic variations in the gut microbiome of gout patients with and without tophi might have a functional impact on urate metabolism.

%B Mol Med %V 27 %P 50 %8 2021 05 24 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/34030623?dopt=Abstract %R 10.1186/s10020-021-00311-5 %0 Journal Article %J Nature Communications %D 2021 %T Uniform genomic data analysis in the NCI Genomic Data CommonsAbstract %A Zhang, Zhenyu %A Hernandez, Kyle %A Savage, Jeremiah %A Li, Shenglai %A Miller, Dan %A Agrawal, Stuti %A Ortuno, Francisco %A Staudt, Louis M. %A Heath, Allison %A Grossman, Robert L. %B Nature Communications %V 12 %8 Jan-12-2021 %G eng %U http://www.nature.com/articles/s41467-021-21254-9 %N 1 %! Nat Commun %R 10.1038/s41467-021-21254-9 %0 Journal Article %J PLoS Comput Biol %D 2021 %T A versatile workflow to integrate RNA-seq genomic and transcriptomic data into mechanistic models of signaling pathways. %A Garrido-Rodriguez, Martín %A López-López, Daniel %A Ortuno, Francisco M %A Peña-Chilet, Maria %A Muñoz, Eduardo %A Calzado, Marco A %A Dopazo, Joaquin %K Algorithms %K Cell Line, Tumor %K Computational Biology %K Databases, Factual %K Gene Expression Profiling %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Models, Theoretical %K mutation %K RNA-seq %K Signal Transduction %K Software %K Transcriptome %K whole exome sequencing %K Workflow %X

MIGNON is a workflow for the analysis of RNA-Seq experiments, which not only efficiently manages the estimation of gene expression levels from raw sequencing reads, but also calls genomic variants present in the transcripts analyzed. Moreover, this is the first workflow that provides a framework for the integration of transcriptomic and genomic data based on a mechanistic model of signaling pathway activities that allows a detailed biological interpretation of the results, including a comprehensive functional profiling of cell activity. MIGNON covers the whole process, from reads to signaling circuit activity estimations, using state-of-the-art tools, it is easy to use and it is deployable in different computational environments, allowing an optimized use of the resources available.

%B PLoS Comput Biol %V 17 %P e1008748 %8 2021 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/33571195?dopt=Abstract %R 10.1371/journal.pcbi.1008748 %0 Journal Article %J EPMA J %D 2020 %T 10th Anniversary of the European Association for Predictive, Preventive and Personalised (3P) Medicine - EPMA World Congress Supplement 2020. %A Golubnitschaja, Olga %A Topolcan, Ondrej %A Kucera, Radek %A Costigliola, Vincenzo %X

In 2019, the EPMA celebrated its 10th anniversary at the 5th World Congress in Pilsen, Czech Republic. The history of the International Professional Network dedicated to Predictive, Preventive and Personalised Medicine (PPPM / 3PM) is rich in achievements. Facing the coronavirus COVID-19 pandemic it is getting evident globally that the predictive approach, targeted prevention and personalisation of medical services is the optimal paradigm in healthcare demonstrating the high potential to save lives and to benefit the society as a whole. The EPMA World Congress Supplement 2020 highlights advances in 3P medicine.

%B EPMA J %P 1-133 %8 2020 Aug 19 %G eng %R 10.1007/s13167-020-00206-1 %0 Journal Article %J Clin Microbiol Infect %D 2020 %T Association of a single nucleotide polymorphism in the ubxn6 gene with long-term non-progression phenotype in HIV-positive individuals. %A Díez-Fuertes, F %A De La Torre-Tarazona, H E %A Calonge, E %A Pernas, M %A Bermejo, M %A García-Pérez, J %A Álvarez, A %A Capa, L %A García-García, F %A Saumoy, M %A Riera, M %A Boland-Auge, A %A López-Galíndez, C %A Lathrop, M %A Dopazo, J %A Sakuntabhai, A %A Alcamí, J %K Adaptor Proteins, Vesicular Transport %K Autophagy-Related Proteins %K Caveolin 1 %K Cohort Studies %K Dendritic Cells %K Disease Progression %K Gene Frequency %K Gene Knockdown Techniques %K Genetic Association Studies %K HeLa Cells %K HIV Infections %K HIV Long-Term Survivors %K HIV-1 %K Humans %K Macrophages %K Oligonucleotide Array Sequence Analysis %K Phenotype %K Polymorphism, Single Nucleotide %K whole exome sequencing %X

OBJECTIVES: The long-term non-progressors (LTNPs) are a heterogeneous group of HIV-positive individuals characterized by their ability to maintain high CD4 T-cell counts and partially control viral replication for years in the absence of antiretroviral therapy. The present study aims to identify host single nucleotide polymorphisms (SNPs) associated with non-progression in a cohort of 352 individuals.

METHODS: DNA microarrays and exome sequencing were used for genotyping about 240 000 functional polymorphisms throughout more than 20 000 human genes. The allele frequencies of 85 LTNPs were compared with a control population. SNPs associated with LTNPs were confirmed in a population of typical progressors. Functional analyses in the affected gene were carried out through knockdown experiments in HeLa-P4, macrophages and dendritic cells.

RESULTS: Several SNPs located within the major histocompatibility complex region previously related to LTNPs were confirmed in this new cohort. The SNP rs1127888 (UBXN6) surpassed the statistical significance of these markers after Bonferroni correction (q = 2.11 × 10). An uncommon allelic frequency of rs1127888 among LTNPs was confirmed by comparison with typical progressors and other publicly available populations. UBXN6 knockdown experiments caused an increase in CAV1 expression and its accumulation in the plasma membrane. In vitro infection of different cell types with HIV-1 replication-competent recombinant viruses caused a reduction of the viral replication capacity compared with their corresponding wild-type cells expressing UBXN6.

CONCLUSIONS: A higher prevalence of Ala31Thr in UBXN6 was found among LTNPs within its N-terminal region, which is crucial for UBXN6/VCP protein complex formation. UBXN6 knockdown affected CAV1 turnover and HIV-1 replication capacity.

%B Clin Microbiol Infect %V 26 %P 107-114 %8 2020 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/31158522?dopt=Abstract %R 10.1016/j.cmi.2019.05.015 %0 Journal Article %J Cell Syst %D 2020 %T Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics. %A Yang, Mi %A Petralia, Francesca %A Li, Zhi %A Li, Hongyang %A Ma, Weiping %A Song, Xiaoyu %A Kim, Sunkyu %A Lee, Heewon %A Yu, Han %A Lee, Bora %A Bae, Seohui %A Heo, Eunji %A Kaczmarczyk, Jan %A Stępniak, Piotr %A Warchoł, Michał %A Yu, Thomas %A Calinawan, Anna P %A Boutros, Paul C %A Payne, Samuel H %A Reva, Boris %A Boja, Emily %A Rodriguez, Henry %A Stolovitzky, Gustavo %A Guan, Yuanfang %A Kang, Jaewoo %A Wang, Pei %A Fenyö, David %A Saez-Rodriguez, Julio %K Crowdsourcing %K Female %K Genomics %K Humans %K Machine Learning %K Male %K Neoplasms %K Phosphoproteins %K Proteins %K Proteomics %K Transcriptome %X

Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteogenomic challenge. We asked for methods to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. The best performance was achieved by an ensemble of models, including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types, and phosphosite proximity for phosphorylation prediction. Proteins from metabolic pathways and complexes were the best and worst predicted, respectively. The performance of even the best-performing model was modest, suggesting that many proteins are strongly regulated through translational control and degradation. Our results set a reference for the limitations of computational inference in proteogenomics. A record of this paper's transparent peer review process is included in the Supplemental Information.

%B Cell Syst %V 11 %P 186-195.e9 %8 2020 08 26 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/32710834?dopt=Abstract %R 10.1016/j.cels.2020.06.013 %0 Journal Article %J Sci Data %D 2020 %T COVID-19 Disease Map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms. %A Ostaszewski, Marek %A Mazein, Alexander %A Gillespie, Marc E %A Kuperstein, Inna %A Niarakis, Anna %A Hermjakob, Henning %A Pico, Alexander R %A Willighagen, Egon L %A Evelo, Chris T %A Hasenauer, Jan %A Schreiber, Falk %A Dräger, Andreas %A Demir, Emek %A Wolkenhauer, Olaf %A Furlong, Laura I %A Barillot, Emmanuel %A Dopazo, Joaquin %A Orta-Resendiz, Aurelio %A Messina, Francesco %A Valencia, Alfonso %A Funahashi, Akira %A Kitano, Hiroaki %A Auffray, Charles %A Balling, Rudi %A Schneider, Reinhard %K Betacoronavirus %K Computational Biology %K Coronavirus Infections %K COVID-19 %K Databases, Factual %K Host Microbial Interactions %K Host-Pathogen Interactions %K Humans %K International Cooperation %K Models, Biological %K Pandemics %K Pneumonia, Viral %K SARS-CoV-2 %B Sci Data %V 7 %P 136 %8 2020 05 05 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/32371892?dopt=Abstract %R 10.1038/s41597-020-0477-8 %0 Journal Article %J Signal Transduct Target Ther %D 2020 %T Drug repurposing for COVID-19 using machine learning and mechanistic models of signal transduction circuits related to SARS-CoV-2 infection. %A Loucera, Carlos %A Esteban-Medina, Marina %A Rian, Kinza %A Falco, Matias M %A Dopazo, Joaquin %A Peña-Chilet, Maria %K Computational Chemistry %K COVID-19 %K drug repositioning %K Humans %K Machine Learning %K Molecular Docking Simulation %K Molecular Targeted Therapy %K Proteins %K SARS-CoV-2 %K Signal Transduction %B Signal Transduct Target Ther %V 5 %P 290 %8 2020 12 11 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33311438?dopt=Abstract %R 10.1038/s41392-020-00417-y %0 Journal Article %J F1000Res %D 2020 %T The ELIXIR Human Copy Number Variations Community: building bioinformatics infrastructure for research. %A Salgado, David %A Armean, Irina M %A Baudis, Michael %A Beltran, Sergi %A Capella-Gutíerrez, Salvador %A Carvalho-Silva, Denise %A Dominguez Del Angel, Victoria %A Dopazo, Joaquin %A Furlong, Laura I %A Gao, Bo %A Garcia, Leyla %A Gerloff, Dietlind %A Gut, Ivo %A Gyenesei, Attila %A Habermann, Nina %A Hancock, John M %A Hanauer, Marc %A Hovig, Eivind %A Johansson, Lennart F %A Keane, Thomas %A Korbel, Jan %A Lauer, Katharina B %A Laurie, Steve %A Leskošek, Brane %A Lloyd, David %A Marqués-Bonet, Tomás %A Mei, Hailiang %A Monostory, Katalin %A Piñero, Janet %A Poterlowicz, Krzysztof %A Rath, Ana %A Samarakoon, Pubudu %A Sanz, Ferran %A Saunders, Gary %A Sie, Daoud %A Swertz, Morris A %A Tsukanov, Kirill %A Valencia, Alfonso %A Vidak, Marko %A Yenyxe González, Cristina %A Ylstra, Bauke %A Béroud, Christophe %K Computational Biology %K DNA Copy Number Variations %K High-Throughput Nucleotide Sequencing %K Humans %X

Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While "High-Throughput" sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR's recently established with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context.

%B F1000Res %V 9 %8 2020 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/34367618?dopt=Abstract %& 1229 %R 10.12688/f1000research.24887.1 %0 Journal Article %J iScience %D 2020 %T Immune Cell Associations with Cancer Risk. %A Palomero, Luis %A Galván-Femenía, Ivan %A de Cid, Rafael %A Espín, Roderic %A Barnes, Daniel R %A Blommaert, Eline %A Gil-Gil, Miguel %A Falo, Catalina %A Stradella, Agostina %A Ouchi, Dan %A Roso-Llorach, Albert %A Violan, Concepció %A Peña-Chilet, Maria %A Dopazo, Joaquin %A Extremera, Ana Isabel %A García-Valero, Mar %A Herranz, Carmen %A Mateo, Francesca %A Mereu, Elisabetta %A Beesley, Jonathan %A Chenevix-Trench, Georgia %A Roux, Cecilia %A Mak, Tak %A Brunet, Joan %A Hakem, Razq %A Gorrini, Chiara %A Antoniou, Antonis C %A Lázaro, Conxi %A Pujana, Miquel Angel %X

Proper immune system function hinders cancer development, but little is known about whether genetic variants linked to cancer risk alter immune cells. Here, we report 57 cancer risk loci associated with differences in immune and/or stromal cell contents in the corresponding tissue. Predicted target genes show expression and regulatory associations with immune features. Polygenic risk scores also reveal associations with immune and/or stromal cell contents, and breast cancer scores show consistent results in normal and tumor tissue. SH2B3 links peripheral alterations of several immune cell types to the risk of this malignancy. Pleiotropic SH2B3 variants are associated with breast cancer risk in BRCA1/2 mutation carriers. A retrospective case-cohort study indicates a positive association between blood counts of basophils, leukocytes, and monocytes and age at breast cancer diagnosis. These findings broaden our knowledge of the role of the immune system in cancer and highlight promising prevention strategies for individuals at high risk.

%B iScience %V 23 %P 101296 %8 2020 Jul 24 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/32622267?dopt=Abstract %R 10.1016/j.isci.2020.101296 %0 Journal Article %J Gac Sanit %D 2020 %T [Impact assessment on data protection in research projects]. %A García-León, Francisco Javier %A Villegas-Portero, Román %A Goicoechea-Salazar, Juan Antonio %A Muñoyerro-Muñiz, Dolores %A Dopazo, Joaquin %K Computer Security %K Humans %X

Recent changes in European regulations for personal data protection still allow the use of health data for research purposes, but they have set the Impact Assessment on Data Protection as an instrument for reflection and risk analysis in the process of data processing. The publication of a guide for facilitates this impact assessment, although it is not directly applicable to research projects. Experience in a specific project is detailed, showing how the context of the treatment becomes relevant with respect to the data characteristics. Carrying out an impact assessment is an opportunity to ensure compliance with the principles of data protection in an increasingly complex environment with greater ethical challenges.

%B Gac Sanit %V 34 %P 521-523 %8 2020 Sep - Oct %G spa %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/31980148?dopt=Abstract %R 10.1016/j.gaceta.2019.10.006 %0 Journal Article %J NAR Cancer %D 2020 %T Mechanistic models of signaling pathways deconvolute the glioblastoma single-cell functional landscapeAbstract %A Falco, Matias M %A Peña-Chilet, Maria %A Loucera, Carlos %A Hidalgo, Marta R %A Dopazo, Joaquin %B NAR Cancer %V 2 %8 Jan-06-2020 %G eng %U https://academic.oup.com/narcancer/article/doi/10.1093/narcan/zcaa011/5862620http://academic.oup.com/narcancer/article-pdf/2/2/zcaa011/33428092/zcaa011.pdfhttp://academic.oup.com/narcancer/article-pdf/2/2/zcaa011/33428092/zcaa011.pdf %N 2 %R 10.1093/narcan/zcaa011 %0 Journal Article %J Cells %D 2020 %T Mechanistic Models of Signaling Pathways Reveal the Drug Action Mechanisms behind Gender-Specific Gene Expression for Cancer Treatments. %A Cubuk, Cankut %A Can, Fatma E %A Peña-Chilet, Maria %A Dopazo, Joaquin %K Female %K Gene Expression Regulation, Neoplastic %K Humans %K Male %K Neoplasms %K Signal Transduction %X

Despite the existence of differences in gene expression across numerous genes between males and females having been known for a long time, these have been mostly ignored in many studies, including drug development and its therapeutic use. In fact, the consequences of such differences over the disease mechanisms or the drug action mechanisms are completely unknown. Here we applied mechanistic mathematical models of signaling activity to reveal the ultimate functional consequences that gender-specific gene expression activities have over cell functionality and fate. Moreover, we also used the mechanistic modeling framework to simulate the drug interventions and unravel how drug action mechanisms are affected by gender-specific differential gene expression. Interestingly, some cancers have many biological processes significantly affected by these gender-specific differences (e.g., bladder or head and neck carcinomas), while others (e.g., glioblastoma or rectum cancer) are almost insensitive to them. We found that many of these gender-specific differences affect cancer-specific pathways or in physiological signaling pathways, also involved in cancer origin and development. Finally, mechanistic models have the potential to be used for finding alternative therapeutic interventions on the pathways targeted by the drug, which lead to similar results compensating the downstream consequences of gender-specific differences in gene expression.

%B Cells %V 9 %8 2020 06 29 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/32610626?dopt=Abstract %R 10.3390/cells9071579 %0 Journal Article %J J Immunother Cancer %D 2020 %T Nivolumab and sunitinib combination in advanced soft tissue sarcomas: a multicenter, single-arm, phase Ib/II trial. %A Martin-Broto, Javier %A Hindi, Nadia %A Grignani, Giovanni %A Martinez-Trufero, Javier %A Redondo, Andres %A Valverde, Claudia %A Stacchiotti, Silvia %A Lopez-Pousa, Antonio %A D'Ambrosio, Lorenzo %A Gutierrez, Antonio %A Perez-Vega, Herminia %A Encinas-Tobajas, Victor %A de Alava, Enrique %A Collini, Paola %A Peña-Chilet, Maria %A Dopazo, Joaquin %A Carrasco-Garcia, Irene %A Lopez-Alvarez, Maria %A Moura, David S %A Lopez-Martin, Jose A %K Adult %K Aged %K Antineoplastic Agents, Immunological %K Female %K Humans %K Male %K Middle Aged %K Nivolumab %K Sarcoma %K Sunitinib %K Young Adult %X

BACKGROUND: Sarcomas exhibit low expression of factors related to immune response, which could explain the modest activity of PD-1 inhibitors. A potential strategy to convert a cold into an inflamed microenvironment lies on a combination therapy. As tumor angiogenesis promotes immunosuppression, we designed a phase Ib/II trial to test the double inhibition of angiogenesis (sunitinib) and PD-1/PD-L1 axis (nivolumab).

METHODS: This single-arm, phase Ib/II trial enrolled adult patients with selected subtypes of sarcoma. Phase Ib established two dose levels: level 0 with sunitinib 37.5 mg daily from day 1, plus nivolumab 3 mg/kg intravenously on day 15, and then every 2 weeks; and level -1 with sunitinib 37.5 mg on the first 14 days (induction) and then 25 mg per day plus nivolumab on the same schedule. The primary endpoint was to determine the recommended dose for phase II (phase I) and the 6-month progression-free survival rate, according to Response Evaluation Criteria in Solid Tumors 1.1 (phase II).

RESULTS: From May 2017 to April 2019, 68 patients were enrolled: 16 in phase Ib and 52 in phase II. The recommended dose of sunitinib for phase II was 37.5 mg as induction and then 25 mg in combination with nivolumab. After a median follow-up of 17 months (4-26), the 6-month progression-free survival rate was 48% (95% CI 41% to 55%). The most common grade 3-4 adverse events included transaminitis (17.3%) and neutropenia (11.5%).

CONCLUSIONS: Sunitinib plus nivolumab is an active scheme with manageable toxicity in the treatment of selected patients with advanced soft tissue sarcoma, with almost half of patients free of progression at 6 months. NCT03277924.

%B J Immunother Cancer %V 8 %8 2020 11 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/33203665?dopt=Abstract %R 10.1136/jitc-2020-001561 %0 Journal Article %J J Med Genet %D 2020 %T Optimised molecular genetic diagnostics of Fanconi anaemia by whole exome sequencing and functional studies. %A Bogliolo, Massimo %A Pujol, Roser %A Aza-Carmona, Miriam %A Muñoz-Subirana, Núria %A Rodriguez-Santiago, Benjamin %A Casado, José Antonio %A Rio, Paula %A Bauser, Christopher %A Reina-Castillón, Judith %A Lopez-Sanchez, Marcos %A Gonzalez-Quereda, Lidia %A Gallano, Pia %A Catalá, Albert %A Ruiz-Llobet, Ana %A Badell, Isabel %A Diaz-Heredia, Cristina %A Hladun, Raquel %A Senent, Leonort %A Argiles, Bienvenida %A Bergua Burgues, Juan Miguel %A Bañez, Fatima %A Arrizabalaga, Beatriz %A López Almaraz, Ricardo %A Lopez, Monica %A Figuera, Ángela %A Molinés, Antonio %A Pérez de Soto, Inmaculada %A Hernando, Inés %A Muñoz, Juan Antonio %A Del Rosario Marin, Maria %A Balmaña, Judith %A Stjepanovic, Neda %A Carrasco, Estela %A Cuesta, Isabel %A Cosuelo, José Miguel %A Regueiro, Alexandra %A Moraleda Jimenez, José %A Galera-Miñarro, Ana Maria %A Rosiñol, Laura %A Carrió, Anna %A Beléndez-Bieler, Cristina %A Escudero Soto, Antonio %A Cela, Elena %A de la Mata, Gregorio %A Fernández-Delgado, Rafael %A Garcia-Pardos, Maria Carmen %A Sáez-Villaverde, Raquel %A Barragaño, Marta %A Portugal, Raquel %A Lendinez, Francisco %A Hernadez, Ines %A Vagace, José Manue %A Tapia, Maria %A Nieto, José %A Garcia, Marta %A Gonzalez, Macarena %A Vicho, Cristina %A Galvez, Eva %A Valiente, Alberto %A Antelo, Maria Luisa %A Ancliff, Phil %A García, Francisco %A Dopazo, Joaquin %A Sevilla, Julian %A Paprotka, Tobias %A Pérez-Jurado, Luis Alberto %A Bueren, Juan %A Surralles, Jordi %K Cell Line %K DNA Copy Number Variations %K DNA Repair %K DNA-Binding Proteins %K Fanconi Anemia %K Fanconi Anemia Complementation Group A Protein %K Female %K Gene Knockout Techniques %K Genetic Predisposition to Disease %K Humans %K Male %K Mutation, Missense %K Polymorphism, Single Nucleotide %K whole exome sequencing %X

PURPOSE: Patients with Fanconi anaemia (FA), a rare DNA repair genetic disease, exhibit chromosome fragility, bone marrow failure, malformations and cancer susceptibility. FA molecular diagnosis is challenging since FA is caused by point mutations and large deletions in 22 genes following three heritability patterns. To optimise FA patients' characterisation, we developed a simplified but effective methodology based on whole exome sequencing (WES) and functional studies.

METHODS: 68 patients with FA were analysed by commercial WES services. Copy number variations were evaluated by sequencing data analysis with RStudio. To test missense variants, wt FANCA cDNA was cloned and variants were introduced by site-directed mutagenesis. Vectors were then tested for their ability to complement DNA repair defects of a FANCA-KO human cell line generated by TALEN technologies.

RESULTS: We identified 93.3% of mutated alleles including large deletions. We determined the pathogenicity of three FANCA missense variants and demonstrated that two variants reported in mutations databases as 'affecting functions' are SNPs. Deep analysis of sequencing data revealed patients' true mutations, highlighting the importance of functional analysis. In one patient, no pathogenic variant could be identified in any of the 22 known FA genes, and in seven patients, only one deleterious variant could be identified (three patients each with FANCA and FANCD2 and one patient with FANCE mutations) CONCLUSION: WES and proper bioinformatics analysis are sufficient to effectively characterise patients with FA regardless of the rarity of their complementation group, type of mutations, mosaic condition and DNA source.

%B J Med Genet %V 57 %P 258-268 %8 2020 04 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/31586946?dopt=Abstract %R 10.1136/jmedgenet-2019-106249 %0 Journal Article %J Lancet Oncol %D 2020 %T Pazopanib for treatment of typical solitary fibrous tumours: a multicentre, single-arm, phase 2 trial. %A Martin-Broto, Javier %A Cruz, Josefina %A Penel, Nicolas %A Le Cesne, Axel %A Hindi, Nadia %A Luna, Pablo %A Moura, David S %A Bernabeu, Daniel %A de Alava, Enrique %A Lopez-Guerrero, Jose Antonio %A Dopazo, Joaquin %A Peña-Chilet, Maria %A Gutierrez, Antonio %A Collini, Paola %A Karanian, Marie %A Redondo, Andres %A Lopez-Pousa, Antonio %A Grignani, Giovanni %A Diaz-Martin, Juan %A Marcilla, David %A Fernandez-Serra, Antonio %A Gonzalez-Aguilera, Cristina %A Casali, Paolo G %A Blay, Jean-Yves %A Stacchiotti, Silvia %K Aged %K Female %K Follow-Up Studies %K Humans %K Indazoles %K Male %K Middle Aged %K Neoplasm Metastasis %K Prognosis %K Prospective Studies %K Protein Kinase Inhibitors %K Pyrimidines %K Response Evaluation Criteria in Solid Tumors %K Solitary Fibrous Tumors %K Sulfonamides %K Survival Rate %X

BACKGROUND: Solitary fibrous tumour is an ultra-rare sarcoma, which encompasses different clinicopathological subgroups. The dedifferentiated subgroup shows an aggressive course with resistance to pazopanib, whereas in the malignant subgroup, pazopanib shows higher activity than in previous studies with chemotherapy. We designed a trial to test pazopanib activity in two different cohorts of solitary fibrous tumour: the malignant-dedifferentiated cohort, which was previously published, and the typical cohort, which is presented here.

METHODS: In this single-arm, phase 2 trial, adult patients (aged ≥18 years) diagnosed with confirmed metastatic or unresectable typical solitary fibrous tumour of any location, who had progressed in the previous 6 months (by Choi criteria or Response Evaluation Criteria in Solid Tumors [RECIST]) and an Eastern Cooperative Oncology Group (ECOG) performance status of 0-2 were enrolled at 11 tertiary hospitals in Italy, France, and Spain. Patients received pazopanib 800 mg once daily, taken orally, until progression, unacceptable toxicity, withdrawal of consent, non-compliance, or a delay in pazopanib administration of longer than 3 weeks. The primary endpoint was proportion of patients achieving an overall response measured by Choi criteria in patients who received at least 1 month of treatment with at least one radiological assessment. All patients who received at least one dose of the study drug were included in the safety analyses. This study is registered in ClinicalTrials.gov, NCT02066285, and with the European Clinical Trials Database, EudraCT 2013-005456-15.

FINDINGS: From June 26, 2014, to Dec 13, 2018, of 40 patients who were assessed, 34 patients were enrolled and 31 patients were included in the response analysis. Median follow-up was 18 months (IQR 14-34), and 18 (58%) of 31 patients had a partial response, 12 (39%) had stable disease, and one (3%) showed progressive disease according to Choi criteria and central review. The proportion of overall response based on Choi criteria was 58% (95% CI 34-69). There were no deaths caused by toxicity, and the most frequent adverse events were diarrhoea (18 [53%] of 34 patients), fatigue (17 [50%]), and hypertension (17 [50%]).

INTERPRETATION: To our knowledge, this is the first prospective trial of pazopanib for advanced typical solitary fibrous tumour. The manageable toxicity and activity shown by pazopanib in this cohort suggest that this drug could be considered as first-line treatment for advanced typical solitary fibrous tumour.

FUNDING: Spanish Group for Research on Sarcomas (GEIS), Italian Sarcoma Group (ISG), French Sarcoma Group (FSG), GlaxoSmithKline, and Novartis.

%B Lancet Oncol %V 21 %P 456-466 %8 2020 03 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/32066540?dopt=Abstract %R 10.1016/S1470-2045(19)30826-5 %0 Journal Article %J Stem Cells %D 2020 %T Platform to study intracellular polystyrene nanoplastic pollution and clinical outcomes. %A Bojic, Sanja %A Falco, Matias M %A Stojkovic, Petra %A Ljujic, Biljana %A Gazdic Jankovic, Marina %A Armstrong, Lyle %A Markovic, Nebojsa %A Dopazo, Joaquin %A Lako, Majlinda %A Bauer, Roman %A Stojkovic, Miodrag %K Environmental Pollution %K Humans %K Induced Pluripotent Stem Cells %K Intracellular Space %K Nanoparticles %K Plastics %K Polystyrenes %K Transcriptome %K Treatment Outcome %X

Increased pollution by plastics has become a serious global environmental problem, but the concerns for human health have been raised after reported presence of microplastics (MPs) and nanoplastics (NPs) in food and beverages. Unfortunately, few studies have investigate the potentially harmful effects of MPs/NPs on early human development and human health. Therefore, we used a new platform to study possible effects of polystyrene NPs (PSNPs) on the transcription profile of preimplantation human embryos and human induced pluripotent stem cells (hiPSCs). Two pluripotency genes, LEFTY1 and LEFTY2, which encode secreted ligands of the transforming growth factor-beta, were downregulated, while CA4 and OCLM, which are related to eye development, were upregulated in both samples. The gene set enrichment analysis showed that the development of atrioventricular heart valves and the dysfunction of cellular components, including extracellular matrix, were significantly affected after exposure of hiPSCs to PSNPs. Finally, using the HiPathia method, which uncovers disease mechanisms and predicts clinical outcomes, we determined the APOC3 circuit, which is responsible for increased risk for ischemic cardiovascular disease. These results clearly demonstrate that better understanding of NPs bioactivities and its implications for human health is of extreme importance. Thus, the presented platform opens further aspects to study interactions between different environmental and intracellular pollutions with the aim to decipher the mechanism and origin of human diseases.

%B Stem Cells %V 38 %P 1321-1325 %8 2020 10 01 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/32614127?dopt=Abstract %R 10.1002/stem.3244 %0 Journal Article %J Hum Mutat %D 2020 %T SMN1 copy-number and sequence variant analysis from next-generation sequencing data. %A López-López, Daniel %A Loucera, Carlos %A Carmona, Rosario %A Aquino, Virginia %A Salgado, Josefa %A Pasalodos, Sara %A Miranda, María %A Alonso, Ángel %A Dopazo, Joaquin %K Base Sequence %K DNA Copy Number Variations %K High-Throughput Nucleotide Sequencing %K Humans %K Reproducibility of Results %K Software %K Survival of Motor Neuron 1 Protein %X

Spinal muscular atrophy (SMA) is a severe neuromuscular autosomal recessive disorder affecting 1/10,000 live births. Most SMA patients present homozygous deletion of SMN1, while the vast majority of SMA carriers present only a single SMN1 copy. The sequence similarity between SMN1 and SMN2, and the complexity of the SMN locus makes the estimation of the SMN1 copy-number by next-generation sequencing (NGS) very difficult. Here, we present SMAca, the first python tool to detect SMA carriers and estimate the absolute SMN1 copy-number using NGS data. Moreover, SMAca takes advantage of the knowledge of certain variants specific to SMN1 duplication to also identify silent carriers. This tool has been validated with a cohort of 326 samples from the Navarra 1000 Genomes Project (NAGEN1000). SMAca was developed with a focus on execution speed and easy installation. This combination makes it especially suitable to be integrated into production NGS pipelines. Source code and documentation are available at https://www.github.com/babelomics/SMAca.

%B Hum Mutat %V 41 %P 2073-2077 %8 2020 12 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/33058415?dopt=Abstract %R 10.1002/humu.24120 %0 Journal Article %J IEEE J Biomed Health Inform %D 2020 %T Towards Improving Skin Cancer Diagnosis by Integrating Microarray and RNA-Seq Datasets. %A Galvez, Juan M %A Castillo-Secilla, Daniel %A Herrera, Luis J %A Valenzuela, Olga %A Caba, Octavio %A Prados, Jose C %A Ortuno, Francisco M %A Rojas, Ignacio %K Biomarkers, Tumor %K Computational Biology %K Diagnosis, Computer-Assisted %K Gene Expression Profiling %K Humans %K Machine Learning %K RNA-seq %K Skin Neoplasms %X

Many clinical studies have revealed the high biological similarities existing among different skin pathological states. These similarities create difficulties in the efficient diagnosis of skin cancer, and encourage to study and design new intelligent clinical decision support systems. In this sense, gene expression analysis can help find differentially expressed genes (DEGs) simultaneously discerning multiple skin pathological states in a single test. The integration of multiple heterogeneous transcriptomic datasets requires different pipeline stages to be properly designed: from suitable batch merging and efficient biomarker selection to automated classification assessment. This article presents a novel approach addressing all these technical issues, with the intention of providing new sights about skin cancer diagnosis. Although new future efforts will have to be made in the search for better biomarkers recognizing specific skin pathological states, our study found a panel of 8 highly relevant multiclass DEGs for discerning up to 10 skin pathological states: 2 healthy skin conditions a priori, 2 cataloged precancerous skin diseases and 6 cancerous skin states. Their power of diagnosis over new samples was widely tested by previously well-trained classification models. Robust performance metrics such as overall and mean multiclass F1-score outperformed recognition rates of 94% and 80%, respectively. Clinicians should give special attention to highlighted multiclass DEGs that have high gene expression changes present among them, and understand their biological relationship to different skin pathological states.

%B IEEE J Biomed Health Inform %V 24 %P 2119-2130 %8 2020 07 %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/31871000?dopt=Abstract %R 10.1109/JBHI.2019.2953978 %0 Journal Article %J Genes (Basel) %D 2020 %T Transcriptomic Analysis of a Diabetic Skin-Humanized Mouse Model Dissects Molecular Pathways Underlying the Delayed Wound Healing Response. %A León, Carlos %A Garcia-Garcia, Francisco %A Llames, Sara %A García-Pérez, Eva %A Carretero, Marta %A Arriba, María Del Carmen %A Dopazo, Joaquin %A Del Rio, Marcela %A Escamez, Maria José %A Martínez-Santamaría, Lucía %K Animals %K Diabetes Mellitus, Experimental %K Gene Expression Profiling %K Gene Expression Regulation %K Gene ontology %K Humans %K Metabolic Networks and Pathways %K Mice %K Mice, Nude %K Microarray Analysis %K Molecular Sequence Annotation %K Principal Component Analysis %K Signal Transduction %K Skin %K Skin Transplantation %K Skin Ulcer %K Streptozocin %K Tissue Engineering %K Transcriptome %K Transplantation, Heterologous %K Wound Healing %X

Defective healing leading to cutaneous ulcer formation is one of the most feared complications of diabetes due to its consequences on patients' quality of life and on the healthcare system. A more in-depth analysis of the underlying molecular pathophysiology is required to develop effective healing-promoting therapies for those patients. Major architectural and functional differences with human epidermis limit extrapolation of results coming from rodents and other small mammal-healing models. Therefore, the search for reliable humanized models has become mandatory. Previously, we developed a diabetes-induced delayed humanized wound healing model that faithfully recapitulated the major histological features of such skin repair-deficient condition. Herein, we present the results of a transcriptomic and functional enrichment analysis followed by a mechanistic analysis performed in such humanized wound healing model. The deregulation of genes implicated in functions such as angiogenesis, apoptosis, and inflammatory signaling processes were evidenced, confirming published data in diabetic patients that in fact might also underlie some of the histological features previously reported in the delayed skin-humanized healing model. Altogether, these molecular findings support the utility of such preclinical model as a valuable tool to gain insight into the molecular basis of the delayed diabetic healing with potential impact in the translational medicine field.

%B Genes (Basel) %V 12 %8 2020 12 31 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/33396192?dopt=Abstract %R 10.3390/genes12010047 %0 Journal Article %J Nature %D 2020 %T Transparency and reproducibility in artificial intelligence. %A Haibe-Kains, Benjamin %A Adam, George Alexandru %A Hosny, Ahmed %A Khodakarami, Farnoosh %A Waldron, Levi %A Wang, Bo %A McIntosh, Chris %A Goldenberg, Anna %A Kundaje, Anshul %A Greene, Casey S %A Broderick, Tamara %A Hoffman, Michael M %A Leek, Jeffrey T %A Korthauer, Keegan %A Huber, Wolfgang %A Brazma, Alvis %A Pineau, Joelle %A Tibshirani, Robert %A Hastie, Trevor %A Ioannidis, John P A %A Quackenbush, John %A Aerts, Hugo J W L %K Algorithms %K Artificial Intelligence %K Reproducibility of Results %B Nature %V 586 %P E14-E16 %8 2020 10 %G eng %N 7829 %1 https://www.ncbi.nlm.nih.gov/pubmed/33057217?dopt=Abstract %R 10.1038/s41586-020-2766-y %0 Journal Article %J Bioinformatics %D 2020 %T Using AnABlast for intergenic sORF prediction in the Caenorhabditis elegans genome. %A Casimiro-Soriguer, C S %A Rigual, M M %A Brokate-Llanos, A M %A Muñoz, M J %A Garzón, A %A Pérez-Pulido, A J %A Jimenez, J %K Animals %K Caenorhabditis elegans %K Computational Biology %K Genome %K Open Reading Frames %K Software %X

MOTIVATION: Short bioactive peptides encoded by small open reading frames (sORFs) play important roles in eukaryotes. Bioinformatics prediction of ORFs is an early step in a genome sequence analysis, but sORFs encoding short peptides, often using non-AUG initiation codons, are not easily discriminated from false ORFs occurring by chance.

RESULTS: AnABlast is a computational tool designed to highlight putative protein-coding regions in genomic DNA sequences. This protein-coding finder is independent of ORF length and reading frame shifts, thus making of AnABlast a potentially useful tool to predict sORFs. Using this algorithm, here, we report the identification of 82 putative new intergenic sORFs in the Caenorhabditis elegans genome. Sequence similarity, motif presence, expression data and RNA interference experiments support that the underlined sORFs likely encode functional peptides, encouraging the use of AnABlast as a new approach for the accurate prediction of intergenic sORFs in annotated eukaryotic genomes.

AVAILABILITY AND IMPLEMENTATION: AnABlast is freely available at http://www.bioinfocabd.upo.es/ab/. The C.elegans genome browser with AnABlast results, annotated genes and all data used in this study is available at http://www.bioinfocabd.upo.es/celegans.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 36 %P 4827-4832 %8 2020 12 08 %G eng %N 19 %1 https://www.ncbi.nlm.nih.gov/pubmed/32614398?dopt=Abstract %R 10.1093/bioinformatics/btaa608 %0 Journal Article %J Biol Direct %D 2019 %T Antibiotic resistance and metabolic profiles as functional biomarkers that accurately predict the geographic origin of city metagenomics samples. %A Casimiro-Soriguer, Carlos S %A Loucera, Carlos %A Perez Florido, Javier %A López-López, Daniel %A Dopazo, Joaquin %K biomarkers %K Cities %K Drug Resistance, Microbial %K Machine Learning %K Metabolome %K Metagenome %K metagenomics %K Microbiota %X

BACKGROUND: The availability of hundreds of city microbiome profiles allows the development of increasingly accurate predictors of the origin of a sample based on its microbiota composition. Typical microbiome studies involve the analysis of bacterial abundance profiles.

RESULTS: Here we use a transformation of the conventional bacterial strain or gene abundance profiles to functional profiles that account for bacterial metabolism and other cell functionalities. These profiles are used as features for city classification in a machine learning algorithm that allows the extraction of the most relevant features for the classification.

CONCLUSIONS: We demonstrate here that the use of functional profiles not only predict accurately the most likely origin of a sample but also to provide an interesting functional point of view of the biogeography of the microbiota. Interestingly, we show how cities can be classified based on the observed profile of antibiotic resistances.

REVIEWERS: Open peer review: Reviewed by Jin Zhuang Dou, Jing Zhou, Torsten Semmler and Eran Elhaik.

%B Biol Direct %V 14 %P 15 %8 2019 08 20 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/31429791?dopt=Abstract %R 10.1186/s13062-019-0246-9 %0 Journal Article %J Nat Commun %D 2019 %T Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen. %A Menden, Michael P %A Wang, Dennis %A Mason, Mike J %A Szalai, Bence %A Bulusu, Krishna C %A Guan, Yuanfang %A Yu, Thomas %A Kang, Jaewoo %A Jeon, Minji %A Wolfinger, Russ %A Nguyen, Tin %A Zaslavskiy, Mikhail %A Jang, In Sock %A Ghazoui, Zara %A Ahsen, Mehmet Eren %A Vogel, Robert %A Neto, Elias Chaibub %A Norman, Thea %A Tang, Eric K Y %A Garnett, Mathew J %A Veroli, Giovanni Y Di %A Fawell, Stephen %A Stolovitzky, Gustavo %A Guinney, Justin %A Dry, Jonathan R %A Saez-Rodriguez, Julio %K ADAM17 Protein %K Antineoplastic Combined Chemotherapy Protocols %K Benchmarking %K Biomarkers, Tumor %K Cell Line, Tumor %K Computational Biology %K Datasets as Topic %K Drug Antagonism %K Drug Resistance, Neoplasm %K Drug Synergism %K Genomics %K Humans %K Molecular Targeted Therapy %K mutation %K Neoplasms %K pharmacogenetics %K Phosphatidylinositol 3-Kinases %K Phosphoinositide-3 Kinase Inhibitors %K Treatment Outcome %X

The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.

%B Nat Commun %V 10 %P 2674 %8 2019 06 17 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/31209238?dopt=Abstract %R 10.1038/s41467-019-09799-2 %0 Journal Article %J Brief Bioinform %D 2019 %T A comparison of mechanistic signaling pathway activity analysis methods. %A Amadoz, Alicia %A Hidalgo, Marta R %A Cubuk, Cankut %A Carbonell-Caballero, José %A Dopazo, Joaquin %K Algorithms %K Humans %K Postmortem Changes %K Signal Transduction %K Systems biology %K Transcriptome %X

Understanding the aspects of cell functionality that account for disease mechanisms or drug modes of action is a main challenge for precision medicine. Classical gene-based approaches ignore the modular nature of most human traits, whereas conventional pathway enrichment approaches produce only illustrative results of limited practical utility. Recently, a family of new methods has emerged that change the focus from the whole pathways to the definition of elementary subpathways within them that have any mechanistic significance and to the study of their activities. Thus, mechanistic pathway activity (MPA) methods constitute a new paradigm that allows recoding poorly informative genomic measurements into cell activity quantitative values and relate them to phenotypes. Here we provide a review on the MPA methods available and explain their contribution to systems medicine approaches for addressing challenges in the diagnostic and treatment of complex diseases.

%B Brief Bioinform %V 20 %P 1655-1668 %8 2019 09 27 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/29868818?dopt=Abstract %R 10.1093/bib/bby040 %0 Journal Article %J NPJ Syst Biol Appl %D 2019 %T Differential metabolic activity and discovery of therapeutic targets using summarized metabolic pathway models. %A Cubuk, Cankut %A Hidalgo, Marta R %A Amadoz, Alicia %A Rian, Kinza %A Salavert, Francisco %A Pujana, Miguel A %A Mateo, Francesca %A Herranz, Carmen %A Carbonell-Caballero, José %A Dopazo, Joaquin %K Computational Biology %K Computer Simulation %K Drug discovery %K Gene Regulatory Networks %K Humans %K Internet %K Metabolic Networks and Pathways %K Models, Biological %K Neoplasms %K Phenotype %K Software %K Transcriptome %X

In spite of the increasing availability of genomic and transcriptomic data, there is still a gap between the detection of perturbations in gene expression and the understanding of their contribution to the molecular mechanisms that ultimately account for the phenotype studied. Alterations in the metabolism are behind the initiation and progression of many diseases, including cancer. The wealth of available knowledge on metabolic processes can therefore be used to derive mechanistic models that link gene expression perturbations to changes in metabolic activity that provide relevant clues on molecular mechanisms of disease and drug modes of action (MoA). In particular, pathway modules, which recapitulate the main aspects of metabolism, are especially suitable for this type of modeling. We present Metabolizer, a web-based application that offers an intuitive, easy-to-use interactive interface to analyze differences in pathway metabolic module activities that can also be used for class prediction and in silico prediction of knock-out (KO) effects. Moreover, Metabolizer can automatically predict the optimal KO intervention for restoring a diseased phenotype. We provide different types of validations of some of the predictions made by Metabolizer. Metabolizer is a web tool that allows understanding molecular mechanisms of disease or the MoA of drugs within the context of the metabolism by using gene expression measurements. In addition, this tool automatically suggests potential therapeutic targets for individualized therapeutic interventions.

%B NPJ Syst Biol Appl %V 5 %P 7 %8 2019 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/30854222?dopt=Abstract %R 10.1038/s41540-019-0087-2 %0 Journal Article %J BMC Bioinformatics %D 2019 %T Exploring the druggable space around the Fanconi anemia pathway using machine learning and mechanistic models. %A Esteban-Medina, Marina %A Peña-Chilet, Maria %A Loucera, Carlos %A Dopazo, Joaquin %K Databases, Factual %K Fanconi Anemia %K Genomics %K Humans %K Machine Learning %K Phenotype %K Proteins %K Signal Transduction %X

BACKGROUND: In spite of the abundance of genomic data, predictive models that describe phenotypes as a function of gene expression or mutations are difficult to obtain because they are affected by the curse of dimensionality, given the disbalance between samples and candidate genes. And this is especially dramatic in scenarios in which the availability of samples is difficult, such as the case of rare diseases.

RESULTS: The application of multi-output regression machine learning methodologies to predict the potential effect of external proteins over the signaling circuits that trigger Fanconi anemia related cell functionalities, inferred with a mechanistic model, allowed us to detect over 20 potential therapeutic targets.

CONCLUSIONS: The use of artificial intelligence methods for the prediction of potentially causal relationships between proteins of interest and cell activities related with disease-related phenotypes opens promising avenues for the systematic search of new targets in rare diseases.

%B BMC Bioinformatics %V 20 %P 370 %8 2019 Jul 02 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/31266445?dopt=Abstract %R 10.1186/s12859-019-2969-0 %0 Journal Article %J Br J Dermatol %D 2019 %T Fibroblast activation and abnormal extracellular matrix remodelling as common hallmarks in three cancer-prone genodermatoses. %A Chacón-Solano, E %A León, C %A Díaz, F %A García-García, F %A García, M %A Escámez, M J %A Guerrero-Aspizua, S %A Conti, C J %A Mencía, Á %A Martínez-Santamaría, L %A Llames, S %A Pévida, M %A Carbonell-Caballero, J %A Puig-Butillé, J A %A Maseda, R %A Puig, S %A de Lucas, R %A Baselga, E %A Larcher, F %A Dopazo, J %A Del Rio, M %K Adolescent %K Adult %K Biopsy %K Blister %K Case-Control Studies %K Cells, Cultured %K Child %K Child, Preschool %K Epidermolysis Bullosa %K Epidermolysis Bullosa Dystrophica %K Extracellular Matrix %K Extracellular Matrix Proteins %K Female %K Fibroblasts %K Fibrosis %K Gene Expression Regulation %K Healthy Volunteers %K Humans %K Infant %K Infant, Newborn %K Male %K Middle Aged %K mutation %K Periodontal Diseases %K Photosensitivity Disorders %K Primary Cell Culture %K RNA-seq %K Skin %K Xeroderma Pigmentosum %K Young Adult %X

BACKGROUND: Recessive dystrophic epidermolysis bullosa (RDEB), Kindler syndrome (KS) and xeroderma pigmentosum complementation group C (XPC) are three cancer-prone genodermatoses whose causal genetic mutations cannot fully explain, on their own, the array of associated phenotypic manifestations. Recent evidence highlights the role of the stromal microenvironment in the pathology of these disorders.

OBJECTIVES: To investigate, by means of comparative gene expression analysis, the role played by dermal fibroblasts in the pathogenesis of RDEB, KS and XPC.

METHODS: We conducted RNA-Seq analysis, which included a thorough examination of the differentially expressed genes, a functional enrichment analysis and a description of affected signalling circuits. Transcriptomic data were validated at the protein level in cell cultures, serum samples and skin biopsies.

RESULTS: Interdisease comparisons against control fibroblasts revealed a unifying signature of 186 differentially expressed genes and four signalling pathways in the three genodermatoses. Remarkably, some of the uncovered expression changes suggest a synthetic fibroblast phenotype characterized by the aberrant expression of extracellular matrix (ECM) proteins. Western blot and immunofluorescence in situ analyses validated the RNA-Seq data. In addition, enzyme-linked immunosorbent assay revealed increased circulating levels of periostin in patients with RDEB.

CONCLUSIONS: Our results suggest that the different causal genetic defects converge into common changes in gene expression, possibly due to injury-sensitive events. These, in turn, trigger a cascade of reactions involving abnormal ECM deposition and underexpression of antioxidant enzymes. The elucidated expression signature provides new potential biomarkers and common therapeutic targets in RDEB, XPC and KS. What's already known about this topic? Recessive dystrophic epidermolysis bullosa (RDEB), Kindler syndrome (KS) and xeroderma pigmentosum complementation group C (XPC) are three genodermatoses with high predisposition to cancer development. Although their causal genetic mutations mainly affect epithelia, the dermal microenvironment likely contributes to the physiopathology of these disorders. What does this study add? We disclose a large overlapping transcription profile between XPC, KS and RDEB fibroblasts that points towards an activated phenotype with high matrix-synthetic capacity. This common signature seems to be independent of the primary causal deficiency, but reflects an underlying derangement of the extracellular matrix via transforming growth factor-β signalling activation and oxidative state imbalance. What is the translational message? This study broadens the current knowledge about the pathology of these diseases and highlights new targets and biomarkers for effective therapeutic intervention. It is suggested that high levels of circulating periostin could represent a potential biomarker in RDEB.

%B Br J Dermatol %V 181 %P 512-522 %8 2019 09 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/30693469?dopt=Abstract %R 10.1111/bjd.17698 %0 Journal Article %J Lancet Oncol %D 2019 %T Pazopanib for treatment of advanced malignant and dedifferentiated solitary fibrous tumour: a multicentre, single-arm, phase 2 trial. %A Martin-Broto, Javier %A Stacchiotti, Silvia %A Lopez-Pousa, Antonio %A Redondo, Andres %A Bernabeu, Daniel %A de Alava, Enrique %A Casali, Paolo G %A Italiano, Antoine %A Gutierrez, Antonio %A Moura, David S %A Peña-Chilet, Maria %A Diaz-Martin, Juan %A Biscuola, Michele %A Taron, Miguel %A Collini, Paola %A Ranchere-Vince, Dominique %A Garcia Del Muro, Xavier %A Grignani, Giovanni %A Dumont, Sarah %A Martinez-Trufero, Javier %A Palmerini, Emanuela %A Hindi, Nadia %A Sebio, Ana %A Dopazo, Joaquin %A Dei Tos, Angelo Paolo %A LeCesne, Axel %A Blay, Jean-Yves %A Cruz, Josefina %K Adult %K Aged %K Angiogenesis Inhibitors %K Antineoplastic Agents %K Female %K Humans %K Indazoles %K Male %K Middle Aged %K Multivariate Analysis %K Pyrimidines %K Response Evaluation Criteria in Solid Tumors %K Soft Tissue Neoplasms %K Solitary Fibrous Tumors %K Sulfonamides %K Survival Analysis %X

BACKGROUND: A solitary fibrous tumour is a rare soft-tissue tumour with three clinicopathological variants: typical, malignant, and dedifferentiated. Preclinical experiments and retrospective studies have shown different sensitivities of solitary fibrous tumour to chemotherapy and antiangiogenics. We therefore designed a trial to assess the activity of pazopanib in a cohort of patients with malignant or dedifferentiated solitary fibrous tumour. The clinical and translational results are presented here.

METHODS: In this single-arm, phase 2 trial, adult patients (aged ≥ 18 years) with histologically confirmed metastatic or unresectable malignant or dedifferentiated solitary fibrous tumour at any location, who had progressed (by RECIST and Choi criteria) in the previous 6 months and had an ECOG performance status of 0-2, were enrolled at 16 third-level hospitals with expertise in sarcoma care in Spain, Italy, and France. Patients received pazopanib 800 mg once daily, taken orally without food, at least 1 h before or 2 h after a meal, until progression or intolerance. The primary endpoint of the study was overall response measured by Choi criteria in the subset of the intention-to-treat population (patients who received at least 1 month of treatment with at least one radiological assessment). All patients who received at least one dose of the study drug were included in the safety analyses. This study is registered with ClinicalTrials.gov, number NCT02066285, and with the European Clinical Trials Database, EudraCT number 2013-005456-15.

FINDINGS: From June 26, 2014, to Nov 24, 2016, of 40 patients assessed, 36 were enrolled (34 with malignant solitary fibrous tumour and two with dedifferentiated solitary fibrous tumour). Median follow-up was 27 months (IQR 16-31). Based on central radiology review, 18 (51%) of 35 evaluable patients had partial responses, nine (26%) had stable disease, and eight (23%) had progressive disease according to Choi criteria. Further enrolment of patients with dedifferentiated solitary fibrous tumour was stopped after detection of early and fast progressions in a planned interim analysis. 51% (95% CI 34-69) of 35 patients achieved an overall response according to Choi criteria. Ten (29%) of 35 patients died. There were no deaths related to adverse events and the most frequent grade 3 or higher adverse events were hypertension (11 [31%] of 36 patients), neutropenia (four [11%]), increased concentrations of alanine aminotransferase (four [11%]), and increased concentrations of bilirubin (three [8%]).

INTERPRETATION: To our knowledge, this is the first trial of pazopanib for treatment of malignant solitary fibrous tumour showing activity in this patient group. The manageable toxicity profile and the activity shown by pazopanib suggests that this drug could be an option for systemic treatment of advanced malignant solitary fibrous tumour, and provides a benchmark for future trials.

FUNDING: Spanish Group for Research on Sarcomas (GEIS), Italian Sarcoma Group (ISG), French Sarcoma Group (FSG), GlaxoSmithKline, and Novartis.

%B Lancet Oncol %V 20 %P 134-144 %8 2019 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/30578023?dopt=Abstract %R 10.1016/S1470-2045(18)30676-4 %0 Journal Article %J Brief Bioinform %D 2019 %T Precision medicine needs pioneering clinical bioinformaticians. %A Gómez-López, Gonzalo %A Dopazo, Joaquin %A Cigudosa, Juan C %A Valencia, Alfonso %A Al-Shahrour, Fátima %K Cohort Studies %K Computational Biology %K Humans %K Precision Medicine %X

Success in precision medicine depends on accessing high-quality genetic and molecular data from large, well-annotated patient cohorts that couple biological samples to comprehensive clinical data, which in conjunction can lead to effective therapies. From such a scenario emerges the need for a new professional profile, an expert bioinformatician with training in clinical areas who can make sense of multi-omics data to improve therapeutic interventions in patients, and the design of optimized basket trials. In this review, we first describe the main policies and international initiatives that focus on precision medicine. Secondly, we review the currently ongoing clinical trials in precision medicine, introducing the concept of 'precision bioinformatics', and we describe current pioneering bioinformatics efforts aimed at implementing tools and computational infrastructures for precision medicine in health institutions around the world. Thirdly, we discuss the challenges related to the clinical training of bioinformaticians, and the urgent need for computational specialists capable of assimilating medical terminologies and protocols to address real clinical questions. We also propose some skills required to carry out common tasks in clinical bioinformatics and some tips for emergent groups. Finally, we explore the future perspectives and the challenges faced by precision medicine bioinformatics.

%B Brief Bioinform %V 20 %P 752-766 %8 2019 05 21 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/29077790?dopt=Abstract %R 10.1093/bib/bbx144 %0 Journal Article %J BMC Bioinformatics %D 2019 %T PyCellBase, an efficient python package for easy retrieval of biological data from heterogeneous sources. %A Perez-Gil, Daniel %A Lopez, Francisco J %A Dopazo, Joaquin %A Marin-Garcia, Pablo %A Rendon, Augusto %A Medina, Ignacio %K Computational Biology %K Databases, Factual %K Software %K User-Computer Interface %X

BACKGROUND: Biological databases and repositories are incrementing in diversity and complexity over the years. This rapid expansion of current and new sources of biological knowledge raises serious problems of data accessibility and integration. To handle the growing necessity of unification, CellBase was created as an integrative solution. CellBase provides a centralized NoSQL database containing biological information from different and heterogeneous sources. Access to this information is done through a RESTful web service API, which provides an efficient interface to the data.

RESULTS: In this work we present PyCellBase, a Python package that provides programmatic access to the rich RESTful web service API offered by CellBase. This package offers a fast and user-friendly access to biological information without the need of installing any local database. In addition, a series of command-line tools are provided to perform common bioinformatic tasks, such as variant annotation. CellBase data is always available by a high-availability cluster and queries have been tuned to ensure a real-time performance.

CONCLUSION: PyCellBase is an open-source Python package that provides an efficient access to heterogeneous biological information. It allows to perform tasks that require a comprehensive set of knowledge resources, as for example variant annotation. Queries can be easily fine-tuned to retrieve the desired information of particular biological features. PyCellBase offers the convenience of an object-oriented scripting language and provides the ability to integrate the obtained results into other Python applications and pipelines.

%B BMC Bioinformatics %V 20 %P 159 %8 2019 Mar 28 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/30922213?dopt=Abstract %R 10.1186/s12859-019-2726-4 %0 Journal Article %J Scientific Reports %D 2019 %T Using mechanistic models for the clinical interpretation of complex genomic variation %A Peña-Chilet, Maria %A Esteban-Medina, Marina %A Falco, Matias M. %A Rian, Kinza %A Hidalgo, Marta R. %A Loucera, Carlos %A Dopazo, Joaquin %B Scientific Reports %V 9 %8 Jan-12-2019 %G eng %U http://www.nature.com/articles/s41598-019-55454-7http://www.nature.com/articles/s41598-019-55454-7.pdfhttp://www.nature.com/articles/s41598-019-55454-7.pdfhttp://www.nature.com/articles/s41598-019-55454-7 %N 1 %! Sci Rep %R 10.1038/s41598-019-55454-7 %0 Journal Article %J Nature Communications %D 2018 %T A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection %A Fourati, Slim %A Talla, Aarthi %A Mahmoudian, Mehrad %A Burkhart, Joshua G. %A Klén, Riku %A Henao, Ricardo %A Yu, Thomas %A Aydın, Zafer %A Yeung, Ka Yee %A Ahsen, Mehmet Eren %A Almugbel, Reem %A Jahandideh, Samad %A Liang, Xiao %A Nordling, Torbjörn E. M. %A Shiga, Motoki %A Stanescu, Ana %A Vogel, Robert %A Pandey, Gaurav %A Chiu, Christopher %A McClain, Micah T. %A Woods, Christopher W. %A Ginsburg, Geoffrey S. %A Elo, Laura L. %A Tsalik, Ephraim L. %A Mangravite, Lara M. %A Sieberts, Solveig K. %B Nature Communications %V 9 %8 Jan-12-2018 %G eng %U http://www.nature.com/articles/s41467-018-06735-8http://www.nature.com/articles/s41467-018-06735-8.pdfhttp://www.nature.com/articles/s41467-018-06735-8.pdfhttp://www.nature.com/articles/s41467-018-06735-8 %N 1 %! Nat Commun %R 10.1038/s41467-018-06735-8 %0 Journal Article %J Nat Commun %D 2018 %T The effects of death and post-mortem cold ischemia on human tissue transcriptomes. %A Ferreira, Pedro G %A Muñoz-Aguirre, Manuel %A Reverter, Ferran %A Sá Godinho, Caio P %A Sousa, Abel %A Amadoz, Alicia %A Sodaei, Reza %A Hidalgo, Marta R %A Pervouchine, Dmitri %A Carbonell-Caballero, José %A Nurtdinov, Ramil %A Breschi, Alessandra %A Amador, Raziel %A Oliveira, Patrícia %A Cubuk, Cankut %A Curado, João %A Aguet, François %A Oliveira, Carla %A Dopazo, Joaquin %A Sammeth, Michael %A Ardlie, Kristin G %A Guigó, Roderic %K Blood %K Cold Ischemia %K Death %K Female %K gene expression %K Humans %K Models, Biological %K Postmortem Changes %K RNA, Messenger %K Stochastic Processes %K Transcriptome %X

Post-mortem tissues samples are a key resource for investigating patterns of gene expression. However, the processes triggered by death and the post-mortem interval (PMI) can significantly alter physiologically normal RNA levels. We investigate the impact of PMI on gene expression using data from multiple tissues of post-mortem donors obtained from the GTEx project. We find that many genes change expression over relatively short PMIs in a tissue-specific manner, but this potentially confounding effect in a biological analysis can be minimized by taking into account appropriate covariates. By comparing ante- and post-mortem blood samples, we identify the cascade of transcriptional events triggered by death of the organism. These events do not appear to simply reflect stochastic variation resulting from mRNA degradation, but active and ongoing regulation of transcription. Finally, we develop a model to predict the time since death from the analysis of the transcriptome of a few readily accessible tissues.

%B Nat Commun %V 9 %P 490 %8 2018 02 13 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29440659?dopt=Abstract %R 10.1038/s41467-017-02772-x %0 Journal Article %J Sci Rep %D 2018 %T Evolution of the Quorum network and the mobilome (plasmids and bacteriophages) in clinical strains of Acinetobacter baumannii during a decade. %A López, M %A Rueda, A %A Florido, J P %A Blasco, L %A Fernández-García, L %A Trastoy, R %A Fernández-Cuenca, F %A Martínez-Martínez, L %A Vila, J %A Pascual, A %A Bou, G %A Tomas, M %K Acinetobacter baumannii %K Acinetobacter Infections %K Bacteriophages %K Cross Infection %K Humans %K Plasmids %K Quorum Sensing %K Retrospective Studies %X

In this study, we compared eighteen clinical strains of A. baumannii belonging to the ST-2 clone and isolated from patients in the same intensive care unit (ICU) in 2000 (9 strains referred to collectively as Ab_GEIH-2000) and 2010 (9 strains referred to collectively as Ab_GEIH-2010), during the GEIH-REIPI project (Umbrella BioProject PRJNA422585). We observed two main molecular differences between the Ab_GEIH-2010 and the Ab_GEIH-2000 collections, acquired over the course of the decade long sampling interval and involving the mobilome: i) a plasmid harbouring genes for bla ß-lactamase and abKA/abkB proteins of a toxin-antitoxin system; and ii) two temperate bacteriophages, Ab105-1ϕ (63 proteins) and Ab105-2ϕ (93 proteins), containing important viral defence proteins. Moreover, all Ab_GEIH-2010 strains contained a Quorum functional network of Quorum Sensing (QS) and Quorum Quenching (QQ) mechanisms, including a new QQ enzyme, AidA, which acts as a bacterial defence mechanism against the exogenous 3-oxo-C12-HSL. Interestingly, the infective capacity of the bacteriophages isolated in this study (Ab105-1ϕ and Ab105-2ϕ) was higher in the Ab_GEIH-2010 strains (carrying a functional Quorum network) than in the Ab_GEIH-2000 strains (carrying a deficient Quorum network), in which the bacteriophages showed little or no infectivity. This is the first study about the evolution of the Quorum network and the mobilome in clinical strains of Acinetobacter baumannii during a decade.

%B Sci Rep %V 8 %P 2523 %8 2018 02 06 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29410443?dopt=Abstract %R 10.1038/s41598-018-20847-7 %0 Journal Article %J Microb Genom %D 2018 %T The first complete genomic structure of Butyrivibrio fibrisolvens and its chromid. %A Rodríguez Hernáez, Javier %A Cerón Cucchi, Maria Esperanza %A Cravero, Silvio %A Martinez, Maria Carolina %A Gonzalez, Sergio %A Puebla, Andrea %A Dopazo, Joaquin %A Farber, Marisa %A Paniego, Norma %A Rivarola, Máximo %K Animals %K Butyrivibrio fibrisolvens %K Cattle %K Genome, Bacterial %K Genomics %K Humans %K Milk %K Rumen %K Sequence Analysis, DNA %X

Butyrivibrio fibrisolvens forms part of the gastrointestinal microbiome of ruminants and other mammals, including humans. Indeed, it is one of the most common bacteria found in the rumen and plays an important role in ruminal fermentation of polysaccharides, yet, to date, there is no closed reference genome published for this species in any ruminant animal. We successfully assembled the nearly complete genome sequence of B. fibrisolvens strain INBov1 isolated from cow rumen using Illumina paired-end reads, 454 Roche single-end and mate pair sequencing technology. Additionally, we constructed an optical restriction map of this strain to aid in scaffold ordering and positioning, and completed the first genomic structure of this species. Moreover, we identified and assembled the first chromid of this species (pINBov266). The INBov1 genome encodes a large set of genes involved in the cellulolytic process but lacks key genes. This seems to indicate that B. fibrisolvens plays an important role in ruminal cellulolytic processes, but does not have autonomous cellulolytic capacity. When searching for genes involved in the biohydrogenation of unsaturated fatty acids, no linoleate isomerase gene was found in this strain. INBov1 does encode oleate hydratase genes known to participate in the hydrogenation of oleic acids. Furthermore, INBov1 contains an enolase gene, which has been recently determined to participate in the synthesis of conjugated linoleic acids. This work confirms the presence of a novel chromid in B. fibrisolvens and provides a new potential reference genome sequence for this species, providing new insight into its role in biohydrogenation and carbohydrate degradation.

%B Microb Genom %V 4 %8 2018 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/30216146?dopt=Abstract %R 10.1099/mgen.0.000216 %0 Journal Article %J Cancer Res %D 2018 %T Gene Expression Integration into Pathway Modules Reveals a Pan-Cancer Metabolic Landscape. %A Cubuk, Cankut %A Hidalgo, Marta R %A Amadoz, Alicia %A Pujana, Miguel A %A Mateo, Francesca %A Herranz, Carmen %A Carbonell-Caballero, José %A Dopazo, Joaquin %K Cell Line, Tumor %K Cluster Analysis %K Disease Progression %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Gene Regulatory Networks %K Humans %K Kaplan-Meier Estimate %K Metabolome %K mutation %K Neoplasms %K Oncogenes %K Phenotype %K Prognosis %K RNA, Small Interfering %K Sequence Analysis, RNA %K Transcriptome %K Treatment Outcome %X

Metabolic reprogramming plays an important role in cancer development and progression and is a well-established hallmark of cancer. Despite its inherent complexity, cellular metabolism can be decomposed into functional modules that represent fundamental metabolic processes. Here, we performed a pan-cancer study involving 9,428 samples from 25 cancer types to reveal metabolic modules whose individual or coordinated activity predict cancer type and outcome, in turn highlighting novel therapeutic opportunities. Integration of gene expression levels into metabolic modules suggests that the activity of specific modules differs between cancers and the corresponding tissues of origin. Some modules may cooperate, as indicated by the positive correlation of their activity across a range of tumors. The activity of many metabolic modules was significantly associated with prognosis at a stronger magnitude than any of their constituent genes. Thus, modules may be classified as tumor suppressors and oncomodules according to their potential impact on cancer progression. Using this modeling framework, we also propose novel potential therapeutic targets that constitute alternative ways of treating cancer by inhibiting their reprogrammed metabolism. Collectively, this study provides an extensive resource of predicted cancer metabolic profiles and dependencies. Combining gene expression with metabolic modules identifies molecular mechanisms of cancer undetected on an individual gene level and allows discovery of new potential therapeutic targets. .

%B Cancer Res %V 78 %P 6059-6072 %8 2018 11 01 %G eng %N 21 %1 https://www.ncbi.nlm.nih.gov/pubmed/30135189?dopt=Abstract %R 10.1158/0008-5472.CAN-17-2705 %0 Journal Article %J Nature %D 2018 %T Genomics of the origin and evolution of Citrus. %A Wu, Guohong Albert %A Terol, Javier %A Ibañez, Victoria %A López-García, Antonio %A Pérez-Román, Estela %A Borredá, Carles %A Domingo, Concha %A Tadeo, Francisco R %A Carbonell-Caballero, José %A Alonso, Roberto %A Curk, Franck %A Du, Dongliang %A Ollitrault, Patrick %A Roose, Mikeal L %A Dopazo, Joaquin %A Gmitter, Frederick G %A Rokhsar, Daniel S %A Talon, Manuel %K Asia, Southeastern %K Biodiversity %K citrus %K Crop Production %K Evolution, Molecular %K Genetic Speciation %K Genome, Plant %K Genomics %K Haplotypes %K Heterozygote %K History, Ancient %K Human Migration %K Hybridization, Genetic %K Phylogeny %X

The genus Citrus, comprising some of the most widely cultivated fruit crops worldwide, includes an uncertain number of species. Here we describe ten natural citrus species, using genomic, phylogenetic and biogeographic analyses of 60 accessions representing diverse citrus germ plasms, and propose that citrus diversified during the late Miocene epoch through a rapid southeast Asian radiation that correlates with a marked weakening of the monsoons. A second radiation enabled by migration across the Wallace line gave rise to the Australian limes in the early Pliocene epoch. Further identification and analyses of hybrids and admixed genomes provides insights into the genealogy of major commercial cultivars of citrus. Among mandarins and sweet orange, we find an extensive network of relatedness that illuminates the domestication of these groups. Widespread pummelo admixture among these mandarins and its correlation with fruit size and acidity suggests a plausible role of pummelo introgression in the selection of palatable mandarins. This work provides a new evolutionary framework for the genus Citrus.

%B Nature %V 554 %P 311-316 %8 2018 02 15 %G eng %N 7692 %1 https://www.ncbi.nlm.nih.gov/pubmed/29414943?dopt=Abstract %R 10.1038/nature25447 %0 Journal Article %J Nat Commun %D 2018 %T LRH-1 agonism favours an immune-islet dialogue which protects against diabetes mellitus. %A Cobo-Vuilleumier, Nadia %A Lorenzo, Petra I %A Rodríguez, Noelia García %A Herrera Gómez, Irene de Gracia %A Fuente-Martin, Esther %A López-Noriega, Livia %A Mellado-Gil, José Manuel %A Romero-Zerbo, Silvana-Yanina %A Baquié, Mathurin %A Lachaud, Christian Claude %A Stifter, Katja %A Perdomo, German %A Bugliani, Marco %A De Tata, Vincenzo %A Bosco, Domenico %A Parnaud, Geraldine %A Pozo, David %A Hmadcha, Abdelkrim %A Florido, Javier P %A Toscano, Miguel G %A de Haan, Peter %A Schoonjans, Kristina %A Sánchez Palazón, Luis %A Marchetti, Piero %A Schirmbeck, Reinhold %A Martín-Montalvo, Alejandro %A Meda, Paolo %A Soria, Bernat %A Bermúdez-Silva, Francisco-Javier %A St-Onge, Luc %A Gauthier, Benoit R %K Animals %K Apoptosis %K Cell Communication %K Cell Survival %K Diabetes Mellitus, Experimental %K Diabetes Mellitus, Type 2 %K Female %K Gene Expression Regulation %K Humans %K Hypoglycemic Agents %K Immunity, Innate %K insulin %K Insulin-Secreting Cells %K Islets of Langerhans %K Islets of Langerhans Transplantation %K Macrophages %K Male %K Mice %K Mice, Inbred C57BL %K Phenalenes %K Receptors, Cytoplasmic and Nuclear %K Streptozocin %K T-Lymphocytes, Regulatory %K Transplantation, Heterologous %X

Type 1 diabetes mellitus (T1DM) is due to the selective destruction of islet beta cells by immune cells. Current therapies focused on repressing the immune attack or stimulating beta cell regeneration still have limited clinical efficacy. Therefore, it is timely to identify innovative targets to dampen the immune process, while promoting beta cell survival and function. Liver receptor homologue-1 (LRH-1) is a nuclear receptor that represses inflammation in digestive organs, and protects pancreatic islets against apoptosis. Here, we show that BL001, a small LRH-1 agonist, impedes hyperglycemia progression and the immune-dependent inflammation of pancreas in murine models of T1DM, and beta cell apoptosis in islets of type 2 diabetic patients, while increasing beta cell mass and insulin secretion. Thus, we suggest that LRH-1 agonism favors a dialogue between immune and islet cells, which could be druggable to protect against diabetes mellitus.

%B Nat Commun %V 9 %P 1488 %8 2018 04 16 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29662071?dopt=Abstract %R 10.1038/s41467-018-03943-0 %0 Journal Article %J Biol Direct %D 2018 %T Models of cell signaling uncover molecular mechanisms of high-risk neuroblastoma and predict disease outcome. %A Hidalgo, Marta R %A Amadoz, Alicia %A Cubuk, Cankut %A Carbonell-Caballero, José %A Dopazo, Joaquin %K Computational Biology %K Gene Expression Regulation, Neoplastic %K Humans %K JNK Mitogen-Activated Protein Kinases %K Models, Theoretical %K Neuroblastoma %K Signal Transduction %X

BACKGROUND: Despite the progress in neuroblastoma therapies the mortality of high-risk patients is still high (40-50%) and the molecular basis of the disease remains poorly known. Recently, a mathematical model was used to demonstrate that the network regulating stress signaling by the c-Jun N-terminal kinase pathway played a crucial role in survival of patients with neuroblastoma irrespective of their MYCN amplification status. This demonstrates the enormous potential of computational models of biological modules for the discovery of underlying molecular mechanisms of diseases.

RESULTS: Since signaling is known to be highly relevant in cancer, we have used a computational model of the whole cell signaling network to understand the molecular determinants of bad prognostic in neuroblastoma. Our model produced a comprehensive view of the molecular mechanisms of neuroblastoma tumorigenesis and progression.

CONCLUSION: We have also shown how the activity of signaling circuits can be considered a reliable model-based prognostic biomarker.

REVIEWERS: This article was reviewed by Tim Beissbarth, Wenzhong Xiao and Joanna Polanska. For the full reviews, please go to the Reviewers' comments section.

%B Biol Direct %V 13 %P 16 %8 2018 08 22 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/30134948?dopt=Abstract %R 10.1186/s13062-018-0219-4 %0 Journal Article %J PLoS One %D 2018 %T The modular network structure of the mutational landscape of Acute Myeloid Leukemia. %A Ibáñez, Mariam %A Carbonell-Caballero, José %A Such, Esperanza %A García-Alonso, Luz %A Liquori, Alessandro %A López-Pavía, María %A LLop, Marta %A Alonso, Carmen %A Barragán, Eva %A Gómez-Seguí, Inés %A Neef, Alexander %A Hervás, David %A Montesinos, Pau %A Sanz, Guillermo %A Sanz, Miguel Angel %A Dopazo, Joaquin %A Cervera, José %K Adult %K Aged %K Cytodiagnosis %K Female %K Gene Regulatory Networks %K Genetic Association Studies %K Genetic Heterogeneity %K Humans %K Karyotype %K Leukemia, Myeloid, Acute %K Male %K Middle Aged %K mutation %K Neoplasm Proteins %K Nucleophosmin %K Prognosis %K whole exome sequencing %X

Acute myeloid leukemia (AML) is associated with the sequential accumulation of acquired genetic alterations. Although at diagnosis cytogenetic alterations are frequent in AML, roughly 50% of patients present an apparently normal karyotype (NK), leading to a highly heterogeneous prognosis. Due to this significant heterogeneity, it has been suggested that different molecular mechanisms may trigger the disease with diverse prognostic implications. We performed whole-exome sequencing (WES) of tumor-normal matched samples of de novo AML-NK patients lacking mutations in NPM1, CEBPA or FLT3-ITD to identify new gene mutations with potential prognostic and therapeutic relevance to patients with AML. Novel candidate-genes, together with others previously described, were targeted resequenced in an independent cohort of 100 de novo AML patients classified in the cytogenetic intermediate-risk (IR) category. A mean of 4.89 mutations per sample were detected in 73 genes, 35 of which were mutated in more than one patient. After a network enrichment analysis, we defined a single in silico model and established a set of seed-genes that may trigger leukemogenesis in patients with normal karyotype. The high heterogeneity of gene mutations observed in AML patients suggested that a specific alteration could not be as essential as the interaction of deregulated pathways.

%B PLoS One %V 13 %P e0202926 %8 2018 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/30303964?dopt=Abstract %R 10.1371/journal.pone.0202926 %0 Journal Article %J BMC Bioinformatics %D 2017 %T ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data. %A Gonzalez, Sergio %A Clavijo, Bernardo %A Rivarola, Máximo %A Moreno, Patricio %A Fernandez, Paula %A Dopazo, Joaquin %A Paniego, Norma %K Animals %K Databases, Genetic %K Gene Expression Profiling %K High-Throughput Nucleotide Sequencing %K Internet %K Sequence Analysis, RNA %K Transcriptome %K User-Computer Interface %X

BACKGROUND: In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species.

RESULTS: We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results.

CONCLUSIONS: ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .

%B BMC Bioinformatics %V 18 %P 121 %8 2017 Feb 22 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28222698?dopt=Abstract %R 10.1186/s12859-017-1494-2 %0 Journal Article %J Oncotarget %D 2017 %T Genomic expression differences between cutaneous cells from red hair color individuals and black hair color individuals based on bioinformatic analysis. %A Puig-Butille, Joan Anton %A Gimenez-Xavier, Pol %A Visconti, Alessia %A Nsengimana, Jérémie %A Garcia-Garcia, Francisco %A Tell-Marti, Gemma %A Escamez, Maria José %A Newton-Bishop, Julia %A Bataille, Veronique %A Del Rio, Marcela %A Dopazo, Joaquin %A Falchi, Mario %A Puig, Susana %K Adult %K Coculture Techniques %K Computational Biology %K gene expression %K Genetic Predisposition to Disease %K Genomics %K Hair Color %K Humans %K Keratinocytes %K Melanocytes %K Middle Aged %K Phenotype %K Receptor, Melanocortin, Type 1 %X

The MC1R gene plays a crucial role in pigmentation synthesis. Loss-of-function MC1R variants, which impair protein function, are associated with red hair color (RHC) phenotype and increased skin cancer risk. Cultured cutaneous cells bearing loss-of-function MC1R variants show a distinct gene expression profile compared to wild-type MC1R cultured cutaneous cells. We analysed the gene signature associated with RHC co-cultured melanocytes and keratinocytes by Protein-Protein interaction (PPI) network analysis to identify genes related with non-functional MC1R variants. From two detected networks, we selected 23 nodes as hub genes based on topological parameters. Differential expression of hub genes was then evaluated in healthy skin biopsies from RHC and black hair color (BHC) individuals. We also compared gene expression in melanoma tumors from individuals with RHC versus BHC. Gene expression in normal skin from RHC cutaneous cells showed dysregulation in 8 out of 23 hub genes (CLN3, ATG10, WIPI2, SNX2, GABARAPL2, YWHA, PCNA and GBAS). Hub genes did not differ between melanoma tumors in RHC versus BHC individuals. The study suggests that healthy skin cells from RHC individuals present a constitutive genomic deregulation associated with the red hair phenotype and identify novel genes involved in melanocyte biology.

%B Oncotarget %V 8 %P 11589-11599 %8 2017 Feb 14 %G eng %U http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path%5B%5D=14140&path%5B%5D=45094 %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/28030792?dopt=Abstract %R 10.18632/oncotarget.14140 %0 Journal Article %J N Engl J Med %D 2017 %T GGPS1 Mutation and Atypical Femoral Fractures with Bisphosphonates. %A Roca-Ayats, Neus %A Balcells, Susana %A Garcia-Giralt, Natàlia %A Falcó-Mascaró, Maite %A Martínez-Gil, Núria %A Abril, Josep F %A Urreizti, Roser %A Dopazo, Joaquin %A Quesada-Gómez, José M %A Nogués, Xavier %A Mellibovsky, Leonardo %A Prieto-Alhambra, Daniel %A Dunford, James E %A Javaid, Muhammad K %A Russell, R Graham %A Grinberg, Daniel %A Díez-Pérez, Adolfo %K Aged %K Amino Acid Sequence %K Bone Density Conservation Agents %K Dimethylallyltranstransferase %K Diphosphonates %K Exome %K Farnesyltranstransferase %K Female %K Femoral Fractures %K Geranyltranstransferase %K Humans %K Middle Aged %K mutation %B N Engl J Med %V 376 %P 1794-1795 %8 2017 05 04 %G eng %U http://www.nejm.org/doi/full/10.1056/NEJMc1612804 %N 18 %1 https://www.ncbi.nlm.nih.gov/pubmed/28467865?dopt=Abstract %R 10.1056/NEJMc1612804 %0 Journal Article %J Cereb Cortex %D 2017 %T Global Transcriptome Analysis of Primary Cerebrocortical Cells: Identification of Genes Regulated by Triiodothyronine in Specific Cell Types. %A Gil-Ibañez, Pilar %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Bernal, Juan %A Morte, Beatriz %K Animals %K Astrocytes %K Cells, Cultured %K Cerebral Cortex %K Fluorescent Antibody Technique %K Gene Expression Profiling %K Mice, 129 Strain %K Mice, Inbred BALB C %K Mice, Inbred C57BL %K Neurons %K Piperazines %K Transcriptome %K Triiodothyronine %X

Thyroid hormones, thyroxine, and triiodothyronine (T3) are crucial for cerebral cortex development acting through regulation of gene expression. To define the transcriptional program under T3 regulation, we have performed RNA-Seq of T3-treated and untreated primary mouse cerebrocortical cells. The expression of 1145 genes or 7.7% of expressed genes was changed upon T3 addition, of which 371 responded to T3 in the presence of cycloheximide indicating direct transcriptional regulation. The results were compared with available transcriptomic datasets of defined cellular types. In this way, we could identify targets of T3 within genes enriched in astrocytes and neurons, in specific layers including the subplate, and in specific neurons such as prepronociceptin, cholecystokinin, or cortistatin neurons. The subplate and the prepronociceptin neurons appear as potentially major targets of T3 action. T3 upregulates mostly genes related to cell membrane events, such as G-protein signaling, neurotransmission, and ion transport and downregulates genes involved in nuclear events associated with the M phase of cell cycle, such as chromosome organization and segregation. Remarkably, the transcriptomic changes induced by T3 sustain the transition from fetal to adult patterns of gene expression. The results allow defining in molecular terms the elusive role of thyroid hormones on neocortical development.

%B Cereb Cortex %V 27 %P 706-717 %8 2017 01 01 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26534908?dopt=Abstract %R 10.1093/cercor/bhv273 %0 Journal Article %J BMC Syst Biol %D 2017 %T Graph-theoretical comparison of normal and tumor networks in identifying BRCA genes. %A Dopazo, Joaquin %A Erten, Cesim %K Breast Neoplasms %K Female %K Gene Regulatory Networks %K Genes, Tumor Suppressor %K Humans %K Models, Theoretical %K mutation %X

BACKGROUND: Identification of driver genes related to certain types of cancer is an important research topic. Several systems biology approaches have been suggested, in particular for the identification of breast cancer (BRCA) related genes. Such approaches usually rely on differential gene expression and/or mutational landscape data. In some cases interaction network data is also integrated to identify cancer-related modules computationally.

RESULTS: We provide a framework for the comparative graph-theoretical analysis of networks integrating the relevant gene expression, mutations, and potein-protein interaction network data. The comparisons involve a graph-theoretical analysis of normal and tumor network pairs across all instances of a given set of breast cancer samples. The network measures under consideration are based on appropriate formulations of various centrality measures: betweenness, clustering coefficients, degree centrality, random walk distances, graph-theoretical distances, and Jaccard index centrality.

CONCLUSIONS: Among all the studied centrality-based graph-theoretical properties, we show that a betweenness-based measure differentiates BRCA genes across all normal versus tumor network pairs, than the rest of the popular centrality-based measures. The AUROC and AUPR values of the gene lists ordered with respect to the measures under study as compared to NCBI BioSystems pathway and the COSMIC database of cancer genes are the largest with the betweenness-based differentiation, followed by the measure based on degree centrality. In order to test the robustness of the suggested measures in prioritizing cancer genes, we further tested the two most promising measures, those based on betweenness and degree centralities, on randomly rewired networks. We show that both measures are quite resilient to noise in the input interaction network. We also compared the same measures against a state-of-the-art alternative disease gene prioritization method, MUFFFINN. We show that both our graph-theoretical measures outperform MUFFINN prioritizations in terms of ROC and precions/recall analysis. Finally, we filter the ordered list of the best measure, the betweenness-based differentiation, via a maximum-weight independent set formulation and investigate the top 50 genes in regards to literature verification. We show that almost all genes in the list are verified by the breast cancer literature and three genes are presented as novel genes that may potentialy be BRCA-related but missing in literature.

%B BMC Syst Biol %V 11 %P 110 %8 2017 Nov 22 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/29166896?dopt=Abstract %R 10.1186/s12918-017-0495-0 %0 Journal Article %J BMC Systems Biology %D 2017 %T Graph-theoretical comparison of normal and tumor networks in identifying BRCA genes %A Dopazo, Joaquin %A Erten, Cesim %B BMC Systems Biology %V 11 %8 Jan-12-2017 %G eng %U https://bmcsystbiol.biomedcentral.com/articles/10.1186/s12918-017-0495-0http://link.springer.com/content/pdf/10.1186/s12918-017-0495-0.pdf %N 1 %! BMC Syst Biol %R 10.1186/s12918-017-0495-0 %0 Journal Article %J Nucleic Acids Res %D 2017 %T HGVA: the Human Genome Variation Archive. %A Lopez, Javier %A Coll, Jacobo %A Haimel, Matthias %A Kandasamy, Swaathi %A Tárraga, Joaquín %A Furio-Tari, Pedro %A Bari, Wasim %A Bleda, Marta %A Rueda, Antonio %A Gräf, Stefan %A Rendon, Augusto %A Dopazo, Joaquin %A Medina, Ignacio %K Genetic Variation %K Genome, Human %K Humans %K Internet %K Software %K User-Computer Interface %X

High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets.

%B Nucleic Acids Res %V 45 %P W189-W194 %8 2017 07 03 %G eng %U https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkx445 %N W1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28535294?dopt=Abstract %R 10.1093/nar/gkx445 %0 Journal Article %J Oncotarget %D 2017 %T High throughput estimation of functional cell activities reveals disease mechanisms and predicts relevant clinical outcomes. %A Hidalgo, Marta R %A Cubuk, Cankut %A Amadoz, Alicia %A Salavert, Francisco %A Carbonell-Caballero, José %A Dopazo, Joaquin %K Computational Biology %K gene expression %K Gene Regulatory Networks %K Humans %K mutation %K Neoplasms %K Precision Medicine %K Sequence Analysis, RNA %K Signal Transduction %X

Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is a main challenge for precision medicine. Here we propose a new method that models cell signaling using biological knowledge on signal transduction. The method recodes individual gene expression values (and/or gene mutations) into accurate measurements of changes in the activity of signaling circuits, which ultimately constitute high-throughput estimations of cell functionalities caused by gene activity within the pathway. Moreover, such estimations can be obtained either at cohort-level, in case/control comparisons, or personalized for individual patients. The accuracy of the method is demonstrated in an extensive analysis involving 5640 patients from 12 different cancer types. Circuit activity measurements not only have a high diagnostic value but also can be related to relevant disease outcomes such as survival, and can be used to assess therapeutic interventions.

%B Oncotarget %V 8 %P 5160-5178 %8 2017 Jan 17 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/28042959?dopt=Abstract %R 10.18632/oncotarget.14107 %0 Journal Article %J Plant Mol Biol %D 2017 %T Integration of transcriptomic and metabolic data reveals hub transcription factors involved in drought stress response in sunflower (Helianthus annuus L.). %A Moschen, Sebastián %A Di Rienzo, Julio A %A Higgins, Janet %A Tohge, Takayuki %A Watanabe, Mutsumi %A Gonzalez, Sergio %A Rivarola, Máximo %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Hopp, H Esteban %A Hoefgen, Rainer %A Fernie, Alisdair R %A Paniego, Norma %A Fernandez, Paula %A Heinz, Ruth A %K Chlorophyll %K Gene Expression Regulation, Plant %K Helianthus %K Plant Leaves %K Plant Proteins %K Protein Array Analysis %K RNA, Plant %K Stress, Physiological %K Transcription Factors %K Water %X

By integration of transcriptional and metabolic profiles we identified pathways and hubs transcription factors regulated during drought conditions in sunflower, useful for applications in molecular and/or biotechnological breeding. Drought is one of the most important environmental stresses that effects crop productivity in many agricultural regions. Sunflower is tolerant to drought conditions but the mechanisms involved in this tolerance remain unclear at the molecular level. The aim of this study was to characterize and integrate transcriptional and metabolic pathways related to drought stress in sunflower plants, by using a system biology approach. Our results showed a delay in plant senescence with an increase in the expression level of photosynthesis related genes as well as higher levels of sugars, osmoprotectant amino acids and ionic nutrients under drought conditions. In addition, we identified transcription factors that were upregulated during drought conditions and that may act as hubs in the transcriptional network. Many of these transcription factors belong to families implicated in the drought response in model species. The integration of transcriptomic and metabolomic data in this study, together with physiological measurements, has improved our understanding of the biological responses during droughts and contributes to elucidate the molecular mechanisms involved under this environmental condition. These findings will provide useful biotechnological tools to improve stress tolerance while maintaining crop yield under restricted water availability.

%B Plant Mol Biol %V 94 %P 549-564 %8 2017 Jul %G eng %N 4-5 %1 https://www.ncbi.nlm.nih.gov/pubmed/28639116?dopt=Abstract %R 10.1007/s11103-017-0625-5 %0 Journal Article %J Hum Mutat %D 2017 %T Mutations in TRAPPC11 are associated with a congenital disorder of glycosylation. %A Matalonga, Leslie %A Bravo, Miren %A Serra-Peinado, Carla %A García-Pelegrí, Elisabeth %A Ugarteburu, Olatz %A Vidal, Silvia %A Llambrich, Maria %A Quintana, Ester %A Fuster-Jorge, Pedro %A Gonzalez-Bravo, Maria Nieves %A Beltran, Sergi %A Dopazo, Joaquin %A Garcia-Garcia, Francisco %A Foulquier, François %A Matthijs, Gert %A Mills, Philippa %A Ribes, Antonia %A Egea, Gustavo %A Briones, Paz %A Tort, Frederic %A Girós, Marisa %K Abnormalities, Multiple %K Alleles %K Amino Acid Substitution %K Brain %K Congenital Disorders of Glycosylation %K Genotype %K Humans %K Magnetic Resonance Imaging %K Male %K mutation %K Phenotype %K Vesicular Transport Proteins %K Whole Genome Sequencing %X

Congenital disorders of glycosylation (CDG) are a heterogeneous and rapidly growing group of diseases caused by abnormal glycosylation of proteins and/or lipids. Mutations in genes involved in the homeostasis of the endoplasmic reticulum (ER), the Golgi apparatus (GA), and the vesicular trafficking from the ER to the ER-Golgi intermediate compartment (ERGIC) have been found to be associated with CDG. Here, we report a patient with defects in both N- and O-glycosylation combined with a delayed vesicular transport in the GA due to mutations in TRAPPC11, a subunit of the TRAPPIII complex. TRAPPIII is implicated in the anterograde transport from the ER to the ERGIC as well as in the vesicle export from the GA. This report expands the spectrum of genetic alterations associated with CDG, providing new insights for the diagnosis and the understanding of the physiopathological mechanisms underlying glycosylation disorders.

%B Hum Mutat %V 38 %P 148-151 %8 2017 02 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/27862579?dopt=Abstract %R 10.1002/humu.23145 %0 Journal Article %J BMC bioinformatics %D 2017 %T A new parallel pipeline for DNA methylation analysis of long reads datasets. %A Olanda, Ricardo %A Pérez, Mariano %A Orduña, Juan M %A Tárraga, Joaquín %A Joaquín Dopazo %K Methyl-Seq %K NGS %X BACKGROUND: DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. RESULTS: In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while yielding a better level of sensitivity, particularly for datasets composed of long reads. This strategy can be exported to other methylation, DNA and RNA analysis tools. CONCLUSIONS: The developed software tool achieves execution times one order of magnitude shorter than the existing tools, while yielding equal sensitivity for short reads and even better sensitivity for long reads. %B BMC bioinformatics %V 18 %P 161 %8 2017 Mar 09 %G eng %U http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1574-3 %R 10.1186/s12859-017-1574-3 %0 Journal Article %J Bioinformatics %D 2017 %T Reference genome assessment from a population scale perspective: an accurate profile of variability and noise. %A Carbonell-Caballero, José %A Amadoz, Alicia %A Alonso, Roberto %A Hidalgo, Marta R %A Cubuk, Cankut %A Conesa, David %A López-Quílez, Antonio %A Dopazo, Joaquin %K Animals %K Genetic Variation %K Genome %K Genomics %K Genotype %K Humans %K Models, Statistical %K Quality Control %K Reproducibility of Results %K Software %X

Motivation: Current plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions of the genome.

Results: The reliability of our protocol has been extensively tested through different experiments and organisms with accurate results, improving state-of-the-art methods. Our analysis demonstrates synergistic relations between quality control scores and allelic variability estimators, that improve the detection of misassembled regions, and is able to find strong artifact signals even within the human reference assembly. Furthermore, we demonstrated how our model can be trained to properly rank the confidence of a set of candidate variants obtained from new independent samples.

Availability and implementation: This tool is freely available at http://gitlab.com/carbonell/ces.

Contact: jcarbonell.cipf@gmail.com or joaquin.dopazo@juntadeandalucia.es.

Supplementary information: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 33 %P 3511-3517 %8 2017 Nov 15 %G eng %U https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btx482 %N 22 %1 https://www.ncbi.nlm.nih.gov/pubmed/28961772?dopt=Abstract %R 10.1093/bioinformatics/btx482 %0 Journal Article %J BMC Bioinformatics %D 2017 %T VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy. %A Juanes, José M %A Gallego, Asunción %A Tárraga, Joaquín %A Chaves, Felipe J %A Marin-Garcia, Pablo %A Medina, Ignacio %A Arnau, Vicente %A Dopazo, Joaquin %K Base Sequence %K Genetic Therapy %K Genetic Vectors %K High-Throughput Nucleotide Sequencing %K Humans %K Internet %K User-Computer Interface %K Virus Integration %X

BACKGROUND: The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use.

RESULTS: Here we present VISMapper, a vector integration site analysis web server, to analyze next-generation sequencing data for retroviral vector integration sites. VISMapper can be found at: http://vismapper.babelomics.org .

CONCLUSIONS: Because it uses novel mapping algorithms VISMapper is remarkably faster than previous available programs. It also provides a useful graphical interface to analyze the integration sites found in the genomic context.

%B BMC Bioinformatics %V 18 %P 421 %8 2017 Sep 20 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/28931371?dopt=Abstract %R 10.1186/s12859-017-1837-z %0 Journal Article %J Genome biology %D 2017 %T Whole exome sequencing coupled with unbiased functional analysis reveals new Hirschsprung disease genes. %A Gui, Hongsheng %A Schriemer, Duco %A Cheng, William W %A Chauhan, Rajendra K %A Antiňolo, Guillermo %A Berrios, Courtney %A Bleda, Marta %A Brooks, Alice S %A Brouwer, Rutger W W %A Burns, Alan J %A Cherny, Stacey S %A Dopazo, Joaquin %A Eggen, Bart J L %A Griseri, Paola %A Jalloh, Binta %A Le, Thuy-Linh %A Lui, Vincent C H %A Luzón-Toro, Berta %A Matera, Ivana %A Ngan, Elly S W %A Pelet, Anna %A Ruiz-Ferrer, Macarena %A Sham, Pak C %A Shepherd, Iain T %A So, Man-Ting %A Sribudiani, Yunia %A Tang, Clara S M %A van den Hout, Mirjam C G N %A van der Linde, Herma C %A van Ham, Tjakko J %A van IJcken, Wilfred F J %A Verheij, Joke B G M %A Amiel, Jeanne %A Borrego, Salud %A Ceccherini, Isabella %A Chakravarti, Aravinda %A Lyonnet, Stanislas %A Tam, Paul K H %A Garcia-Barceló, Maria-Mercè %A Hofstra, Robert Mw %K Hirschprung %K Rare Disease %K WES %X BACKGROUND: Hirschsprung disease (HSCR), which is congenital obstruction of the bowel, results from a failure of enteric nervous system (ENS) progenitors to migrate, proliferate, differentiate, or survive within the distal intestine. Previous studies that have searched for genes underlying HSCR have focused on ENS-related pathways and genes not fitting the current knowledge have thus often been ignored. We identify and validate novel HSCR genes using whole exome sequencing (WES), burden tests, in silico prediction, unbiased in vivo analyses of the mutated genes in zebrafish, and expression analyses in zebrafish, mouse, and human. RESULTS: We performed de novo mutation (DNM) screening on 24 HSCR trios. We identify 28 DNMs in 21 different genes. Eight of the DNMs we identified occur in RET, the main HSCR gene, and the remaining 20 DNMs reside in genes not reported in the ENS. Knockdown of all 12 genes with missense or loss-of-function DNMs showed that the orthologs of four genes (DENND3, NCLN, NUP98, and TBATA) are indispensable for ENS development in zebrafish, and these results were confirmed by CRISPR knockout. These genes are also expressed in human and mouse gut and/or ENS progenitors. Importantly, the encoded proteins are linked to neuronal processes shared by the central nervous system and the ENS. CONCLUSIONS: Our data open new fields of investigation into HSCR pathology and provide novel insights into the development of the ENS. Moreover, the study demonstrates that functional analyses of genes carrying DNMs are warranted to delineate the full genetic architecture of rare complex diseases. %B Genome biology %V 18 %P 48 %8 2017 Mar 08 %G eng %U http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1174-6 %R 10.1186/s13059-017-1174-6 %0 Journal Article %J Genome Biology %D 2017 %T Whole exome sequencing coupled with unbiased functional analysis reveals new Hirschsprung disease genes %A Gui, Hongsheng %A Schriemer, Duco %A Cheng, William W. %A Chauhan, Rajendra K. %A Antiňolo, Guillermo %A Berrios, Courtney %A Bleda, Marta %A Brooks, Alice S. %A Brouwer, Rutger W. W. %A Burns, Alan J. %A Cherny, Stacey S. %A Dopazo, Joaquin %A Eggen, Bart J. L. %A Griseri, Paola %A Jalloh, Binta %A Le, Thuy-Linh %A Lui, Vincent C. H. %A Luzón-Toro, Berta %A Matera, Ivana %A Ngan, Elly S. W. %A Pelet, Anna %A Ruiz-Ferrer, Macarena %A Sham, Pak C. %A Shepherd, Iain T. %A So, Man-Ting %A Sribudiani, Yunia %A Tang, Clara S. M. %A van den Hout, Mirjam C. G. N. %A van der Linde, Herma C. %A van Ham, Tjakko J. %A van IJcken, Wilfred F. J. %A Verheij, Joke B. G. M. %A Amiel, Jeanne %A Borrego, Salud %A Ceccherini, Isabella %A Chakravarti, Aravinda %A Lyonnet, Stanislas %A Tam, Paul K. H. %A Garcia-Barceló, Maria-Mercè %A Hofstra, Robert M. W. %B Genome Biology %V 18 %8 Jan-12-2017 %G eng %U http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1174-6http://link.springer.com/content/pdf/10.1186/s13059-017-1174-6.pdf %N 1 %! Genome Biol %R 10.1186/s13059-017-1174-6 %0 Journal Article %J Molecular biology and evolution %D 2016 %T 267 Spanish exomes reveal population-specific differences in disease-related genetic variation. %A Joaquín Dopazo %A Amadoz, Alicia %A Bleda, Marta %A García-Alonso, Luz %A Alemán, Alejandro %A Garcia-Garcia, Francisco %A Rodriguez, Juan A %A Daub, Josephine T %A Muntané, Gerard %A Antonio Rueda %A Vela-Boza, Alicia %A López-Domingo, Francisco J %A Florido, Javier P %A Arce, Pablo %A Ruiz-Ferrer, Macarena %A Méndez-Vidal, Cristina %A Arnold, Todd E %A Spleiss, Olivia %A Alvarez-Tejado, Miguel %A Navarro, Arcadi %A Bhattacharya, Shomi S %A Borrego, Salud %A Santoyo-López, Javier %A Antiňolo, Guillermo %K disease %K NGS %K polymorphisms %K Population genomics %K prioritization %K SNP %X Recent results from large-scale genomic projects suggest that allele frequencies, which are highly relevant for medical purposes, differ considerably across different populations. The need for a detailed catalogue of local variability motivated the whole exome sequencing of 267 unrelated individuals, representative of the healthy Spanish population. Like in other studies, a considerable number of rare variants were found (almost one third of the described variants). There were also relevant differences in allelic frequencies in polymorphic variants, including about 10,000 polymorphisms private to the Spanish population. The allelic frequencies of variants conferring susceptibility to complex diseases (including cancer, schizophrenia, Alzheimer disease, type 2 diabetes and other pathologies) were overall similar to those of other populations. However, the trend is the opposite for variants linked to Mendelian and rare diseases (including several retinal degenerative dystrophies and cardiomyopathies) that show marked frequency differences between populations. Interestingly, a correspondence between differences in allelic frequencies and disease prevalence was found, highlighting the relevance of frequency differences in disease risk. These differences are also observed in variants that disrupt known drug binding sites, suggesting an important role for local variability in population-specific drug resistances or adverse effects. We have made the Spanish population variant server web page that contains population frequency information for the complete list of 170,888 variant positions we found publicly available (http://spv.babelomics.org/), We show that it if fundamental to determine population-specific variant frequencies in order to distinguish real disease associations from population-specific polymorphisms. %B Molecular biology and evolution %8 2016 Jan 13 %G eng %U https://mbe.oxfordjournals.org/content/early/2016/02/17/molbev.msw005.full %R 10.1093/molbev/msw005 %0 Journal Article %J Nucleic acids research %D 2016 %T Actionable pathways: interactive discovery of therapeutic targets using signaling pathway models. %A Salavert, Francisco %A Hidago, Marta R %A Amadoz, Alicia %A Cubuk, Cankut %A Medina, Ignacio %A Crespo, Daniel %A Carbonell-Caballero, José %A Joaquín Dopazo %K actionable genes %K Disease mechanism %K drug action mechanism %K Drug discovery %K pathway analysis %K personalized medicine %K signalling %K therapeutic targets %X The discovery of actionable targets is crucial for targeted therapies and is also a constituent part of the drug discovery process. The success of an intervention over a target depends critically on its contribution, within the complex network of gene interactions, to the cellular processes responsible for disease progression or therapeutic response. Here we present PathAct, a web server that predicts the effect that interventions over genes (inhibitions or activations that simulate knock-outs, drug treatments or over-expressions) can have over signal transmission within signaling pathways and, ultimately, over the cell functionalities triggered by them. PathAct implements an advanced graphical interface that provides a unique interactive working environment in which the suitability of potentially actionable genes, that could eventually become drug targets for personalized or individualized therapies, can be easily tested. The PathAct tool can be found at: http://pathact.babelomics.org. %B Nucleic acids research %8 2016 May 2 %G eng %U http://nar.oxfordjournals.org/content/early/2016/05/02/nar.gkw369.full %R 10.1093/nar/gkw369 %0 Journal Article %J The Journal of molecular diagnostics : JMD %D 2016 %T Assessment of Targeted Next-Generation Sequencing as a Tool for the Diagnosis of Charcot-Marie-Tooth Disease and Hereditary Motor Neuropathy. %A Lupo, Vincenzo %A Garcia-Garcia, Francisco %A Sancho, Paula %A Tello, Cristina %A García-Romero, Mar %A Villarreal, Liliana %A Alberti, Antonia %A Sivera, Rafael %A Joaquín Dopazo %A Pascual-Pascual, Samuel I %A Márquez-Infante, Celedonio %A Casasnovas, Carlos %A Sevilla, Teresa %A Espinós, Carmen %K Charcot-Marie-Tooth %K CMT %K Diagnostic %K NGS %K Panels %K rare diseases %K Targeted resequencing %X Charcot-Marie-Tooth disease is characterized by broad genetic heterogeneity with >50 known disease-associated genes. Mutations in some of these genes can cause a pure motor form of hereditary motor neuropathy, the genetics of which are poorly characterized. We designed a panel comprising 56 genes associated with Charcot-Marie-Tooth disease/hereditary motor neuropathy. We validated this diagnostic tool by first testing 11 patients with pathological mutations. A cohort of 33 affected subjects was selected for this study. The DNAJB2 c.352+1G>A mutation was detected in two cases; novel changes and/or variants with low frequency (<1%) were found in 12 cases. There were no candidate variants in 18 cases, and amplification failed for one sample. The DNAJB2 c.352+1G>A mutation was also detected in three additional families. On haplotype analysis, all of the patients from these five families shared the same haplotype; therefore, the DNAJB2 c.352+1G>A mutation may be a founder event. Our gene panel allowed us to perform a very rapid and cost-effective screening of genes involved in Charcot-Marie-Tooth disease/hereditary motor neuropathy. Our diagnostic strategy was robust in terms of both coverage and read depth for all of the genes and patient samples. These findings demonstrate the difficulty in achieving a definitive molecular diagnosis because of the complexity of interpreting new variants and the genetic heterogeneity that is associated with these neuropathies. %B The Journal of molecular diagnostics : JMD %8 2016 Jan 2 %G eng %U http://www.sciencedirect.com/science/article/pii/S1525157815002615 %R 10.1016/j.jmoldx.2015.10.005 %0 Journal Article %J Stress (Amsterdam, Netherlands) %D 2016 %T Chronic subordination stress selectively downregulates the insulin signaling pathway in liver and skeletal muscle but not in adipose tissue of male mice. %A Sanghez, Valentina %A Cubuk, Cankut %A Sebastián-Leon, Patricia %A Carobbio, Stefania %A Dopazo, Joaquin %A Vidal-Puig, Antonio %A Bartolomucci, Alessandro %K Adipose tissue %K insulin %K IRS1 %K IRS2 %K metabolic syndrome %K obesity %K pathway analysis %X Chronic stress has been associated with obesity, glucose intolerance, and insulin resistance. We developed a model of chronic psychosocial stress (CPS) in which subordinate mice are vulnerable to obesity and the metabolic-like syndrome while dominant mice exhibit a healthy metabolic phenotype. Here we tested the hypothesis that the metabolic difference between subordinate and dominant mice is associated with changes in functional pathways relevant for insulin sensitivity, glucose and lipid homeostasis. Male mice were exposed to CPS for four weeks and fed either a standard diet or a high-fat diet (HFD). We first measured, by real-time PCR candidate genes, in the liver, skeletal muscle, and the perigonadal white adipose tissue (pWAT). Subsequently, we used a probabilistic analysis approach to analyze different ways in which signals can be transmitted across the pathways in each tissue. Results showed that subordinate mice displayed a drastic downregulation of the insulin pathway in liver and muscle, indicative of insulin resistance, already on standard diet. Conversely, pWAT showed molecular changes suggestive of facilitated fat deposition in an otherwise insulin-sensitive tissue. The molecular changes in subordinate mice fed a standard diet were greater compared to HFD-fed controls. Finally, dominant mice maintained a substantially normal metabolic and molecular phenotype even when fed a HFD. Overall, our data demonstrate that subordination stress is a potent stimulus for the downregulation of the insulin signaling pathway in liver and muscle and a major risk factor for the development of obesity, insulin resistance, and type 2 diabetes mellitus. %B Stress (Amsterdam, Netherlands) %P 1-11 %8 2016 Mar 7 %G eng %U http://www.tandfonline.com/doi/abs/10.3109/10253890.2016.1151491?journalCode=ists20 %R 10.3109/10253890.2016.1151491 %0 Journal Article %J Cell Cycle %D 2016 %T Dysfunctional mitochondrial fission impairs cell reprogramming. %A Prieto, Javier %A León, Marian %A Ponsoda, Xavier %A Garcia-Garcia, Francisco %A Bort, Roque %A Serna, Eva %A Barneo-Muñoz, Manuela %A Palau, Francesc %A Dopazo, Joaquin %A López-García, Carlos %A Torres, Josema %K Animals %K Cell Cycle Checkpoints %K Cellular Reprogramming %K DNA Damage %K G2 Phase %K Gene Knockdown Techniques %K Mice %K Mitochondrial Dynamics %K Mitosis %K Nerve Tissue Proteins %K Pluripotent Stem Cells %K Transcription Factors %X

We have recently shown that mitochondrial fission is induced early in reprogramming in a Drp1-dependent manner; however, the identity of the factors controlling Drp1 recruitment to mitochondria was unexplored. To investigate this, we used a panel of RNAi targeting factors involved in the regulation of mitochondrial dynamics and we observed that MiD51, Gdap1 and, to a lesser extent, Mff were found to play key roles in this process. Cells derived from Gdap1-null mice were used to further explore the role of this factor in cell reprogramming. Microarray data revealed a prominent down-regulation of cell cycle pathways in Gdap1-null cells early in reprogramming and cell cycle profiling uncovered a G2/M growth arrest in Gdap1-null cells undergoing reprogramming. High-Content analysis showed that this growth arrest was DNA damage-independent. We propose that lack of efficient mitochondrial fission impairs cell reprogramming by interfering with cell cycle progression in a DNA damage-independent manner.

%B Cell Cycle %V 15 %P 3240-3250 %8 2016 Dec %G eng %N 23 %1 https://www.ncbi.nlm.nih.gov/pubmed/27753531?dopt=Abstract %R 10.1080/15384101.2016.1241930 %0 Journal Article %J Nature communications %D 2016 %T Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). %A Lagarde, Julien %A Uszczynska-Ratajczak, Barbara %A Santoyo-López, Javier %A Gonzalez, Jose Manuel %A Tapanari, Electra %A Mudge, Jonathan M %A Steward, Charles A %A Wilming, Laurens %A Tanzer, Andrea %A Howald, Cédric %A Chrast, Jacqueline %A Vela-Boza, Alicia %A Antonio Rueda %A López-Domingo, Francisco J %A Dopazo, Joaquin %A Reymond, Alexandre %A Guigó, Roderic %A Harrow, Jennifer %X Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking. Here we describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. We apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5’ or 3’, often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, we show that RACE-Seq is an effective tool to annotate an organism’s deep transcriptome, and compares favourably to other targeted sequencing techniques. %B Nature communications %V 7 %P 12339 %8 2016 %G eng %U http://www.nature.com/articles/ncomms12339 %R 10.1038/ncomms12339 %0 Journal Article %J Nature Communications %D 2016 %T Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq) %A Lagarde, Julien %A Uszczynska-Ratajczak, Barbara %A Santoyo-López, Javier %A Gonzalez, Jose Manuel %A Tapanari, Electra %A Mudge, Jonathan M. %A Steward, Charles A. %A Wilming, Laurens %A Tanzer, Andrea %A Howald, Cédric %A Chrast, Jacqueline %A Vela-Boza, Alicia %A Rueda, Antonio %A Lopez-Domingo, Francisco J. %A Dopazo, Joaquin %A Reymond, Alexandre %A Guigó, Roderic %A Harrow, Jennifer %B Nature Communications %V 7 %8 Jan-11-2016 %G eng %U http://www.nature.com/articles/ncomms12339http://www.nature.com/articles/ncomms12339.pdfhttp://www.nature.com/articles/ncomms12339.pdfhttp://www.nature.com/articles/ncomms12339 %N 1 %! Nat Commun %R 10.1038/ncomms12339 %0 Journal Article %J DNA Res %D 2016 %T Highly sensitive and ultrafast read mapping for RNA-seq analysis. %A Medina, I %A Tárraga, J %A Martínez, H %A Barrachina, S %A Castillo, M I %A Paschall, J %A Salavert-Torres, J %A Blanquer-Espert, I %A Hernández-García, V %A Quintana-Ortí, E S %A Dopazo, J %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Sensitivity and Specificity %K Sequence Analysis, RNA %K Transcriptome %X

As sequencing technologies progress, the amount of data produced grows exponentially, shifting the bottleneck of discovery towards the data analysis phase. In particular, currently available mapping solutions for RNA-seq leave room for improvement in terms of sensitivity and performance, hindering an efficient analysis of transcriptomes by massive sequencing. Here, we present an innovative approach that combines re-engineering, optimization and parallelization. This solution results in a significant increase of mapping sensitivity over a wide range of read lengths and substantial shorter runtimes when compared with current RNA-seq mapping methods available.

%B DNA Res %V 23 %P 93-100 %8 2016 Apr %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/26740642?dopt=Abstract %R 10.1093/dnares/dsv039 %0 Journal Article %J BMC Bioinformatics %D 2016 %T HPG pore: an efficient and scalable framework for nanopore sequencing data %A Tárraga, Joaquín %A Gallego, Asunción %A Arnau, Vicente %A Medina, Ignacio %A Dopazo, Joaquin %B BMC Bioinformatics %V 17 %8 Jan-12-2016 %G eng %U http://www.biomedcentral.com/1471-2105/17/107http://link.springer.com/content/pdf/10.1186/s12859-016-0966-0 %N 1 %! BMC Bioinformatics %R 10.1186/s12859-016-0966-0 %0 Journal Article %J BMC bioinformatics %D 2016 %T HPG pore: an efficient and scalable framework for nanopore sequencing data. %A Tárraga, Joaquín %A Gallego, Asunción %A Arnau, Vicente %A Medina, Ignacio %A Dopazo, Joaquin %K hadoop %K HPC %K nanopore %K NGS %X BACKGROUND: The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput. Despite the flood of data expected from this technology, the data analysis solutions currently available are only designed to manage small projects and are not scalable. RESULTS: Here we present HPG Pore, a toolkit for exploring and analysing nanopore sequencing data. HPG Pore can run on both individual computers and in the Hadoop distributed computing framework, which allows easy scale-up to manage the large amounts of data expected to result from extensive use of nanopore technologies in the future. CONCLUSIONS: HPG Pore allows for virtually unlimited sequencing data scalability, thus guaranteeing its continued management in near future scenarios. HPG Pore is available in GitHub at http://github.com/opencb/hpg-pore . %B BMC bioinformatics %V 17 %P 107 %8 2016 %G eng %U http://www.biomedcentral.com/1471-2105/17/107 %R 10.1186/s12859-016-0966-0 %0 Journal Article %J Transl Psychiatry %D 2016 %T Human DNA methylomes of neurodegenerative diseases show common epigenomic patterns. %A Sanchez-Mut, J V %A Heyn, H %A Vidal, E %A Moran, S %A Sayols, S %A Delgado-Morales, R %A Schultz, M D %A Ansoleaga, B %A Garcia-Esparcia, P %A Pons-Espinal, M %A de Lagran, M M %A Dopazo, J %A Rabano, A %A Avila, J %A Dierssen, M %A Lott, I %A Ferrer, I %A Ecker, J R %A Esteller, M %K Adult %K Aged %K Aged, 80 and over %K DNA Methylation %K Epigenomics %K Female %K Humans %K Male %K Middle Aged %K neurodegenerative diseases %K Prefrontal Cortex %K Tissue Array Analysis %X

Different neurodegenerative disorders often show similar lesions, such as the presence of amyloid plaques, TAU-neurotangles and synuclein inclusions. The genetically inherited forms are rare, so we wondered whether shared epigenetic aberrations, such as those affecting DNA methylation, might also exist. The studied samples were gray matter samples from the prefrontal cortex of control and neurodegenerative disease-associated cases. We performed the DNA methylation analyses of Alzheimer's disease, dementia with Lewy bodies, Parkinson's disease and Alzheimer-like neurodegenerative profile associated with Down's syndrome samples. The DNA methylation landscapes obtained show that neurodegenerative diseases share similar aberrant CpG methylation shifts targeting a defined gene set. Our findings suggest that neurodegenerative disorders might have similar pathogenetic mechanisms that subsequently evolve into different clinical entities. The identified aberrant DNA methylation changes can be used as biomarkers of the disorders and as potential new targets for the development of new therapies.

%B Transl Psychiatry %V 6 %P e718 %8 2016 Jan 19 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/26784972?dopt=Abstract %R 10.1038/tp.2015.214 %0 Journal Article %J Sci Rep %D 2016 %T Identification of the Photoreceptor Transcriptional Co-Repressor SAMD11 as Novel Cause of Autosomal Recessive Retinitis Pigmentosa. %A Corton, M %A Avila-Fernández, A %A Campello, L %A Sánchez, M %A Benavides, B %A López-Molina, M I %A Fernández-Sánchez, L %A Sánchez-Alcudia, R %A da Silva, L R J %A Reyes, N %A Martín-Garrido, E %A Zurita, O %A Fernández-San José, P %A Pérez-Carro, R %A García-García, F %A Dopazo, J %A García-Sandoval, B %A Cuenca, N %A Ayuso, C %K Aged %K Animals %K Co-Repressor Proteins %K Codon, Nonsense %K Cohort Studies %K Comparative Genomic Hybridization %K Consanguinity %K DNA Mutational Analysis %K Exome %K Eye Proteins %K Female %K Gene Expression Regulation %K Genes, Recessive %K Homeodomain Proteins %K Homozygote %K Humans %K Male %K Mice %K Middle Aged %K Polymorphism, Single Nucleotide %K Protein Interaction Mapping %K Retina %K Retinal Dystrophies %K Retinal Rod Photoreceptor Cells %K Retinitis pigmentosa %K Spain %K Trans-Activators %K Transcription Factors %X

Retinitis pigmentosa (RP), the most frequent form of inherited retinal dystrophy is characterized by progressive photoreceptor degeneration. Many genes have been implicated in RP development, but several others remain to be identified. Using a combination of homozygosity mapping, whole-exome and targeted next-generation sequencing, we found a novel homozygous nonsense mutation in SAMD11 in five individuals diagnosed with adult-onset RP from two unrelated consanguineous Spanish families. SAMD11 is ortholog to the mouse major retinal SAM domain (mr-s) protein that is implicated in CRX-mediated transcriptional regulation in the retina. Accordingly, protein-protein network analysis revealed a significant interaction of SAMD11 with CRX. Immunoblotting analysis confirmed strong expression of SAMD11 in human retina. Immunolocalization studies revealed SAMD11 was detected in the three nuclear layers of the human retina and interestingly differential expression between cone and rod photoreceptors was observed. Our study strongly implicates SAMD11 as novel cause of RP playing an important role in the pathogenesis of human degeneration of photoreceptors.

%B Sci Rep %V 6 %P 35370 %8 2016 10 13 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/27734943?dopt=Abstract %R 10.1038/srep35370 %0 Journal Article %J Sci Rep %D 2016 %T Improving the management of Inherited Retinal Dystrophies by targeted sequencing of a population-specific gene panel. %A Bravo-Gil, Nereida %A Méndez-Vidal, Cristina %A Romero-Pérez, Laura %A González-del Pozo, María %A Rodríguez-de la Rúa, Enrique %A Dopazo, Joaquin %A Borrego, Salud %A Antiňolo, Guillermo %K Alleles %K Computer Simulation %K DNA Copy Number Variations %K DNA Mutational Analysis %K Eye Proteins %K Gene Library %K Genetic Association Studies %K Genetic Heterogeneity %K Genetic Therapy %K High-Throughput Nucleotide Sequencing %K Humans %K mutation %K Phenotype %K Retinal Dystrophies %X

Next-generation sequencing (NGS) has overcome important limitations to the molecular diagnosis of Inherited Retinal Dystrophies (IRD) such as the high clinical and genetic heterogeneity and the overlapping phenotypes. The purpose of this study was the identification of the genetic defect in 32 Spanish families with different forms of IRD. With that aim, we implemented a custom NGS panel comprising 64 IRD-associated genes in our population, and three disease-associated intronic regions. A total of 37 pathogenic mutations (14 novels) were found in 73% of IRD patients ranging from 50% for autosomal dominant cases, 75% for syndromic cases, 83% for autosomal recessive cases, and 100% for X-linked cases. Additionally, unexpected phenotype-genotype correlations were found in 6 probands, which led to the refinement of their clinical diagnoses. Furthermore, intra- and interfamilial phenotypic variability was observed in two cases. Moreover, two cases unsuccessfully analysed by exome sequencing were resolved by applying this panel. Our results demonstrate that this hypothesis-free approach based on frequently mutated, population-specific loci is highly cost-efficient for the routine diagnosis of this heterogeneous condition and allows the unbiased analysis of a miscellaneous cohort. The molecular information found here has aid clinical diagnosis and has improved genetic counselling and patient management.

%B Sci Rep %V 6 %P 23910 %8 2016 Apr 01 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/27032803?dopt=Abstract %R 10.1038/srep23910 %0 Journal Article %J Bioinformatics %D 2016 %T Integrated gene set analysis for microRNA studies. %A Garcia-Garcia, Francisco %A Panadero, Joaquin %A Dopazo, Joaquin %A Montaner, David %K Computational Biology %K Gene Expression Profiling %K Gene ontology %K Gene Regulatory Networks %K High-Throughput Nucleotide Sequencing %K Humans %K MicroRNAs %K Neoplasms %K Reproducibility of Results %X

MOTIVATION: Functional interpretation of miRNA expression data is currently done in a three step procedure: select differentially expressed miRNAs, find their target genes, and carry out gene set overrepresentation analysis Nevertheless, major limitations of this approach have already been described at the gene level, while some newer arise in the miRNA scenario.Here, we propose an enhanced methodology that builds on the well-established gene set analysis paradigm. Evidence for differential expression at the miRNA level is transferred to a gene differential inhibition score which is easily interpretable in terms of gene sets or pathways. Such transferred indexes account for the additive effect of several miRNAs targeting the same gene, and also incorporate cancellation effects between cases and controls. Together, these two desirable characteristics allow for more accurate modeling of regulatory processes.

RESULTS: We analyze high-throughput sequencing data from 20 different cancer types and provide exhaustive reports of gene and Gene Ontology-term deregulation by miRNA action.

AVAILABILITY AND IMPLEMENTATION: The proposed methodology was implemented in the Bioconductor library mdgsa http://bioconductor.org/packages/mdgsa For the purpose of reproducibility all of the scripts are available at https://github.com/dmontaner-papers/gsa4mirna

CONTACT: : david.montaner@gmail.com

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 32 %P 2809-16 %8 2016 09 15 %G eng %N 18 %1 https://www.ncbi.nlm.nih.gov/pubmed/27324197?dopt=Abstract %R 10.1093/bioinformatics/btw334 %0 Journal Article %J Plant Biotechnol J %D 2016 %T Integrating transcriptomic and metabolomic analysis to understand natural leaf senescence in sunflower. %A Moschen, Sebastián %A Bengoa Luoni, Sofía %A Di Rienzo, Julio A %A Caro, María Del Pilar %A Tohge, Takayuki %A Watanabe, Mutsumi %A Hollmann, Julien %A Gonzalez, Sergio %A Rivarola, Máximo %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Hopp, Horacio Esteban %A Hoefgen, Rainer %A Fernie, Alisdair R %A Paniego, Norma %A Fernandez, Paula %A Heinz, Ruth A %K Gas Chromatography-Mass Spectrometry %K Gene Expression Profiling %K Gene Expression Regulation, Plant %K Gene ontology %K Genes, Plant %K Helianthus %K Ions %K metabolomics %K Oligonucleotide Array Sequence Analysis %K Plant Leaves %K Principal Component Analysis %K RNA, Messenger %K Transcription Factors %X

Leaf senescence is a complex process, which has dramatic consequences on crop yield. In sunflower, gap between potential and actual yields reveals the economic impact of senescence. Indeed, sunflower plants are incapable of maintaining their green leaf area over sustained periods. This study characterizes the leaf senescence process in sunflower through a systems biology approach integrating transcriptomic and metabolomic analyses: plants being grown under both glasshouse and field conditions. Our results revealed a correspondence between profile changes detected at the molecular, biochemical and physiological level throughout the progression of leaf senescence measured at different plant developmental stages. Early metabolic changes were detected prior to anthesis and before the onset of the first senescence symptoms, with more pronounced changes observed when physiological and molecular variables were assessed under field conditions. During leaf development, photosynthetic activity and cell growth processes decreased, whereas sucrose, fatty acid, nucleotide and amino acid metabolisms increased. Pathways related to nutrient recycling processes were also up-regulated. Members of the NAC, AP2-EREBP, HB, bZIP and MYB transcription factor families showed high expression levels, and their expression level was highly correlated, suggesting their involvement in sunflower senescence. The results of this study thus contribute to the elucidation of the molecular mechanisms involved in the onset and progression of leaf senescence in sunflower leaves as well as to the identification of candidate genes involved in this process.

%B Plant Biotechnol J %V 14 %P 719-34 %8 2016 Feb %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/26132509?dopt=Abstract %R 10.1111/pbi.12422 %0 Journal Article %J PLoS One %D 2016 %T The Mutational Landscape of Acute Promyelocytic Leukemia Reveals an Interacting Network of Co-Occurrences and Recurrent Mutations. %A Ibáñez, Mariam %A Carbonell-Caballero, José %A García-Alonso, Luz %A Such, Esperanza %A Jiménez-Almazán, Jorge %A Vidal, Enrique %A Barragán, Eva %A López-Pavía, María %A LLop, Marta %A Martín, Iván %A Gómez-Seguí, Inés %A Montesinos, Pau %A Sanz, Miguel A %A Dopazo, Joaquin %A Cervera, José %K Exome %K Gene Regulatory Networks %K Genome, Human %K Humans %K INDEL Mutation %K Leukemia, Promyelocytic, Acute %K mutation %K Mutation Rate %K Polymorphism, Single Nucleotide %K Reproducibility of Results %X

Preliminary Acute Promyelocytic Leukemia (APL) whole exome sequencing (WES) studies have identified a huge number of somatic mutations affecting more than a hundred different genes mainly in a non-recurrent manner, suggesting that APL is a heterogeneous disease with secondary relevant changes not yet defined. To extend our knowledge of subtle genetic alterations involved in APL that might cooperate with PML/RARA in the leukemogenic process, we performed a comprehensive analysis of somatic mutations in APL combining WES with sequencing of a custom panel of targeted genes by next-generation sequencing. To select a reduced subset of high confidence candidate driver genes, further in silico analysis were carried out. After prioritization and network analysis we found recurrent deleterious mutations in 8 individual genes (STAG2, U2AF1, SMC1A, USP9X, IKZF1, LYN, MYCBP2 and PTPN11) with a strong potential of being involved in APL pathogenesis. Our network analysis of multiple mutations provides a reliable approach to prioritize genes for additional analysis, improving our knowledge of the leukemogenesis interactome. Additionally, we have defined a functional module in the interactome of APL. The hypothesis is that the number, or the specific combinations, of mutations harbored in each patient might not be as important as the disturbance caused in biological key functions, triggered by several not necessarily recurrent mutations.

%B PLoS One %V 11 %P e0148346 %8 2016 %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/26886259?dopt=Abstract %R 10.1371/journal.pone.0148346 %0 Journal Article %J Brain %D 2016 %T Mutations in the MORC2 gene cause axonal Charcot-Marie-Tooth disease. %A Sevilla, Teresa %A Lupo, Vincenzo %A Martínez-Rubio, Dolores %A Sancho, Paula %A Sivera, Rafael %A Chumillas, María J %A García-Romero, Mar %A Pascual-Pascual, Samuel I %A Muelas, Nuria %A Dopazo, Joaquin %A Vílchez, Juan J %A Palau, Francesc %A Espinós, Carmen %K Adult %K Aged %K Animals %K Axons %K Charcot-Marie-Tooth Disease %K Female %K gene expression %K Humans %K Infant %K Male %K Mice %K Middle Aged %K mutation %K Pedigree %K Phenotype %K Sciatic Nerve %K Sural Nerve %K Transcription Factors %K Young Adult %X

Charcot-Marie-Tooth disease (CMT) is a complex disorder with wide genetic heterogeneity. Here we present a new axonal Charcot-Marie-Tooth disease form, associated with the gene microrchidia family CW-type zinc finger 2 (MORC2). Whole-exome sequencing in a family with autosomal dominant segregation identified the novel MORC2 p.R190W change in four patients. Further mutational screening in our axonal Charcot-Marie-Tooth disease clinical series detected two additional sporadic cases, one patient who also carried the same MORC2 p.R190W mutation and another patient that harboured a MORC2 p.S25L mutation. Genetic and in silico studies strongly supported the pathogenicity of these sequence variants. The phenotype was variable and included patients with congenital or infantile onset, as well as others whose symptoms started in the second decade. The patients with early onset developed a spinal muscular atrophy-like picture, whereas in the later onset cases, the initial symptoms were cramps, distal weakness and sensory impairment. Weakness and atrophy progressed in a random and asymmetric fashion and involved limb girdle muscles, leading to a severe incapacity in adulthood. Sensory loss was always prominent and proportional to disease severity. Electrophysiological studies were consistent with an asymmetric axonal motor and sensory neuropathy, while fasciculations and myokymia were recorded rather frequently by needle electromyography. Sural nerve biopsy revealed pronounced multifocal depletion of myelinated fibres with some regenerative clusters and occasional small onion bulbs. Morc2 is expressed in both axons and Schwann cells of mouse peripheral nerve. Different roles in biological processes have been described for MORC2. As the silencing of Charcot-Marie-Tooth disease genes have been associated with DNA damage response, it is tempting to speculate that a deregulation of this pathway may be linked to the axonal degeneration observed in MORC2 neuropathy, thus adding a new pathogenic mechanism to the long list of causes of Charcot-Marie-Tooth disease.

%B Brain %V 139 %P 62-72 %8 2016 Jan %G eng %N Pt 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26497905?dopt=Abstract %R 10.1093/brain/awv311 %0 Journal Article %J Scientific Reports %D 2016 %T The pan-cancer pathological regulatory landscape %A Falco, Matias M. %A Bleda, Marta %A Carbonell-Caballero, José %A Dopazo, Joaquin %B Scientific Reports %V 6 %8 Jan-12-2016 %G eng %U http://www.nature.com/articles/srep39709http://www.nature.com/articles/srep39709.pdfhttp://www.nature.com/articles/srep39709.pdfhttp://www.nature.com/articles/srep39709 %N 1 %! Sci Rep %R 10.1038/srep39709 %0 Journal Article %J Scientific reports %D 2016 %T The pan-cancer pathological regulatory landscape. %A Falco, Matias M %A Bleda, Marta %A Carbonell-Caballero, José %A Joaquín Dopazo %X Dysregulation of the normal gene expression program is the cause of a broad range of diseases, including cancer. Detecting the specific perturbed regulators that have an effect on the generation and the development of the disease is crucial for understanding the disease mechanism and for taking decisions on efficient preventive and curative therapies. Moreover, detecting such perturbations at the patient level is even more important from the perspective of personalized medicine. We applied the Transcription Factor Target Enrichment Analysis, a method that detects the activity of transcription factors based on the quantification of the collective transcriptional activation of their targets, to a large collection of 5607 cancer samples covering eleven cancer types. We produced for the first time a comprehensive catalogue of altered transcription factor activities in cancer, a considerable number of them significantly associated to patient’s survival. Moreover, we described several interesting TFs whose activity do not change substantially in the cancer with respect to the normal tissue but ultimately play an important role in patient prognostic determination, which suggest they might be promising therapeutic targets. An additional advantage of this method is that it allows obtaining personalized TF activity estimations for individual patients. %B Scientific reports %V 6 %P 39709 %8 2016 Dec 21 %G eng %U http://www.nature.com/articles/srep39709 %R 10.1038/srep39709 %0 Journal Article %J Drug Metab Pers Ther %D 2016 %T Progress in pharmacogenetics: consortiums and new strategies. %A Maroñas, Olalla %A Latorre, Ana %A Dopazo, Joaquin %A Pirmohamed, Munir %A Rodríguez-Antona, Cristina %A Siest, Gérard %A Carracedo, Ángel %A LLerena, Adrián %K Cooperative Behavior %K Genome-Wide Association Study %K High-Throughput Screening Assays %K Humans %K Patient Care Team %K pharmacogenetics %K Polymorphism, Single Nucleotide %K Precision Medicine %X

Pharmacogenetics (PGx), as a field dedicated to achieving the goal of personalized medicine (PM), is devoted to the study of genes involved in inter-individual response to drugs. Due to its nature, PGx requires access to large samples; therefore, in order to progress, the formation of collaborative consortia seems to be crucial. Some examples of this collective effort are the European Society of Pharmacogenomics and personalized Therapy and the Ibero-American network of Pharmacogenetics. As an emerging field, one of the major challenges that PGx faces is translating their discoveries from research bench to bedside. The development of genomic high-throughput technologies is generating a revolution and offers the possibility of producing vast amounts of genome-wide single nucleotide polymorphisms for each patient. Moreover, there is a need of identifying and replicating associations of new biomarkers, and, in addition, a greater effort must be invested in developing regulatory organizations to accomplish a correct standardization. In this review, we outline the current progress in PGx using examples to highlight both the importance of polymorphisms and the research strategies for their detection. These concepts need to be applied together with a proper dissemination of knowledge to improve clinician and patient understanding, in a multidisciplinary team-based approach.

%B Drug Metab Pers Ther %V 31 %P 17-23 %8 2016 Mar %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26913460?dopt=Abstract %R 10.1515/dmpt-2015-0039 %0 Journal Article %J Am J Med Genet A %D 2016 %T Screening of CD96 and ASXL1 in 11 patients with Opitz C or Bohring-Opitz syndromes. %A Urreizti, Roser %A Roca-Ayats, Neus %A Trepat, Judith %A Garcia-Garcia, Francisco %A Alemán, Alejandro %A Orteschi, Daniela %A Marangi, Giuseppe %A Neri, Giovanni %A Opitz, John M %A Dopazo, Joaquin %A Cormand, Bru %A Vilageliu, Lluïsa %A Balcells, Susana %A Grinberg, Daniel %K Adolescent %K Antigens, CD %K Child %K Child, Preschool %K Craniosynostoses %K Exome %K Female %K High-Throughput Nucleotide Sequencing %K Humans %K Infant %K Intellectual Disability %K Male %K mutation %K Pedigree %K Phenotype %K Prognosis %K Repressor Proteins %X

Opitz C trigonocephaly (or Opitz C syndrome, OTCS) and Bohring-Opitz syndrome (BOS or C-like syndrome) are two rare genetic disorders with phenotypic overlap. The genetic causes of these diseases are not understood. However, two genes have been associated with OTCS or BOS with dominantly inherited de novo mutations. Whereas CD96 has been related to OTCS (one case) and to BOS (one case), ASXL1 has been related to BOS only (several cases). In this study we analyze CD96 and ASXL1 in a group of 11 affected individuals, including 2 sibs, 10 of them were diagnosed with OTCS, and one had a BOS phenotype. Exome sequences were available on six patients with OTCS and three parent pairs. Thus, we could analyze the CD96 and ASXL1 sequences in these patients bioinformatically. Sanger sequencing of all exons of CD96 and ASXL1 was carried out in the remaining patients. Detailed scrutiny of the sequences and assessment of variants allowed us to exclude putative pathogenic and private mutations in all but one of the patients. In this patient (with BOS) we identified a de novo mutation in ASXL1 (c.2100dupT). By nature and location within the gene, this mutation resembles those previously described in other BOS patients and we conclude that it may be responsible for the condition. Our results indicate that in 10 of 11, the disease (OTCS or BOS) cannot be explained by small changes in CD96 or ASXL1. However, the cohort is too small to make generalizations about the genetic etiology of these diseases.

%B Am J Med Genet A %V 170A %P 24-31 %8 2016 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26768331?dopt=Abstract %R 10.1002/ajmg.a.37418 %0 Journal Article %J Oncotarget %D 2016 %T Serum metabolomic profiling facilitates the non-invasive identification of metabolic biomarkers associated with the onset and progression of non-small cell lung cancer. %A Puchades-Carrasco, Leonor %A Jantus-Lewintre, Eloisa %A Pérez-Rambla, Clara %A Garcia-Garcia, Francisco %A Lucas, Rut %A Calabuig, Silvia %A Blasco, Ana %A Dopazo, Joaquin %A Camps, Carlos %A Pineda-Lucena, Antonio %K Adult %K Aged %K Biomarkers, Tumor %K Carcinoma, Non-Small-Cell Lung %K Disease Progression %K Female %K Humans %K Lung Neoplasms %K Male %K metabolomics %K Middle Aged %K Proton Magnetic Resonance Spectroscopy %X

Lung cancer (LC) is responsible for most cancer deaths. One of the main factors contributing to the lethality of this disease is the fact that a large proportion of patients are diagnosed at advanced stages when a clinical intervention is unlikely to succeed. In this study, we evaluated the potential of metabolomics by 1H-NMR to facilitate the identification of accurate and reliable biomarkers to support the early diagnosis and prognosis of non-small cell lung cancer (NSCLC).We found that the metabolic profile of NSCLC patients, compared with healthy individuals, is characterized by statistically significant changes in the concentration of 18 metabolites representing different amino acids, organic acids and alcohols, as well as different lipids and molecules involved in lipid metabolism. Furthermore, the analysis of the differences between the metabolic profiles of NSCLC patients at different stages of the disease revealed the existence of 17 metabolites involved in metabolic changes associated with disease progression.Our results underscore the potential of metabolomics profiling to uncover pathophysiological mechanisms that could be useful to objectively discriminate NSCLC patients from healthy individuals, as well as between different stages of the disease.

%B Oncotarget %V 7 %P 12904-16 %8 2016 Mar 15 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/26883203?dopt=Abstract %R 10.18632/oncotarget.7354 %0 Journal Article %J Mol Metab %D 2016 %T Stress-induced activation of brown adipose tissue prevents obesity in conditions of low adaptive thermogenesis. %A Razzoli, Maria %A Frontini, Andrea %A Gurney, Allison %A Mondini, Eleonora %A Cubuk, Cankut %A Katz, Liora S %A Cero, Cheryl %A Bolan, Patrick J %A Dopazo, Joaquin %A Vidal-Puig, Antonio %A Cinti, Saverio %A Bartolomucci, Alessandro %X

BACKGROUND: Stress-associated conditions such as psychoemotional reactivity and depression have been paradoxically linked to either weight gain or weight loss. This bi-directional effect of stress is not understood at the functional level. Here we tested the hypothesis that pre-stress level of adaptive thermogenesis and brown adipose tissue (BAT) functions explain the vulnerability or resilience to stress-induced obesity.

METHODS: We used wt and triple β1,β2,β3-Adrenergic Receptors knockout (β-less) mice exposed to a model of chronic subordination stress (CSS) at either room temperature (22 °C) or murine thermoneutrality (30 °C). A combined behavioral, physiological, molecular, and immunohistochemical analysis was conducted to determine stress-induced modulation of energy balance and BAT structure and function. Immortalized brown adipocytes were used for in vitro assays.

RESULTS: Departing from our initial observation that βARs are dispensable for cold-induced BAT browning, we demonstrated that under physiological conditions promoting low adaptive thermogenesis and BAT activity (e.g. thermoneutrality or genetic deletion of the βARs), exposure to CSS acted as a stimulus for BAT activation and thermogenesis, resulting in resistance to diet-induced obesity despite the presence of hyperphagia. Conversely, in wt mice acclimatized to room temperature, and therefore characterized by sustained BAT function, exposure to CSS increased vulnerability to obesity. Exposure to CSS enhanced the sympathetic innervation of BAT in wt acclimatized to thermoneutrality and in β-less mice. Despite increased sympathetic innervation suggesting adrenergic-mediated browning, norepinephrine did not promote browning in βARs knockout brown adipocytes, which led us to identify an alternative sympathetic/brown adipocytes purinergic pathway in the BAT. This pathway is downregulated under conditions of low adaptive thermogenesis requirements, is induced by stress, and elicits activation of UCP1 in wt and β-less brown adipocytes. Importantly, this purinergic pathway is conserved in human BAT.

CONCLUSION: Our findings demonstrate that thermogenesis and BAT function are determinant of the resilience or vulnerability to stress-induced obesity. Our data support a model in which adrenergic and purinergic pathways exert complementary/synergistic functions in BAT, thus suggesting an alternative to βARs agonists for the activation of human BAT.

%B Mol Metab %V 5 %P 19-33 %8 2016 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26844204?dopt=Abstract %R 10.1016/j.molmet.2015.10.005 %0 Journal Article %J Sci Rep %D 2016 %T The transcriptomics of an experimentally evolved plant-virus interaction. %A Hillung, Julia %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Cuevas, José M %A Elena, Santiago F %K Arabidopsis %K Ecotype %K Gene Expression Profiling %K Host-Pathogen Interactions %K Potyvirus %X

Models of plant-virus interaction assume that the ability of a virus to infect a host genotype depends on the matching between virulence and resistance genes. Recently, we evolved tobacco etch potyvirus (TEV) lineages on different ecotypes of Arabidopsis thaliana, and found that some ecotypes selected for specialist viruses whereas others selected for generalists. Here we sought to evaluate the transcriptomic basis of such relationships. We have characterized the transcriptomic responses of five ecotypes infected with the ancestral and evolved viruses. Genes and functional categories differentially expressed by plants infected with local TEV isolates were identified, showing heterogeneous responses among ecotypes, although significant parallelism existed among lineages evolved in the same ecotype. Although genes involved in immune responses were altered upon infection, other functional groups were also pervasively over-represented, suggesting that plant resistance genes were not the only drivers of viral adaptation. Finally, the transcriptomic consequences of infection with the generalist and specialist lineages were compared. Whilst the generalist induced very similar perturbations in the transcriptomes of the different ecotypes, the perturbations induced by the specialist were divergent. Plant defense mechanisms were activated when the infecting virus was specialist but they were down-regulated when infecting with generalist.

%B Sci Rep %V 6 %P 24901 %8 2016 04 26 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/27113435?dopt=Abstract %R 10.1038/srep24901 %0 Journal Article %J Bioinformatics %D 2016 %T Web-based network analysis and visualization using CellMaps. %A Salavert, Francisco %A García-Alonso, Luz %A Sánchez, Rubén %A Alonso, Roberto %A Bleda, Marta %A Medina, Ignacio %A Dopazo, Joaquin %K Biochemical Phenomena %K Internet %K Software %X

UNLABELLED: : CellMaps is an HTML5 open-source web tool that allows displaying, editing, exploring and analyzing biological networks as well as integrating metadata into them. Computations and analyses are remotely executed in high-end servers, and all the functionalities are available through RESTful web services. CellMaps can easily be integrated in any web page by using an available JavaScript API.

AVAILABILITY AND IMPLEMENTATION: The application is available at: http://cellmaps.babelomics.org/ and the code can be found in: https://github.com/opencb/cell-maps The client is implemented in JavaScript and the server in C and Java.

CONTACT: jdopazo@cipf.es

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

%B Bioinformatics %V 32 %P 3041-3 %8 2016 10 01 %G eng %N 19 %1 https://www.ncbi.nlm.nih.gov/pubmed/27296979?dopt=Abstract %R 10.1093/bioinformatics/btw332 %0 Journal Article %J Hum Genet %D 2016 %T Whole exome sequencing of Rett syndrome-like patients reveals the mutational diversity of the clinical phenotype. %A Lucariello, Mario %A Vidal, Enrique %A Vidal, Silvia %A Saez, Mauricio %A Roa, Laura %A Huertas, Dori %A Pineda, Mercè %A Dalfó, Esther %A Dopazo, Joaquin %A Jurado, Paola %A Armstrong, Judith %A Esteller, Manel %K Adolescent %K Adult %K Animals %K Caenorhabditis elegans %K Carrier Proteins %K Cell Cycle Proteins %K Child %K Child, Preschool %K DNA Mutational Analysis %K Exome %K Female %K Forkhead Transcription Factors %K Genetic Variation %K High-Throughput Nucleotide Sequencing %K Humans %K Methyl-CpG-Binding Protein 2 %K mutation %K Nerve Tissue Proteins %K Protein Serine-Threonine Kinases %K Receptors, Nicotinic %K Rett Syndrome %X

Classical Rett syndrome (RTT) is a neurodevelopmental disorder where most of cases carry MECP2 mutations. Atypical RTT variants involve mutations in CDKL5 and FOXG1. However, a subset of RTT patients remains that do not carry any mutation in the described genes. Whole exome sequencing was carried out in a cohort of 21 female probands with clinical features overlapping with those of RTT, but without mutations in the customarily studied genes. Candidates were functionally validated by assessing the appearance of a neurological phenotype in Caenorhabditis elegans upon disruption of the corresponding ortholog gene. We detected pathogenic variants that accounted for the RTT-like phenotype in 14 (66.6 %) patients. Five patients were carriers of mutations in genes already known to be associated with other syndromic neurodevelopmental disorders. We determined that the other patients harbored mutations in genes that have not previously been linked to RTT or other neurodevelopmental syndromes, such as the ankyrin repeat containing protein ANKRD31 or the neuronal acetylcholine receptor subunit alpha-5 (CHRNA5). Furthermore, worm assays demonstrated that mutations in the studied candidate genes caused locomotion defects. Our findings indicate that mutations in a variety of genes contribute to the development of RTT-like phenotypes.

%B Hum Genet %V 135 %P 1343-1354 %8 2016 12 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/27541642?dopt=Abstract %R 10.1007/s00439-016-1721-3 %0 Journal Article %J Nucleic acids research %D 2015 %T Assessing the impact of mutations found in next generation sequencing data over human signaling pathways. %A Hernansaiz-Ballesteros, Rosa D %A Salavert, Francisco %A Sebastián-Leon, Patricia %A Alemán, Alejandro %A Medina, Ignacio %A Joaquín Dopazo %K NGS %K pathways %K signalling %K Systems biology %X Modern sequencing technologies produce increasingly detailed data on genomic variation. However, conventional methods for relating either individual variants or mutated genes to phenotypes present known limitations given the complex, multigenic nature of many diseases or traits. Here we present PATHiVar, a web-based tool that integrates genomic variation data with gene expression tissue information. PATHiVar constitutes a new generation of genomic data analysis methods that allow studying variants found in next generation sequencing experiment in the context of signaling pathways. Simple Boolean models of pathways provide detailed descriptions of the impact of mutations in cell functionality so as, recurrences in functionality failures can easily be related to diseases, even if they are produced by mutations in different genes. Patterns of changes in signal transmission circuits, often unpredictable from individual genes mutated, correspond to patterns of affected functionalities that can be related to complex traits such as disease progression, drug response, etc. PATHiVar is available at: http://pathivar.babelomics.org. %B Nucleic acids research %V 43 %P W270-W275 %8 2015 Apr 16 %G eng %U http://nar.oxfordjournals.org/content/43/W1/W270 %R 10.1093/nar/gkv349 %0 Journal Article %J Nucleic acids research %D 2015 %T Babelomics 5.0: functional interpretation for new generations of genomic data. %A Alonso, Roberto %A Salavert, Francisco %A Garcia-Garcia, Francisco %A Carbonell-Caballero, José %A Bleda, Marta %A García-Alonso, Luz %A Sanchis-Juan, Alba %A Perez-Gil, Daniel %A Marin-Garcia, Pablo %A Sánchez, Rubén %A Cubuk, Cankut %A Hidalgo, Marta R %A Amadoz, Alicia %A Hernansaiz-Ballesteros, Rosa D %A Alemán, Alejandro %A Tárraga, Joaquín %A Montaner, David %A Medina, Ignacio %A Dopazo, Joaquin %K babelomics %K data integration %K gene set analysis %K interactome %K network analysis %K NGS %K RNA-seq %K Systems biology %K transcriptomics %X Babelomics has been running for more than one decade offering a user-friendly interface for the functional analysis of gene expression and genomic data. Here we present its fifth release, which includes support for Next Generation Sequencing data including gene expression (RNA-seq), exome or genome resequencing. Babelomics has simplified its interface, being now more intuitive. Improved visualization options, such as a genome viewer as well as an interactive network viewer, have been implemented. New technical enhancements at both, client and server sides, makes the user experience faster and more dynamic. Babelomics offers user-friendly access to a full range of methods that cover: (i) primary data analysis, (ii) a variety of tests for different experimental designs and (iii) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context. In addition to the public server, local copies of Babelomics can be downloaded and installed. Babelomics is freely available at: http://www.babelomics.org. %B Nucleic acids research %V 43 %P W117-W121 %8 2015 Apr 20 %G eng %U http://nar.oxfordjournals.org/content/43/W1/W117 %R 10.1093/nar/gkv384 %0 Journal Article %J BMC cancer %D 2015 %T BRCA1 Alternative splicing landscape in breast tissue samples. %A Romero, Atocha %A Garcia-Garcia, Francisco %A López-Perolio, Irene %A Ruiz de Garibay, Gorka %A García-Sáenz, José A %A Garre, Pilar %A Ayllón, Patricia %A Benito, Esperanza %A Joaquín Dopazo %A Díaz-Rubio, Eduardo %A Caldés, Trinidad %A de la Hoya, Miguel %X BACKGROUND: BRCA1 is a key protein in cell network, involved in DNA repair pathways and cell cycle. Recently, the ENIGMA consortium has reported a high number of alternative splicing (AS) events at this locus in blood-derived samples. However, BRCA1 splicing pattern in breast tissue samples is unknown. Here, we provide an accurate description of BRCA1 splicing events distribution in breast tissue samples. METHODS: BRCA1 splicing events were scanned in 70 breast tumor samples, 4 breast samples from healthy individuals and in 72 blood-derived samples by capillary electrophoresis (capillary EP). Molecular subtype was identified in all tumor samples. Splicing events were considered predominant if their relative expression level was at least the 10% of the full-length reference signal. RESULTS: 54 BRCA1 AS events were identified, 27 of them were annotated as predominant in at least one sample. Δ5q, Δ13, Δ9, Δ5 and ▼1aA were significantly more frequently annotated as predominant in breast tumor samples than in blood-derived samples. Predominant splicing events were, on average, more frequent in tumor samples than in normal breast tissue samples (P = 0.010). Similarly, likely inactivating splicing events (PTC-NMDs, Non-Coding, Δ5 and Δ18) were more frequently annotated as predominant in tumor than in normal breast samples (P = 0.020), whereas there were no significant differences for other splicing events (No-Fs) frequency distribution between tumor and normal breast samples (P = 0.689). CONCLUSIONS: Our results complement recent findings by the ENIGMA consortium, demonstrating that BRCA1 AS, despite its tremendous complexity, is similar in breast and blood samples, with no evidences for tissue specific AS events. Further on, we conclude that somatic inactivation of BRCA1 through spliciogenic mutations is, at best, a rare mechanism in breast carcinogenesis, albeit our data detects an excess of likely inactivating AS events in breast tumor samples. %B BMC cancer %V 15 %P 219 %8 2015 %G eng %U http://www.biomedcentral.com/1471-2407/15/219 %R 10.1186/s12885-015-1145-9 %0 Journal Article %J Nature methods %D 2015 %T Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. %A Ewing, Adam D %A Houlahan, Kathleen E %A Hu, Yin %A Ellrott, Kyle %A Caloian, Cristian %A Yamaguchi, Takafumi N %A Bare, J Christopher %A P’ng, Christine %A Waggott, Daryl %A Sabelnykova, Veronica Y %A Kellen, Michael R %A Norman, Thea C %A Haussler, David %A Friend, Stephen H %A Stolovitzky, Gustavo %A Margolin, Adam A %A Stuart, Joshua M %A Boutros, Paul C %E ICGC-TCGA DREAM Somatic Mutation Calling Challenge participants %E Liu Xi %E Ninad Dewal %E Yu Fan %E Wenyi Wang %E David Wheeler %E Andreas Wilm %E Grace Hui Ting %E Chenhao Li %E Denis Bertrand %E Niranjan Nagarajan %E Qing-Rong Chen %E Chih-Hao Hsu %E Ying Hu %E Chunhua Yan %E Warren Kibbe %E Daoud Meerzaman %E Kristian Cibulskis %E Mara Rosenberg %E Louis Bergelson %E Adam Kiezun %E Amie Radenbaugh %E Anne-Sophie Sertier %E Anthony Ferrari %E Laurie Tonton %E Kunal Bhutani %E Nancy F Hansen %E Difei Wang %E Lei Song %E Zhongwu Lai %E Liao, Yang %E Shi, Wei %E Carbonell-Caballero, José %E Joaquín Dopazo %E Cheryl C K Lau %E Justin Guinney %K cancer %K NGS %K variant calling %X The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/. %B Nature methods %8 2015 May 18 %G eng %U http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3407.html %R 10.1038/nmeth.3407 %0 Journal Article %J Hearing research %D 2015 %T Comparative gene expression study of the vestibular organ of the Igf1 deficient mouse using whole-transcript arrays. %A Rodríguez-de la Rosa, Lourdes %A Sánchez-Calderón, Hortensia %A Contreras, Julio %A Murillo-Cuesta, Silvia %A Falagan, Sandra %A Avendaño, Carlos %A Joaquín Dopazo %A Varela-Nieto, Isabel %A Milo, Marta %X The auditory and vestibular organs form the inner ear and have a common developmental origin. Insulin like growth factor 1 (IGF-1) has a central role in the development of the cochlea and maintenance of hearing. Its deficiency causes sensorineural hearing loss in man and mice. During chicken early development, IGF-1 modulates neurogenesis of the cochleovestibular ganglion but no further studies have been conducted to explore the potential role of IGF-1 in the vestibular system. In this study we have compared the whole transcriptome of the vestibular organ from wild type and Igf1(-/-) mice at different developmental and postnatal times. RNA was prepared from E18.5, P15 and P90 vestibular organs of Igf1(-/-) and Igf1(+/+) mice and the transcriptome analysed in triplicates using Affymetrix® Mouse Gene 1.1 ST Array Plates. These plates are whole-transcript arrays that include probes to measure both messenger (mRNA) and long intergenic non-coding RNA transcripts (lincRNA), with a coverage of over 28 thousand coding transcripts and over 7 thousands non-coding transcripts. Given the complexity of the data we used two different methods VSN-RMA and mmBGX to analyse and compare the data. This is to better evaluate the number of false positives and to quantify uncertainty of low signals. We identified a number of differentially expressed genes that we described using functional analysis and validated using RT-qPCR. The morphology of the vestibular organ did not show differences between genotypes and no evident alterations were observed in the vestibular sensory areas of the null mice. However, well-defined cellular alterations were found in the vestibular neurons with respect their number and size. Although these mice did not show a dramatic vestibular phenotype, we conducted a functional analysis on differentially expressed genes between genotypes and across time. This was with the aim to identify new pathways that are involved in the development of the vestibular organ as well as pathways that maybe affected by the lack of IGF-1 and be associated to the morphological changes of the vestibular neurons that we observed in the Igf1(-/-) mice. %B Hearing research %8 2015 Sep 1 %G eng %U http://www.sciencedirect.com/science/article/pii/S0378595515001835 %R 10.1016/j.heares.2015.08.016 %0 Journal Article %J IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %D 2015 %T Concurrent and Accurate Short Read Mapping on Multicore Processors. %A Martinez, Hector %A Tárraga, Joaquín %A Medina, Ignacio %A Barrachina, Sergio %A Castillo, Maribel %A Dopazo, Joaquin %A Quintana-Orti, Enrique S %K HPC %K NGS %K short real mapping %X We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, [Formula: see text] ([Formula: see text] is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of [Formula: see text], on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR. %B IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %V 12 %P 995-1007 %8 2015 Sep-Oct %G eng %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=7010005 %R 10.1109/TCBB.2015.2392077 %0 Journal Article %J Gene %D 2015 %T Deregulation of key signaling pathways involved in oocyte maturation in FMR1 premutation carriers with Fragile X-associated primary ovarian insufficiency. %A Alvarez-Mora, M I %A Rodriguez-Revenga, L %A Madrigal, I %A García-García, F %A Duran, M %A Dopazo, J %A Estivill, X %A Milà, M %K Adult %K Aged %K Female %K Fragile X Mental Retardation Protein %K Fragile X Syndrome %K Gene Expression Profiling %K Gene Expression Regulation, Developmental %K Gene ontology %K Genome-Wide Association Study %K Heterozygote %K Humans %K Middle Aged %K Models, Genetic %K mutation %K Oligonucleotide Array Sequence Analysis %K Oocytes %K Primary Ovarian Insufficiency %K Signal Transduction %X

FMR1 premutation female carriers are at risk for Fragile X-associated primary ovarian insufficiency (FXPOI). Insights from knock-in mouse model have recently demonstrated that FXPOI is due to an increased rate of follicle depletion or an impaired development of the growing follicles. Molecular mechanisms responsible for this reduced viability are still unknown. In an attempt to provide new data on the mechanisms that lead to FXPOI, we report the first investigation involving transcription profiling of total blood from FMR1 premutation female carriers with and without FXPOI. A total of 16 unrelated female individuals (6 FMR1 premutated females with FXPOI; 6 FMR1 premutated females without FXPOI; and 4 no-FXPOI females) were studied by whole human genome oligonucleotide microarray (Agilent Technologies). Fold change analysis did not show any genes with significant differential gene expression. However, functional profiling by gene set analysis showed large number of statistically significant deregulated GO annotations as well as numerous KEGG pathways in FXPOI females. These results suggest that the impairment of fertility in these females might be due to a generalized deregulation of key signaling pathways involved in oocyte maturation. In particular, the vasoendotelial growth factor signaling, the inositol phosphate metabolism, the cell cycle, and the MAPK signaling pathways were found to be down-regulated in FXPOI females. Furthermore, a high statistical enrichment of biological processes involved in cell death and survival were found deregulated among FXPOI females. Our results provide new strategic approaches to further investigate the molecular mechanisms and potential therapeutic targets for FXPOI not focused in a single gene but rather in the set of genes involved in these pathways.

%B Gene %V 571 %P 52-7 %8 2015 Oct 15 %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/26095811?dopt=Abstract %R 10.1016/j.gene.2015.06.039 %0 Journal Article %J J Invest Dermatol %D 2015 %T Differential Features Between Chronic Skin Inflammatory Diseases Revealed in Skin-Humanized Psoriasis and Atopic Dermatitis Mouse Models. %A Carretero, M %A Guerrero-Aspizua, S %A Illera, N %A Galvez, V %A Navarro, M %A García-García, F %A Dopazo, J %A Jorcano, J L %A Larcher, F %A Del Rio, M %X

Psoriasis (PS) and atopic dermatitis (AD) are chronic and relapsing inflammatory diseases of the skin affecting a large number of patients worldwide. Psoriasis is characterized by a Th1/Th17 immunological response whereas acute AD lesions exhibit Th2-dominant inflammation. Current single gene and signaling pathways-based models of inflammatory skin diseases are incomplete. Previous work allowed us to model psoriasis in skin-humanized mice through proper combinations of inflammatory cell components and disruption of barrier function. Herein we describe and characterize an animal model for AD using similar bioengineered-based approaches, by intradermal injection of human Th2 lymphocytes in regenerated human skin after partial removal of stratum corneum. In the present work we have extensively compared this model with the previous and an improved version of the PS model, in which Th17/Th1 lymphocytes replace exogenous cytokines. Comparative expression analyses revealed marked differences in specific epidermal proliferation and differentiation markers and immune-related molecules including antimicrobial peptides. Likewise, the composition of the dermal inflammatory infiltrate presented important differences. Availability of accurate and reliable animal models for these diseases will contribute to the understanding of the pathogenesis and provide valuable tools for drug development and testing.Journal of Investigative Dermatology accepted article preview online, 23 September 2015. doi:10.1038/jid.2015.362.

%B J Invest Dermatol %8 2015 Sep 23 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/26398345?dopt=Abstract %R 10.1038/jid.2015.362 %0 Journal Article %J Eur J Neurol %D 2015 %T The EGR2 gene is involved in axonal Charcot-Marie-Tooth disease. %A Sevilla, T %A Sivera, R %A Martínez-Rubio, D %A Lupo, V %A Chumillas, M J %A Calpena, E %A Dopazo, J %A Vílchez, J J %A Palau, F %A Espinós, C %K Adult %K Aged %K Aged, 80 and over %K Axons %K Charcot-Marie-Tooth Disease %K Early Growth Response Protein 2 %K Exome %K Female %K Humans %K Male %K Middle Aged %K mutation %K Pedigree %K Phenotype %K Severity of Illness Index %K Young Adult %X

BACKGROUND AND PURPOSE: A three-generation family affected by axonal Charcot-Marie-Tooth disease (CMT) was investigated with the aim of discovering genetic defects and to further characterize the phenotype.

METHODS: The clinical, nerve conduction studies and muscle magnetic resonance images of the patients were reviewed. A whole exome sequencing was performed and the changes were investigated by genetic studies, in silico analysis and luciferase reporter assays.

RESULTS: A novel c.1226G>A change (p.R409Q) in the EGR2 gene was identified. Patients presented with a typical, late-onset axonal CMT phenotype with variable severity that was confirmed in the ancillary tests. The in silico studies showed that the residue R409 is an evolutionary conserved amino acid. The p.R409Q mutation, which is predicted as probably damaging, would alter the conformation of the protein slightly and would cause a decrease of gene expression.

CONCLUSIONS: This is the first report of an EGR2 mutation presenting as an axonal CMT phenotype with variable severity. This study broadens the phenotype of the EGR2-related neuropathies and suggests that the genetic testing of patients suffering from axonal CMT should include the EGR2 gene.

%B Eur J Neurol %V 22 %P 1548-55 %8 2015 Dec %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/26204789?dopt=Abstract %R 10.1111/ene.12782 %0 Journal Article %J Scientific Reports %D 2015 %T Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease %A Luzón-Toro, Berta %A Gui, Hongsheng %A Ruiz-Ferrer, Macarena %A Sze-Man Tang, Clara %A Fernández, Raquel M. %A Sham, Pak-Chung %A Torroglosa, Ana %A Kwong-Hang Tam, Paul %A Espino-Paisán, Laura %A Cherny, Stacey S. %A Bleda, Marta %A Enguix-Riego, María Del Valle %A Dopazo, Joaquin %A Antiňolo, Guillermo %A Garcia-Barceló, Maria-Mercè %A Borrego, Salud %B Scientific Reports %V 5 %8 Jan-12-2015 %G eng %U http://www.nature.com/articles/srep16473http://www.nature.com/articles/srep16473.pdfhttp://www.nature.com/articles/srep16473.pdfhttp://www.nature.com/articles/srep16473 %N 1 %! Sci Rep %R 10.1038/srep16473 %0 Journal Article %J Scientific reports %D 2015 %T Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease. %A Luzón-Toro, Berta %A Gui, Hongsheng %A Ruiz-Ferrer, Macarena %A Sze-Man Tang, Clara %A Fernández, Raquel M %A Sham, Pak-Chung %A Torroglosa, Ana %A Kwong-Hang Tam, Paul %A Espino-Paisán, Laura %A Cherny, Stacey S %A Bleda, Marta %A Enguix-Riego, María Del Valle %A Joaquín Dopazo %A Antiňolo, Guillermo %A Garcia-Barceló, Maria-Mercè %A Borrego, Salud %K babelomics %K Hirschprung %K NGS %K prioritization %X Hirschsprung disease (HSCR; OMIM 142623) is a developmental disorder characterized by aganglionosis along variable lengths of the distal gastrointestinal tract, which results in intestinal obstruction. Interactions among known HSCR genes and/or unknown disease susceptibility loci lead to variable severity of phenotype. Neither linkage nor genome-wide association studies have efficiently contributed to completely dissect the genetic pathways underlying this complex genetic disorder. We have performed whole exome sequencing of 16 HSCR patients from 8 unrelated families with SOLID platform. Variants shared by affected relatives were validated by Sanger sequencing. We searched for genes recurrently mutated across families. Only variations in the FAT3 gene were significantly enriched in five families. Within-family analysis identified compound heterozygotes for AHNAK and several genes (N = 23) with heterozygous variants that co-segregated with the phenotype. Network and pathway analyses facilitated the discovery of polygenic inheritance involving FAT3, HSCR known genes and their gene partners. Altogether, our approach has facilitated the detection of more than one damaging variant in biologically plausible genes that could jointly contribute to the phenotype. Our data may contribute to the understanding of the complex interactions that occur during enteric nervous system development and the etiopathology of familial HSCR. %B Scientific reports %V 5 %P 16473 %8 2015 %G eng %U http://www.nature.com/articles/srep16473 %R 10.1038/srep16473 %0 Journal Article %J Eur J Oral Sci %D 2015 %T Family-based genome-wide association study in Patagonia confirms the association of the DMD locus and cleft lip and palate. %A Fonseca, Renata F %A de Carvalho, Flávia M %A Poletta, Fernando A %A Montaner, David %A Dopazo, Joaquin %A Mereb, Juan C %A Moreira, Miguel A M %A Seuanez, Hector N %A Vieira, Alexandre R %A Castilla, Eduardo E %A Orioli, Iêda M %X

The etiology of cleft lip with or without cleft palate (CL±P) is complex and heterogeneous, and multiple genetic and environmental factors are involved. Some candidate genes reported to be associated with oral clefts are located on the X chromosome. At least three genes causing X-linked syndromes [midline 1 (MID1), oral-facial-digital syndrome 1 (OFD1), and dystrophin (DMD)] were previously found to be associated with isolated CL±P. We attempted to confirm the role of X-linked genes in the etiology of isolated CL±P in a South American population through a family-based genome-wide scan. We studied 27 affected children and their mothers, from 26 families, in a Patagonian population with a high prevalence of CL±P. We conducted an exploratory analysis of the X chromosome to identify candidate regions associated with CL±P. Four genomic segments were identified, two of which showed a statistically significant association with CL±P. One is an 11-kb region of Xp21.1 containing the DMD gene, and the other is an intergenic region (8.7 kb; Xp11.4). Our results are consistent with recent data on the involvement of the DMD gene in the etiology of CL±P. The MID1 and OFD1 genes were not included in the four potential CL±P-associated X-chromosome genomic segments.

%B Eur J Oral Sci %V 123 %P 381-384 %8 2015 Oct %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/26331285?dopt=Abstract %R 10.1111/eos.12212 %0 Journal Article %J BMC Bioinformatics %D 2015 %T Fast inexact mapping using advanced tree exploration on backward search methods. %A Salavert, José %A Tomás, Andrés %A Tárraga, Joaquín %A Medina, Ignacio %A Dopazo, Joaquin %A Blanquer, Ignacio %K Algorithms %K Genome, Human %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Alignment %K Sequence Analysis, DNA %K Software %X

BACKGROUND: Short sequence mapping methods for Next Generation Sequencing consist on a combination of seeding techniques followed by local alignment based on dynamic programming approaches. Most seeding algorithms are based on backward search alignment, using the Burrows Wheeler Transform, the Ferragina and Manzini Index or Suffix Arrays. All these backward search algorithms have excellent performance, but their computational cost highly increases when allowing errors. In this paper, we discuss an inexact mapping algorithm based on pruning strategies for search tree exploration over genomic data.

RESULTS: The proposed algorithm achieves a 13x speed-up over similar algorithms when allowing 6 base errors, including insertions, deletions and mismatches. This algorithm can deal with 400 bps reads with up to 9 errors in a high quality Illumina dataset. In this example, the algorithm works as a preprocessor that reduces by 55% the number of reads to be aligned. Depending on the aligner the overall execution time is reduced between 20-40%.

CONCLUSIONS: Although not intended as a complete sequence mapping tool, the proposed algorithm could be used as a preprocessing step to modern sequence mappers. This step significantly reduces the number reads to be aligned, accelerating overall alignment time. Furthermore, this algorithm could be used for accelerating the seeding step of already available sequence mappers. In addition, an out-of-core index has been implemented for working with large genomes on systems without expensive memory configurations.

%B BMC Bioinformatics %V 16 %P 18 %8 2015 Jan 28 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25626517?dopt=Abstract %R 10.1186/s12859-014-0438-3 %0 Journal Article %J BMC medical genomics %D 2015 %T Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas. %A Luzón-Toro, Berta %A Bleda, Marta %A Navarro, Elena %A García-Alonso, Luz %A Ruiz-Ferrer, Macarena %A Medina, Ignacio %A Martín-Sánchez, Marta %A Gonzalez, Cristina Y %A Fernández, Raquel M %A Torroglosa, Ana %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %K epistasis %K GWAS %K Thyroid cancer %X BACKGROUND: The molecular mechanisms leading to sporadic medullary thyroid carcinoma (sMTC) and juvenile papillary thyroid carcinoma (PTC), two rare tumours of the thyroid gland, remain poorly understood. Genetic studies on thyroid carcinomas have been conducted, although just a few loci have been systematically associated. Given the difficulties to obtain single-loci associations, this work expands its scope to the study of epistatic interactions that could help to understand the genetic architecture of complex diseases and explain new heritable components of genetic risk. METHODS: We carried out the first screening for epistasis by Multifactor-Dimensionality Reduction (MDR) in genome-wide association study (GWAS) on sMTC and juvenile PTC, to identify the potential simultaneous involvement of pairs of variants in the disease. RESULTS: We have identified two significant epistatic gene interactions in sMTC (CHFR-AC016582.2 and C8orf37-RNU1-55P) and three in juvenile PTC (RP11-648k4.2-DIO1, RP11-648k4.2-DMGDH and RP11-648k4.2-LOXL1). Interestingly, each interacting gene pair included a non-coding RNA, providing thus support to the relevance that these elements are increasingly gaining to explain carcinoma development and progression. CONCLUSIONS: Overall, this study contributes to the understanding of the genetic basis of thyroid carcinoma susceptibility in two different case scenarios such as sMTC and juvenile PTC. %B BMC medical genomics %V 8 %P 83 %8 2015 %G eng %U http://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-015-0160-7 %R 10.1186/s12920-015-0160-7 %0 Journal Article %J BMC Medical Genomics %D 2015 %T Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas %A Luzón-Toro, Berta %A Bleda, Marta %A Navarro, Elena %A García-Alonso, Luz %A Ruiz-Ferrer, Macarena %A Medina, Ignacio %A Martín-Sánchez, Marta %A Gonzalez, Cristina Y. %A Fernández, Raquel M. %A Torroglosa, Ana %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %X The molecular mechanisms leading to sporadic medullary thyroid carcinoma (sMTC) and juvenile papillary thyroid carcinoma (PTC), two rare tumours of the thyroid gland, remain poorly understood. Genetic studies on thyroid carcinomas have been conducted, although just a few loci have been systematically associated. Given the difficulties to obtain single-loci associations, this work expands its scope to the study of epistatic interactions that could help to understand the genetic architecture of complex diseases and explain new heritable components of genetic risk. %B BMC Medical Genomics %V 8 %P 83 %8 Dec %G eng %U https://doi.org/10.1186/s12920-015-0160-7 %R 10.1186/s12920-015-0160-7 %0 Journal Article %J BMC Genomics %D 2015 %T Involvement of a citrus meiotic recombination TTC-repeat motif in the formation of gross deletions generated by ionizing radiation and MULE activation %A Terol, Javier %A Ibañez, Victoria %A Carbonell, José %A Alonso, Roberto %A Estornell, Leandro H. %A Licciardello, Concetta %A Gut, Ivo G. %A Dopazo, Joaquin %A Talon, Manuel %X Transposable-element mediated chromosomal rearrangements require the involvement of two transposons and two double-strand breaks (DSB) located in close proximity. In radiobiology, DSB proximity is also a major factor contributing to rearrangements. However, the whole issue of DSB proximity remains virtually unexplored. %B BMC Genomics %V 16 %P 69 %8 Feb %G eng %U https://doi.org/10.1186/s12864-015-1280-3 %R 10.1186/s12864-015-1280-3 %0 Journal Article %J BMC genomics %D 2015 %T Involvement of a citrus meiotic recombination TTC-repeat motif in the formation of gross deletions generated by ionizing radiation and MULE activation. %A Terol, Javier %A Ibañez, Victoria %A Carbonell, José %A Alonso, Roberto %A Estornell, Leandro H %A Licciardello, Concetta %A Gut, Ivo G %A Joaquín Dopazo %A Talon, Manuel %X BACKGROUND: Transposable-element mediated chromosomal rearrangements require the involvement of two transposons and two double-strand breaks (DSB) located in close proximity. In radiobiology, DSB proximity is also a major factor contributing to rearrangements. However, the whole issue of DSB proximity remains virtually unexplored. RESULTS: Based on DNA sequencing analysis we show that the genomes of 2 derived mutations, Arrufatina (sport) and Nero (irradiation), share a similar 2 Mb deletion of chromosome 3. A 7 kb Mutator-like element found in Clemenules was present in Arrufatina in inverted orientation flanking the 5’ end of the deletion. The Arrufatina Mule displayed "dissimilar" 9-bp target site duplications separated by 2 Mb. Fine-scale single nucleotide variant analyses of the deleted fragments identified a TTC-repeat sequence motif located in the center of the deletion responsible of a meiotic crossover detected in the citrus reference genome. CONCLUSIONS: Taken together, this information is compatible with the proposal that in both mutants, the TTC-repeat motif formed a triplex DNA structure generating a loop that brought in close proximity the originally distinct reactive ends. In Arrufatina, the loop brought the Mule ends nearby the 2 distinct insertion target sites and the inverted insertion of the transposable element between these target sites provoked the release of the in-between fragment. This proposal requires the involvement of a unique transposon and sheds light on the unresolved question of how two distinct sites become located in close proximity. These observations confer a crucial role to the TTC-repeats in fundamental plant processes as meiotic recombination and chromosomal rearrangements. %B BMC genomics %V 16 %P 69 %8 2015 Feb 13 %G eng %U http://www.biomedcentral.com/1471-2164/16/69 %R 10.1186/s12864-015-1280-3 %0 Journal Article %J PLoS Comput Biol %D 2015 %T A Pan-Cancer Catalogue of Cancer Driver Protein Interaction Interfaces. %A Porta-Pardo, Eduard %A García-Alonso, Luz %A Hrabe, Thomas %A Dopazo, Joaquin %A Godzik, Adam %K Animals %K Base Sequence %K Biomarkers, Tumor %K Catalogs as Topic %K Chromosome Mapping %K Computer Simulation %K DNA Mutational Analysis %K Genetic Predisposition to Disease %K Humans %K Models, Genetic %K Molecular Sequence Data %K mutation %K Neoplasm Proteins %K Neoplasms %K Polymorphism, Single Nucleotide %K Protein Interaction Mapping %K Signal Transduction %X

Despite their importance in maintaining the integrity of all cellular pathways, the role of mutations on protein-protein interaction (PPI) interfaces as cancer drivers has not been systematically studied. Here we analyzed the mutation patterns of the PPI interfaces from 10,028 proteins in a pan-cancer cohort of 5,989 tumors from 23 projects of The Cancer Genome Atlas (TCGA) to find interfaces enriched in somatic missense mutations. To that end we use e-Driver, an algorithm to analyze the mutation distribution of specific protein functional regions. We identified 103 PPI interfaces enriched in somatic cancer mutations. 32 of these interfaces are found in proteins coded by known cancer driver genes. The remaining 71 interfaces are found in proteins that have not been previously identified as cancer drivers even that, in most cases, there is an extensive literature suggesting they play an important role in cancer. Finally, we integrate these findings with clinical information to show how tumors apparently driven by the same gene have different behaviors, including patient outcomes, depending on which specific interfaces are mutated.

%B PLoS Comput Biol %V 11 %P e1004518 %8 2015 Oct %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/26485003?dopt=Abstract %R 10.1371/journal.pcbi.1004518 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2015 %T A Parallel and Sensitive Software Tool for Methylation Analysis on Multicore Platforms. %A Tárraga, Joaquín %A Pérez, Mariano %A Orduña, Juan M %A Duato, José %A Medina, Ignacio %A Joaquín Dopazo %K BS-seq %K HPC %K methylation %K NGS %X MOTIVATION: DNA methylation analysis suffers from very long processing time, since the advent of Next-Generation Sequencers (NGS) has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. Since it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. RESULTS: We present a new software tool, called HPG-Methyl, which efficiently maps bisulfite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPGMethyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulfite reads. AVAILABILITY: Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password "anonymous"). CONTACT: Juan.Orduna@uv.es. %B Bioinformatics (Oxford, England) %V 31 %P 3130-3138 %8 2015 Jun 10 %G eng %U http://bioinformatics.oxfordjournals.org/content/31/19/3130.long %R 10.1093/bioinformatics/btv357 %0 Journal Article %J Molecular biology and evolution %D 2015 %T A phylogenetic analysis of 34 chloroplast genomes elucidates the relationships between wild and domestic species within the genus Citrus. %A Carbonell-Caballero, José %A Alonso, Roberto %A Ibañez, Victoria %A Terol, Javier %A Talon, Manuel %A Dopazo, Joaquin %K chloroplast %K citrus %K Phylogeny %K WGS %X Citrus genus includes some of the most important cultivated fruit trees worldwide. Despite being extensively studied because of its commercial relevance, the origin of cultivated citrus species and the history of its domestication still remain an open question. Here we present a phylogenetic analysis of the chloroplast genomes of 34 citrus genotypes which constitutes the most comprehensive and detailed study to date on the evolution and variability of the genus Citrus. A statistical model was used to estimate divergence times between the major citrus groups. Additionally, a complete map of the variability across the genome of different citrus species was produced, including single nucleotide variants, heteroplasmic positions, indels and large structural variants. The distribution of all these variants provided further independent support to the phylogeny obtained. An unexpected finding was the high level of heteroplasmy found in several of the analysed genomes. The use of the complete chloroplast DNA not only paves the way for a better understanding of the phylogenetic relationships within the Citrus genus, but also provides original insights into other elusive evolutionary processes such as chloroplast inheritance, heteroplasmy and gene selection. %B Molecular biology and evolution %V 32 %P 2015-2035 %8 2015 Apr 14 %G eng %U http://mbe.oxfordjournals.org/content/early/2015/04/27/molbev.msv082.full %R 10.1093/molbev/msv082 %0 Journal Article %J Nature biotechnology %D 2015 %T Prediction of human population responses to toxic compounds by a collaborative competition. %A Eduati, Federica %A Mangravite, Lara M %A Wang, Tao %A Tang, Hao %A Bare, J Christopher %A Huang, Ruili %A Norman, Thea %A Kellen, Mike %A Menden, Michael P %A Yang, Jichen %A Zhan, Xiaowei %A Zhong, Rui %A Xiao, Guanghua %A Xia, Menghang %A Abdo, Nour %A Kosyk, Oksana %X The ability to computationally predict the effects of toxic compounds on humans could help address the deficiencies of current chemical safety testing. Here, we report the results from a community-based DREAM challenge to predict toxicities of environmental compounds with potential adverse health effects for human populations. We measured the cytotoxicity of 156 compounds in 884 lymphoblastoid cell lines for which genotype and transcriptional data are available as part of the Tox21 1000 Genomes Project. The challenge participants developed algorithms to predict interindividual variability of toxic response from genomic profiles and population-level cytotoxicity data from structural attributes of the compounds. 179 submitted predictions were evaluated against an experimental data set to which participants were blinded. Individual cytotoxicity predictions were better than random, with modest correlations (Pearson’s r < 0.28), consistent with complex trait genomic prediction. In contrast, predictions of population-level response to different compounds were higher (r < 0.66). The results highlight the possibility of predicting health risks associated with unknown compounds, although risk estimation accuracy remains suboptimal. %B Nature biotechnology %8 2015 Aug 10 %G eng %U http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3299.html %R 10.1038/nbt.3299 %0 Journal Article %J Nucleic Acids Res %D 2015 %T PTMcode v2: a resource for functional associations of post-translational modifications within and between proteins. %A Minguez, Pablo %A Letunic, Ivica %A Parca, Luca %A García-Alonso, Luz %A Dopazo, Joaquin %A Huerta-Cepas, Jaime %A Bork, Peer %K Databases, Protein %K Internet %K Protein Interaction Mapping %K Protein Processing, Post-Translational %X

The post-translational regulation of proteins is mainly driven by two molecular events, their modification by several types of moieties and their interaction with other proteins. These two processes are interdependent and together are responsible for the function of the protein in a particular cell state. Several databases focus on the prediction and compilation of protein-protein interactions (PPIs) and no less on the collection and analysis of protein post-translational modifications (PTMs), however, there are no resources that concentrate on describing the regulatory role of PTMs in PPIs. We developed several methods based on residue co-evolution and proximity to predict the functional associations of pairs of PTMs that we apply to modifications in the same protein and between two interacting proteins. In order to make data available for understudied organisms, PTMcode v2 (http://ptmcode.embl.de) includes a new strategy to propagate PTMs from validated modified sites through orthologous proteins. The second release of PTMcode covers 19 eukaryotic species from which we collected more than 300,000 experimentally verified PTMs (>1,300,000 propagated) of 69 types extracting the post-translational regulation of >100,000 proteins and >100,000 interactions. In total, we report 8 million associations of PTMs regulating single proteins and over 9.4 million interplays tuning PPIs.

%B Nucleic Acids Res %V 43 %P D494-502 %8 2015 Jan %G eng %N Database issue %1 https://www.ncbi.nlm.nih.gov/pubmed/25361965?dopt=Abstract %R 10.1093/nar/gku1081 %0 Journal Article %J Am J Med Genet A %D 2015 %T Re-evaluation casts doubt on the pathogenicity of homozygous USH2A p.C759F. %A Pozo, María González-Del %A Bravo-Gil, Nereida %A Méndez-Vidal, Cristina %A Montero-de-Espinosa, Ignacio %A Millán, José M %A Dopazo, Joaquin %A Borrego, Salud %A Antiňolo, Guillermo %K Base Sequence %K Cyclic Nucleotide Phosphodiesterases, Type 6 %K Extracellular Matrix Proteins %K Gene Library %K Humans %K Molecular Sequence Data %K Mutation, Missense %K Pedigree %K Retinitis pigmentosa %K Sequence Analysis, DNA %K Spain %X

Mutations in USH2A are a common cause of Retinitis Pigmentosa (RP). Among the most frequently reported USH2A variants, c.2276G>T (p.C759F) has been found in both affected and healthy individuals. The pathogenicity of this variant remains controversial since it was detected in homozygosity in two healthy siblings of a Spanish family (S23), eleven years ago. The fact that these individuals remain asymptomatic today, prompted us to study the presence of other pathogenic variants in this family using targeted resequencing of 26 retinal genes in one of the affected individuals. This approach allowed us to identify one novel pathogenic homozygous mutation in exon 13 of PDE6B (c.1678C>T; p.R560C). This variant cosegregated with the disease and was absent in 200 control individuals. Remarkably, the identified variant in PDE6B corresponds to the mutation responsible of the retinal degeneration in the naturally occurring rd10 mutant mice. To our knowledge, this is the first report of the identification of the rd10 mice mutation in a RP family. These findings, together with a review of the literature, support the hypothesis that homozygous p.C759F mutations are not pathogenic and led us to exclude the implication of p.C759F in the RP of family S23. Our results indicate the need of re-evaluating all families genetically diagnosed with this mutation.

%B Am J Med Genet A %V 167 %P 1597-600 %8 2015 Jul %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/25823529?dopt=Abstract %R 10.1002/ajmg.a.37003 %0 Journal Article %J Molecular immunology %D 2015 %T Therapeutic targets for olive pollen allergy defined by gene markers modulated by Ole e 1-derived peptides. %A Calzada, David %A Aguerri, Miriam %A Baos, Selene %A Montaner, David %A Mata, Manuel %A Joaquín Dopazo %A Quiralte, Joaquín %A Florido, Fernando %A Lahoz, Carlos %A Cárdaba, Blanca %X Two regions of Ole e 1, the major olive-pollen allergen, have been characterized as T-cell epitopes, one as immunodominant region (aa91-130) and the other, as mainly recognized by non-allergic subjects (aa10-31). This report tries to characterize the specific relevance of these epitopes in the allergic response to olive pollen by analyzing the secreted cytokines and the gene expression profiles induced after specific stimulation of peripheral blood mononuclear cells (PBMCs). PBMCs from olive pollen-allergic and non-allergic control subjects were stimulated with olive-pollen extract and Ole e 1 dodecapeptides containing relevant T-cell epitopes. Levels of cytokines were measured in cellular supernatants and gene expression was determined by microarrays, on the RNAs extracted from PBMCs. One hundred eighty-nine differential genes (fold change >2 or <-2, P<0.05) were validated by qRT-PCR in a large population. It was not possible to define a pattern of response according the overall cytokine results but interesting differences were observed, mainly in the regulatory cytokines. Principal component (PCA) gene-expression analysis defined clusters that correlated with the experimental conditions in the group of allergic subjects. Gene expression and functional analyses revealed differential genes and pathways among the experimental conditions. A set of 51 genes (many essential to T-cell tolerance and homeostasis) correlated with the response to aa10-31 of Ole e 1. In conclusion, two peptides derived from Ole e 1 could regulate the immune response in allergic patients, by gene-expression modification of several regulation-related genes. These results open new research ways to the regulation of allergy by Oleaceae family members. %B Molecular immunology %V 64 %P 252-61 %8 2015 Apr %G eng %U http://www.sciencedirect.com/science/article/pii/S0161589014003356 %R 10.1016/j.molimm.2014.12.002 %0 Journal Article %J Sci Rep %D 2015 %T Using activation status of signaling pathways as mechanism-based biomarkers to predict drug sensitivity. %A Amadoz, Alicia %A Sebastián-Leon, Patricia %A Vidal, Enrique %A Salavert, Francisco %A Dopazo, Joaquin %K Algorithms %K Antineoplastic Agents %K biomarkers %K Cell Line, Tumor %K Cell Survival %K gene expression %K Humans %K Lethal Dose 50 %K Neoplasms %K Phosphorylation %K Proteins %K Signal Transduction %X

Many complex traits, as drug response, are associated with changes in biological pathways rather than being caused by single gene alterations. Here, a predictive framework is presented in which gene expression data are recoded into activity statuses of signal transduction circuits (sub-pathways within signaling pathways that connect receptor proteins to final effector proteins that trigger cell actions). Such activity values are used as features by a prediction algorithm which can efficiently predict a continuous variable such as the IC50 value. The main advantage of this prediction method is that the features selected by the predictor, the signaling circuits, are themselves rich-informative, mechanism-based biomarkers which provide insight into or drug molecular mechanisms of action (MoA).

%B Sci Rep %V 5 %P 18494 %8 2015 Dec 18 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/26678097?dopt=Abstract %R 10.1038/srep18494 %0 Journal Article %J Human molecular genetics %D 2015 %T Whole Exome Sequencing Reveals ZNF408 as a New Gene Associated With Autosomal Recessive Retinitis Pigmentosa with Vitreal Alterations. %A Avila-Fernandez, Almudena %A Perez-Carro, Raquel %A Corton, Marta %A Lopez-Molina, Maria Isabel %A Campello, Laura %A Garanto, Alex %A Fernadez-Sanchez, Laura %A Duijkers, Lonneke %A Lopez-Martinez, Miguel Angel %A Riveiro-Alvarez, Rosa %A da Silva, Luciana Rodrigues Jacy %A Sanchez-Alcudia, Rocío %A Martin-Garrido, Esther %A Reyes, Noelia %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Garcia-Sandoval, Blanca %A Collin, Rob W %A Cuenca, Nicolas %A Ayuso, Carmen %X Retinitis Pigmentosa (RP) is a group of progressive inherited retinal dystrophies that cause visual impairment as a result of photoreceptor cell death. RP is heterogeneous, both clinically and genetically making difficult to establish precise genotype-phenotype correlations. In a Spanish family with autosomal recessive RP (arRP), homozygosity mapping and whole exome sequencing led to the identification of a homozygous mutation (c.358_359delGT; p.Ala122Leufs*2) in the ZNF408 gene. A screening performed in 217 additional unrelated families revealed another homozygous mutation (c.1621C>T; p.Arg541Cys) in an isolated RP case. ZNF408 encodes a transcription factor that harbors ten predicted C2H2-type fingers thought to be implicated in DNA binding. To elucidate the ZNF408 role in the retina and the pathogenesis of these mutations we have performed different functional studies. By immunohistochemical analysis in healthy human retina, we identified that ZNF408 is expressed in both cone and rod photoreceptors, in a specific type of amacrine and ganglion cells, and in retinal blood vessels. ZNF408 revealed a cytoplasmic localization and a nuclear distribution in areas corresponding with the euchromatin fraction. Immunolocalization studies showed a partial mislocalization of the p.Arg541Cys mutant protein retaining part of the WT protein in the cytoplasm. Our study demonstrates that ZNF408, previously associated with Familial Exudative Vitreoretinopathy (FEVR), is a new gene causing arRP with vitreous condensations supporting the evidence that this protein plays additional functions into the human retina. %B Human molecular genetics %V 24 %P 4037-4048 %8 2015 Apr 16 %G eng %U http://hmg.oxfordjournals.org/content/early/2015/04/16/hmg.ddv140.abstract %R 10.1093/hmg/ddv140 %0 Journal Article %J Hum Mol Genet %D 2015 %T Whole-exome sequencing reveals ZNF408 as a new gene associated with autosomal recessive retinitis pigmentosa with vitreal alterations. %A Avila-Fernandez, Almudena %A Perez-Carro, Raquel %A Corton, Marta %A Lopez-Molina, Maria Isabel %A Campello, Laura %A Garanto, Alejandro %A Fernandez-Sanchez, Laura %A Duijkers, Lonneke %A Lopez-Martinez, Miguel Angel %A Riveiro-Alvarez, Rosa %A da Silva, Luciana Rodrigues Jacy %A Sanchez-Alcudia, Rocío %A Martin-Garrido, Esther %A Reyes, Noelia %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Garcia-Sandoval, Blanca %A Collin, Rob W J %A Cuenca, Nicolas %A Ayuso, Carmen %K Amino Acid Sequence %K Animals %K Chlorocebus aethiops %K Chromosome Mapping %K COS Cells %K DNA-Binding Proteins %K Exome %K Genome-Wide Association Study %K High-Throughput Nucleotide Sequencing %K Homozygote %K Humans %K Molecular Sequence Data %K Mutant Proteins %K Pedigree %K Retina %K Retinal Cone Photoreceptor Cells %K Retinal Rod Photoreceptor Cells %K Retinitis pigmentosa %K Transcription Factors %X

Retinitis pigmentosa (RP) is a group of progressive inherited retinal dystrophies that cause visual impairment as a result of photoreceptor cell death. RP is heterogeneous, both clinically and genetically making difficult to establish precise genotype-phenotype correlations. In a Spanish family with autosomal recessive RP (arRP), homozygosity mapping and whole-exome sequencing led to the identification of a homozygous mutation (c.358_359delGT; p.Ala122Leufs*2) in the ZNF408 gene. A screening performed in 217 additional unrelated families revealed another homozygous mutation (c.1621C>T; p.Arg541Cys) in an isolated RP case. ZNF408 encodes a transcription factor that harbors 10 predicted C2H2-type fingers thought to be implicated in DNA binding. To elucidate the ZNF408 role in the retina and the pathogenesis of these mutations we have performed different functional studies. By immunohistochemical analysis in healthy human retina, we identified that ZNF408 is expressed in both cone and rod photoreceptors, in a specific type of amacrine and ganglion cells, and in retinal blood vessels. ZNF408 revealed a cytoplasmic localization and a nuclear distribution in areas corresponding with the euchromatin fraction. Immunolocalization studies showed a partial mislocalization of the p.Arg541Cys mutant protein retaining part of the WT protein in the cytoplasm. Our study demonstrates that ZNF408, previously associated with Familial Exudative Vitreoretinopathy (FEVR), is a new gene causing arRP with vitreous condensations supporting the evidence that this protein plays additional functions into the human retina.

%B Hum Mol Genet %V 24 %P 4037-48 %8 2015 Jul 15 %G eng %N 14 %1 https://www.ncbi.nlm.nih.gov/pubmed/25882705?dopt=Abstract %R 10.1093/hmg/ddv140 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2014 %T Acceleration of short and long DNA read mapping without loss of accuracy using suffix array. %A Tárraga, Joaquín %A Arnau, Vicente %A Martinez, Hector %A Moreno, Raul %A Cazorla, Diego %A Salavert-Torres, José %A Blanquer-Espert, Ignacio %A Joaquín Dopazo %A Medina, Ignacio %K NGS %K short read mapping. HPC. suffix arrays %X HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20x for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current, state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies. %B Bioinformatics (Oxford, England) %V 30 %P 3396-3398 %8 2014 Aug 20 %G eng %U http://bioinformatics.oxfordjournals.org/content/early/2014/08/19/bioinformatics.btu553.long %R 10.1093/bioinformatics/btu553 %0 Journal Article %J Front Oncol %D 2014 %T The Activation of the Sox2 RR2 Pluripotency Transcriptional Reporter in Human Breast Cancer Cell Lines is Dynamic and Labels Cells with Higher Tumorigenic Potential. %A Iglesias, Juan Manuel %A Leis, Olatz %A Pérez Ruiz, Estíbaliz %A Gumuzio Barrie, Juan %A Garcia-Garcia, Francisco %A Aduriz, Ariane %A Beloqui, Izaskun %A Hernandez-Garcia, Susana %A Lopez-Mato, Maria Paz %A Dopazo, Joaquin %A Pandiella, Atanasio %A Menendez, Javier A %A Martin, Angel Garcia %X

The striking similarity displayed at the mechanistic level between tumorigenesis and the generation of induced pluripotent stem cells and the fact that genes and pathways relevant for embryonic development are reactivated during tumor progression highlights the link between pluripotency and cancer. Based on these observations, we tested whether it is possible to use a pluripotency-associated transcriptional reporter, whose activation is driven by the SRR2 enhancer from the Sox2 gene promoter (named S4+ reporter), to isolate cancer stem cells (CSCs) from breast cancer cell lines. The S4+ pluripotency transcriptional reporter allows the isolation of cells with enhanced tumorigenic potential and its activation was switched on and off in the cell lines studied, reflecting a plastic cellular process. Microarray analysis comparing the populations in which the reporter construct is active versus inactive showed that positive cells expressed higher mRNA levels of cytokines (IL-8, IL-6, TNF) and genes (such as ATF3, SNAI2, and KLF6) previously related with the CSC phenotype in breast cancer.

%B Front Oncol %V 4 %P 308 %8 2014 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25414831?dopt=Abstract %R 10.3389/fonc.2014.00308 %0 Journal Article %J Nature communications %D 2014 %T Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. %A Munro, Sarah A %A Lund, Steven P %A Pine, P Scott %A Binder, Hans %A Clevert, Djork-Arné %A Ana Conesa %A Dopazo, Joaquin %A Fasold, Mario %A Hochreiter, Sepp %A Hong, Huixiao %A Jafari, Nadereh %A Kreil, David P %A Labaj, Paweł P %A Li, Sheng %A Liao, Yang %A Lin, Simon M %A Meehan, Joseph %A Mason, Christopher E %A Santoyo-López, Javier %A Setterquist, Robert A %A Shi, Leming %A Shi, Wei %A Smyth, Gordon K %A Stralis-Pavese, Nancy %A Su, Zhenqiang %A Tong, Weida %A Wang, Charles %A Wang, Jian %A Xu, Joshua %A Ye, Zhan %A Yang, Yong %A Yu, Ying %A Salit, Marc %K RNA-seq %X There is a critical need for standard approaches to assess, report and compare the technical performance of genome-scale differential gene expression experiments. Here we assess technical performance with a proposed standard ’dashboard’ of metrics derived from analysis of external spike-in RNA control ratio mixtures. These control ratio mixtures with defined abundance ratios enable assessment of diagnostic performance of differentially expressed transcript lists, limit of detection of ratio (LODR) estimates and expression ratio variability and measurement bias. The performance metrics suite is applicable to analysis of a typical experiment, and here we also apply these metrics to evaluate technical performance among laboratories. An interlaboratory study using identical samples shared among 12 laboratories with three different measurement processes demonstrates generally consistent diagnostic power across 11 laboratories. Ratio measurement variability and bias are also comparable among laboratories for the same measurement process. We observe different biases for measurement processes using different mRNA-enrichment protocols. %B Nature communications %V 5 %P 5125 %8 2014 %G eng %U http://www.nature.com/ncomms/2014/140925/ncomms6125/full/ncomms6125.html %R 10.1038/ncomms6125 %0 Journal Article %J PloS one %D 2014 %T Combined genetic and high-throughput strategies for molecular diagnosis of inherited retinal dystrophies. %A de Castro-Miró, Marta %A Pomares, Esther %A Lorés-Motta, Laura %A Tonda, Raul %A Joaquín Dopazo %A Marfany, Gemma %A Gonzàlez-Duarte, Roser %X Most diagnostic laboratories are confronted with the increasing demand for molecular diagnosis from patients and families and the ever-increasing genetic heterogeneity of visual disorders. Concerning Retinal Dystrophies (RD), almost 200 causative genes have been reported to date, and most families carry private mutations. We aimed to approach RD genetic diagnosis using all the available genetic information to prioritize candidates for mutational screening, and then restrict the number of cases to be analyzed by massive sequencing. We constructed and optimized a comprehensive cosegregation RD-chip based on SNP genotyping and haplotype analysis. The RD-chip allows to genotype 768 selected SNPs (closely linked to 100 RD causative genes) in a single cost-, time-effective step. Full diagnosis was attained in 17/36 Spanish pedigrees, yielding 12 new and 12 previously reported mutations in 9 RD genes. The most frequently mutated genes were USH2A and CRB1. Notably, RD3-up to now only associated to Leber Congenital Amaurosis- was identified as causative of Retinitis Pigmentosa. The main assets of the RD-chip are: i) the robustness of the genetic information that underscores the most probable candidates, ii) the invaluable clues in cases of shared haplotypes, which are indicative of a common founder effect, and iii) the detection of extended haplotypes over closely mapping genes, which substantiates cosegregation, although the assumptions in which the genetic analysis is based could exceptionally lead astray. The combination of the genetic approach with whole exome sequencing (WES) greatly increases the diagnosis efficiency, and revealed novel mutations in USH2A and GUCY2D. Overall, the RD-chip diagnosis efficiency ranges from 16% in dominant, to 80% in consanguineous recessive pedigrees, with an average of 47%, well within the upper range of massive sequencing approaches, highlighting the validity of this time- and cost-effective approach whilst high-throughput methodologies become amenable for routine diagnosis in medium sized labs. %B PloS one %V 9 %P e88410 %8 2014 %G eng %U http://dx.plos.org/10.1371/journal.pone.0088410 %R 10.1371/journal.pone.0088410 %0 Journal Article %J Nature biotechnology %D 2014 %T A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. %A Su, Z. %A Labaj, P.P. %A .... %A Dopazo, J. %A .... %A Mason, C.E. %A Shi, L %K NGS %K RNA-seq %K SEQC %X We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings. %B Nature biotechnology %V 32 %P 903–914 %8 2014 Aug 24 %G eng %U http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.2957.html %R 10.1038/nbt.2957 %0 Journal Article %J Cancer research %D 2014 %T A Comprehensive DNA Methylation Profile of Epithelial-to-Mesenchymal Transition. %A Carmona, F Javier %A Davalos, Veronica %A Vidal, Enrique %A Gomez, Antonio %A Heyn, Holger %A Hashimoto, Yutaka %A Vizoso, Miguel %A Martinez-Cardus, Anna %A Sayols, Sergi %A Ferreira, Humberto %A Sanchez-Mut, Jose %A Moran, Sebastian %A Margeli, Mireia %A Castella, Eva %A Berdasco, Maria %A Stefansson, Olafur Andri %A Eyfjord, Jorunn E %A Gonzalez-Suarez, Eva %A Dopazo, Joaquin %A Orozco, Modesto %A Gut, Ivo %A Esteller, Manel %K Methyl-Seq %K Methylomics %K Next Generation Sequencing %X Epithelial-to-mesenchymal transition (EMT) is a plastic process in which fully differentiated epithelial cells are converted into poorly differentiated, migratory and invasive mesenchymal cells and it has been related to the metastasis potential of tumors. This is a reversible process and cells can also eventually undergo mesenchymal-to-epithelial transition (MET). The existence of a dynamic EMT process suggests the involvement of epigenetic shifts in the phenotype. Herein, we obtained the DNA methylomes at single-base resolution of MDCK cells undergoing epithelial-to-mesenchymal transition (EMT) and translated the identified differentially methylated regions (DMRs) to human breast cancer cells undergoing a gain of migratory and invasive capabilities associated with the EMT phenotype. We noticed dynamic and reversible changes of DNA methylation, both on promoter sequences and gene-bodies in association with transcription regulation of EMT-related genes. Most importantly, the identified DNA methylation markers of EMT were present in primary mammary tumors in association with the epithelial or the mesenchymal phenotype of the studied breast cancer samples. %B Cancer research %V 74 %P 5608–19 %8 2014 Aug 8 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/25106427 %R 10.1158/0008-5472.CAN-13-3659 %0 Journal Article %J Molecular Genetics & Genomic Medicine %D 2014 %T Deciphering intrafamilial phenotypic variability by exome sequencing in a Bardet–Biedl family %A González-del Pozo, María %A Méndez-Vidal, Cristina %A Santoyo-López, Javier %A Vela-Boza, Alicia %A Nereida Bravo-Gil %A Antonio Rueda %A García-Alonso, Luz %A Vázquez-Marouschek, Carmen %A Joaquín Dopazo %A Borrego, Salud %A Antiňolo, Guillermo %X Bardet–Biedl syndrome (BBS) is a model ciliopathy characterized by a wide range of clinical variability. The heterogeneity of this condition is reflected in the number of underlying gene defects and the epistatic interactions between the proteins encoded. BBS is generally inherited in an autosomal recessive trait. However, in some families, mutations across different loci interact to modulate the expressivity of the phenotype. In order to investigate the magnitude of epistasis in one BBS family with remarkable intrafamilial phenotypic variability, we designed an exome sequencing–based approach using SOLID 5500xl platform. This strategy allowed the reliable detection of the primary causal mutations in our family consisting of two novel compound heterozygous mutations in McKusick–Kaufman syndrome (MKKS) gene (p.D90G and p.V396F). Additionally, exome sequencing enabled the detection of one novel heterozygous NPHP4 variant which is predicted to activate a cryptic acceptor splice site and is only present in the most severely affected patient. Here, we provide an exome sequencing analysis of a BBS family and show the potential utility of this tool, in combination with network analysis, to detect disease-causing mutations and second-site modifiers. Our data demonstrate how next-generation sequencing (NGS) can facilitate the dissection of epistatic phenomena, and shed light on the genetic basis of phenotypic variability. %B Molecular Genetics & Genomic Medicine %V 2 %P 124-133 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/mgg3.50/full %R 10.1002/mgg3.50 %0 Journal Article %J Mol Genet Genomic Med %D 2014 %T Deciphering intrafamilial phenotypic variability by exome sequencing in a Bardet-Biedl family. %A González-del Pozo, María %A Méndez-Vidal, Cristina %A Santoyo-López, Javier %A Vela-Boza, Alicia %A Bravo-Gil, Nereida %A Rueda, Antonio %A García-Alonso, Luz %A Vázquez-Marouschek, Carmen %A Dopazo, Joaquin %A Borrego, Salud %A Antiňolo, Guillermo %X

Bardet-Biedl syndrome (BBS) is a model ciliopathy characterized by a wide range of clinical variability. The heterogeneity of this condition is reflected in the number of underlying gene defects and the epistatic interactions between the proteins encoded. BBS is generally inherited in an autosomal recessive trait. However, in some families, mutations across different loci interact to modulate the expressivity of the phenotype. In order to investigate the magnitude of epistasis in one BBS family with remarkable intrafamilial phenotypic variability, we designed an exome sequencing-based approach using SOLID 5500xl platform. This strategy allowed the reliable detection of the primary causal mutations in our family consisting of two novel compound heterozygous mutations in McKusick-Kaufman syndrome (MKKS) gene (p.D90G and p.V396F). Additionally, exome sequencing enabled the detection of one novel heterozygous NPHP4 variant which is predicted to activate a cryptic acceptor splice site and is only present in the most severely affected patient. Here, we provide an exome sequencing analysis of a BBS family and show the potential utility of this tool, in combination with network analysis, to detect disease-causing mutations and second-site modifiers. Our data demonstrate how next-generation sequencing (NGS) can facilitate the dissection of epistatic phenomena, and shed light on the genetic basis of phenotypic variability.

%B Mol Genet Genomic Med %V 2 %P 124-33 %8 2014 Mar %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/24689075?dopt=Abstract %R 10.1002/mgg3.50 %0 Journal Article %J PLoS One %D 2014 %T Exome sequencing reveals novel and recurrent mutations with clinical significance in inherited retinal dystrophies. %A González-del Pozo, María %A Méndez-Vidal, Cristina %A Bravo-Gil, Nereida %A Vela-Boza, Alicia %A Dopazo, Joaquin %A Borrego, Salud %A Antiňolo, Guillermo %K Adolescent %K Adult %K Amino Acid Sequence %K Base Sequence %K Child %K Chromosome Segregation %K DNA Mutational Analysis %K Exome %K Family %K Female %K Humans %K Inheritance Patterns %K Male %K Middle Aged %K Molecular Sequence Data %K mutation %K Pedigree %K Retinal Dystrophies %K Rhodopsin %X

This study aimed to identify the underlying molecular genetic cause in four Spanish families clinically diagnosed of Retinitis Pigmentosa (RP), comprising one autosomal dominant RP (adRP), two autosomal recessive RP (arRP) and one with two possible modes of inheritance: arRP or X-Linked RP (XLRP). We performed whole exome sequencing (WES) using NimbleGen SeqCap EZ Exome V3 sample preparation kit and SOLID 5500xl platform. All variants passing filter criteria were validated by Sanger sequencing to confirm familial segregation and the absence in local control population. This strategy allowed the detection of: (i) one novel heterozygous splice-site deletion in RHO, c.937-2_944del, (ii) one rare homozygous mutation in C2orf71, c.1795T>C; p.Cys599Arg, not previously associated with the disease, (iii) two heterozygous null mutations in ABCA4, c.2041C>T; p.R681* and c.6088C>T; p.R2030*, and (iv) one mutation, c.2405-2406delAG; p.Glu802Glyfs*31 in the ORF15 of RPGR. The molecular findings for RHO and C2orf71 confirmed the initial diagnosis of adRP and arRP, respectively, while patients with the two ABCA4 mutations, both previously associated with Stargardt disease, presented symptoms of RP with early macular involvement. Finally, the X-Linked inheritance was confirmed for the family with the RPGR mutation. This latter finding allowed the inclusion of carrier sisters in our preimplantational genetic diagnosis program.

%B PLoS One %V 9 %P e116176 %8 2014 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/25544989?dopt=Abstract %R 10.1371/journal.pone.0116176 %0 Journal Article %J Drug discovery today %D 2014 %T Genomics and transcriptomics in drug discovery. %A Dopazo, Joaquin %K adverse effects %K Drug discovery %K drug repositioning %K metagenomics %K modeling %K network analysis %K pathway analysis %X The popularization of genomic high-throughput technologies is causing a revolution in biomedical research and, particularly, is transforming the field of drug discovery. Systems biology offers a framework to understand the extensive human genetic heterogeneity revealed by genomic sequencing in the context of the network of functional, regulatory and physical protein-drug interactions. Thus, approaches to find biomarkers and therapeutic targets will have to take into account the complex system nature of the relationships of the proteins with the disease. Pharmaceutical companies will have to reorient their drug discovery strategies considering the human genetic heterogeneity. Consequently, modeling and computational data analysis will have an increasingly important role in drug discovery. %B Drug discovery today %V 19 %P 126-32 %8 2013 Jun 14 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/23773860 %R 10.1016/j.drudis.2013.06.003 %0 Journal Article %J Annals of Applied Biology %D 2014 %T Molecular interactions between sugar beet and Polymyxa betae during its life cycle %A N. Desoignies %A Carbonell, J. %A J.-S. Moreau %A A. Conesa %A Dopazo, J. %A A. Legrève %X Polymyxa betae is a biotrophic obligate sugar beet parasite that belongs to plasmodiophorids. The infection of sugar beet roots by this parasite is asymptomatic, except when it transmits Beet necrotic yellow vein virus (BNYVV), the causal agent of rhizomania. To date, there has been little work on P. betae–sugar beet molecular interactions, mainly because of the obligate nature of the parasite and also because research on rhizomania has tended to focus on the virus. In this study, we investigated these interactions through differential transcript analysis, using suppressive subtractive hybridization. The analysis included 76 P. betae and 120 sugar beet expressed sequence tags (ESTs). The expression of selected ESTs from both organisms was monitored during the protist life cycle, revealing a potential role of two P. betae proteins, profilin and a Von Willebrand factor domain-containing protein, in the early phase of infection. This study also revealed an over-expression of some sugar beet genes involved in defence, such as those encoding PR proteins, stress resistance proteins or lectins, especially during the plasmodial stage of the P. betae life cycle. In addition to providing new information on the molecular aspects of P. betae–sugar beet interactions, this study also enabled previously unknown ESTs of P. betae to be sequenced, thus enhancing our knowledge of the genome of this protist. %B Annals of Applied Biology %V 164 %P 244–256 %G eng %U http://onlinelibrary.wiley.com/doi/10.1111/aab.12095/abstract %R 10.1111/aab.12095 %0 Journal Article %J Human mutation %D 2014 %T A New Overgrowth Syndrome is Due to Mutations in RNF125. %A Tenorio, Jair %A Mansilla, Alicia %A Valencia, María %A Martínez-Glez, Víctor %A Romanelli, Valeria %A Arias, Pedro %A Castrejón, Nerea %A Poletta, Fernando %A Guillén-Navarro, Encarna %A Gordo, Gema %A Mansilla, Elena %A García-Santiago, Fé %A González-Casado, Isabel %A Vallespín, Elena %A Palomares, María %A Mori, María A %A Santos-Simarro, Fernando %A García-Miñaur, Sixto %A Fernández, Luis %A Mena, Rocío %A Benito-Sanz, Sara %A Del Pozo, Angela %A Silla, Juan Carlos %A Ibañez, Kristina %A López-Granados, Eduardo %A Martín-Trujillo, Alex %A Montaner, David %A Heath, Karen E %A Campos-Barros, Angel %A Joaquín Dopazo %A Nevado, Julián %A Monk, David %A Ruiz-Pérez, Víctor L %A Lapunzina, Pablo %K NGS %K prioritization %K Rare Disease %X Overgrowth syndromes (OGS) are a group of disorders in which all parameters of growth and physical development are above the mean for age and sex. We evaluated a series of 270 families from the Spanish Overgrowth Syndrome Registry with no known overgrowth syndrome. We identified one de novo deletion and three missense mutations in RNF125 in six patients from 4 families with overgrowth, macrocephaly, intellectual disability, mild hydrocephaly, hypoglycaemia and inflammatory diseases resembling Sjögren syndrome. RNF125 encodes an E3 ubiquitin ligase and is a novel gene of OGS. Our studies of the RNF125 pathway point to upregulation of RIG-I-IPS1-MDA5 and/or disruption of the PI3K-AKT and interferon signaling pathways as the putative final effectors. This article is protected by copyright. All rights reserved. %B Human mutation %V 35 %P 1436–1441 %8 2014 Sep 5 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/humu.22689/abstract %R 10.1002/humu.22689 %0 Journal Article %J Bioinformatics %D 2014 %T ngsCAT: a tool to assess the efficiency of targeted enrichment sequencing. %A López-Domingo, Francisco J %A Florido, Javier P %A Rueda, Antonio %A Dopazo, Joaquin %A Santoyo-López, Javier %K Exome %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Sequence Analysis, DNA %K Software %X

MOTIVATION: Targeted enrichment sequencing by next-generation sequencing is a common approach to interrogate specific loci or the whole exome in the human genome. The efficiency and the lack of bias in the enrichment process need to be assessed as a quality control step before performing downstream analysis of the sequence data. Tools that can report on the sensitivity, specificity, uniformity and other enrichment-specific features are needed.

RESULTS: We have implemented the next-generation sequencing data Capture Assessment Tool (ngsCAT), a tool that takes the information of the mapped reads and the coordinates of the targeted regions as input files, and generates a report with metrics and figures that allows the evaluation of the efficiency of the enrichment process. The tool can also take as input the information of two samples allowing the comparison of two different experiments.

AVAILABILITY AND IMPLEMENTATION: Documentation and downloads for ngsCAT can be found at http://www.bioinfomgp.org/ngscat.

%B Bioinformatics %V 30 %P 1767-8 %8 2014 Jun 15 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/24578402?dopt=Abstract %R 10.1093/bioinformatics/btu108 %0 Journal Article %J Neuromuscular disorders : NMD %D 2014 %T A novel locus for a hereditary recurrent neuropathy on chromosome 21q21. %A Calpena, E %A Martínez-Rubio, D %A Arpa, J %A García-Peñas, J J %A Montaner, D. %A Dopazo, J. %A Palau, F %A Espinós, C %X Hereditary recurrent neuropathies are uncommon. Disorders with a known molecular basis falling within this group include hereditary neuropathy with liability to pressure palsies (HNPP) due to the deletion of the PMP22 gene or to mutations in this same gene, and hereditary neuralgic amyotrophy (HNA) caused by mutations in the SEPT9 gene. We report a three-generation family presenting a hereditary recurrent neuropathy without pathological changes in either PMP22 or SEPT9 genes. We performed a genome-wide mapping, which yielded a locus of 12.4Mb on chromosome 21q21. The constructed haplotype fully segregated with the disease and we found significant evidence of linkage. After mutational screening of genes located within this locus, encoding for proteins and microRNAs, as well as analysis of large deletions/insertions, we identified 71 benign polymorphisms. Our findings suggest a novel genetic locus for a recurrent hereditary neuropathy of which the molecular defect remains elusive. Our results further underscore the clinical and genetic heterogeneity of this group of neuropathies. %B Neuromuscular disorders : NMD %V 24 %P 660-5 %8 2014 May 9 %G eng %U http://www.sciencedirect.com/science/article/pii/S0960896614001060# %R 10.1016/j.nmd.2014.04.004 %0 Journal Article %J BMC Genet %D 2014 %T Novel RP1 mutations and a recurrent BBS1 variant explain the co-existence of two distinct retinal phenotypes in the same pedigree. %A Méndez-Vidal, Cristina %A Bravo-Gil, Nereida %A González-del Pozo, María %A Vela-Boza, Alicia %A Dopazo, Joaquin %A Borrego, Salud %A Antiňolo, Guillermo %K Bardet-Biedl Syndrome %K Base Sequence %K Case-Control Studies %K DNA Mutational Analysis %K Eye Proteins %K Genes, Recessive %K Genetic Association Studies %K Humans %K Microsatellite Repeats %K Microtubule-Associated Proteins %K Mutation, Missense %K Pedigree %K Phenotype %K Retina %K Retinitis pigmentosa %X

BACKGROUND: Molecular diagnosis of Inherited Retinal Dystrophies (IRD) has long been challenging due to the extensive clinical and genetic heterogeneity present in this group of disorders. Here, we describe the clinical application of an integrated next-generation sequencing approach to determine the underlying genetic defects in a Spanish family with a provisional clinical diagnosis of autosomal recessive Retinitis Pigmentosa (arRP).

RESULTS: Exome sequencing of the index patient resulted in the identification of the homozygous BBS1 p.M390R mutation. Sanger sequencing of additional members of the family showed lack of co-segregation of the p.M390R variant in some individuals. Clinical reanalysis indicated co-ocurrence of two different phenotypes in the same family: Bardet-Biedl syndrome in the individual harboring the BBS1 mutation and non-syndromic arRP in extended family members. To identify possible causative mutations underlying arRP, we conducted disease-targeted gene sequencing using a panel of 26 IRD genes. The in-house custom panel was validated using 18 DNA samples known to harbor mutations in relevant genes. All variants were redetected, indicating a high mutation detection rate. This approach allowed the identification of two novel heterozygous null mutations in RP1 (c.4582_4585delATCA; p.I1528Vfs*10 and c.5962dupA; p.I1988Nfs*3) which co-segregated with the disease in arRP patients. Additionally, a mutational screening in 96 patients of our cohort with genetically unresolved IRD revealed the presence of the c.5962dupA mutation in one unrelated family.

CONCLUSIONS: The combination of molecular findings for RP1 and BBS1 genes through exome and gene panel sequencing enabled us to explain the co-existence of two different retinal phenotypes in a family. The identification of two novel variants in RP1 suggests that the use of panels containing the prevalent genes of a particular population, together with an optimized data analysis pipeline, is an efficient and cost-effective approach that can be reliably implemented into the routine diagnostic process of diverse inherited retinal disorders. Moreover, the identification of these novel variants in two unrelated families supports the relatively high prevalence of RP1 mutations in Spanish population and the role of private mutations for commonly mutated genes, while extending the mutational spectrum of RP1.

%B BMC Genet %V 15 %P 143 %8 2014 Dec 14 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25494902?dopt=Abstract %R 10.1186/s12863-014-0143-2 %0 Journal Article %J BMC Syst Biol %D 2014 %T Pathway network inference from gene expression data. %A Ponzoni, Ignacio %A Nueda, María %A Tarazona, Sonia %A Götz, Stefan %A Montaner, David %A Dussaut, Julieta %A Dopazo, Joaquin %A Conesa, Ana %K Alzheimer Disease %K Cell Cycle %K DNA Replication %K Gene Expression Profiling %K Gene Regulatory Networks %K Gluconeogenesis %K Glycolysis %K Oxidative Phosphorylation %K Proteolysis %K Purines %K Saccharomyces cerevisiae %K Systems biology %K Ubiquitin %X

BACKGROUND: The development of high-throughput omics technologies enabled genome-wide measurements of the activity of cellular elements and provides the analytical resources for the progress of the Systems Biology discipline. Analysis and interpretation of gene expression data has evolved from the gene to the pathway and interaction level, i.e. from the detection of differentially expressed genes, to the establishment of gene interaction networks and the identification of enriched functional categories. Still, the understanding of biological systems requires a further level of analysis that addresses the characterization of the interaction between functional modules.

RESULTS: We present a novel computational methodology to study the functional interconnections among the molecular elements of a biological system. The PANA approach uses high-throughput genomics measurements and a functional annotation scheme to extract an activity profile from each functional block -or pathway- followed by machine-learning methods to infer the relationships between these functional profiles. The result is a global, interconnected network of pathways that represents the functional cross-talk within the molecular system. We have applied this approach to describe the functional transcriptional connections during the yeast cell cycle and to identify pathways that change their connectivity in a disease condition using an Alzheimer example.

CONCLUSIONS: PANA is a useful tool to deepen in our understanding of the functional interdependences that operate within complex biological systems. We show the approach is algorithmically consistent and the inferred network is well supported by the available functional data. The method allows the dissection of the molecular basis of the functional connections and we describe the different regulatory mechanisms that explain the network's topology obtained for the yeast cell cycle data.

%B BMC Syst Biol %V 8 Suppl 2 %P S7 %8 2014 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25032889?dopt=Abstract %R 10.1186/1752-0509-8-S2-S7 %0 Journal Article %J PLoS One %D 2014 %T Permanent cardiac sarcomere changes in a rabbit model of intrauterine growth restriction. %A Torre, Iratxe %A González-Tendero, Anna %A García-Cañadilla, Patricia %A Crispi, Fátima %A Garcia-Garcia, Francisco %A Bijnens, Bart %A Iruretagoyena, Igor %A Dopazo, Joaquin %A Amat-Roldán, Ivan %A Gratacós, Eduard %K Animals %K biomarkers %K Blood Pressure %K Body Weight %K Disease Models, Animal %K Echocardiography %K Female %K Fetal Growth Retardation %K Fetal Heart %K Fetus %K Gene Expression Profiling %K Organ Size %K Placenta %K Pregnancy %K Rabbits %K Sarcomeres %X

BACKGROUND: Intrauterine growth restriction (IUGR) induces fetal cardiac remodelling and dysfunction, which persists postnatally and may explain the link between low birth weight and increased cardiovascular mortality in adulthood. However, the cellular and molecular bases for these changes are still not well understood. We tested the hypothesis that IUGR is associated with structural and functional gene expression changes in the fetal sarcomere cytoarchitecture, which remain present in adulthood.

METHODS AND RESULTS: IUGR was induced in New Zealand pregnant rabbits by selective ligation of the utero-placental vessels. Fetal echocardiography demonstrated more globular hearts and signs of cardiac dysfunction in IUGR. Second harmonic generation microscopy (SHGM) showed shorter sarcomere length and shorter A-band and thick-thin filament interaction lengths, that were already present in utero and persisted at 70 postnatal days (adulthood). Sarcomeric M-band (GO: 0031430) functional term was over-represented in IUGR fetal hearts.

CONCLUSION: The results suggest that IUGR induces cardiac dysfunction and permanent changes on the sarcomere.

%B PLoS One %V 9 %P e113067 %8 2014 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/25402351?dopt=Abstract %R 10.1371/journal.pone.0113067 %0 Journal Article %J Journal of experimental botany %D 2014 %T Programmed cell death activated by Rose Bengal in Arabidopsis thaliana cell suspension cultures requires functional chloroplasts. %A Gutiérrez, Jorge %A González-Pérez, Sergio %A Garcia-Garcia, Francisco %A Daly, Cara T %A Lorenzo, Oscar %A Revuelta, José L %A McCabe, Paul F %A Arellano, Juan B %X Light-grown Arabidopsis thaliana cell suspension culture (ACSC) were subjected to mild photooxidative damage with Rose Bengal (RB) with the aim of gaining a better understanding of singlet oxygen-mediated defence responses in plants. Additionally, ACSC were treated with H2O2 at concentrations that induced comparable levels of protein oxidation damage. Under low to medium light conditions, both RB and H2O2 treatments activated transcriptional defence responses and inhibited photosynthetic activity, but they differed in that programmed cell death (PCD) was only observed in cells treated with RB. When dark-grown ACSC were subjected to RB in the light, PCD was suppressed, indicating that the singlet oxygen-mediated signalling pathway in ACSC requires functional chloroplasts. Analysis of up-regulated transcripts in light-grown ACSC, treated with RB in the light, showed that both singlet oxygen-responsive transcripts and transcripts with a key role in hormone-activated PCD (i.e. ethylene and jasmonic acid) were present. A co-regulation analysis proved that ACSC treated with RB exhibited higher correlation with the conditional fluorescence (flu) mutant than with other singlet oxygen-producing mutants or wild-type plants subjected to high light. However, there was no evidence for the up-regulation of EDS1, suggesting that activation of PCD was not associated with the EXECUTER- and EDS1-dependent signalling pathway described in the flu mutant. Indigo Carmine and Methylene Violet, two photosensitizers unable to enter chloroplasts, did not activate transcriptional defence responses in ACSC; however, whether this was due to their location or to their inherently low singlet oxygen quantum efficiencies was not determined. %B Journal of experimental botany %8 2014 Apr 10 %G eng %U http://jxb.oxfordjournals.org/content/early/2014/04/09/jxb.eru151.long %R 10.1093/jxb/eru151 %0 Journal Article %J Mol Syst Biol %D 2014 %T The role of the interactome in the maintenance of deleterious variability in human populations. %A García-Alonso, Luz %A Jiménez-Almazán, Jorge %A Carbonell-Caballero, José %A Vela-Boza, Alicia %A Santoyo-López, Javier %A Antiňolo, Guillermo %A Dopazo, Joaquin %K Alleles %K Exome %K Gene Library %K Genetic Variation %K Genetics, Population %K Genome, Human %K Genomics %K Humans %K Models, Genetic %K mutation %K Phenotype %K Protein Conformation %K Protein Interaction Maps %K Sequence Analysis, DNA %K Whites %X

Recent genomic projects have revealed the existence of an unexpectedly large amount of deleterious variability in the human genome. Several hypotheses have been proposed to explain such an apparently high mutational load. However, the mechanisms by which deleterious mutations in some genes cause a pathological effect but are apparently innocuous in other genes remain largely unknown. This study searched for deleterious variants in the 1,000 genomes populations, as well as in a newly sequenced population of 252 healthy Spanish individuals. In addition, variants causative of monogenic diseases and somatic variants from 41 chronic lymphocytic leukaemia patients were analysed. The deleterious variants found were analysed in the context of the interactome to understand the role of network topology in the maintenance of the observed mutational load. Our results suggest that one of the mechanisms whereby the effect of these deleterious variants on the phenotype is suppressed could be related to the configuration of the protein interaction network. Most of the deleterious variants observed in healthy individuals are concentrated in peripheral regions of the interactome, in combinations that preserve their connectivity, and have a marginal effect on interactome integrity. On the contrary, likely pathogenic cancer somatic deleterious variants tend to occur in internal regions of the interactome, often with associated structural consequences. Finally, variants causative of monogenic diseases seem to occupy an intermediate position. Our observations suggest that the real pathological potential of a variant might be more a systems property rather than an intrinsic property of individual proteins.

%B Mol Syst Biol %V 10 %P 752 %8 2014 Sep 26 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/25261458?dopt=Abstract %R 10.15252/msb.20145222 %0 Journal Article %J Fungal Genet Biol %D 2014 %T Sequencing and functional analysis of the genome of a nematode egg-parasitic fungus, Pochonia chlamydosporia. %A Larriba, Eduardo %A Jaime, María D L A %A Carbonell-Caballero, José %A Conesa, Ana %A Dopazo, Joaquin %A Nislow, Corey %A Martín-Nieto, José %A Lopez-Llorca, Luis Vicente %K Animals %K Ascomycota %K Female %K Gene Expression Regulation, Fungal %K Gene ontology %K Genome, Fungal %K Hordeum %K Host-Pathogen Interactions %K Nematoda %K Ovum %K Phylogeny %K Plant Roots %K Sequence Analysis, DNA %K Signal Transduction %K Transcriptome %X

Pochonia chlamydosporia is a worldwide-distributed soil fungus with a great capacity to infect and destroy the eggs and kill females of plant-parasitic nematodes. Additionally, it has the ability to colonize endophytically roots of economically-important crop plants, thereby promoting their growth and eliciting plant defenses. This multitrophic behavior makes P. chlamydosporia a potentially useful tool for sustainable agriculture approaches. We sequenced and assembled ∼41 Mb of P. chlamydosporia genomic DNA and predicted 12,122 gene models, of which many were homologous to genes of fungal pathogens of invertebrates and fungal plant pathogens. Predicted genes (65%) were functionally annotated according to Gene Ontology, and 16% of them found to share homology with genes in the Pathogen Host Interactions (PHI) database. The genome of this fungus is highly enriched in genes encoding hydrolytic enzymes, such as proteases, glycoside hydrolases and carbohydrate esterases. We used RNA-Seq technology in order to identify the genes expressed during endophytic behavior of P. chlamydosporia when colonizing barley roots. Functional annotation of these genes showed that hydrolytic enzymes and transporters are expressed during endophytism. This structural and functional analysis of the P. chlamydosporia genome provides a starting point for understanding the molecular mechanisms involved in the multitrophic lifestyle of this fungus. The genomic information provided here should also prove useful for enhancing the capabilities of this fungus as a biocontrol agent of plant-parasitic nematodes and as a plant growth-promoting organism.

%B Fungal Genet Biol %V 65 %P 69-80 %8 2014 Apr %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/24530791?dopt=Abstract %R 10.1016/j.fgb.2014.02.002 %0 Journal Article %J Hum Mutat %D 2014 %T Two novel mutations in the BCKDK (branched-chain keto-acid dehydrogenase kinase) gene are responsible for a neurobehavioral deficit in two pediatric unrelated patients. %A García-Cazorla, Angels %A Oyarzabal, Alfonso %A Fort, Joana %A Robles, Concepción %A Castejón, Esperanza %A Ruiz-Sala, Pedro %A Bodoy, Susanna %A Merinero, Begoña %A Lopez-Sala, Anna %A Dopazo, Joaquin %A Nunes, Virginia %A Ugarte, Magdalena %A Artuch, Rafael %A Palacín, Manuel %A Rodríguez-Pombo, Pilar %A Alcaide, Patricia %A Navarrete, Rosa %A Sanz, Paloma %A Font-Llitjós, Mariona %A Vilaseca, Ma Antonia %A Ormaizabal, Aida %A Pristoupilova, Anna %A Agulló, Sergi Beltran %K Amino Acids, Branched-Chain %K Developmental Disabilities %K Fibroblasts %K Humans %K Male %K Mutation, Missense %K Nervous System Diseases %K Pediatrics %K Protein Kinases %X

Inactivating mutations in the BCKDK gene, which codes for the kinase responsible for the negative regulation of the branched-chain α-keto acid dehydrogenase complex (BCKD), have recently been associated with a form of autism in three families. In this work, two novel exonic BCKDK mutations, c.520C>G/p.R174G and c.1166T>C/p.L389P, were identified at the homozygous state in two unrelated children with persistently reduced body fluid levels of branched-chain amino acids (BCAAs), developmental delay, microcephaly, and neurobehavioral abnormalities. Functional analysis of the mutations confirmed the missense character of the c.1166T>C change and showed a splicing defect r.[520c>g;521_543del]/p.R174Gfs1*, for c.520C>G due to the presence of a new donor splice site. Mutation p.L389P showed total loss of kinase activity. Moreover, patient-derived fibroblasts showed undetectable (p.R174Gfs1*) or barely detectable (p.L389P) levels of BCKDK protein and its phosphorylated substrate (phospho-E1α), resulting in increased BCKD activity and the very rapid BCAA catabolism manifested by the patients' clinical phenotype. Based on these results, a protein-rich diet plus oral BCAA supplementation was implemented in the patient homozygous for p.R174Gfs1*. This treatment normalized plasma BCAA levels and improved growth, developmental and behavioral variables. Our results demonstrate that BCKDK mutations can result in neurobehavioral deficits in humans and support the rationale for dietary intervention.

%B Hum Mutat %V 35 %P 470-7 %8 2014 Apr %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/24449431?dopt=Abstract %R 10.1002/humu.22513 %0 Journal Article %J Human mutation %D 2014 %T Two Novel Mutations in the BCKDK Gene (Branched-Chain Keto-Acid Dehydrogenase Kinase) are Responsible of a Neurobehavioral Deficit in two Pediatric Unrelated Patients. %A García-Cazorla, Angels %A Oyarzabal, Alfonso %A Fort, Joana %A Robles, Concepción %A Castejón, Esperanza %A Ruiz-Sala, Pedro %A Bodoy, Susanna %A Merinero, Begoña %A Lopez-Sala, Anna %A Joaquín Dopazo %A Nunes, Virginia %A Ugarte, Magdalena %A Artuch, Rafael %A Palacín, Manuel %A Rodríguez-Pombo, Pilar %X Inactivating mutations in the BCKDK gene, which codes for the kinase responsible for the negative regulation of the branched-chain keto-acid dehydrogenase complex (BCKD), have recently been associated with a form of autism in three families. In this work, two novel exonic BCKDK mutations, c.520C>G/p.R174G and c.1166T>C/p.L389P, were identified at the homozygous state in two unrelated children with persistently reduced body fluid levels of branched-chain amino acids (BCAAs), developmental delay, microcephaly and neurobehavioral abnormalities. Functional analysis of the mutations confirmed the missense character of the c.1166T>C change and showed a splicing defect r.[520c>g;521_543del]/p.R174Gfs1*, for c.520C>G due to the presence of a new donor splice site. Mutation p.L389P showed total loss of kinase activity. Moreover, patient-derived fibroblasts showed undetectable (p.R174Gfs1*) or barely detectable (p.L389P) levels of BCKDK protein and its phosphorylated substrate (phospho-E1α), resulting in increased BCKD activity and the very rapid BCAA catabolism manifested by the patients’ clinical phenotype. Based on these results, a protein-rich diet plus oral BCAA supplementation was implemented in the patient homozygous for p.R174Gfs1*. This treatment normalized plasma BCAA levels and improved growth, developmental and behavioral variables. Our results demonstrate that BCKDK mutations can result in neurobehavioral deficits in humans and support the rationale for dietary intervention. This article is protected by copyright. All rights reserved. %B Human mutation %V 35 %P 470-7 %8 2014 Jan 21 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/humu.22513/abstract %R 10.1002/humu.22513 %0 Journal Article %J BMC systems biology %D 2014 %T Understanding disease mechanisms with models of signaling pathway activities. %A Sebastián-Leon, Patricia %A Vidal, Enrique %A Minguez, Pablo %A Ana Conesa %A Sonia Tarazona %A Amadoz, Alicia %A Armero, Carmen %A Salavert, Francisco %A Vidal-Puig, Antonio %A Montaner, David %A Joaquín Dopazo %K Disease mechanism %K pathway %K signalling %K Systems biology %X BackgroundUnderstanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine.ResultsHere we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation statuses can be interpreted as biomarkers that discriminate among the compared conditions. This type of mechanism-based biomarkers accounts for cell functional activities and can easily be associated to disease or drug action mechanisms. The accuracy of the proposed model is demonstrated with simulations and real datasets.ConclusionsThe proposed model provides detailed information that enables the interpretation disease mechanisms as a consequence of the complex combinations of altered gene expression values. Moreover, it offers a framework for suggesting possible ways of therapeutic intervention in a pathologically perturbed system. %B BMC systems biology %V 8 %P 121 %8 2014 Oct 25 %G eng %U http://www.biomedcentral.com/1752-0509/8/121/abstract %R 10.1186/s12918-014-0121-3 %0 Journal Article %J BMC systems biology %D 2014 %T Understanding disease mechanisms with models of signaling pathway activities %A Sebastián-Leon, Patricia %A Vidal, Enrique %A Minguez, Pablo %A Conesa, Ana %A Tarazona, Sonia %A Amadoz, Alicia %A Armero, Carmen %A Salavert Torres, Francisco %A Vidal-Puig, Antonio %A Montaner, David %A Dopazo, Joaquin %B BMC systems biology %V 8 %P 121 %8 10 %G eng %R 10.1186/s12918-014-0121-3 %0 Journal Article %J Nucleic acids research %D 2014 %T A web tool for the design and management of panels of genes for targeted enrichment and massive sequencing for clinical applications. %A Alemán, Alejandro %A Garcia-Garcia, Francisco %A Medina, Ignacio %A Joaquín Dopazo %K Diagnostic %K Targeted enrichment sequencing %K WES %X Disease targeted sequencing is gaining importance as a powerful and cost-effective application of high throughput sequencing technologies to the diagnosis. However, the lack of proper tools to process the data hinders its extensive adoption. Here we present TEAM, an intuitive and easy-to-use web tool that fills the gap between the predicted mutations and the final diagnostic in targeted enrichment sequencing analysis. The tool searches for known diagnostic mutations, corresponding to a disease panel, among the predicted patient’s variants. Diagnostic variants for the disease are taken from four databases of disease-related variants (HGMD-public, HUMSAVAR, ClinVar and COSMIC.) If no primary diagnostic variant is found, then a list of secondary findings that can help to establish a diagnostic is produced. TEAM also provides with an interface for the definition of and customization of panels, by means of which, genes and mutations can be added or discarded to adjust panel definitions. TEAM is freely available at: http://team.babelomics.org. %B Nucleic acids research %V 42 %P W83-W87 %8 2014 May 26 %G eng %U http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=24861626 %R 10.1093/nar/gku472 %0 Journal Article %J Nucleic acids research %D 2014 %T A web-based interactive framework to assist in the prioritization of disease candidate genes in whole-exome sequencing studies. %A Alemán, Alejandro %A Garcia-Garcia, Francisco %A Salavert, Francisco %A Medina, Ignacio %A Joaquín Dopazo %K NGS. prioritization %X Whole-exome sequencing has become a fundamental tool for the discovery of disease-related genes of familial diseases and the identification of somatic driver variants in cancer. However, finding the causal mutation among the enormous background of individual variability in a small number of samples is still a big challenge. Here we describe a web-based tool, BiERapp, which efficiently helps in the identification of causative variants in family and sporadic genetic diseases. The program reads lists of predicted variants (nucleotide substitutions and indels) in affected individuals or tumor samples and controls. In family studies, different modes of inheritance can easily be defined to filter out variants that do not segregate with the disease along the family. Moreover, BiERapp integrates additional information such as allelic frequencies in the general population and the most popular damaging scores to further narrow down the number of putative variants in successive filtering steps. BiERapp provides an interactive and user-friendly interface that implements the filtering strategy used in the context of a large-scale genomic project carried out by the Spanish Network for Research in Rare Diseases (CIBERER) in which more than 800 exomes have been analyzed. BiERapp is freely available at: http://bierapp.babelomics.org/ %B Nucleic acids research %V 42 %P W88-W93. %8 2014 May 6 %G eng %U http://nar.oxfordjournals.org/content/42/W1/W88 %R 10.1093/nar/gku407 %0 Journal Article %J Omics : a journal of integrative biology %D 2013 %T Assessing Differential Expression Measurements by Highly Parallel Pyrosequencing and DNA Microarrays: A Comparative Study. %A Ariño, Joaquín %A Casamayor, Antonio %A Pérez, Julián Perez %A Pedrola, Laia %A Alvarez-Tejado, Miguel %A Marbà, Martina %A Santoyo, Javier %A Joaquín Dopazo %X

Abstract To explore the feasibility of pyrosequencing for quantitative differential gene expression analysis we have performed a comparative study of the results of the sequencing experiments to those obtained by a conventional DNA microarray platform. A conclusion from our analysis is that, over a threshold of 35 normalized reads per gene, the measurements of gene expression display a good correlation with the references. The observed concordance between pyrosequencing and DNA microarray platforms beyond the threshold was of 0.8, measured as a Pearson’s correlation coefficient. In differential gene expression the initial aim is the quantification the differences among transcripts when comparing experimental conditions. Thus, even in a scenario of low coverage the concordance in the measurements is quite acceptable. On the other hand, the comparatively longer read size obtained by pyrosequencing allows detecting unconventional splicing forms.

%B Omics : a journal of integrative biology %8 2011 Sep 15 %G eng %U http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3545353/ %R 10.1089/omi.2011.0065 %0 Journal Article %J Oncotarget %D 2013 %T Capturing the biological impact of CDKN2A and MC1R genes as an early predisposing event in melanoma and non melanoma skin cancer. %A Puig-Butille, Joan Anton %A Escamez, Maria José %A Garcia-Garcia, Francisco %A Tell-Marti, Gemma %A Fabra, Angels %A Martínez-Santamaría, Lucía %A Badenas, Celia %A Aguilera, Paula %A Pevida, Marta %A Joaquín Dopazo %A Del Rio, Marcela %A Puig, Susana %X Germline mutations in CDKN2A and/or red hair color variants in MC1R genes are associated with an increased susceptibility to develop cutaneous melanoma or non melanoma skin cancer. We studied the impact of the CDKN2A germinal mutation p.G101W and MC1R variants on gene expression and transcription profiles associated with skin cancer. To this end we set-up primary skin cell co-cultures from siblings of melanoma prone-families that were later analyzed using the expression array approach. As a result, we found that 1535 transcripts were deregulated in CDKN2A mutated cells, with over-expression of immunity-related genes (HLA-DPB1, CLEC2B, IFI44, IFI44L, IFI27, IFIT1, IFIT2, SP110 and IFNK) and down-regulation of genes playing a role in the Notch signaling pathway. 3570 transcripts were deregulated in MC1R variant carriers. In particular, genes related to oxidative stress and DNA damage pathways were up-regulated as well as genes associated with neurodegenerative diseases such as Parkinson’s, Alzheimer and Huntington. Finally, we observed that the expression signatures indentified in phenotypically normal cells carrying CDKN2A mutations or MC1R variants are maintained in skin cancer tumors (melanoma and squamous cell carcinoma). These results indicate that transcriptome deregulation represents an early event critical for skin cancer development. %B Oncotarget %8 2013 Dec 16 %G eng %U http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path%5B%5D=1444&path%5B%5D=1824 %0 Journal Article %J PLoS One %D 2013 %T Defining the genomic signature of totipotency and pluripotency during early human development. %A Galan, Amparo %A Diaz-Gimeno, Patricia %A Poo, Maria Eugenia %A Valbuena, Diana %A Sanchez, Eva %A Ruiz, Veronica %A Dopazo, Joaquin %A Montaner, David %A Conesa, Ana %A Simon, Carlos %K Blastocyst Inner Cell Mass %K Blastomeres %K Cell Differentiation %K Embryonic Development %K Embryonic Stem Cells %K Gene Expression Profiling %K Gene Regulatory Networks %K Genome, Human %K Humans %K Molecular Sequence Annotation %K Pluripotent Stem Cells %K Totipotent Stem Cells %X

The genetic mechanisms governing human pre-implantation embryo development and the in vitro counterparts, human embryonic stem cells (hESCs), still remain incomplete. Previous global genome studies demonstrated that totipotent blastomeres from day-3 human embryos and pluripotent inner cell masses (ICMs) from blastocysts, display unique and differing transcriptomes. Nevertheless, comparative gene expression analysis has revealed that no significant differences exist between hESCs derived from blastomeres versus those obtained from ICMs, suggesting that pluripotent hESCs involve a new developmental progression. To understand early human stages evolution, we developed an undifferentiation network signature (UNS) and applied it to a differential gene expression profile between single blastomeres from day-3 embryos, ICMs and hESCs. This allowed us to establish a unique signature composed of highly interconnected genes characteristic of totipotency (61 genes), in vivo pluripotency (20 genes), and in vitro pluripotency (107 genes), and which are also proprietary according to functional analysis. This systems biology approach has led to an improved understanding of the molecular and signaling processes governing human pre-implantation embryo development, as well as enabling us to comprehend how hESCs might adapt to in vitro culture conditions.

%B PLoS One %V 8 %P e62135 %8 2013 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/23614026?dopt=Abstract %R 10.1371/journal.pone.0062135 %0 Journal Article %J J Biol Regul Homeost Agents %D 2013 %T Differential gene-expression analysis defines a molecular pattern related to olive pollen allergy. %A Aguerri, M %A Calzada, D %A Montaner, D %A Mata, M %A Florido, F %A Quiralte, J %A Dopazo, J %A Lahoz, C %A Cardaba, B %K Adult %K Female %K Gene Expression Profiling %K Humans %K Male %K Middle Aged %K Olea %K Principal Component Analysis %K Rhinitis, Allergic, Seasonal %X

Analysis of gene-expression profiles by microarrays is useful for characterization of candidate genes, key regulatory networks, and to define phenotypes or molecular signatures which improve the diagnosis and/or classification of the allergic processes. We have used this approach in the study of olive pollen response in order to find differential molecular markers among responders and non-responders to this allergenic source. Five clinical groups, non-allergic, asymptomatic, allergic but not to olive pollen, untreated-olive-pollen allergic patients and olive-pollen allergic patients (under specific-immunotherapy), were assessed during and outside pollen seasons. Whole-genome gene expression analysis was performed in RNAs extracted from PBMCs. After assessment of data quality and principal components analysis (PCA), differential gene-expression, by multiple testing and, functional analyses by KEGG, for pathways and Gene-Ontology for biological processes were performed. Relevance was defined by fold change and corrected P values (less than 0.05). The most differential genes were validated by qRT-PCR in a larger set of individuals. Interestingly, gene-expression profiling obtained by PCA clearly showed five clusters of samples that correlated with the five clinical groups. Furthermore, differential gene expression and functional analyses revealed differential genes and pathways in the five clinical groups. The 93 most significant genes found were validated, and one set of 35 genes was able to discriminate profiles of olive pollen response. Our results, in addition to providing new information on allergic response, define a possible molecular signature for olive pollen allergy which could be useful for the diagnosis and treatment of this and other sensitizations.

%B J Biol Regul Homeost Agents %V 27 %P 337-50 %8 2013 Apr-Jun %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/23830385?dopt=Abstract %0 Book Section %B III Jornadas de Intercambio de Experiencias de Innovación Educativa en Estadística %D 2013 %T Docencia en Estadística: Experiencias de Innovación %A Garcia-Garcia, Francisco %A Montaner, David %B III Jornadas de Intercambio de Experiencias de Innovación Educativa en Estadística %V 1 %P 201-210 %G eng %0 Journal Article %J Mol Genet Metab %D 2013 %T Exome sequencing identifies a new mutation in SERAC1 in a patient with 3-methylglutaconic aciduria. %A Tort, Frederic %A García-Silva, María Teresa %A Ferrer-Cortès, Xènia %A Navarro-Sastre, Aleix %A Garcia-Villoria, Judith %A Coll, Maria Josep %A Vidal, Enrique %A Jiménez-Almazán, Jorge %A Dopazo, Joaquin %A Briones, Paz %A Elpeleg, Orly %A Ribes, Antonia %K Adolescent %K Adult %K Carboxylic Ester Hydrolases %K Child %K Exome %K Female %K High-Throughput Nucleotide Sequencing %K Humans %K Infant %K Male %K Metabolism, Inborn Errors %K mutation %X

3-Methylglutaconic aciduria (3-MGA-uria) is a heterogeneous group of syndromes characterized by an increased excretion of 3-methylglutaconic and 3-methylglutaric acids. Five types of 3-MGA-uria (I to V) with different clinical presentations have been described. Causative mutations in TAZ, OPA3, DNAJC19, ATP12, ATP5E, and TMEM70 have been identified. After excluding the known genetic causes of 3-MGA-uria we used exome sequencing to investigate a patient with Leigh syndrome and 3-MGA-uria. We identified a homozygous variant in SERAC1 (c.202C>T; p.Arg68*), that generates a premature stop codon at position 68 of SERAC1 protein. Western blot analysis in patient's fibroblasts showed a complete absence of SERAC1 that was consistent with the prediction of a truncated protein and supports the pathogenic role of the mutation. During the course of this project a parallel study identified mutations in SERAC1 as the genetic cause of the disease in 15 patients with MEGDEL syndrome, which was compatible with the clinical and biochemical phenotypes of the patient described here. In addition, our patient developed microcephaly and optic atrophy, two features not previously reported in MEGDEL syndrome. We highlight the usefulness of exome sequencing to reveal the genetic bases of human rare diseases even if only one affected individual is available.

%B Mol Genet Metab %V 110 %P 73-7 %8 2013 Sep-Oct %G eng %N 1-2 %1 https://www.ncbi.nlm.nih.gov/pubmed/23707711?dopt=Abstract %R 10.1016/j.ymgme.2013.04.021 %0 Journal Article %J Nucleic acids research %D 2013 %T Genome Maps, a new generation genome browser. %A Medina, Ignacio %A Salavert, Francisco %A Sánchez, Rubén %A De Maria, Alejandro %A Alonso, Roberto %A Escobar, Pablo %A Bleda, Marta %A Joaquín Dopazo %K BAM %K genome viewer %K HTML5 %K javascript %K Next Generation Sequencing %K NGS %K SVG %K VCF %X Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org. %B Nucleic acids research %V 41 %P W41-W46 %8 2013 Jun 8 %G eng %U http://nar.oxfordjournals.org/content/41/W1/W41 %R 10.1093/nar/gkt530 %0 Journal Article %J Carcinogenesis %D 2013 %T Grape antioxidant dietary fiber (GADF) inhibits intestinal polyposis in ApcMin/+ mice: relation to cell cycle and immune response. %A Sánchez-Tena, Susana %A Lizarraga, Daneida %A Miranda, Anibal %A Vinardell, Maria Pilar %A Garcia-Garcia, Francisco %A Joaquín Dopazo %A Torres, Josep Lluís %A Saura-Calixto, Fulgencio %A Capellà, Gabriel %A Cascante, Marta %X Epidemiological and experimental studies suggest that fiber and phenolic compounds might have a protective effect on the development of colon cancer in humans. Accordingly, we assessed the chemopreventive efficacy and associated mechanisms of action of a lyophilized red grape pomace containing proanthocyanidin-rich dietary fiber (Grape Antioxidant Dietary Fiber, GADF) on spontaneous intestinal tumorigenesis in the Apc(Min/+) mouse model. Mice were fed a standard diet (control group) or a 1% (w/w) GADF-supplemented diet (GADF group) for 6 weeks. GADF supplementation greatly reduced intestinal tumorigenesis, significantly decreasing the total number of polyps by 76%. Moreover, size distribution analysis showed a considerable reduction in all polyp size categories [diameter <1 mm (65%), 1-2 mm (67%) and >2 mm (87%)]. In terms of polyp formation in the proximal, middle and distal portions of the small intestine a decrease of 76%, 81% and 73% was observed respectively. Putative molecular mechanisms underlying the inhibition of intestinal tumorigenesis were investigated by comparison of microarray expression profiles of GADF-treated and non-treated mice. We observed that the effects of GADF are mainly associated with the induction of a G1 cell cycle arrest and the downregulation of genes related to the immune response and inflammation. Our findings show for the first time the efficacy and associated mechanisms of action of GADF against intestinal tumorigenesis in Apc(Min/+) mice, suggesting its potential for the prevention of colorectal cancer. %B Carcinogenesis %8 2013 Apr 24 %G eng %U http://carcin.oxfordjournals.org/content/early/2013/04/23/carcin.bgt140.abstract %R 10.1093/carcin/bgt140 %0 Journal Article %J Carcinogenesis %D 2013 %T Grape antioxidant dietary fiber inhibits intestinal polyposis in ApcMin/+ mice: relation to cell cycle and immune response. %A Sánchez-Tena, Susana %A Lizarraga, Daneida %A Miranda, Anibal %A Vinardell, Maria P %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Torres, Josep L %A Saura-Calixto, Fulgencio %A Capellà, Gabriel %A Cascante, Marta %K Animals %K Antioxidants %K Body Weight %K Carcinogenesis %K Cell Cycle %K Cell Cycle Checkpoints %K Colorectal Neoplasms %K Dietary Fiber %K Dietary Supplements %K Down-Regulation %K G1 Phase %K Inflammation %K Intestinal Polyposis %K Intestinal Polyps %K Intestine, Small %K Male %K Mice %K Transcriptome %K Vitis %X

Epidemiological and experimental studies suggest that fiber and phenolic compounds might have a protective effect on the development of colon cancer in humans. Accordingly, we assessed the chemopreventive efficacy and associated mechanisms of action of a lyophilized red grape pomace containing proanthocyanidin (PA)-rich dietary fiber [grape antioxidant dietary fiber (GADF)] on spontaneous intestinal tumorigenesis in the Apc(Min/+) mouse model. Mice were fed a standard diet (control group) or a 1% (w/w) GADF-supplemented diet (GADF group) for 6 weeks. GADF supplementation greatly reduced intestinal tumorigenesis, significantly decreasing the total number of polyps by 76%. Moreover, size distribution analysis showed a considerable reduction in all polyp size categories [diameter <1mm (65%), 1-2mm (67%) and >2mm (87%)]. In terms of polyp formation in the proximal, middle and distal portions of the small intestine, a decrease of 76, 81 and 73% was observed, respectively. Putative molecular mechanisms underlying the inhibition of intestinal tumorigenesis were investigated by comparison of microarray expression profiles of GADF-treated and non-treated mice. We observed that the effects of GADF are mainly associated with the induction of a G1 cell cycle arrest and the downregulation of genes related to the immune response and inflammation. Our findings show for the first time the efficacy and associated mechanisms of action of GADF against intestinal tumorigenesis in Apc(Min/+) mice, suggesting its potential for the prevention of colorectal cancer.

%B Carcinogenesis %V 34 %P 1881-8 %8 2013 Aug %G eng %N 8 %1 https://www.ncbi.nlm.nih.gov/pubmed/23615403?dopt=Abstract %R 10.1093/carcin/bgt140 %0 Journal Article %J Nucleic Acids Res %D 2013 %T Inferring the functional effect of gene expression changes in signaling pathways. %A Sebastián-Leon, Patricia %A Carbonell, José %A Salavert, Francisco %A Sánchez, Rubén %A Medina, Ignacio %A Dopazo, Joaquin %K Animals %K Humans %K Internet %K Mice %K Models, Statistical %K Receptors, Cell Surface %K Signal Transduction %K Software %K Transcriptome %X

Signaling pathways constitute a valuable source of information that allows interpreting the way in which alterations in gene activities affect to particular cell functionalities. There are web tools available that allow viewing and editing pathways, as well as representing experimental data on them. However, few methods aimed to identify the signaling circuits, within a pathway, associated to the biological problem studied exist and none of them provide a convenient graphical web interface. We present PATHiWAYS, a web-based signaling pathway visualization system that infers changes in signaling that affect cell functionality from the measurements of gene expression values in typical expression microarray case-control experiments. A simple probabilistic model of the pathway is used to estimate the probabilities for signal transmission from any receptor to any final effector molecule (taking into account the pathway topology) using for this the individual probabilities of gene product presence/absence inferred from gene expression values. Significant changes in these probabilities allow linking different cell functionalities triggered by the pathway to the biological problem studied. PATHiWAYS is available at: http://pathiways.babelomics.org/.

%B Nucleic Acids Res %V 41 %P W213-7 %8 2013 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/23748960?dopt=Abstract %R 10.1093/nar/gkt451 %0 Journal Article %J Am J Physiol Heart Circ Physiol %D 2013 %T Intrauterine growth restriction is associated with cardiac ultrastructural and gene expression changes related to the energetic metabolism in a rabbit model. %A González-Tendero, Anna %A Torre, Iratxe %A García-Cañadilla, Patricia %A Crispi, Fátima %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Bijnens, Bart %A Gratacós, Eduard %K Animals %K Disease Models, Animal %K Energy Metabolism %K Female %K Fetal Growth Retardation %K gene expression %K Mitochondria %K Myocardium %K Oxidative Phosphorylation %K Placenta %K Pregnancy %K Rabbits %X

Intrauterine growth restriction (IUGR) affects 7-10% of pregnancies and is associated with cardiovascular remodeling and dysfunction, which persists into adulthood. The underlying subcellular remodeling and cardiovascular programming events are still poorly documented. Cardiac muscle is central in the fetal adaptive mechanism to IUGR given its high energetic demands. The energetic homeostasis depends on the correct interaction of several molecular pathways and the adequate arrangement of intracellular energetic units (ICEUs), where mitochondria interact with the contractile machinery and the main cardiac ATPases to enable a quick and efficient energy transfer. We studied subcellular cardiac adaptations to IUGR in an experimental rabbit model. We evaluated the ultrastructure of ICEUs with transmission electron microscopy and observed an altered spatial arrangement in IUGR, with significant increases in cytosolic space between mitochondria and myofilaments. A global decrease of mitochondrial density was also observed. In addition, we conducted a global gene expression profile by advanced bioinformatics tools to assess the expression of genes involved in the cardiomyocyte energetic metabolism and identified four gene modules with a coordinated over-representation in IUGR: oxygen homeostasis (GO: 0032364), mitochondrial respiratory chain complex I (GO:0005747), oxidative phosphorylation (GO: 0006119), and NADH dehydrogenase activity (GO:0003954). These findings might contribute to changes in energetic homeostasis in IUGR. The potential persistence and role of these changes in long-term cardiovascular programming deserves further investigation.

%B Am J Physiol Heart Circ Physiol %V 305 %P H1752-60 %8 2013 Dec %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/24097427?dopt=Abstract %R 10.1152/ajpheart.00514.2013 %0 Journal Article %J PLoS One %D 2013 %T Mammosphere formation in breast carcinoma cell lines depends upon expression of E-cadherin. %A Manuel Iglesias, Juan %A Beloqui, Izaskun %A Garcia-Garcia, Francisco %A Leis, Olatz %A Vazquez-Martin, Alejandro %A Eguiara, Arrate %A Cufi, Silvia %A Pavon, Andres %A Menendez, Javier A %A Dopazo, Joaquin %A Martin, Angel G %K Breast Neoplasms %K Cadherins %K Cell Line, Tumor %K Cell Proliferation %K Cluster Analysis %K Female %K gene expression %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Gene Knockdown Techniques %K Humans %K MCF-7 Cells %K Neoplastic Stem Cells %K Spheroids, Cellular %K Tumor Cells, Cultured %X

Tumors are heterogeneous at the cellular level where the ability to maintain tumor growth resides in discrete cell populations. Floating sphere-forming assays are broadly used to test stem cell activity in tissues, tumors and cell lines. Spheroids are originated from a small population of cells with stem cell features able to grow in suspension culture and behaving as tumorigenic in mice. We tested the ability of eleven common breast cancer cell lines representing the major breast cancer subtypes to grow as mammospheres, measuring the ability to maintain cell viability upon serial non-adherent passage. Only MCF7, T47D, BT474, MDA-MB-436 and JIMT1 were successfully propagated as long-term mammosphere cultures, measured as the increase in the number of viable cells upon serial non-adherent passages. Other cell lines tested (SKBR3, MDA-MB-231, MDA-MB-468 and MDA-MB-435) formed cell clumps that can be disaggregated mechanically, but cell viability drops dramatically on their second passage. HCC1937 and HCC1569 cells formed typical mammospheres, although they could not be propagated as long-term mammosphere cultures. All the sphere forming lines but MDA-MB-436 express E-cadherin on their surface. Knock down of E-cadherin expression in MCF-7 cells abrogated its ability to grow as mammospheres, while re-expression of E-cadherin in SKBR3 cells allow them to form mammospheres. Therefore, the mammosphere assay is suitable to reveal stem like features in breast cancer cell lines that express E-cadherin.

%B PLoS One %V 8 %P e77281 %8 2013 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/24124614?dopt=Abstract %R 10.1371/journal.pone.0077281 %0 Journal Article %J PLoS ONE %D 2013 %T Mammosphere Formation in Breast Carcinoma Cell Lines Depends upon Expression of E-cadherin %A Manuel Iglesias, Juan %A Beloqui, Izaskun %A Garcia-Garcia, Francisco %A Leis, Olatz %A Vazquez-Martin, Alejandro %A Eguiara, Arrate %A Cufi, Silvia %A Pavon, Andres %A Menendez, Javier A. %A Dopazo, Joaquin %A Martin, Angel G. %X

Tumors are heterogeneous at the cellular level where the ability to maintain tumor growth resides in discrete cell populations. Floating sphere-forming assays are broadly used to test stem cell activity in tissues, tumors and cell lines. Spheroids are originated from a small population of cells with stem cell features able to grow in suspension culture and behaving as tumorigenic in mice. We tested the ability of eleven common breast cancer cell lines representing the major breast cancer subtypes to grow as mammospheres, measuring the ability to maintain cell viability upon serial non-adherent passage. Only MCF7, T47D, BT474, MDA-MB-436 and JIMT1 were successfully propagated as long-term mammosphere cultures, measured as the increase in the number of viable cells upon serial non-adherent passages. Other cell lines tested (SKBR3, MDA-MB-231, MDA-MB-468 and MDA-MB-435) formed cell clumps that can be disaggregated mechanically, but cell viability drops dramatically on their second passage. HCC1937 and HCC1569 cells formed typical mammospheres, although they could not be propagated as long-term mammosphere cultures. All the sphere forming lines but MDA-MB-436 express E-cadherin on their surface. Knock down of E-cadherin expression in MCF-7 cells abrogated its ability to grow as mammospheres, while re-expression of E-cadherin in SKBR3 cells allow them to form mammospheres. Therefore, the mammosphere assay is suitable to reveal stem like features in breast cancer cell lines that express E-cadherin.

%B PLoS ONE %I Public Library of Science %V 8 %P e77281 - %8 2013/10/04 %G eng %U http://dx.doi.org/10.1371%2Fjournal.pone.0077281 %0 Journal Article %J PloS one %D 2013 %T Maslinic Acid-Enriched Diet Decreases Intestinal Tumorigenesis in Apc(Min/+) Mice through Transcriptomic and Metabolomic Reprogramming. %A Sánchez-Tena, Susana %A Reyes-Zurita, Fernando J %A Díaz-Moralli, Santiago %A Vinardell, Maria Pilar %A Reed, Michelle %A Garcia-Garcia, Francisco %A Joaquín Dopazo %A Lupiáñez, José A %A Günther, Ulrich %A Cascante, Marta %X Chemoprevention is a pragmatic approach to reduce the risk of colorectal cancer, one of the leading causes of cancer-related death in western countries. In this regard, maslinic acid (MA), a pentacyclic triterpene extracted from wax-like coatings of olives, is known to inhibit proliferation and induce apoptosis in colon cancer cell lines without affecting normal intestinal cells. The present study evaluated the chemopreventive efficacy and associated mechanisms of maslinic acid treatment on spontaneous intestinal tumorigenesis in Apc(Min/+) mice. Twenty-two mice were randomized into 2 groups: control group and MA group, fed with a maslinic acid-supplemented diet for six weeks. MA treatment reduced total intestinal polyp formation by 45% (P<0.01). Putative molecular mechanisms associated with suppressing intestinal polyposis in Apc(Min/+) mice were investigated by comparing microarray expression profiles of MA-treated and control mice and by analyzing the serum metabolic profile using NMR techniques. The different expression phenotype induced by MA suggested that it exerts its chemopreventive action mainly by inhibiting cell-survival signaling and inflammation. These changes eventually induce G1-phase cell cycle arrest and apoptosis. Moreover, the metabolic changes induced by MA treatment were associated with a protective profile against intestinal tumorigenesis. These results show the efficacy and underlying mechanisms of MA against intestinal tumor development in the Apc(Min/+) mice model, suggesting its chemopreventive potential against colorectal cancer. %B PloS one %V 8 %P e59392 %8 2013 %G eng %R 10.1371/journal.pone.0059392 %0 Conference Paper %B Proceedings of the 18th International Conference on Parallel Processing Workshops %D 2013 %T Multicore and Cloud-based Solutions for Genomic Variant Analysis %A Gonzalez, Cristina Y. %A Bleda, Marta %A Salavert, Francisco %A Sánchez, Rubén %A Dopazo, Joaquin %A Medina, Ignacio %K genomic variant analysis %K multicore %K mutation %K OpenMP %K web service %B Proceedings of the 18th International Conference on Parallel Processing Workshops %I Springer-Verlag %C Berlin, Heidelberg %@ 978-3-642-36948-3 %G eng %U http://dx.doi.org/10.1007/978-3-642-36949-0_30 %R 10.1007/978-3-642-36949-0_30 %0 Journal Article %J Clinica chimica acta; international journal of clinical chemistry %D 2013 %T Novel genes detected by transcriptional profiling from whole-blood cells in patients with early onset of acute coronary syndrome: Transcriptional profiling of acute coronary syndrome. %A Silbiger, Vivian N %A Luchessi, André D %A Hirata, Rosário D C %A Lima-Neto, Lídio G %A Cavichioli, Débora %A Carracedo, Ángel %A Brión, Maria %A Joaquín Dopazo %A Garcia-Garcia, Francisco %A Dos Santos, Elizabete S %A Ramos, Rui F %A Sampaio, Marcelo F %A Armaganijan, Dikran %A Sousa, Amanda G M R %A Hirata, Mario H %X {BACKGROUND: Genome-wide expression analysis using microarrays has been used as a research strategy to discovery new biomarkers and candidate genes for a number of diseases. We aim to find new biomarkers for the prediction of acute coronary syndrome (ACS) with a differentially expressed mRNA profiling approach using whole genomic expression analysis in a peripheral blood cell model from patients with early ACS. METHODS AND RESULTS: This study was carried out in two phases. On phase 1 a restricted clinical criteria (ACS-Ph1 %B Clinica chimica acta; international journal of clinical chemistry %8 2013 Mar 24 %G eng %R 10.1016/j.cca.2013.03.011 %0 Journal Article %J Clin Chim Acta %D 2013 %T Novel genes detected by transcriptional profiling from whole-blood cells in patients with early onset of acute coronary syndrome. %A Silbiger, Vivian N %A Luchessi, André D %A Hirata, Rosário D C %A Lima-Neto, Lídio G %A Cavichioli, Débora %A Carracedo, Ángel %A Brión, Maria %A Dopazo, Joaquin %A Garcia-Garcia, Francisco %A Dos Santos, Elizabete S %A Ramos, Rui F %A Sampaio, Marcelo F %A Armaganijan, Dikran %A Sousa, Amanda G M R %A Hirata, Mario H %K Acute Coronary Syndrome %K Acute-Phase Proteins %K Adult %K biomarkers %K Blood Cells %K Early Diagnosis %K gene expression %K Gene Expression Profiling %K Humans %K Male %K Middle Aged %K Oligonucleotide Array Sequence Analysis %K RNA, Messenger %K Transcriptome %X

BACKGROUND: Genome-wide expression analysis using microarrays has been used as a research strategy to discovery new biomarkers and candidate genes for a number of diseases. We aim to find new biomarkers for the prediction of acute coronary syndrome (ACS) with a differentially expressed mRNA profiling approach using whole genomic expression analysis in a peripheral blood cell model from patients with early ACS.

METHODS AND RESULTS: This study was carried out in two phases. On phase 1 a restricted clinical criteria (ACS-Ph1, n=9 and CG-Ph1, n=6) was used in order to select potential mRNA biomarkers candidates. A subsequent phase 2 study was performed using selected phase 1 markers analyzed by RT-qPCR using a larger and independent casuistic (ACS-Ph2, n=74 and CG-Ph2, n=41). A total of 549 genes were found to be differentially expressed in the first 48 h after the ACS-Ph1. Technical and biological validation further confirmed that ALOX15, AREG, BCL2A1, BCL2L1, CA1, COX7B, ECHDC3, IL18R1, IRS2, KCNE1, MMP9, MYL4 and TREML4, are differentially expressed in both phases of this study.

CONCLUSIONS: Transcriptomic analysis by microarray technology demonstrated differential expression during a 48 h time course suggesting a potential use of some of these genes as biomarkers for very early stages of ACS, as well as for monitoring early cardiac ischemic recovery.

%B Clin Chim Acta %V 421 %P 184-90 %8 2013 Jun 05 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/23535507?dopt=Abstract %R 10.1016/j.cca.2013.03.011 %0 Journal Article %J Orphanet journal of rare diseases %D 2013 %T Pathways systematically associated to Hirschsprung’s disease. %A Fernández, Raquel M %A Bleda, Marta %A Luzón-Toro, Berta %A García-Alonso, Luz %A Arnold, Stacey %A Sribudiani, Yunia %A Besmond, Claude %A Lantieri, Francesca %A Doan, Betty %A Ceccherini, Isabella %A Lyonnet, Stanislas %A Hofstra, Robert Mw %A Chakravarti, Aravinda %A Antiňolo, Guillermo %A Joaquín Dopazo %A Borrego, Salud %K GWAS %K Hirschprung %K network analysis %K Pathway Based Analysis %X Despite it has been reported that several loci are involved in Hirschsprung’s disease, the molecular basis of the disease remains yet essentially unknown. The study of collective properties of modules of functionally-related genes provides an efficient and sensitive statistical framework that can overcome sample size limitations in the study of rare diseases. Here, we present the extension of a previous study of a Spanish series of HSCR trios to an international cohort of 162 HSCR trios to validate the generality of the underlying functional basis of the Hirschsprung’s disease mechanisms previously found. The Pathway-Based Analysis (PBA) confirms a strong association of gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other processes related to the disease. In addition, network analysis recovers sub-networks significantly associated to the disease, which contain genes related to the same functionalities, thus providing an independent validation of these findings. The functional profiles of association obtained for patients populations from different countries were compared to each other. While gene associations were different at each series, the main functional associations were identical in all the five populations. These observations would also explain the reported low reproducibility of associations of individual disease genes across populations. %B Orphanet journal of rare diseases %V 8 %P 187 %8 2013 Dec 2 %G eng %U http://www.ojrd.com/content/8/1/187/abstract %R 10.1186/1750-1172-8-187 %0 Journal Article %J Orphanet J Rare Dis %D 2013 %T Pathways systematically associated to Hirschsprung's disease. %A Fernández, Raquel M %A Bleda, Marta %A Luzón-Toro, Berta %A García-Alonso, Luz %A Arnold, Stacey %A Sribudiani, Yunia %A Besmond, Claude %A Lantieri, Francesca %A Doan, Betty %A Ceccherini, Isabella %A Lyonnet, Stanislas %A Hofstra, Robert Mw %A Chakravarti, Aravinda %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %K Female %K Genetic Predisposition to Disease %K Genotype %K Hirschsprung Disease %K Humans %K Male %K Polymorphism, Single Nucleotide %X

Despite it has been reported that several loci are involved in Hirschsprung's disease, the molecular basis of the disease remains yet essentially unknown. The study of collective properties of modules of functionally-related genes provides an efficient and sensitive statistical framework that can overcome sample size limitations in the study of rare diseases. Here, we present the extension of a previous study of a Spanish series of HSCR trios to an international cohort of 162 HSCR trios to validate the generality of the underlying functional basis of the Hirschsprung's disease mechanisms previously found. The Pathway-Based Analysis (PBA) confirms a strong association of gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other processes related to the disease. In addition, network analysis recovers sub-networks significantly associated to the disease, which contain genes related to the same functionalities, thus providing an independent validation of these findings. The functional profiles of association obtained for patients populations from different countries were compared to each other. While gene associations were different at each series, the main functional associations were identical in all the five populations. These observations would also explain the reported low reproducibility of associations of individual disease genes across populations.

%B Orphanet J Rare Dis %V 8 %P 187 %8 2013 Dec 02 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/24289864?dopt=Abstract %R 10.1186/1750-1172-8-187 %0 Journal Article %J Exp Dermatol %D 2013 %T Role of CPI-17 in restoring skin homoeostasis in cutaneous field of cancerization: effects of topical application of a film-forming medical device containing photolyase and UV filters. %A Puig-Butille, Joan Anton %A Malvehy, Josep %A Potrony, Miriam %A Trullas, Carles %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Puig, Susana %K Administration, Topical %K Adult %K Aged %K Aged, 80 and over %K Biopsy %K Deoxyribodipyrimidine Photo-Lyase %K Female %K Gene Expression Profiling %K Gene Expression Regulation, Enzymologic %K Gene Expression Regulation, Neoplastic %K Homeostasis %K Humans %K Inflammation %K Intracellular Signaling Peptides and Proteins %K Liposomes %K Male %K Middle Aged %K Muscle Proteins %K Phenotype %K Phosphoprotein Phosphatases %K Reactive Oxygen Species %K Skin %K Skin Neoplasms %K Ultraviolet Rays %X

Cutaneous field of cancerization (CFC) is caused in part by the carcinogenic effect of the cyclobutane pyrimidine dimers CPD and 6-4 photoproducts (6-4PPs). Photoreactivation is carried out by photolyases which specifically recognize and repair both photoproducts. The study evaluates the molecular effects of topical application of a film-forming medical device containing photolyase and UV filters on the precancerous field in AK from seven patients. Skin improvement after treatment was confirmed in all patients by histopathological and molecular assessment. A gene set analysis showed that skin recovery was associated with biological processes involved in tissue homoeostasis and cell maintenance. The CFC response was associated with over-expression of the CPI-17 gene, and a dependence on the initial expression level was observed (P = 0.001). Low CPI-17 levels were directly associated with pro-inflammatory genes such as TNF (P = 0.012) and IL-1B (P = 0.07). Our results suggest a role for CPI-17 in restoring skin homoeostasis in CFC lesions.

%B Exp Dermatol %V 22 %P 494-6 %8 2013 Jul %G eng %N 7 %1 https://www.ncbi.nlm.nih.gov/pubmed/23800065?dopt=Abstract %R 10.1111/exd.12177 %0 Journal Article %J Molecular vision %D 2013 %T Whole-exome sequencing identifies novel compound heterozygous mutations in USH2A in Spanish patients with autosomal recessive retinitis pigmentosa. %A Méndez-Vidal, Cristina %A González-del Pozo, María %A Vela-Boza, Alicia %A Santoyo-López, Javier %A López-Domingo, Francisco J %A Vázquez-Marouschek, Carmen %A Dopazo, Joaquin %A Borrego, Salud %A Antiňolo, Guillermo %X PURPOSE: Retinitis pigmentosa (RP) is an inherited retinal dystrophy characterized by extreme genetic and clinical heterogeneity. Thus, the diagnosis is not always easily performed due to phenotypic and genetic overlap. Current clinical practices have focused on the systematic evaluation of a set of known genes for each phenotype, but this approach may fail in patients with inaccurate diagnosis or infrequent genetic cause. In the present study, we investigated the genetic cause of autosomal recessive RP (arRP) in a Spanish family in which the causal mutation has not yet been identified with primer extension technology and resequencing. METHODS: We designed a whole-exome sequencing (WES)-based approach using NimbleGen SeqCap EZ Exome V3 sample preparation kit and the SOLiD 5500×l next-generation sequencing platform. We sequenced the exomes of both unaffected parents and two affected siblings. Exome analysis resulted in the identification of 43,204 variants in the index patient. All variants passing filter criteria were validated with Sanger sequencing to confirm familial segregation and absence in the control population. In silico prediction tools were used to determine mutational impact on protein function and the structure of the identified variants. RESULTS: Novel Usher syndrome type 2A (USH2A) compound heterozygous mutations, c.4325T>C (p.F1442S) and c.15188T>G (p.L5063R), located in exons 20 and 70, respectively, were identified as probable causative mutations for RP in this family. Family segregation of the variants showed the presence of both mutations in all affected members and in two siblings who were apparently asymptomatic at the time of family ascertainment. Clinical reassessment confirmed the diagnosis of RP in these patients. CONCLUSIONS: Using WES, we identified two heterozygous novel mutations in USH2A as the most likely disease-causing variants in a Spanish family diagnosed with arRP in which the cause of the disease had not yet been identified with commonly used techniques. Our data reinforce the clinical role of WES in the molecular diagnosis of highly heterogeneous genetic diseases where conventional genetic approaches have previously failed in achieving a proper diagnosis. %B Molecular vision %V 19 %P 2187-95 %8 2013 %G eng %U http://www.molvis.org/molvis/v19/2187/ %0 Journal Article %J Nucleic acids research %D 2012 %T CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources. %A Bleda, Marta %A Tárraga, Joaquín %A De Maria, Alejandro %A Salavert, Francisco %A García-Alonso, Luz %A Celma, Matilde %A Martin, Ainoha %A Dopazo, Joaquin %A Medina, Ignacio %X During the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase. %B Nucleic acids research %V 40 %P W609-14 %8 2012 Jul %G eng %U http://nar.oxfordjournals.org/content/40/W1/W609.long %R 10.1093/nar/gks575 %0 Journal Article %J PloS one %D 2012 %T Development, Characterization and Experimental Validation of a Cultivated Sunflower (Helianthus annuus L.) Gene Expression Oligonucleotide Microarray. %A Fernandez, Paula %A Soria, Marcelo %A Blesa, David %A Dirienzo, Julio %A Moschen, Sebastián %A Rivarola, Máximo %A Clavijo, Bernardo Jose %A Gonzalez, Sergio %A Peluffo, Lucila %A Príncipi, Dario %A Dosio, Guillermo %A Aguirrezabal, Luis %A Garcia-Garcia, Francisco %A Ana Conesa %A Hopp, Esteban %A Joaquín Dopazo %A Heinz, Ruth Amelia %A Paniego, Norma %X Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement. %B PloS one %V 7 %P e45899 %8 2012 %G eng %U http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0045899 %R 10.1371/journal.pone.0045899 %0 Journal Article %J Nucleic Acids Res %D 2012 %T Discovering the hidden sub-network component in a ranked list of genes or proteins derived from genomic experiments. %A García-Alonso, Luz %A Alonso, Roberto %A Vidal, Enrique %A Amadoz, Alicia %A De Maria, Alejandro %A Minguez, Pablo %A Medina, Ignacio %A Dopazo, Joaquin %K Bipolar Disorder %K Fanconi Anemia %K Gene Regulatory Networks %K Genes, Neoplasm %K Genome-Wide Association Study %K Genomics %K Humans %K Protein Interaction Mapping %X

Genomic experiments (e.g. differential gene expression, single-nucleotide polymorphism association) typically produce ranked list of genes. We present a simple but powerful approach which uses protein-protein interaction data to detect sub-networks within such ranked lists of genes or proteins. We performed an exhaustive study of network parameters that allowed us concluding that the average number of components and the average number of nodes per component are the parameters that best discriminate between real and random networks. A novel aspect that increases the efficiency of this strategy in finding sub-networks is that, in addition to direct connections, also connections mediated by intermediate nodes are considered to build up the sub-networks. The possibility of using of such intermediate nodes makes this approach more robust to noise. It also overcomes some limitations intrinsic to experimental designs based on differential expression, in which some nodes are invariant across conditions. The proposed approach can also be used for candidate disease-gene prioritization. Here, we demonstrate the usefulness of the approach by means of several case examples that include a differential expression analysis in Fanconi Anemia, a genome-wide association study of bipolar disorder and a genome-scale study of essentiality in cancer genes. An efficient and easy-to-use web interface (available at http://www.babelomics.org) based on HTML5 technologies is also provided to run the algorithm and represent the network.

%B Nucleic Acids Res %V 40 %P e158 %8 2012 Nov 01 %G eng %N 20 %1 https://www.ncbi.nlm.nih.gov/pubmed/22844098?dopt=Abstract %R 10.1093/nar/gks699 %0 Journal Article %J BMC Evol Biol %D 2012 %T Diversification of the expanded teleost-specific toll-like receptor family in Atlantic cod, Gadus morhua. %A Sundaram, Arvind Y M %A Kiron, Viswanath %A Dopazo, Joaquin %A Fernandes, Jorge M O %K Amino Acid Sequence %K Animals %K Binding Sites %K Evolution, Molecular %K Fish Diseases %K Fish Proteins %K Gadus morhua %K Gene Expression Profiling %K Genetic Variation %K Gills %K Head Kidney %K Host-Pathogen Interactions %K Models, Molecular %K Molecular Sequence Data %K Multigene Family %K Phylogeny %K Protein Structure, Tertiary %K Reverse Transcriptase Polymerase Chain Reaction %K Selection, Genetic %K Sequence Analysis, DNA %K Sequence Homology, Amino Acid %K Temperature %K Toll-Like Receptors %K Vibrio %X

BACKGROUND: Toll-like receptors (Tlrs) are major molecular pattern recognition receptors of the innate immune system. Atlantic cod (Gadus morhua) is the first vertebrate known to have lost most of the mammalian Tlr orthologues, particularly all bacterial recognising and other cell surface Tlrs. On the other hand, its genome encodes a unique repertoire of teleost-specific Tlrs. The aim of this study was to investigate if these duplicate Tlrs have been retained through adaptive evolution to compensate for the lack of other cell surface Tlrs in the cod genome.

RESULTS: In this study, one tlr21, 12 tlr22 and two tlr23 genes representing the teleost-specific Tlr family have been cloned and characterised in cod. Phylogenetic analysis grouped all tlr22 genes under a single clade, indicating that the multiple cod paralogues have arisen through lineage-specific duplications. All tlrs examined were transcribed in immune-related tissues as well as in stomach, gut and gonads of adult cod and were differentially expressed during early development. These tlrs were also differentially regulated following immune challenge by immersion with Vibrio anguillarum, indicating their role in the immune response. An increase in water temperature from 4 to 12°C was associated with a 5.5-fold down-regulation of tlr22d transcript levels in spleen. Maximum likelihood analysis with different evolution models revealed that tlr22 genes are under positive selection. A total of 24 codons were found to be positively selected, of which 19 are in the ligand binding region of ectodomain.

CONCLUSION: Positive selection pressure coupled with experimental evidence of differential expression strongly support the hypothesis that teleost-specific tlr paralogues in cod are undergoing neofunctionalisation and can recognise bacterial pathogen-associated molecular patterns to compensate for the lack of other cell surface Tlrs.

%B BMC Evol Biol %V 12 %P 256 %8 2012 Dec 29 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/23273344?dopt=Abstract %R 10.1186/1471-2148-12-256 %0 Journal Article %J Evolutionary bioinformatics online %D 2012 %T Evolutionary Genomics of Genes Involved in Olfactory Behavior in the Drosophila melanogaster Species Group. %A Lavagnino, Nicolás %A Serra, François %A Arbiza, Leonardo %A Dopazo, Hernán %A Hasson, Esteban %X Previous comparative genomic studies of genes involved in olfactory behavior in Drosophila focused only on particular gene families such as odorant receptor and/or odorant binding proteins. However, olfactory behavior has a complex genetic architecture that is orchestrated by many interacting genes. In this paper, we present a comparative genomic study of olfactory behavior in Drosophila including an extended set of genes known to affect olfactory behavior. We took advantage of the recent burst of whole genome sequences and the development of powerful statistical tools to analyze genomic data and test evolutionary and functional hypotheses of olfactory genes in the six species of the Drosophila melanogaster species group for which whole genome sequences are available. Our study reveals widespread purifying selection and limited incidence of positive selection on olfactory genes. We show that the pace of evolution of olfactory genes is mostly independent of the life cycle stage, and of the number of life cycle stages, in which they participate in olfaction. However, we detected a relationship between evolutionary rates and the position that the gene products occupy in the olfactory system, genes occupying central positions tend to be more constrained than peripheral genes. Finally, we demonstrate that specialization to one host does not seem to be associated with bursts of adaptive evolution in olfactory genes in D. sechellia and D. erecta, the two specialists species analyzed, but rather different lineages have idiosyncratic evolutionary histories in which both historical and ecological factors have been involved. %B Evolutionary bioinformatics online %V 8 %P 89-104 %8 2012 %G eng %U http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3273929/?tool=pubmed %R 10.4137/EBO.S8484 %0 Journal Article %J International journal of cancer. Journal international du cancer %D 2012 %T Expression profiling shows differential molecular pathways and provides potential new diagnostic biomarkers for colorectal serrated adenocarcinoma. %A Conesa-Zamora, Pablo %A García-Solano, José %A Garcia-Garcia, Francisco %A Del Carmen Turpin, María %A Trujillo-Santos, Javier %A Torres-Moreno, Daniel %A Oviedo-Ramírez, Isabel %A Carbonell-Muñoz, Rosa %A Muñoz-Delgado, Encarnación %A Rodriguez-Braun, Edith %A Ana Conesa %A Pérez-Guillermo, Miguel %X Serrated adenocarcinoma (SAC) is a recently recognized colorectal cancer (CRC) subtype accounting for 7.5-8.7% of CRCs. It has been shown that SAC has a poorer prognosis and has different molecular and immunohistochemical features compared to conventional carcinoma (CC) but, to date, only one previous study has analysed its mRNA expression profile by microarray. Using a different microarray platform, we have studied the molecular signature of 11 SACs and compared it with that of 15 matched CC with the aim of discerning the functions which characterize SAC biology and validating, at the mRNA and protein level, the most differentially expressed genes which were also tested using a validation set of 70 SACs and 70 CCs to assess their diagnostic and prognostic values. Microarray data showed a higher representation of morphogenesis-, hypoxia-, cytoskeleton- and vesicle transport-related functions and also an over-expression of fascin1 (actin-bundling protein associated with invasion) and the antiapoptotic gene hippocalcin in SAC all of which were validated both by qPCR and immunohistochemistry. Fascin1 expression was statistically associated with KRAS mutation with 88.6% sensitivity and 85.7% specificity for SAC diagnosis and the positivity of fascin1 or hippocalcin was highly suggestive of SAC diagnosis (sensitivity=100%). Evaluation of these markers in CRCs showing histological and molecular characteristics of high-level microsatellite instability (MSI-H) also helped to distinguish SACs from MSI-H CRCs. Molecular profiling demonstrates that SAC shows activation of distinct signalling pathways and that immunohistochemical fascin1 and hippocalcin expression can be reliably used for its differentiation from other CRC subtypes. © 2012 Wiley Periodicals, Inc. %B International journal of cancer. Journal international du cancer %8 2012 Jun 14 %G eng %R 10.1002/ijc.27674 %0 Journal Article %J PLoS One %D 2012 %T Extensive translatome remodeling during ER stress response in mammalian cells. %A Ventoso, Iván %A Kochetov, Alex %A Montaner, David %A Dopazo, Joaquin %A Santoyo, Javier %K Animals %K Endoplasmic Reticulum Stress %K Humans %K Jurkat Cells %K Mice %K NIH 3T3 Cells %K Oligonucleotide Array Sequence Analysis %K Protein Biosynthesis %K RNA, Messenger %K Transcription, Genetic %X

In this work we have described the translatome of two mammalian cell lines, NIH3T3 and Jurkat, by scoring the relative polysome association of ∼10,000 mRNA under normal and ER stress conditions. We have found that translation efficiencies of mRNA correlated poorly with transcript abundance, although a general tendency was observed so that the highest translation efficiencies were found in abundant mRNA. Despite the differences found between mouse (NIH3T3) and human (Jurkat) cells, both cell types share a common translatome composed by ∼800-900 mRNA that encode proteins involved in basic cellular functions. Upon stress, an extensive remodeling in translatomes was observed so that translation of ∼50% of mRNA was inhibited in both cell types, this effect being more dramatic for those mRNA that accounted for most of the cell translation. Interestingly, we found two subsets comprising 1000-1500 mRNA whose translation resisted or was induced by stress. Translation arrest resistant class includes many mRNA encoding aminoacyl tRNA synthetases, ATPases and enzymes involved in DNA replication and stress response such as BiP. This class of mRNA is characterized by high translation rates in both control and stress conditions. Translation inducible class includes mRNA whose translation was relieved after stress, showing a high enrichment in early response transcription factors of bZIP and zinc finger C2H2 classes. Unlike yeast, a general coordination between changes in translation and transcription upon stress (potentiation) was not observed in mammalian cells. Among the different features of mRNA analyzed, we found a relevant association of translation efficiency with the presence of upstream ATG in the 5'UTR and with the length of coding sequence of mRNA, and a looser association with other parameters such as the length and the G+C content of 5'UTR. A model for translatome remodeling during the acute phase of stress response in mammalian cells is proposed.

%B PLoS One %V 7 %P e35915 %8 2012 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/22574127?dopt=Abstract %R 10.1371/journal.pone.0035915 %0 Journal Article %J Orphanet journal of rare diseases %D 2012 %T Four new loci associations discovered by pathway-based and network analyses of the genome-wide variability profile of Hirschsprung’s disease. %A Fernández, Raquel Ma %A Bleda, Marta %A Núñez-Torres, Rocío %A Medina, Ignacio %A Luzón-Toro, Berta %A García-Alonso, Luz %A Torroglosa, Ana %A Marbà, Martina %A Enguix-Riego, Ma Valle %A Montaner, David %A Antiňolo, Guillermo %A Joaquín Dopazo %A Borrego, Salud %X ABSTRACT: Finding gene associations in rare diseases is frequently hampered by the reduced numbers of patients accessible. Conventional gene-based association tests rely on the availability of large cohorts, which constitutes a serious limitation for its application in this scenario. To overcome this problem we have used here a combined strategy in which a pathway-based analysis (PBA) has been initially conducted to prioritize candidate genes in a Spanish cohort of 53 trios of short-segment Hirschsprung’s disease. Candidate genes have been further validated in an independent population of 106 trios. The study revealed a strong association of 11 gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other HSCR-related processes. Among the preselected candidates, a total of 4 loci, RASGEF1A, IQGAP2, DLC1 and CHRNA7, related to signal transduction and migration processes, were found to be significantly associated to HSCR. Network analysis also confirms their involvement in the network of already known disease genes. This approach, based on the study of functionally-related gene sets, requires of lower sample sizes and opens new opportunities for the study of rare diseases. %B Orphanet journal of rare diseases %V 7 %P 103 %8 2012 Dec 28 %G eng %U http://www.ojrd.com/content/7/1/103/abstract %R 10.1186/1750-1172-7-103 %0 Journal Article %J Orphanet J Rare Dis %D 2012 %T Four new loci associations discovered by pathway-based and network analyses of the genome-wide variability profile of Hirschsprung's disease. %A Fernández, Raquel Ma %A Bleda, Marta %A Núñez-Torres, Rocío %A Medina, Ignacio %A Luzón-Toro, Berta %A García-Alonso, Luz %A Torroglosa, Ana %A Marbà, Martina %A Enguix-Riego, Ma Valle %A Montaner, David %A Antiňolo, Guillermo %A Dopazo, Joaquin %A Borrego, Salud %K Female %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Genotype %K Hirschsprung Disease %K Humans %K Male %X

Finding gene associations in rare diseases is frequently hampered by the reduced numbers of patients accessible. Conventional gene-based association tests rely on the availability of large cohorts, which constitutes a serious limitation for its application in this scenario. To overcome this problem we have used here a combined strategy in which a pathway-based analysis (PBA) has been initially conducted to prioritize candidate genes in a Spanish cohort of 53 trios of short-segment Hirschsprung's disease. Candidate genes have been further validated in an independent population of 106 trios. The study revealed a strong association of 11 gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other HSCR-related processes. Among the preselected candidates, a total of 4 loci, RASGEF1A, IQGAP2, DLC1 and CHRNA7, related to signal transduction and migration processes, were found to be significantly associated to HSCR. Network analysis also confirms their involvement in the network of already known disease genes. This approach, based on the study of functionally-related gene sets, requires of lower sample sizes and opens new opportunities for the study of rare diseases.

%B Orphanet J Rare Dis %V 7 %P 103 %8 2012 Dec 28 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/23270508?dopt=Abstract %R 10.1186/1750-1172-7-103 %0 Journal Article %J BMC genomics %D 2012 %T Identification of yeast genes that confer resistance to chitosan oligosaccharide (COS) using chemogenomics. %A Jaime, María D L A %A Lopez-Llorca, Luis Vicente %A Ana Conesa %A Lee, Anna Y %A Proctor, Michael %A Heisler, Lawrence E %A Gebbia, Marinella %A Giaever, Guri %A Westwood, J Timothy %A Nislow, Corey %X BACKGROUND: Chitosan oligosaccharide (COS), a deacetylated derivative of chitin, is an abundant, and renewable natural polymer. COS has higher antimicrobial properties than chitosan and is presumed to act by disrupting/permeabilizing the cell membranes of bacteria, yeast and fungi. COS is relatively non-toxic to mammals. By identifying the molecular and genetic targets of COS, we hope to gain a better understanding of the antifungal mode of action of COS. RESULTS: Three different chemogenomic fitness assays, haploinsufficiency (HIP), homozygous deletion (HOP), and multicopy suppression (MSP) profiling were combined with a transcriptomic analysis to gain insight in to the mode of action and mechanisms of resistance to chitosan oligosaccharides. The fitness assays identified 39 yeast deletion strains sensitive to COS and 21 suppressors of COS sensitivity. The genes identified are involved in processes such as RNA biology (transcription, translation and regulatory mechanisms), membrane functions (e.g. signalling, transport and targeting), membrane structural components, cell division, and proteasome processes. The transcriptomes of control wild type and 5 suppressor strains overexpressing ARL1, BCK2, ERG24, MSG5, or RBA50, were analyzed in the presence and absence of COS. Some of the up-regulated transcripts in the suppressor overexpressing strains exposed to COS included genes involved in transcription, cell cycle, stress response and the Ras signal transduction pathway. Down-regulated transcripts included those encoding protein folding components and respiratory chain proteins. The COS-induced transcriptional response is distinct from previously described environmental stress responses (i.e. thermal, salt, osmotic and oxidative stress) and pre-treatment with these well characterized environmental stressors provided little or any resistance to COS. CONCLUSIONS: Overexpression of the ARL1 gene, a member of the Ras superfamily that regulates membrane trafficking, provides protection against COS-induced cell membrane permeability and damage. We found that the ARL1 COS-resistant over-expression strain was as sensitive to Amphotericin B, Fluconazole and Terbinafine as the wild type cells and that when COS and Fluconazole are used in combination they act in a synergistic fashion. The gene targets of COS identified in this study indicate that COS’s mechanism of action is different from other commonly studied fungicides that target membranes, suggesting that COS may be an effective fungicide for drug-resistant fungal pathogens. %B BMC genomics %V 13 %P 267 %8 2012 %G eng %R 10.1186/1471-2164-13-267 %0 Journal Article %J Stem Cell Rev Rep %D 2012 %T IL1β induces mesenchymal stem cells migration and leucocyte chemotaxis through NF-κB. %A Carrero, Rubén %A Cerrada, Inmaculada %A Lledó, Elisa %A Dopazo, Joaquin %A Garcia-Garcia, Francisco %A Rubio, Mari-Paz %A Trigueros, César %A Dorronsoro, Akaitz %A Ruiz-Sauri, Amparo %A Montero, José Anastasio %A Sepúlveda, Pilar %K Cell Adhesion %K Cell Movement %K Cell Proliferation %K Chemokines %K Chemotaxis, Leukocyte %K Collagen %K Fibronectins %K Gene Expression Profiling %K Gene Knockdown Techniques %K HEK293 Cells %K Humans %K I-kappa B Kinase %K Inflammation Mediators %K Intercellular Signaling Peptides and Proteins %K Interleukin-1beta %K Laminin %K Leukocytes %K Mesenchymal Stem Cells %K NF-kappa B %K Oligonucleotide Array Sequence Analysis %K RNA Interference %K Signal Transduction %X

Mesenchymal stem cells are often transplanted into inflammatory environments where they are able to survive and modulate host immune responses through a poorly understood mechanism. In this paper we analyzed the responses of MSC to IL-1β: a representative inflammatory mediator. Microarray analysis of MSC treated with IL-1β revealed that this cytokine activateds a set of genes related to biological processes such as cell survival, cell migration, cell adhesion, chemokine production, induction of angiogenesis and modulation of the immune response. Further more detailed analysis by real-time PCR and functional assays revealed that IL-1β mainly increaseds the production of chemokines such as CCL5, CCL20, CXCL1, CXCL3, CXCL5, CXCL6, CXCL10, CXCL11 and CX(3)CL1, interleukins IL-6, IL-8, IL23A, IL32, Toll-like receptors TLR2, TLR4, CLDN1, metalloproteins MMP1 and MMP3, growth factors CSF2 and TNF-α, together with adhesion molecules ICAM1 and ICAM4. Functional analysis of MSC proliferation, migration and adhesion to extracellular matrix components revealed that IL-1β did not affect proliferation but also served to induce the secretion of trophic factors and adhesion to ECM components such as collagen and laminin. IL-1β treatment enhanced the ability of MSC to recruit monocytes and granulocytes in vitro. Blockade of NF-κβ transcription factor activation with IκB kinase beta (IKKβ) shRNA impaired MSC migration, adhesion and leucocyte recruitment, induced by IL-1β demonstrating that NF-κB pathway is an important downstream regulator of these responses. These findings are relevant to understanding the biological responses of MSC to inflammatory environments.

%B Stem Cell Rev Rep %V 8 %P 905-16 %8 2012 Sep %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/22467443?dopt=Abstract %R 10.1007/s12015-012-9364-9 %0 Journal Article %J Nucleic Acids Res %D 2012 %T Inferring the regulatory network behind a gene expression experiment. %A Bleda, Marta %A Medina, Ignacio %A Alonso, Roberto %A De Maria, Alejandro %A Salavert, Francisco %A Dopazo, Joaquin %K Binding Sites %K Databases, Genetic %K Fanconi Anemia %K Gene Regulatory Networks %K Internet %K MicroRNAs %K Software %K Transcription Factors %K Transcriptome %X

Transcription factors (TFs) and miRNAs are the most important dynamic regulators in the control of gene expression in multicellular organisms. These regulatory elements play crucial roles in development, cell cycling and cell signaling, and they have also been associated with many diseases. The Regulatory Network Analysis Tool (RENATO) web server makes the exploration of regulatory networks easy, enabling a better understanding of functional modularity and network integrity under specific perturbations. RENATO is suitable for the analysis of the result of expression profiling experiments. The program analyses lists of genes and search for the regulators compatible with its activation or deactivation. Tests of single enrichment or gene set enrichment allow the selection of the subset of TFs or miRNAs significantly involved in the regulation of the query genes. RENATO also offers an interactive advanced graphical interface that allows exploring the regulatory network found.RENATO is available at: http://renato.bioinfo.cipf.es/.

%B Nucleic Acids Res %V 40 %P W168-72 %8 2012 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/22693210?dopt=Abstract %R 10.1093/nar/gks573 %0 Journal Article %J Genome medicine %D 2012 %T A map of human microRNA variation uncovers unexpectedly high levels of variability. %A Carbonell, José %A Alloza, Eva %A Arce, Pablo %A Borrego, Salud %A Santoyo, Javier %A Ruiz-Ferrer, Macarena %A Medina, Ignacio %A Jiménez-Almazán, Jorge %A Méndez-Vidal, Cristina %A González-del Pozo, María %A Vela, Alicia %A Bhattacharya, Shomi S %A Antiňolo, Guillermo %A Dopazo, Joaquin %K NGS %X ABSTRACT: BACKGROUND: MicroRNAs (miRNAs) are key components of the gene regulatory network in many species. During the past few years, these regulatory elements have been shown to be involved in an increasing number and range of diseases. Consequently, the compilation of a comprehensive map of natural variability in healthy population seems an obvious requirement for future research on miRNA-related pathologies. METHODS: Data on 14 populations from the 1000 Genomes Project were analysed, along with new data extracted from 60 exomes of healthy individuals from a southern Spain population, sequenced in the context of the Medical Genome Project, to derive an accurate map of miRNA variability. RESULTS: Despite the common belief that miRNAs are highly conserved elements, analysis of the sequences of the 1,152 individuals indicated that the observed level of variability is double what was expected. A total of 527 variants were found. Among these, 45 variants affected the recognition region of the corresponding miRNA and were found in 43 different miRNAs, 26 of which are known to be involved in 57 diseases. Different parts of the mature structure of the miRNA were affected to different degrees by variants, which suggests the existence of a selective pressure related to the relative functional impact of the change. Moreover, 41 variants showed a significant deviation from the Hardy-Weinberg equilibrium, which supports the existence of a selective process against some alleles. The average number of variants per individual in miRNAs was 28. CONCLUSIONS: Despite an expectation that miRNAs would be highly conserved genomic elements, our study reports a level of variability comparable to that observed for coding genes. %B Genome medicine %V 4 %P 62 %8 2012 Aug 20 %G eng %U http://genomemedicine.com/content/4/8/62/abstract %R 10.1186/gm363 %0 Journal Article %J Molecular plant pathology %D 2012 %T Microarray analysis of Etrog citron (Citrus medica L.) reveals changes in chloroplast, cell wall, peroxidase and symporter activities in response to viroid infection. %A Rizza, Serena %A Ana Conesa %A Juarez, José %A Catara, Antonino %A Navarro, Luis %A Duran-Vila, Nuria %A Ancillo, Gema %X Viroids are small (246-401 nucleotides), single-stranded, circular RNA molecules that infect several crop plants and can cause diseases of economic importance. Citrus are the hosts in which the largest number of viroids have been identified. Citrus exocortis viroid (CEVd), the causal agent of citrus exocortis disease, induces considerable losses in citrus crops. Changes in the gene expression profile during the early (pre-symptomatic) and late (post-symptomatic) stages of Etrog citron infected with CEVd were investigated using a citrus cDNA microarray. MaSigPro analysis was performed and, on the basis of gene expression profiles as a function of the time after infection, the differentially expressed genes were classified into five clusters. FatiScan analysis revealed significant enrichment of functional categories for each cluster, indicating that viroid infection triggers important changes in chloroplast, cell wall, peroxidase and symporter activities. %B Molecular plant pathology %8 2012 Mar 15 %G eng %R 10.1111/j.1364-3703.2012.00794.x %0 Journal Article %J FASEB J %D 2012 %T The protease MT1-MMP drives a combinatorial proteolytic program in activated endothelial cells. %A Koziol, Agnieszka %A Gonzalo, Pilar %A Mota, Alba %A Pollán, Angela %A Lorenzo, Cristina %A Colomé, Nuria %A Montaner, David %A Dopazo, Joaquin %A Arribas, Joaquín %A Canals, Francesc %A Arroyo, Alicia G %K Animals %K Blotting, Western %K Combinatorial Chemistry Techniques %K Computational Biology %K Endothelial Cells %K Gene Expression Regulation, Enzymologic %K Inflammation %K Matrix Metalloproteinase 14 %K Mice %K Protein Array Analysis %K Reverse Transcriptase Polymerase Chain Reaction %K RNA Interference %K RNA, Small Interfering %K Transcriptome %K Tumor Necrosis Factor-alpha %X

The mechanism by which proteolytic events translate into biological responses is not well understood. To explore the link of pericellular proteolysis to events relevant to capillary sprouting within the inflammatory context, we aimed at the identification of the collection of substrates of the protease MT1-MMP in endothelial tip cells induced by inflammatory stimuli. We applied quantitative proteomics to endothelial cells (ECs) derived from wild-type and MT1-MMP-null mice to identify the substrate repertoire of this protease in TNF-α-activated ECs. Bioinformatics analysis revealed a combinatorial MT1-MMP proteolytic program, in which combined rather than single substrate processing would determine biological decisions by activated ECs, including chemotaxis, cell motility and adhesion, and vasculature development. MT1-MMP-deficient ECs inefficiently processed several of these substrates (TSP1, CYR61, NID1, and SEM3C), validating the model. This novel concept of MT1-MMP-driven combinatorial proteolysis in angiogenesis might be extendable to proteolytic actions in other cellular contexts.

%B FASEB J %V 26 %P 4481-94 %8 2012 Nov %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/22859368?dopt=Abstract %R 10.1096/fj.12-205906 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2012 %T Qualimap: evaluating next-generation sequencing alignment data. %A García-Alcalde, Fernando %A Okonechnikov, Konstantin %A Carbonell, José %A Cruz, Luis M %A Götz, Stefan %A Sonia Tarazona %A Joaquín Dopazo %A Meyer, Thomas F %A Ana Conesa %K NGS %X MOTIVATION: The sequence alignment/map (SAM) and the binary alignment/map (BAM) formats have become the standard method of representation of nucleotide sequence alignments for next-generation sequencing data. SAM/BAM files usually contain information from tens to hundreds of millions of reads. Often, the sequencing technology, protocol and/or the selected mapping algorithm introduce some unwanted biases in these data. The systematic detection of such biases is a non-trivial task that is crucial to drive appropriate downstream analyses. RESULTS: We have developed Qualimap, a Java application that supports user-friendly quality control of mapping data, by considering sequence features and their genomic properties. Qualimap takes sequence alignment data and provides graphical and statistical analyses for the evaluation of data. Such quality-control data are vital for highlighting problems in the sequencing and/or mapping processes, which must be addressed prior to further analyses. AVAILABILITY: Qualimap is freely available from http://www.qualimap.org. CONTACT: aconesa@cipf.es SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics (Oxford, England) %V 28 %P 2678-9 %8 2012 Oct 15 %G eng %U http://bioinformatics.oxfordjournals.org/content/28/20/2678.long %R 10.1093/bioinformatics/bts503 %0 Journal Article %J International journal of data mining and bioinformatics %D 2012 %T Select your SNPs (SYSNPs): a web tool for automatic and massive selection of SNPs. %A Lorente-Galdos, Belén %A Medina, Ignacio %A Morcillo-Suarez, Carlos %A Heredia, Txema %A Carreño-Torres, Angel %A Sangrós, Ricardo %A Alegre, Josep %A Pita, Guillermo %A Vellalta, Gemma %A Malats, Nuria %A Pisano, David G %A Joaquín Dopazo %A Navarro, Arcadi %X Association studies are the choice approach in the discovery of the genomic basis of complex traits. To carry out such analysis, researchers frequently need to (1) select optimally informative sets of Single Nucleotide Polymorphisms (SNPs) in candidate regions and (2) annotate the results of associations found by means of genome-wide SNP arrays. These are complex tasks, since many criteria have to be considered, including the SNPs’ functional properties, technological information and haplotype frequencies in given populations. SYSNPs implements algorithms that allow for efficient and simultaneous consideration of all the relevant criteria to obtain sets of SNPs that properly cover arbitrarily large lists of genes or genomic regions. Complementarily, SYSNPs allows for comprehensive functional annotation of SNPs linked to any given marker SNP. SYSNPs dramatically reduces the effort needed for SNP selection from days of searching various databases to a few minutes using a simple browser. %B International journal of data mining and bioinformatics %V 6 %P 324-34 %8 2012 %G eng %U http://inderscience.metapress.com/content/f76740x8071u513n/ %0 Journal Article %J Nucleic Acids Res %D 2012 %T SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants. %A De Baets, Greet %A Van Durme, Joost %A Reumers, Joke %A Maurer-Stroh, Sebastian %A Vanhee, Peter %A Dopazo, Joaquin %A Schymkowitz, Joost %A Rousseau, Frederic %K Databases, Protein %K Humans %K Internet %K Meta-Analysis as Topic %K Phenotype %K Polymorphism, Single Nucleotide %K Protein Conformation %K Proteins %X

Single nucleotide variants (SNVs) are, together with copy number variation, the primary source of variation in the human genome and are associated with phenotypic variation such as altered response to drug treatment and susceptibility to disease. Linking structural effects of non-synonymous SNVs to functional outcomes is a major issue in structural bioinformatics. The SNPeffect database (http://snpeffect.switchlab.org) uses sequence- and structure-based bioinformatics tools to predict the effect of protein-coding SNVs on the structural phenotype of proteins. It integrates aggregation prediction (TANGO), amyloid prediction (WALTZ), chaperone-binding prediction (LIMBO) and protein stability analysis (FoldX) for structural phenotyping. Additionally, SNPeffect holds information on affected catalytic sites and a number of post-translational modifications. The database contains all known human protein variants from UniProt, but users can now also submit custom protein variants for a SNPeffect analysis, including automated structure modeling. The new meta-analysis application allows plotting correlations between phenotypic features for a user-selected set of variants.

%B Nucleic Acids Res %V 40 %P D935-9 %8 2012 Jan %G eng %N Database issue %1 https://www.ncbi.nlm.nih.gov/pubmed/22075996?dopt=Abstract %R 10.1093/nar/gkr996 %0 Journal Article %J PloS one %D 2012 %T Transcriptome profiling of the intoxication response of Tenebrio molitor larvae to Bacillus thuringiensis Cry3Aa protoxin. %A Oppert, Brenda %A Dowd, Scot E %A Bouffard, Pascal %A Li, Lewyn %A Ana Conesa %A Lorenzen, Marcé D %A Toutges, Michelle %A Marshall, Jeremy %A Huestis, Diana L %A Fabrick, Jeff %A Oppert, Cris %A Jurat-Fuentes, Juan Luis %K Administration %K Animals %K Bacterial Proteins %K Base Sequence %K Biosynthetic Pathways %K Complementary %K DNA %K Endotoxins %K Energy Metabolism %K Gene Expression Profiling %K Hemolysin Proteins %K Larva %K Microarray Analysis %K Molecular Sequence Data %K Oral %K Sequence Analysis %K Tenebrio %K Time Factors %K Transcriptome %X Bacillus thuringiensis (Bt) crystal (Cry) proteins are effective against a select number of insect pests, but improvements are needed to increase efficacy and decrease time to mortality for coleopteran pests. To gain insight into the Bt intoxication process in Coleoptera, we performed RNA-Seq on cDNA generated from the guts of Tenebrio molitor larvae that consumed either a control diet or a diet containing Cry3Aa protoxin. Approximately 134,090 and 124,287 sequence reads from the control and Cry3Aa-treated groups were assembled into 1,318 and 1,140 contigs, respectively. Enrichment analyses indicated that functions associated with mitochondrial respiration, signalling, maintenance of cell structure, membrane integrity, protein recycling/synthesis, and glycosyl hydrolases were significantly increased in Cry3Aa-treated larvae, whereas functions associated with many metabolic processes were reduced, especially glycolysis, tricarboxylic acid cycle, and fatty acid synthesis. Microarray analysis was used to evaluate temporal changes in gene expression after 6, 12 or 24 h of Cry3Aa exposure. Overall, microarray analysis indicated that transcripts related to allergens, chitin-binding proteins, glycosyl hydrolases, and tubulins were induced, and those related to immunity and metabolism were repressed in Cry3Aa-intoxicated larvae. The 24 h microarray data validated most of the RNA-Seq data. Of the three intoxication intervals, larvae demonstrated more differential expression of transcripts after 12 h exposure to Cry3Aa. Gene expression examined by three different methods in control vs. Cry3Aa-treated larvae at the 24 h time point indicated that transcripts encoding proteins with chitin-binding domain 3 were the most differentially expressed in Cry3Aa-intoxicated larvae. Overall, the data suggest that T. molitor larvae mount a complex response to Cry3Aa during the initial 24 h of intoxication. Data from this study represent the largest genetic sequence dataset for T. molitor to date. Furthermore, the methods in this study are useful for comparative analyses in organisms lacking a sequenced genome. %B PloS one %V 7 %P e34624 %8 2012 %G eng %R 10.1371/journal.pone.0034624 %0 Journal Article %J SpringerPlus %D 2012 %T Transdifferentiation of MALME-3M and MCF-7 Cells toward Adipocyte-like Cells is Dependent on Clathrin-mediated Endocytosis. %A Carcel-Trullols, Jaime %A Aguilar-Gallardo, Cristóbal %A García-Alcalde, Fernando %A Pardo-Cea, Miguel Angel %A Dopazo, Joaquin %A Ana Conesa %A Simon, Carlos %X ABSTRACT: Enforced cell transdifferentiation of human cancer cells is a promising alternative to conventional chemotherapy. We previously identified albumin-associated lipid- and, more specifically, saturated fatty acid-induced transdifferentiation programs in human cancer cells (HCCLs). In this study, we further characterized the adipocyte-like cells, resulting from the transdifferentiation of human cancer cell lines MCF-7 and MALME-3M, and proposed a common mechanistic approach for these transdifferentiating programs. We showed the loss of pigmentation in MALME-3M cells treated with albumin-associated lipids, based on electron microscopic analysis, and the overexpression of perilipin 2 (PLIN2) by western blotting in MALME-3M and MCF-7 cells treated with unsaturated fatty acids. Comparing the gene expression profiles of naive melanoma MALME-3M cells and albumin-associated lipid-treated cells, based on RNA sequencing, we confirmed the transcriptional upregulation of some key adipogenic gene markers and also an alternative splicing of the adipogenic master regulator PPARG, that is probably related to the reported up regulated expression of the protein. Most importantly, these results also showed the upregulation of genes responsible for Clathrin (CLTC) and other adaptor-related proteins. An increase in CLTC expression in the transdifferentiated cells was confirmed by western blotting. Inactivation of CLTC by chlorpromazine (CHP), an inhibitor of CTLC mediated endocytosis (CME), and gene silencing by siRNAs, partially reversed the accumulation of neutral lipids observed in the transdifferentiated cells. These findings give a deeper insight into the phenotypic changes observed in HCCL to adipocyte-like transdifferentiation and point towards CME as a key pathway in distinct transdifferentiation programs. DISCLOSURES: Simon C and Aguilar-Gallardo C are co-inventors of the International Patent Application No. PCT/EP2011/004941 entitled "Methods for tumor treatment and adipogenesis differentiation". %B SpringerPlus %V 1 %P 44 %8 2012 %G eng %U http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3725915/ %R 10.1186/2193-1801-1-44 %0 Journal Article %J IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %D 2012 %T Using GPUs for the Exact Alignment of Short-read Genetic Sequences by Means of the Burrows–Wheeler Transform. %A Salavert Torres, Jose %A Blanquer Espert, Ignacio %A Tomas Dominguez, Andres %A Hernendez, Vicente %A Medina, Ignacio %A Terraga, Joaquin %A Dopazo, Joaquin %K Burrows-Wheeler transform %K CPU execution %K GPGPU %K NGS %X General Purpose Graphic Processing Units (GPGPUs) constitute an inexpensive resource for computing-intensive applications that could exploit an intrinsic fine-grain parallelism. This paper presents the design and implementation in GPGPUs of an exact alignment tool for nucleotide sequences based on the Burrows-Wheeler Transform. We compare this algorithm with state-of-the-art implementations of the same algorithm over standard CPUs, and considering the same conditions in terms of I/O. Excluding disk transfers, the implementation of the algorithm in GPUs shows a speedup larger than 12x, when compared to CPU execution. This implementation exploits the parallelism by concurrently searching different sequences on the same reference search tree, maximising memory locality and ensuring a symmetric access to the data. The article describes the behaviour of the algorithm in GPU, showing a good scalability in the performance, only limited by the size of the GPU inner memory. %B IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %V 9 %P 1245-1256 %8 2012 Mar 20 %G eng %U http://ieeexplore.ieee.org.sire.ub.edu/xpl/articleDetails.jsp?reload=true&arnumber=6175888 %R 10.1109/TCBB.2012.49 %0 Journal Article %J IEEE/ACM Trans Comput Biol Bioinform %D 2012 %T Using GPUs for the exact alignment of short-read genetic sequences by means of the Burrows-Wheeler transform. %A Salavert Torres, Jose %A Blanquer Espert, Ignacio %A Domínguez, Andrés Tomás %A Hernández García, Vicente %A Medina Castelló, Ignacio %A Tárraga Giménez, Joaquín %A Dopazo Blázquez, Joaquín %K Algorithms %K Animals %K Computational Biology %K Computer Graphics %K Data Compression %K Drosophila melanogaster %K Genes, Insect %K Image Processing, Computer-Assisted %K Models, Genetic %K Sequence Alignment %K Sequence Analysis, DNA %X

General Purpose Graphic Processing Units (GPGPUs) constitute an inexpensive resource for computing-intensive applications that could exploit an intrinsic fine-grain parallelism. This paper presents the design and implementation in GPGPUs of an exact alignment tool for nucleotide sequences based on the Burrows-Wheeler Transform. We compare this algorithm with state-of-the-art implementations of the same algorithm over standard CPUs, and considering the same conditions in terms of I/O. Excluding disk transfers, the implementation of the algorithm in GPUs shows a speedup larger than 12, when compared to CPU execution. This implementation exploits the parallelism by concurrently searching different sequences on the same reference search tree, maximizing memory locality and ensuring a symmetric access to the data. The paper describes the behavior of the algorithm in GPU, showing a good scalability in the performance, only limited by the size of the GPU inner memory.

%B IEEE/ACM Trans Comput Biol Bioinform %V 9 %P 1245-56 %8 2012 Jul-Aug %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/22450827?dopt=Abstract %R 10.1109/TCBB.2012.49 %0 Journal Article %J IEEE/ACM Transactions on Computational Biology and Bioinformatics %D 2012 %T Using GPUs for the Exact Alignment of Short-Read Genetic Sequences by Means of the Burrows-Wheeler Transform %A Torres, J. S. %A Espert, I. B. %A Dominguez, A. T. %A Garcia, V. Hernendez %A Castello, I. Medina %A Gimenez, J. Terraga %A Blazquez, J. Dopazo %B IEEE/ACM Transactions on Computational Biology and Bioinformatics %V 9 %P 1245 - 1256 %8 Jan-07-2012 %G eng %U http://ieeexplore.ieee.org/document/6175888/http://xplorestaging.ieee.org/ielx5/8857/6202798/06175888.pdf?arnumber=6175888 %N 4 %! IEEE/ACM Trans. Comput. Biol. and Bioinf. %R 10.1109/TCBB.2012.49 %0 Journal Article %J Nucleic Acids Res %D 2012 %T VARIANT: Command Line, Web service and Web interface for fast and accurate functional characterization of variants found by Next-Generation Sequencing. %A Medina, Ignacio %A De Maria, Alejandro %A Bleda, Marta %A Salavert, Francisco %A Alonso, Roberto %A Gonzalez, Cristina Y %A Dopazo, Joaquin %K Databases, Nucleic Acid %K Genetic Variation %K High-Throughput Nucleotide Sequencing %K Internet %K Molecular Sequence Annotation %K mutation %K Polymorphism, Single Nucleotide %K Software %K User-Computer Interface %X

The massive use of Next-Generation Sequencing (NGS) technologies is uncovering an unexpected amount of variability. The functional characterization of such variability, particularly in the most common form of variation found, the Single Nucleotide Variants (SNVs), has become a priority that needs to be addressed in a systematic way. VARIANT (VARIant ANalyis Tool) reports information on the variants found that include consequence type and annotations taken from different databases and repositories (SNPs and variants from dbSNP and 1000 genomes, and disease-related variants from the Genome-Wide Association Study (GWAS) catalog, Online Mendelian Inheritance in Man (OMIM), Catalog of Somatic Mutations in Cancer (COSMIC) mutations, etc). VARIANT also produces a rich variety of annotations that include information on the regulatory (transcription factor or miRNA-binding sites, etc.) or structural roles, or on the selective pressures on the sites affected by the variation. This information allows extending the conventional reports beyond the coding regions and expands the knowledge on the contribution of non-coding or synonymous variants to the phenotype studied. Contrarily to other tools, VARIANT uses a remote database and operates through efficient RESTful Web Services that optimize search and transaction operations. In this way, local problems of installation, update or disk size limitations are overcome without the need of sacrifice speed (thousands of variants are processed per minute). VARIANT is available at: http://variant.bioinfo.cipf.es.

%B Nucleic Acids Res %V 40 %P W54-8 %8 2012 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/22693211?dopt=Abstract %R 10.1093/nar/gks572 %0 Journal Article %J Epigenetics %D 2012 %T Whole-genome bisulfite DNA sequencing of a DNMT3B mutant patient. %A Heyn, Holger %A Vidal, Enrique %A Sayols, Sergi %A Sanchez-Mut, Jose V %A Moran, Sebastian %A Medina, Ignacio %A Sandoval, Juan %A Simó-Riudalbas, Laia %A Szczesna, Karolina %A Huertas, Dori %A Gatto, Sole %A Matarazzo, Maria R %A Dopazo, Joaquin %A Esteller, Manel %K B-Lymphocytes %K Cell Line, Transformed %K Child, Preschool %K DNA (Cytosine-5-)-Methyltransferases %K DNA Methylation %K Epigenesis, Genetic %K Face %K Female %K Genome, Human %K High-Throughput Nucleotide Sequencing %K Humans %K Immunologic Deficiency Syndromes %K mutation %K Primary Immunodeficiency Diseases %K Sequence Analysis, DNA %K Sulfites %X

The immunodeficiency, centromere instability and facial anomalies (ICF) syndrome is associated to mutations of the DNA methyl-transferase DNMT3B, resulting in a reduction of enzyme activity. Aberrant expression of immune system genes and hypomethylation of pericentromeric regions accompanied by chromosomal instability were determined as alterations driving the disease phenotype. However, so far only technologies capable to analyze single loci were applied to determine epigenetic alterations in ICF patients. In the current study, we performed whole-genome bisulphite sequencing to assess alteration in DNA methylation at base pair resolution. Genome-wide we detected a decrease of methylation level of 42%, with the most profound changes occurring in inactive heterochromatic regions, satellite repeats and transposons. Interestingly, transcriptional active loci and ribosomal RNA repeats escaped global hypomethylation. Despite a genome-wide loss of DNA methylation the epigenetic landscape and crucial regulatory structures were conserved. Remarkably, we revealed a mislocated activity of mutant DNMT3B to H3K4me1 loci resulting in hypermethylation of active promoters. Functionally, we could associate alterations in promoter methylation with the ICF syndrome immunodeficient phenotype by detecting changes in genes related to the B-cell receptor mediated maturation pathway.

%B Epigenetics %V 7 %P 542-50 %8 2012 Jun 01 %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/22595875?dopt=Abstract %R 10.4161/epi.20523 %0 Journal Article %J PloS one %D 2011 %T Analysis of normal-tumour tissue interaction in tumours: prediction of prostate cancer features from the molecular profile of adjacent normal cells. %A Trevino, Victor %A Tadesse, Mahlet G %A Vannucci, Marina %A Fatima Al-Shahrour %A Antczak, Philipp %A Durant, Sarah %A Bikfalvi, Andreas %A Dopazo, Joaquin %A Campbell, Moray J %A Falciani, Francesco %X

Statistical modelling, in combination with genome-wide expression profiling techniques, has demonstrated that the molecular state of the tumour is sufficient to infer its pathological state. These studies have been extremely important in diagnostics and have contributed to improving our understanding of tumour biology. However, their importance in in-depth understanding of cancer patho-physiology may be limited since they do not explicitly take into consideration the fundamental role of the tissue microenvironment in specifying tumour physiology. Because of the importance of normal cells in shaping the tissue microenvironment we formulate the hypothesis that molecular components of the profile of normal epithelial cells adjacent the tumour are predictive of tumour physiology. We addressed this hypothesis by developing statistical models that link gene expression profiles representing the molecular state of adjacent normal epithelial cells to tumour features in prostate cancer. Furthermore, network analysis showed that predictive genes are linked to the activity of important secreted factors, which have the potential to influence tumor biology, such as IL1, IGF1, PDGF BB, AGT, and TGFβ.

%B PloS one %V 6 %P e16492 %8 2011 %G eng %0 Journal Article %J Biostatistics (Oxford, England) %D 2011 %T ARSyN: a method for the identification and removal of systematic noise in multifactorial time course microarray experiments. %A Nueda, Maria J %A Alberto Ferrer %A Ana Conesa %X Transcriptomic profiling experiments that aim to the identification of responsive genes in specific biological conditions are commonly set up under defined experimental designs that try to assess the effects of factors and their interactions on gene expression. Data from these controlled experiments, however, may also contain sources of unwanted noise that can distort the signal under study, affect the residuals of applied statistical models, and hamper data analysis. Commonly, normalization methods are applied to transcriptomics data to remove technical artifacts, but these are normally based on general assumptions of transcript distribution and greatly ignore both the characteristics of the experiment under consideration and the coordinative nature of gene expression. In this paper, we propose a novel methodology, ARSyN, for the preprocessing of microarray data that takes into account these 2 last aspects. By combining analysis of variance (ANOVA) modeling of gene expression values and multivariate analysis of estimated effects, the method identifies the nonstructured part of the signal associated to the experimental factors (the noise within the signal) and the structured variation of the ANOVA errors (the signal of the noise). By removing these noise fractions from the original data, we create a filtered data set that is rich in the information of interest and includes only the random noise required for inferential analysis. In this work, we focus on multifactorial time course microarray (MTCM) experiments with 2 factors: one quantitative such as time or dosage and the other qualitative, as tissue, strain, or treatment. However, the method can be used in other situations such as experiments with only one factor or more complex designs with more than 2 factors. The filtered data obtained after applying ARSyN can be further analyzed with the appropriate statistical technique to obtain the biological information required. To evaluate the performance of the filtering strategy, we have applied different statistical approaches for MTCM analysis to several real and simulated data sets, studying also the efficiency of these techniques. By comparing the results obtained with the original and ARSyN filtered data and also with other filtering techniques, we can conclude that the proposed method increases the statistical power to detect biological signals, especially in cases where there are high levels of structural noise. Software for ARSyN is freely available at http://www.ua.es/personal/mj.nueda. %B Biostatistics (Oxford, England) %8 2011 Nov 14 %G eng %0 Journal Article %J PloS one %D 2011 %T Assessing the biological significance of gene expression signatures and co-expression modules by studying their network properties. %A Minguez, Pablo %A Dopazo, Joaquin %X

Microarray experiments have been extensively used to define signatures, which are sets of genes that can be considered markers of experimental conditions (typically diseases). Paradoxically, in spite of the apparent functional role that might be attributed to such gene sets, signatures do not seem to be reproducible across experiments. Given the close relationship between function and protein interaction, network properties can be used to study to what extent signatures are composed of genes whose resulting proteins show a considerable level of interaction (and consequently a putative common functional role).We have analysed 618 signatures and 507 modules of co-expression in cancer looking for significant values of four main protein-protein interaction (PPI) network parameters: connection degree, cluster coefficient, betweenness and number of components. A total of 3904 gene ontology (GO) modules, 146 KEGG pathways, and 263 Biocarta pathways have been used as functional modules of reference.Co-expression modules found in microarray experiments display a high level of connectivity, similar to the one shown by conventional modules based on functional definitions (GO, KEGG and Biocarta). A general observation for all the classes studied is that the networks formed by the modules improve their topological parameters when an external protein is allowed to be introduced within the paths (up to the 70% of GO modules show network parameters beyond the random expectation). This fact suggests that functional definitions are incomplete and some genes might still be missing. Conversely, signatures are clearly not capturing the altered functions in the corresponding studies. This is probably because the way in which the genes have been selected in the signatures is too conservative. These results suggest that gene selection methods which take into account relationships among genes should be superior to methods that assume independence among genes outside their functional contexts.

%B PloS one %V 6 %P e17474 %8 2011 %G eng %U http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0017474 %R doi:10.1371/journal.pone.0017474 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2011 %T B2G-FAR, a species centered GO annotation repository. %A Götz, Stefan %A Arnold, Roland %A Sebastián-Leon, Patricia %A Martín-Rodríguez, Samuel %A Tischler, Patrick %A Jehl, Marc-André %A Joaquín Dopazo %A Rattei, Thomas %A Ana Conesa %X

MOTIVATION: Functional genomics research has expanded enormously in the last decade thanks to the cost-reduction in high-throughput technologies and the development of computational tools that generate, standardize and share information on gene and protein function such as the Gene Ontology (GO). Nevertheless many biologists, especially working with non-model organisms, still suffer from non-existing or low coverage functional annotation, or simply struggle retrieving, summarizing and querying these data. RESULTS: The Blast2GO Functional Annotation Repository (B2G-FAR) is a bioinformatics resource envisaged to provide functional information for otherwise uncharacterized sequence-data and offers data-mining tools to analyze a larger repertoire of species than currently available. This new annotation resource has been created by applying the Blast2GO functional annotation engine in a strongly high-throughput manner to the entire space of public available sequences. The resulting repository contains GO term predictions for over 13.2 million non-redundant protein sequences based on BLAST search alignments from the SIMAP database. We generated GO annotation for approximately 150.000 different taxa making available the 2000 species with the highest coverage through B2G-FAR. A second section within B2G-FAR holds functional annotations for 17 non-model organism Affymetrix GeneChips. Conclusions: B2G-FAR provides easy access to exhaustive functional annotation for 2000 species offering a good balance between quality and quantity, thereby supporting functional genomics research especially in the case of non-model organisms. AVAILABILITY: The annotation resource is available at http://b2gfar.bioinfo.cipf.es. CONTACT: aconesa@cipf.es, sgoetz@cipf.es.

%B Bioinformatics (Oxford, England) %V 27 %P 919-924 %8 2011 Feb 18 %G eng %0 Journal Article %J Genome Res %D 2011 %T Differential expression in RNA-seq: a matter of depth. %A Tarazona, Sonia %A García-Alcalde, Fernando %A Dopazo, Joaquin %A Ferrer, Alberto %A Conesa, Ana %K Algorithms %K Expressed Sequence Tags %K Gene Expression Profiling %K Gene Expression Regulation %K Humans %K Models, Genetic %K Oligonucleotide Array Sequence Analysis %X

Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach--NOISeq--that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.

%B Genome Res %V 21 %P 2213-23 %8 2011 Dec %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/21903743?dopt=Abstract %R 10.1101/gr.124321.111 %0 Journal Article %J Diabetes %D 2011 %T Differential Lipid Partitioning Between Adipocytes and Tissue Macrophages Modulates Macrophage Lipotoxicity and M2/M1 Polarization in Obese Mice. %A Prieur, Xavier %A Mok, Crystal Y L %A Velagapudi, Vidya R %A Núñez, Vanessa %A Fuentes, Lucía %A Montaner, David %A Ishikawa, Ko %A Camacho, Alberto %A Barbarroja, Nuria %A O’Rahilly, Stephen %A Sethi, Jaswinder %A Dopazo, Joaquin %A Oresic, Matej %A Ricote, Mercedes %A Vidal-Puig, Antonio %X

OBJECTIVE Obesity-associated insulin resistance is characterized by a state of chronic, low-grade inflammation that is associated with the accumulation of M1 proinflammatory macrophages in adipose tissue. Although different evidence explains the mechanisms linking the expansion of adipose tissue and adipose tissue macrophage (ATM) polarization, in the current study we investigated the concept of lipid-induced toxicity as the pathogenic link that could explain the trigger of this response. RESEARCH DESIGN AND METHODS We addressed this question using isolated ATMs and adipocytes from genetic and diet-induced murine models of obesity. Through transcriptomic and lipidomic analysis, we created a model integrating transcript and lipid species networks simultaneously occurring in adipocytes and ATMs and their reversibility by thiazolidinedione treatment. RESULTS We show that polarization of ATMs is associated with lipid accumulation and the consequent formation of foam cell-like cells in adipose tissue. Our study reveals that early stages of adipose tissue expansion are characterized by M2-polarized ATMs and that progressive lipid accumulation within ATMs heralds the M1 polarization, a macrophage phenotype associated with severe obesity and insulin resistance. Furthermore, rosiglitazone treatment, which promotes redistribution of lipids toward adipocytes and extends the M2 ATM polarization state, prevents the lipid alterations associated with M1 ATM polarization. CONCLUSIONS Our data indicate that the M1 ATM polarization in obesity might be a macrophage-specific manifestation of a more general lipotoxic pathogenic mechanism. This indicates that strategies to optimize fat deposition and repartitioning toward adipocytes might improve insulin sensitivity by preventing ATM lipotoxicity and M1 polarization.

%B Diabetes %V 60 %P 797-809 %8 2011 Jan 24 %G eng %0 Journal Article %J PLoS pathogens %D 2011 %T Discovery of an ebolavirus-like filovirus in europe. %A Negredo, Ana %A Palacios, Gustavo %A Vázquez-Morón, Sonia %A González, Félix %A Dopazo, Hernán %A Molero, Francisca %A Juste, Javier %A Quetglas, Juan %A Savji, Nazir %A de la Cruz Martínez, Maria %A Herrera, Jesus Enrique %A Pizarro, Manuel %A Hutchison, Stephen K %A Echevarría, Juan E %A Lipkin, W Ian %A Tenorio, Antonio %X

Filoviruses, amongst the most lethal of primate pathogens, have only been reported as natural infections in sub-Saharan Africa and the Philippines. Infections of bats with the ebolaviruses and marburgviruses do not appear to be associated with disease. Here we report identification in dead insectivorous bats of a genetically distinct filovirus, provisionally named Lloviu virus, after the site of detection, Cueva del Lloviu, in Spain.

%B PLoS pathogens %V 7 %P e1002304 %8 2011 Oct %G eng %0 Journal Article %J Plant signaling & behavior %D 2011 %T Does singlet oxygen activate cell death in Arabidopsis cell suspension cultures? Analysis of the early transcriptional defence responses to high light stress. %A Gutiérrez, Jorge %A González-Pérez, Sergio %A Garcia-Garcia, Francisco %A Lorenzo, Oscar %A Arellano, Juan B %X

Can Arabidopsis cell suspension cultures (ACSC) provide a useful working model to investigate genetically-controlled defence responses with signalling cascades starting in chloroplasts? In order to provide a convincing answer, we analysed the early transcriptional profile of Arabidopsis cells at high light (HL). The results showed that ACSC respond to HL in a manner that resembles the singlet oxygen ( ( 1) O 2)-mediated defence responses described for the conditional fluorescent (flu) mutant of Arabidopsis thaliana. The flu mutant is characterized by the accumulation of free protochlorophyllide (Pchlide) in plastids when put into darkness and the subsequent production of ( 1) O 2 when the light is on. In ACSC, ( 1) O 2 is produced in chloroplasts at HL when excess excitation energy flows into photosystem II (PSII). Other reactive oxygen species are also produced in ACSC at HL, but to a lesser extent. When the HL stress ceases, ACSC recovers the initial rate of oxygen evolution and cell growth continues. We can conclude that chloroplasts of ACSC are both photosynthetically active and capable of initiating ( 1) O 2-mediated signalling cascades that activate a broad range of genetically-controlled defence responses. The up-regulation of transcripts associated with the biosynthesis and signalling pathways of OPDA (12-oxophytodienoic acid) and ethylene (ET) suggests that the activated defence responses at HL are governed by these two hormones. In contrast to the flu mutant, the ( 1) O 2-mediated defence responses were independent of the up-regulation of EDS1 (enhanced disease susceptibility) required for the accumulation of salicylic acid (SA) and genetically-controlled cell death. 

%B Plant signaling & behavior %V 6 %8 2011 Dec 1 %G eng %0 Journal Article %J BMC Med Genomics %D 2011 %T Early peroxisome proliferator-activated receptor gamma regulated genes involved in expansion of pancreatic beta cell mass. %A Vivas, Yurena %A Martinez-Garcia, Cristina %A Izquierdo, Adriana %A Garcia-Garcia, Francisco %A Callejas, Sergio %A Velasco, Ismael %A Campbell, Mark %A Ros, Manuel %A Dopazo, Ana %A Dopazo, Joaquin %A Vidal-Puig, Antonio %A Medina-Gomez, Gema %K Animals %K Cell Proliferation %K Cell Survival %K Cholesterol %K Down-Regulation %K Female %K Gene Expression Regulation %K Gene Knockout Techniques %K Insulin Resistance %K Insulin-Secreting Cells %K Mice %K obesity %K Oxidation-Reduction %K Phosphorylation %K PPAR gamma %K Signal Transduction %K Transcription, Genetic %K Transforming Growth Factor beta %X

BACKGROUND: The progression towards type 2 diabetes depends on the allostatic response of pancreatic beta cells to synthesise and secrete enough insulin to compensate for insulin resistance. The endocrine pancreas is a plastic tissue able to expand or regress in response to the requirements imposed by physiological and pathophysiological states associated to insulin resistance such as pregnancy, obesity or ageing, but the mechanisms mediating beta cell mass expansion in these scenarios are not well defined. We have recently shown that ob/ob mice with genetic ablation of PPARγ2, a mouse model known as the POKO mouse failed to expand its beta cell mass. This phenotype contrasted with the appropriate expansion of the beta cell mass observed in their obese littermate ob/ob mice. Thus, comparison of these models islets particularly at early ages could provide some new insights on early PPARγ dependent transcriptional responses involved in the process of beta cell mass expansion

RESULTS: Here we have investigated PPARγ dependent transcriptional responses occurring during the early stages of beta cell adaptation to insulin resistance in wild type, ob/ob, PPARγ2 KO and POKO mice. We have identified genes known to regulate both the rate of proliferation and the survival signals of beta cells. Moreover we have also identified new pathways induced in ob/ob islets that remained unchanged in POKO islets, suggesting an important role for PPARγ in maintenance/activation of mechanisms essential for the continued function of the beta cell.

CONCLUSIONS: Our data suggest that the expansion of beta cell mass observed in ob/ob islets is associated with the activation of an immune response that fails to occur in POKO islets. We have also indentified other PPARγ dependent differentially regulated pathways including cholesterol biosynthesis, apoptosis through TGF-β signaling and decreased oxidative phosphorylation.

%B BMC Med Genomics %V 4 %P 86 %8 2011 Dec 30 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/22208362?dopt=Abstract %R 10.1186/1755-8794-4-86 %0 Journal Article %J Plant physiology %D 2011 %T Early transcriptional defence responses in Arabidopsis cell suspension culture under high light conditions. %A González-Pérez, Sergio %A Gutiérrez, Jorge %A Garcia-Garcia, Francisco %A Osuna, Daniel %A Joaquín Dopazo %A Lorenzo, Oscar %A Revuelta, José L %A Arellano, Juan B %X

The early transcriptional defence responses and ROS production in Arabidopsis cell suspension culture (ACSC), containing functional chloroplasts, were examined at high light (HL). The transcriptional analysis revealed that most of the ROS markers identified among the 449 transcripts with significant differential expression were transcripts specifically up-regulated by singlet oxygen (1O2). On the contrary, minimal correlation was established with transcripts specifically up-regulated by superoxide radical (O2•) or hydrogen peroxide (H2O2). The transcriptional analysis was supported by fluorescence microscopy experiments. The incubation of ACSC with the 1O2 sensor green reagent and 2’,7’-dichlorofluorescein diacetate showed that the 30-min-HL-treated cultures emitted fluorescence that corresponded with the production of 1O2, but not of H2O2. Furthermore, the in vivo photodamage of the D1 protein of photosystem II (PSII) indicated that the photogeneration of 1O2 took place within the PSII reaction centre. Functional enrichment analyses identified transcripts that are key components of the ROS signalling transduction pathway in plants as well as others encoding transcription factors that regulate both ROS scavenging and water deficit stress. A meta-analysis examining the transcriptional profiles of mutants and hormone treatments in Arabidopsis showed a high correlation between ACSC at HL and the flu mutant family of Arabidopsis, a producer of 1O2 in plastids. Intriguingly, a high correlation was also observed with aba1 and max4, two mutants with defects in the biosynthesis pathways of two key (apo)carotenoid-derived plant hormones (i.e. ABA and strigolactones, respectively). ACSC has proven to be a valuable system for studying early transcriptional responses to HL stress.

%B Plant physiology %V 156 %P 1439-56 %8 2011 Apr 29 %G eng %U http://www.plantphysiol.org/content/early/2011/04/29/pp.111.177766.short?keytype=ref&ijkey=ph5B6J2khjnqwzN %0 Journal Article %J Plant Physiol %D 2011 %T Early transcriptional defense responses in Arabidopsis cell suspension culture under high-light conditions. %A González-Pérez, Sergio %A Gutiérrez, Jorge %A Garcia-Garcia, Francisco %A Osuna, Daniel %A Dopazo, Joaquin %A Lorenzo, Oscar %A Revuelta, José L %A Arellano, Juan B %K Arabidopsis %K Blotting, Western %K Cell Culture Techniques %K Cells, Cultured %K Chloroplasts %K Cluster Analysis %K Gene Expression Profiling %K Gene Expression Regulation, Plant %K Hydrogen Peroxide %K Light %K mutation %K Oligonucleotide Array Sequence Analysis %K Photosystem II Protein Complex %K Plant Growth Regulators %K Reproducibility of Results %K Reverse Transcriptase Polymerase Chain Reaction %K RNA, Messenger %K Signal Transduction %K Stress, Physiological %K Transcription, Genetic %X

The early transcriptional defense responses and reactive oxygen species (ROS) production in Arabidopsis (Arabidopsis thaliana) cell suspension culture (ACSC), containing functional chloroplasts, were examined at high light (HL). The transcriptional analysis revealed that most of the ROS markers identified among the 449 transcripts with significant differential expression were transcripts specifically up-regulated by singlet oxygen ((1)O(2)). On the contrary, minimal correlation was established with transcripts specifically up-regulated by superoxide radical or hydrogen peroxide. The transcriptional analysis was supported by fluorescence microscopy experiments. The incubation of ACSC with the (1)O(2) sensor green reagent and 2',7'-dichlorofluorescein diacetate showed that the 30-min-HL-treated cultures emitted fluorescence that corresponded with the production of (1)O(2) but not of hydrogen peroxide. Furthermore, the in vivo photodamage of the D1 protein of photosystem II indicated that the photogeneration of (1)O(2) took place within the photosystem II reaction center. Functional enrichment analyses identified transcripts that are key components of the ROS signaling transduction pathway in plants as well as others encoding transcription factors that regulate both ROS scavenging and water deficit stress. A meta-analysis examining the transcriptional profiles of mutants and hormone treatments in Arabidopsis showed a high correlation between ACSC at HL and the fluorescent mutant family of Arabidopsis, a producer of (1)O(2) in plastids. Intriguingly, a high correlation was also observed with ABA deficient1 and more axillary growth4, two mutants with defects in the biosynthesis pathways of two key (apo)carotenoid-derived plant hormones (i.e. abscisic acid and strigolactones, respectively). ACSC has proven to be a valuable system for studying early transcriptional responses to HL stress.

%B Plant Physiol %V 156 %P 1439-56 %8 2011 Jul %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/21531897?dopt=Abstract %R 10.1104/pp.111.177766 %0 Journal Article %J Brief Bioinform %D 2011 %T Evidence for short-time divergence and long-time conservation of tissue-specific expression after gene duplication. %A Huerta-Cepas, Jaime %A Dopazo, Joaquin %A Huynen, Martijn A %A Gabaldón, Toni %K Animals %K Conserved Sequence %K Evolution, Molecular %K Gene Duplication %K gene expression %K Genome %K Humans %K Mice %K Organ Specificity %X

Gene duplication is one of the main mechanisms by which genomes can acquire novel functions. It has been proposed that the retention of gene duplicates can be associated to processes of tissue expression divergence. These models predict that acquisition of divergent expression patterns should be acquired shortly after the duplication, and that larger divergence in tissue expression would be expected for paralogs, as compared to orthologs of a similar age. Many studies have shown that gene duplicates tend to have divergent expression patterns and that gene family expansions are associated with high levels of tissue specificity. However, the timeframe in which these processes occur have rarely been investigated in detail, particularly in vertebrates, and most analyses do not include direct comparisons of orthologs as a baseline for the expected levels of tissue specificity in absence of duplications. To assess the specific contribution of duplications to expression divergence, we combine here phylogenetic analyses and expression data from human and mouse. In particular, we study differences in spatial expression among human-mouse paralogs, specifically duplicated after the radiation of mammals, and compare them to pairs of orthologs in the same species. Our results show that gene duplication leads to increased levels of tissue specificity and that this tends to occur promptly after the duplication event.

%B Brief Bioinform %V 12 %P 442-8 %8 2011 Sep %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/21515902?dopt=Abstract %R 10.1093/bib/bbr022 %0 Journal Article %J Environmental microbiology %D 2011 %T Evolution of the biosynthesis of di-myo-inositol phosphate, a marker of adaptation to hot marine environments. %A Gonçalves, Luís G %A Borges, Nuno %A Serra, François %A Fernandes, Pedro L %A Dopazo, Hernán %A Santos, Helena %X

The synthesis of di-myo-inositol phosphate (DIP), a common compatible solute in hyperthermophiles, involves the consecutive actions of inositol-1-phosphate cytidylyltransferase (IPCT) and di-myo-inositol phosphate phosphate synthase (DIPPS). In most cases, both activities are present in a single gene product, but separate genes are also found in a few organisms. Genes for IPCT and DIPPS were found in the genomes of 33 organisms, all with thermophilic/hyperthermophilic lifestyles. Phylogeny of IPCT/DIPPS revealed an incongruent topology with 16S RNA phylogeny, thus suggesting horizontal gene transfer. The phylogenetic tree of the DIPPS domain was rooted by using phosphatidylinositol phosphate synthase sequences as out-group. The root locates at the separation of genomes with fused and split genes. We propose that the gene encoding DIPPS was recruited from the biosynthesis of phosphatidylinositol. The last DIP-synthesizing ancestor harboured separated genes for IPCT and DIPPS and this architecture was maintained in a crenarchaeal lineage, and transferred by horizontal gene transfer to hyperthermophilic marine Thermotoga species. It is plausible that the driving force for the assembly of those two genes in the early ancestor is related to the acquired advantage of DIP producers to cope with high temperature. This work corroborates the view that Archaea were the first hyperthermophilic organisms.

%B Environmental microbiology %8 2011 Oct 26 %G eng %R 10.1111/j.1462-2920.2011.02621.x %0 Journal Article %J PLoS computational biology %D 2011 %T An evolutionary trade-off between protein turnover rate and protein aggregation favors a higher aggregation propensity in fast degrading proteins. %A De Baets, Greet %A Reumers, Joke %A Delgado Blanco, Javier %A Dopazo, Joaquin %A Schymkowitz, Joost %A Rousseau, Frederic %X

We previously showed the existence of selective pressure against protein aggregation by the enrichment of aggregation-opposing ’gatekeeper’ residues at strategic places along the sequence of proteins. Here we analyzed the relationship between protein lifetime and protein aggregation by combining experimentally determined turnover rates, expression data, structural data and chaperone interaction data on a set of more than 500 proteins. We find that selective pressure on protein sequences against aggregation is not homogeneous but that short-living proteins on average have a higher aggregation propensity and fewer chaperone interactions than long-living proteins. We also find that short-living proteins are more often associated to deposition diseases. These findings suggest that the efficient degradation of high-turnover proteins is sufficient to preclude aggregation, but also that factors that inhibit proteasomal activity, such as physiological ageing, will primarily affect the aggregation of short-living proteins.

%B PLoS computational biology %V 7 %P e1002090 %8 2011 Jun %G eng %R 10.1371/journal.pcbi.1002090 %0 Journal Article %J BMC plant biology %D 2011 %T Fortunella margarita Transcriptional Reprogramming Triggered by Xanthomonas citri subsp. citri. %A Khalaf, Abeer A %A Gmitter, Frederick G %A Ana Conesa %A Dopazo, Joaquin %A Moore, Gloria A %X ABSTRACT: %B BMC plant biology %V 11 %P 159 %8 2011 %G eng %0 Journal Article %J Genome biology and evolution %D 2011 %T Genome-wide heterogeneity of nucleotide substitution model fit. %A Arbiza, Leonardo %A Patricio, Mateus %A Dopazo, Hernán %A Posada, David %X

At a genomic scale, the patterns that have shaped molecular evolution are believed to be largely heterogeneous. Consequently, comparative analyses should use appropriate probabilistic substitution models that capture the main features under which different genomic regions have evolved. While efforts have concentrated in the development and understanding of model selection techniques, no descriptions of overall relative substitution model fit at the genome level have been reported. Here, we provide a characterization of best-fit substitution models across three genomic data sets including coding regions from mammals, vertebrates, and Drosophila (24,000 alignments). According to the Akaike Information Criterion (AIC), 82 of 88 models considered were selected as best-fit models at least in one occasion, although with very different frequencies. Most parameter estimates also varied broadly among genes. Patterns found for vertebrates and Drosophila were quite similar and often more complex than those found in mammals. Phylogenetic trees derived from models in the 95% confidence interval set showed much less variance and were significantly closer to the tree estimated under the best-fit model than trees derived from models outside this interval. Although alternative criteria selected simpler models than the AIC, they suggested similar patterns. All together our results show that at a genomic scale, different gene alignments for the same set of taxa are best explained by a large variety of different substitution models and that model choice has implications on different parameter estimates including the inferred phylogenetic trees. After taking into account the differences related to sample size, our results suggest a noticeable diversity in the underlying evolutionary process. All together, we conclude that the use of model selection techniques is important to obtain consistent phylogenetic estimates from real data at a genomic scale.

%B Genome biology and evolution %V 3 %P 896-908 %8 2011 %G eng %0 Journal Article %J The New phytologist %D 2011 %T Histone modifications and expression of DAM6 gene in peach are modulated during bud dormancy release in a cultivar-dependent manner. %A Leida, Carmen %A Ana Conesa %A Llácer, Gerardo %A Badenes, María Luisa %A Ríos, Gabino %X

• Bud dormancy release in many woody perennial plants responds to the seasonal accumulation of chilling stimulus. MADS-box transcription factors encoded by DORMANCY ASSOCIATED MADS-box (DAM) genes in peach (Prunus persica) are implicated in this pathway, but other regulatory factors remain to be identified. In addition, the regulation of DAM gene expression is not well known at the molecular level. • A microarray hybridization approach was performed to identify genes whose expression correlates with the bud dormancy-related behaviour in 10 different peach cultivars. Histone modifications in DAM6 gene were investigated by chromatin immunoprecipitation in two different cultivars. • The expression of DAM4-DAM6 and several genes related to abscisic acid and drought stress response correlated with the dormancy behaviour of peach cultivars. The trimethylation of histone H3 at K27 in the DAM6 promoter, coding region and the second large intron was preceded by a decrease in acetylated H3 and trimethylated H3K4 in the region of translation start, coinciding with repression of DAM6 during dormancy release. • Analysis of chromatin modifications reinforced the role of epigenetic mechanisms in DAM6 regulation and bud dormancy release, and highlighted common features with the vernalization process in Arabidopsis thaliana and cereals.

%B The New phytologist %8 2011 Sep 7 %G eng %R 10.1111/j.1469-8137.2011.03863.x %0 Journal Article %J BMC Medical Genomics %D 2011 %T A large scale survey reveals that chromosomal copy-number alterations significantly affect gene modules involved in cancer initiation and progression %A Alloza, E. %A Fatima Al-Shahrour %A Cigudosa, J. C. %A Dopazo, J. %X

Background

Recent observations point towards the existence of a large number of neighborhoods composed of functionally-related gene modules that lie together in the genome. This local component in the distribution of the functionality across chromosomes is probably affecting the own chromosomal architecture by limiting the possibilities in which genes can be arranged and distributed across the genome. As a direct consequence of this fact it is therefore presumable that diseases such as cancer, harboring DNA copy number alterations (CNAs), will have a symptomatology strongly dependent on modules of functionally-related genes rather than on a unique "important" gene.

Methods

We carried out a systematic analysis of more than 140,000 observations of CNAs in cancers and searched by enrichments in gene functional modules associated to high frequencies of loss or gains.

Results

The analysis of CNAs in cancers clearly demonstrates the existence of a significant pattern of loss of gene modules functionally related to cancer initiation and progression along with the amplification of modules of genes related to unspecific defense against xenobiotics (probably chemotherapeutical agents). With the extension of this analysis to an Array-CGH dataset (glioblastomas) from The Cancer Genome Atlas we demonstrate the validity of this approach to investigate the functional impact of CNAs.

Conclusions

The presented results indicate promising clinical and therapeutic implications. Our findings also directly point out to the necessity of adopting a function-centric, rather a gene-centric, view in the understanding of phenotypes or diseases harboring CNAs.

%B BMC Medical Genomics %V 4 %P 37 %8 06/05/2011 %G eng %U http://www.biomedcentral.com/1755-8794/4/37 %9 Research article %R 10.1186/1755-8794-4-37 %0 Journal Article %J Hum Mol Genet %D 2011 %T Large-scale transcriptional profiling and functional assays reveal important roles for Rho-GTPase signalling and SCL during haematopoietic differentiation of human embryonic stem cells. %A Yung, Sun %A Ledran, Maria %A Moreno-Gimeno, Inmaculada %A Conesa, Ana %A Montaner, David %A Dopazo, Joaquin %A Dimmick, Ian %A Slater, Nicholas J %A Marenah, Lamin %A Real, Pedro J %A Paraskevopoulou, Iliana %A Bisbal, Viviana %A Burks, Deborah %A Santibanez-Koref, Mauro %A Moreno, Ruben %A Mountford, Joanne %A Menendez, Pablo %A Armstrong, Lyle %A Lako, Majlinda %K Acute Disease %K Anemia, Hemolytic %K Animals %K Basic Helix-Loop-Helix Transcription Factors %K Cell Differentiation %K Cell Line %K Cell Lineage %K Cluster Analysis %K Embryonic Stem Cells %K Erythroid Cells %K Flow Cytometry %K Gene Expression Profiling %K Hematopoietic Stem Cells %K Humans %K Mice %K Myeloid Cells %K Paracrine Communication %K Proto-Oncogene Proteins %K Reverse Transcriptase Polymerase Chain Reaction %K rho GTP-Binding Proteins %K Signal Transduction %K Stem Cell Transplantation %K T-Cell Acute Lymphocytic Leukemia Protein 1 %K Transcriptome %X

Understanding the transcriptional cues that direct differentiation of human embryonic stem cells (hESCs) and human-induced pluripotent stem cells to defined and functional cell types is essential for future clinical applications. In this study, we have compared transcriptional profiles of haematopoietic progenitors derived from hESCs at various developmental stages of a feeder- and serum-free differentiation method and show that the largest transcriptional changes occur during the first 4 days of differentiation. Data mining on the basis of molecular function revealed Rho-GTPase signalling as a key regulator of differentiation. Inhibition of this pathway resulted in a significant reduction in the numbers of emerging haematopoietic progenitors throughout the differentiation window, thereby uncovering a previously unappreciated role for Rho-GTPase signalling during human haematopoietic development. Our analysis indicated that SCL was the 11th most upregulated transcript during the first 4 days of the hESC differentiation process. Overexpression of SCL in hESCs promoted differentiation to meso-endodermal lineages, the emergence of haematopoietic and erythro-megakaryocytic progenitors and accelerated erythroid differentiation. Importantly, intrasplenic transplantation of SCL-overexpressing hESC-derived haematopoietic cells enhanced recovery from induced acute anaemia without significant cell engraftment, suggesting a paracrine-mediated effect.

%B Hum Mol Genet %V 20 %P 4932-46 %8 2011 Dec 15 %G eng %N 24 %1 https://www.ncbi.nlm.nih.gov/pubmed/21937587?dopt=Abstract %R 10.1093/hmg/ddr431 %0 Journal Article %J The Journal of clinical endocrinology and metabolism %D 2011 %T Modeling human endometrial decidualization from the interaction between proteome and secretome. %A Garrido-Gomez, Tamara %A Dominguez, Francisco %A Lopez, Juan Antonio %A Camafeita, Emilio %A Quiñonero, Alicia %A Martinez-Conejero, Jose Antonio %A Pellicer, Antonio %A Ana Conesa %A Simon, Carlos %X

Decidualization of the human endometrium, which involves morphological and biochemical modifications of the endometrial stromal cells (ESCs), is a prerequisite for adequate trophoblast invasion and placenta formation.

%B The Journal of clinical endocrinology and metabolism %V 96 %P 706-16 %8 2011 Mar %G eng %0 Journal Article %J PLoS One %D 2011 %T Mutation screening of multiple genes in Spanish patients with autosomal recessive retinitis pigmentosa by targeted resequencing. %A González-del Pozo, María %A Borrego, Salud %A Barragán, Isabel %A Pieras, Juan I %A Santoyo, Javier %A Matamala, Nerea %A Naranjo, Belén %A Dopazo, Joaquin %A Antiňolo, Guillermo %K Alleles %K DNA Mutational Analysis %K Exons %K Genetic Variation %K Genome %K Hispanic or Latino %K Humans %K Introns %K Language %K mutation %K Mutation, Missense %K Oligonucleotide Array Sequence Analysis %K Polymerase Chain Reaction %K Reproducibility of Results %K Retinitis pigmentosa %K United States %X

Retinitis Pigmentosa (RP) is a heterogeneous group of inherited retinal dystrophies characterised ultimately by the loss of photoreceptor cells. RP is the leading cause of visual loss in individuals younger than 60 years, with a prevalence of about 1 in 4000. The molecular genetic diagnosis of autosomal recessive RP (arRP) is challenging due to the large genetic and clinical heterogeneity. Traditional methods for sequencing arRP genes are often laborious and not easily available and a screening technique that enables the rapid detection of the genetic cause would be very helpful in the clinical practice. The goal of this study was to develop and apply microarray-based resequencing technology capable of detecting both known and novel mutations on a single high-throughput platform. Hence, the coding regions and exon/intron boundaries of 16 arRP genes were resequenced using microarrays in 102 Spanish patients with clinical diagnosis of arRP. All the detected variations were confirmed by direct sequencing and potential pathogenicity was assessed by functional predictions and frequency in controls. For validation purposes 4 positive controls for variants consisting of previously identified changes were hybridized on the array. As a result of the screening, we detected 44 variants, of which 15 are very likely pathogenic detected in 14 arRP families (14%). Finally, the design of this array can easily be transformed in an equivalent diagnostic system based on targeted enrichment followed by next generation sequencing.

%B PLoS One %V 6 %P e27894 %8 2011 %G eng %N 12 %1 https://www.ncbi.nlm.nih.gov/pubmed/22164218?dopt=Abstract %R 10.1371/journal.pone.0027894 %0 Journal Article %J PLoS One %D 2011 %T myKaryoView: a light-weight client for visualization of genomic data. %A Jimenez, Rafael C %A Salazar, Gustavo A %A Gel, Bernat %A Dopazo, Joaquin %A Mulder, Nicola %A Corpas, Manuel %K Computer Graphics %K Databases, Genetic %K Genomics %K Internet %K Molecular Sequence Annotation %K User-Computer Interface %X

The Distributed Annotation System (DAS) is a protocol for easy sharing and integration of biological annotations. In order to visualize feature annotations in a genomic context a client is required. Here we present myKaryoView, a simple light-weight DAS tool for visualization of genomic annotation. myKaryoView has been specifically configured to help analyse data derived from personal genomics, although it can also be used as a generic genome browser visualization. Several well-known data sources are provided to facilitate comparison of known genes and normal variation regions. The navigation experience is enhanced by simultaneous rendering of different levels of detail across chromosomes. A simple interface is provided to allow searches for any SNP, gene or chromosomal region. User-defined DAS data sources may also be added when querying the system. We demonstrate myKaryoView capabilities for adding user-defined sources with a set of genetic profiles of family-related individuals downloaded directly from 23andMe. myKaryoView is a web tool for visualization of genomic data specifically designed for direct-to-consumer genomic data that uses publicly available data distributed throughout the Internet. It does not require data to be held locally and it is capable of rendering any feature as long as it conforms to DAS specifications. Configuration and addition of sources to myKaryoView can be done through the interface. Here we show a proof of principle of myKaryoView's ability to display personal genomics data with 23andMe genome data sources. The tool is available at: http://mykaryoview.com.

%B PLoS One %V 6 %P e26345 %8 2011 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/22046276?dopt=Abstract %R 10.1371/journal.pone.0026345 %0 Journal Article %J PLoS Comput Biol %D 2011 %T Natural selection on functional modules, a genome-wide analysis. %A Serra, François %A Arbiza, Leonardo %A Dopazo, Joaquin %A Dopazo, Hernán %K Animals %K Databases, Genetic %K Drosophila %K Genome, Insect %K Genome-Wide Association Study %K Genomics %K Mammals %K Phylogeny %K Selection, Genetic %K Sequence Analysis, DNA %X

Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA), a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

%B PLoS Comput Biol %V 7 %P e1001093 %8 2011 Mar %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/21390268?dopt=Abstract %R 10.1371/journal.pcbi.1001093 %0 Journal Article %J Protein science : a publication of the Protein Society %D 2011 %T N-glycosylation efficiency is determined by the distance to the C-terminus and the amino acid preceding an Asn-Ser-Thr sequon. %A Bañó-Polo, Manuel %A Baldin, Francesca %A Tamborero, Silvia %A Marti-Renom, Marc A %A Mingarro, Ismael %X

N-glycosylation is the most common and versatile protein modification. In eukaryotic cells, this modification is catalyzed cotranslationally by the enzyme oligosaccharyltransferase, which targets the β-amide of the asparagine in an Asn-Xaa-Ser/Thr consensus sequon (where Xaa is any amino acid but proline) in nascent proteins as they enter the endoplasmic reticulum. Because modification of the glycosylation acceptor site on membrane proteins occurs in a compartment-specific manner, the presence of glycosylation is used to indicate membrane protein topology. Moreover, glycosylation sites can be added to gain topological information. In this study, we explored the determinants of N-glycosylation with the in vitro transcription/translation of a truncated model protein in the presence of microsomes and surveyed 25,488 glycoproteins, of which 2,533 glycosylation sites had been experimentally validated. We found that glycosylation efficiency was dependent on both the distance to the C-terminus and the nature of the amino acid that preceded the consensus sequon. These findings establish a broadly applicable method for membrane protein tagging in topological studies.

%B Protein science : a publication of the Protein Society %V 20 %P 179-86 %8 2011 Jan %G eng %0 Journal Article %J Bioinformatics (Oxford, England) %D 2011 %T Paintomics: a web based tool for the joint visualization of transcriptomics and metabolomics data. %A García-Alcalde, Fernando %A García-López, Federico %A Joaquín Dopazo %A Ana Conesa %X

The development of the omics technologies such as transcriptomics, proteomics and metabolomics has made possible the realization of systems biology studies where biological systems are interrogated at different levels of biochemical activity (gene expression, protein activity and/or metabolite concentration). An effective approach to the analysis of these complex datasets is the joined visualization of the disparate biomolecular data on the framework of known biological pathways.

%B Bioinformatics (Oxford, England) %V 27 %P 137-9 %8 2011 Jan 1 %G eng %0 Journal Article %J Nucleic Acids Res %D 2011 %T Phylemon 2.0: a suite of web-tools for molecular evolution, phylogenetics, phylogenomics and hypotheses testing. %A Sánchez, Rubén %A Serra, François %A Tárraga, Joaquín %A Medina, Ignacio %A Carbonell, José %A Pulido, Luis %A De Maria, Alejandro %A Capella-Gutíerrez, Salvador %A Huerta-Cepas, Jaime %A Gabaldón, Toni %A Dopazo, Joaquin %A Dopazo, Hernán %K Evolution, Molecular %K Genomics %K Internet %K Phylogeny %K Sequence Alignment %K Software %X

Phylemon 2.0 is a new release of the suite of web tools for molecular evolution, phylogenetics, phylogenomics and hypotheses testing. It has been designed as a response to the increasing demand of molecular sequence analyses for experts and non-expert users. Phylemon 2.0 has several unique features that differentiates it from other similar web resources: (i) it offers an integrated environment that enables evolutionary analyses, format conversion, file storage and edition of results; (ii) it suggests further analyses, thereby guiding the users through the web server; and (iii) it allows users to design and save phylogenetic pipelines to be used over multiple genes (phylogenomics). Altogether, Phylemon 2.0 integrates a suite of 30 tools covering sequence alignment reconstruction and trimming; tree reconstruction, visualization and manipulation; and evolutionary hypotheses testing.

%B Nucleic Acids Res %V 39 %P W470-4 %8 2011 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/21646336?dopt=Abstract %R 10.1093/nar/gkr408 %0 Journal Article %J Human mutation %D 2011 %T Phylogenetic and in silico structural analysis of the Parkinson disease-related kinase PINK1. %A Cardona, Fernando %A Sánchez-Mut, Jose Vicente %A Dopazo, Hernán %A Pérez-Tur, Jordi %X

Parkinson disease (PD) is the second most common neurodegenerative disorder and is characterized by the loss of dopaminergic neurons in the substantia nigra. Mutations in PINK1 were shown to cause recessive familial PD, and today are proposed to be associated with the disease via mitochondrial dysfunction and oxidative damage. The PINK1 gene comprises eight exons, which encode a ubiquitously expressed 581 amino acid protein that contains an N-terminal mitochondrial targeting domain and a serine/threonine protein kinase. To better understand the relationship between PINK1 and PD we have first analyzed the evolutionary history of the gene showing its late emergence in evolution. In addition, we have modeled the three-dimensional structure of PINK1 and found some evidences that help to explain the effect of some PD-related mutations in this protein’s function.

%B Human mutation %V 32 %P 369-78 %8 2011 Apr %G eng %R 10.1002/humu.21444 %0 Journal Article %J BMC genomics %D 2011 %T Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing. %A Durban, Jordi %A Juárez, Paula %A Angulo, Yamileth %A Lomonte, Bruno %A Flores-Diaz, Marietta %A Alape-Girón, Alberto %A Sasa, Mahmood %A Sanz, Libia %A Gutiérrez, José M %A Joaquín Dopazo %A Ana Conesa %A Calvete, Juan J %X

A long term research goal of venomics, of applied importance for improving current antivenom therapy, but also for drug discovery, is to understand the pharmacological potential of venoms. Individually or combined, proteomic and transcriptomic studies have demonstrated their feasibility to explore in depth the molecular diversity of venoms. In the absence of genome sequence, transcriptomes represent also valuable searchable databases for proteomic projects.

%B BMC genomics %V 12 %P 259 %8 2011 %G eng %0 Journal Article %J BMC genomics %D 2011 %T Recent human evolution has shaped geographical differences in susceptibility to disease. %A Marigorta, Urko M %A Lao, Oscar %A Casals, Ferran %A Calafell, Francesc %A Morcillo-Suarez, Carlos %A Faria, Rui %A Bosch, Elena %A Serra, François %A Bertranpetit, Jaume %A Dopazo, Hernán %A Navarro, Arcadi %X

Searching for associations between genetic variants and complex diseases has been a very active area of research for over two decades. More than 51,000 potential associations have been studied and published, a figure that keeps increasing, especially with the recent explosion of array-based Genome-Wide Association Studies. Even if the number of true associations described so far is high, many of the putative risk variants detected so far have failed to be consistently replicated and are widely considered false positives. Here, we focus on the world-wide patterns of replicability of published association studies.

%B BMC genomics %V 12 %P 55 %8 2011 %G eng %0 Journal Article %J The Plant journal : for cell and molecular biology %D 2011 %T Role of tomato BRANCHED1-like genes in the control of shoot branching. %A Martín-Trillo, Mar %A Grandío, Eduardo González %A Serra, François %A Marcel, Fabien %A Rodríguez-Buey, María Luisa %A Schmitz, Gregor %A Theres, Klaus %A Bendahmane, Abdelhafid %A Dopazo, Hernán %A Cubas, Pilar %X

In angiosperms, shoot branching greatly determines overall plant architecture and affects fundamental aspects of plant life. Branching patterns are determined by genetic pathways conserved widely across angiosperms. In Arabidopsis thaliana (Brassicaceae, Rosidae) BRANCHED1 (BRC1) plays a central role in this process, acting locally to arrest axillary bud growth. In tomato (Solanum lycopersicum, Solanaceae, Asteridae) we have identified two BRC1-like paralogues, SlBRC1a and SlBRC1b. These genes are expressed in arrested axillary buds and both are down-regulated upon bud activation, although SlBRC1a is transcribed at much lower levels than SlBRC1b. Alternative splicing of SlBRC1a renders two transcripts that encode two BRC1-like proteins with different C-t domains due to a 3’-terminal frameshift. The phenotype of loss-of-function lines suggests that SlBRC1b has retained the ancestral role of BRC1 in shoot branch suppression. We have isolated the BRC1a and BRC1b genes of other Solanum species and have studied their evolution rates across the lineages. These studies indicate that, after duplication of an ancestral BRC1-like gene, BRC1b genes continued to evolve under a strong purifying selection that was consistent with the conserved function of SlBRC1b in shoot branching control. In contrast, the coding sequences of Solanum BRC1a genes have evolved at a higher evolution rate. Branch-site tests indicate that this difference does not reflect relaxation but rather positive selective pressure for adaptation.

%B The Plant journal : for cell and molecular biology %V 67 %P 701-14 %8 2011 Aug %G eng %R 10.1111/j.1365-313X.2011.04629.x %0 Journal Article %J PloS one %D 2011 %T Sexual selection halts the relaxation of protamine 2 among rodents. %A Lüke, Lena %A Vicens, Alberto %A Serra, François %A Luque-Larena, Juan Jose %A Dopazo, Hernán %A Roldan, Eduardo R S %A Gomendio, Montserrat %X Sexual selection has been proposed as the driving force promoting the rapid evolutionary changes observed in some reproductive genes including protamines. We test this hypothesis in a group of rodents which show marked differences in the intensity of sexual selection. Levels of sperm competition were not associated with the evolutionary rates of protamine 1 but, contrary to expectations, were negatively related to the evolutionary rate of cleaved- and mature-protamine 2. Since both domains were found to be under relaxation, our findings reveal an unforeseen role of sexual selection: to halt the degree of degeneration that proteins within families may experience due to functional redundancy. The degree of relaxation of protamine 2 in this group of rodents is such that in some species it has become dysfunctional and it is not expressed in mature spermatozoa. In contrast, protamine 1 is functionally conserved but shows directed positive selection on specific sites which are functionally relevant such as DNA-anchoring domains and phosphorylation sites. We conclude that in rodents protamine 2 is under relaxation and that sexual selection removes deleterious mutations among species with high levels of sperm competition to maintain the protein functional and the spermatozoa competitive. %B PloS one %V 6 %P e29247 %8 2011 %G eng %U http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0029247 %R 10.1371/journal.pone.0029247 %0 Journal Article %J Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology %D 2011 %T Structure determination of genomic domains by satisfaction of spatial restraints. %A Baù, Davide %A Marti-Renom, Marc A %X

The three-dimensional (3D) architecture of a genome is non-random and known to facilitate the spatial colocalization of regulatory elements with the genes they regulate. Determining the 3D structure of a genome may therefore probe an essential step in characterizing how genes are regulated. Currently, there are several experimental and theoretical approaches that aim at determining the 3D structure of genomes and genomic domains; however, approaches integrating experiments and computation to identify the most likely 3D folding of a genome at medium to high resolutions have not been widely explored. Here, we review existing methodologies and propose that the integrative modeling platform ( http://www.integrativemodeling.org ), a computational package developed for structurally characterizing protein assemblies, could be used for integrating diverse experimental data towards the determination of the 3D architecture of genomic domains and entire genomes at unprecedented resolution. Our approach, through the visualization of looping interactions between distal regulatory elements, will allow for the characterization of global chromatin features and their relation to gene expression. We illustrate our work by outlining the recent determination of the 3D architecture of the α-globin domain in the human genome.

%B Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology %V 19 %P 25-35 %8 2011 Jan %G eng %0 Journal Article %J Nucleic acids research %D 2011 %T SUS1 introns are required for efficient mRNA nuclear export in yeast. %A Cuenca-Bono, Bernardo %A García-Molinero, Varinia %A Pascual-García, Pau %A Dopazo, Hernán %A Llopis, Ana %A Vilardell, Josep %A Rodríguez-Navarro, Susana %X

Efficient coupling between mRNA synthesis and export is essential for gene expression. Sus1/ENY2, a component of the SAGA and TREX-2 complexes, is involved in both transcription and mRNA export. While most yeast genes lack introns, we previously reported that yeast SUS1 bears two. Here we show that this feature is evolutionarily conserved and critical for Sus1 function. We determine that while SUS1 splicing is inefficient, it responds to cellular conditions, and intronic mutations either promoting or blocking splicing lead to defects in mRNA export and cell growth. Consistent with this, we find that an intron-less SUS1 only partially rescues sus1Δ phenotypes. Remarkably, splicing of each SUS1 intron is also affected by the presence of the other and by SUS1 exonic sequences. Moreover, by following SUS1 RNA and protein levels we establish that nonsense-mediated decay (NMD) pathway and the splicing factor Mud2 both play a role in SUS1 expression. Our data (and those of the accompanying work by Hossain et al.) provide evidence of the involvement of splicing, translation, and decay in the regulation of early events in mRNP biogenesis; and imply the additional requirement for a balance in splicing isoforms from a single gene.

%B Nucleic acids research %V 39 %P 8599-611 %8 2011 Oct 1 %G eng %0 Journal Article %J Nature structural & molecular biology %D 2011 %T The three-dimensional folding of the α-globin gene domain reveals formation of chromatin globules. %A Baù, Davide %A Sanyal, Amartya %A Lajoie, Bryan R %A Capriotti, Emidio %A Byron, Meg %A Lawrence, Jeanne B %A Dekker, Job %A Marti-Renom, Marc A %X

We developed a general approach that combines chromosome conformation capture carbon copy (5C) with the Integrated Modeling Platform (IMP) to generate high-resolution three-dimensional models of chromatin at the megabase scale. We applied this approach to the ENm008 domain on human chromosome 16, containing the α-globin locus, which is expressed in K562 cells and silenced in lymphoblastoid cells (GM12878). The models accurately reproduce the known looping interactions between the α-globin genes and their distal regulatory elements. Further, we find using our approach that the domain folds into a single globular conformation in GM12878 cells, whereas two globules are formed in K562 cells. The central cores of these globules are enriched for transcribed genes, whereas nontranscribed chromatin is more peripheral. We propose that globule formation represents a higher-order folding state related to clustering of transcribed genes around shared transcription machineries, as previously observed by microscopy.

%B Nature structural & molecular biology %V 18 %P 107-14 %8 2011 Jan %G eng %0 Journal Article %J Nucleic Acids Research %D 2010 %T Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling. %A Medina, Ignacio %A Carbonell, José %A Pulido, Luis %A Madeira, Sara C %A Goetz, Stefan %A Ana Conesa %A Tárraga, Joaquín %A Pascual-Montano, Alberto %A Nogales-Cadenas, Ruben %A Santoyo, Javier %A García, Francisco %A Marbà, Martina %A Montaner, David %A Joaquín Dopazo %K babelomics %K gene expression %K genotyping %K gepas %K GSA %K GWAS %X

Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein-protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org.

%B Nucleic Acids Research %V 38 %P W210-W213. Featured in NAR %8 2010 May 16 %G eng %U http://nar.oxfordjournals.org/content/38/suppl_2/W210.full %& Featured in NAR %0 Journal Article %J Genome research %D 2010 %T Changes in the pattern of DNA methylation associate with twin discordance in systemic lupus erythematosus. %A Javierre, Biola M %A Fernandez, Agustin F %A Richter, Julia %A Fatima Al-Shahrour %A Martin-Subero, J Ignacio %A Rodriguez-Ubreva, Javier %A Berdasco, Maria %A Fraga, Mario F %A O’Hanlon, Terrance P %A Rider, Lisa G %A Jacinto, Filipe V %A Lopez-Longo, F Javier %A Dopazo, Joaquin %A Forn, Marta %A Peinado, Miguel A %A Carreño, Luis %A Sawalha, Amr H %A Harley, John B %A Siebert, Reiner %A Esteller, Manel %A Miller, Frederick W %A Ballestar, Esteban %X

Monozygotic (MZ) twins are partially concordant for most complex diseases, including autoimmune disorders. Whereas phenotypic concordance can be used to study heritability, discordance suggests the role of non-genetic factors. In autoimmune diseases, environmentally driven epigenetic changes are thought to contribute to their etiology. Here we report the first high-throughput and candidate sequence analyses of DNA methylation to investigate discordance for autoimmune disease in twins. We used a cohort of MZ twins discordant for three diseases whose clinical signs often overlap: systemic lupus erythematosus (SLE), rheumatoid arthritis, and dermatomyositis. Only MZ twins discordant for SLE featured widespread changes in the DNA methylation status of a significant number of genes. Gene ontology analysis revealed enrichment in categories associated with immune function. Individual analysis confirmed the existence of DNA methylation and expression changes in genes relevant to SLE pathogenesis. These changes occurred in parallel with a global decrease in the 5-methylcytosine content that was concomitantly accompanied with changes in DNA methylation and expression levels of ribosomal RNA genes, although no changes in repetitive sequences were found. Our findings not only identify potentially relevant DNA methylation markers for the clinical characterization of SLE patients but also support the notion that epigenetic changes may be critical in the clinical manifestations of autoimmune disease.

%B Genome research %V 20 %P 170-9 %8 2010 Feb %G eng %0 Journal Article %J Breast Cancer Res %D 2010 %T DNA methylation epigenotypes in breast cancer molecular subtypes. %A Bediaga, Naiara G %A Acha-Sagredo, Amelia %A Guerra, Isabel %A Viguri, Amparo %A Albaina, Carmen %A Ruiz Diaz, Irune %A Rezola, Ricardo %A Alberdi, Maria Jesus %A Dopazo, Joaquin %A Montaner, David %A Renobales, Mertxe %A Fernandez, Agustin F %A Field, John K %A Fraga, Mario F %A Liloglou, Triantafillos %A de Pancorbo, Marian M %K Aged %K Breast Neoplasms %K CpG Islands %K DNA Methylation %K Epigenesis, Genetic %K Female %K Gene Expression Profiling %K Genes, p53 %K Genotype %K Humans %K Ki-67 Antigen %K Middle Aged %K mutation %K Neoplasm Grading %K Oligonucleotide Array Sequence Analysis %K Receptor, ErbB-2 %K Tumor Suppressor Protein p53 %X

INTRODUCTION: Identification of gene expression based breast cancer subtypes is considered as a critical means of prognostication. Genetic mutations along with epigenetic alterations contribute to gene expression changes occurring in breast cancer. So far, these epigenetic contributions to sporadic breast cancer subtypes have not been well characterized, and there is only a limited understanding of the epigenetic mechanisms affected in those particular breast cancer subtypes. The present study was undertaken to dissect the breast cancer methylome and deliver specific epigenotypes associated with particular breast cancer subtypes.

METHODS: Using a microarray approach we analyzed DNA methylation in regulatory regions of 806 cancer related genes in 28 breast cancer paired samples. We subsequently performed substantial technical and biological validation by Pyrosequencing, investigating the top qualifying 19 CpG regions in independent cohorts encompassing 47 basal-like, 44 ERBB2+ overexpressing, 48 luminal A and 48 luminal B paired breast cancer/adjacent tissues. Using all-subset selection method, we identified the most subtype predictive methylation profiles in multivariable logistic regression analysis.

RESULTS: The approach efficiently recognized 15 individual CpG loci differentially methylated in breast cancer tumor subtypes. We further identify novel subtype specific epigenotypes which clearly demonstrate the differences in the methylation profiles of basal-like and human epidermal growth factor 2 (HER2)-overexpressing tumors.

CONCLUSIONS: Our results provide evidence that well defined DNA methylation profiles enables breast cancer subtype prediction and support the utilization of this biomarker for prognostication and therapeutic stratification of patients with breast cancer.

%B Breast Cancer Res %V 12 %P R77 %8 2010 %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/20920229?dopt=Abstract %R 10.1186/bcr2721 %0 Journal Article %J BMC Bioinformatics %D 2010 %T ETE: a python Environment for Tree Exploration. %A Huerta-Cepas, Jaime %A Dopazo, Joaquin %A Gabaldón, Toni %K Computational Biology %K Databases, Genetic %K Phylogeny %K Software %X

BACKGROUND: Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale.

RESULTS: Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations.

CONCLUSIONS: ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.

%B BMC Bioinformatics %V 11 %P 24 %8 2010 Jan 13 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/20070885?dopt=Abstract %R 10.1186/1471-2105-11-24 %0 Journal Article %J PLoS One %D 2010 %T Exploring the link between germline and somatic genetic alterations in breast carcinogenesis. %A Bonifaci, Núria %A Górski, Bohdan %A Masojć, Bartlomiej %A Wokołorczyk, Dominika %A Jakubowska, Anna %A Dębniak, Tadeusz %A Berenguer, Antoni %A Serra Musach, Jordi %A Brunet, Joan %A Dopazo, Joaquin %A Narod, Steven A %A Lubiński, Jan %A Lázaro, Conxi %A Cybulski, Cezary %A Pujana, Miguel Angel %K Adult %K Bone Morphogenetic Protein Receptors, Type I %K Breast %K Breast Neoplasms %K Calcium-Calmodulin-Dependent Protein Kinases %K Case-Control Studies %K Cyclin-Dependent Kinases %K Disease Progression %K Estrogen Receptor alpha %K Female %K Gene Frequency %K Genetic Predisposition to Disease %K Genome-Wide Association Study %K Genotype %K Germ-Line Mutation %K Humans %K Odds Ratio %K Poland %K Polymorphism, Single Nucleotide %K Protein Serine-Threonine Kinases %K Protein-Tyrosine Kinases %K Receptor Protein-Tyrosine Kinases %K Receptor, EphA3 %K Receptor, EphA7 %K Receptor, EphB1 %K Risk Factors %X

Recent genome-wide association studies (GWASs) have identified candidate genes contributing to cancer risk through low-penetrance mutations. Many of these genes were unexpected and, intriguingly, included well-known players in carcinogenesis at the somatic level. To assess the hypothesis of a germline-somatic link in carcinogenesis, we evaluated the distribution of somatic gene labels within the ordered results of a breast cancer risk GWAS. This analysis suggested frequent influence on risk of genetic variation in loci encoding for "driver kinases" (i.e., kinases encoded by genes that showed higher somatic mutation rates than expected by chance and, therefore, whose deregulation may contribute to cancer development and/or progression). Assessment of these predictions using a population-based case-control study in Poland replicated the association for rs3732568 in EPHB1 (odds ratio (OR) = 0.79; 95% confidence interval (CI): 0.63-0.98; P(trend) = 0.031). Analyses by early age at diagnosis and by estrogen receptor α (ERα) tumor status indicated potential associations for rs6852678 in CDKL2 (OR = 0.32, 95% CI: 0.10-1.00; P(recessive) = 0.044) and rs10878640 in DYRK2 (OR = 2.39, 95% CI: 1.32-4.30; P(dominant) = 0.003), and for rs12765929, rs9836340, rs4707795 in BMPR1A, EPHA3 and EPHA7, respectively (ERα tumor status P(interaction)<0.05). The identification of three novel candidates as EPH receptor genes might indicate a link between perturbed compartmentalization of early neoplastic lesions and breast cancer risk and progression. Together, these data may lay the foundations for replication in additional populations and could potentially increase our knowledge of the underlying molecular mechanisms of breast carcinogenesis.

%B PLoS One %V 5 %P e14078 %8 2010 Nov 22 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/21124932?dopt=Abstract %R 10.1371/journal.pone.0014078 %0 Journal Article %J The ISME journal %D 2010 %T Fine-scale evolution: genomic, phenotypic and ecological differentiation in two coexisting Salinibacter ruber strains. %A Peña, Arantxa %A Teeling, Hanno %A Huerta-Cepas, Jaime %A Santos, Fernando %A Yarza, Pablo %A Brito-Echeverría, Jocelyn %A Lucio, Marianna %A Schmitt-Kopplin, Philippe %A Meseguer, Inmaculada %A Schenowitz, Chantal %A Dossat, Carole %A Barbe, Valerie %A Joaquín Dopazo %A Rosselló-Mora, Ramon %A Schüler, Margarete %A Glöckner, Frank Oliver %A Amann, Rudolf %A Gabaldón, Toni %A Antón, Josefa %X

Genomic and metagenomic data indicate a high degree of genomic variation within microbial populations, although the ecological and evolutive meaning of this microdiversity remains unknown. Microevolution analyses, including genomic and experimental approaches, are so far very scarce for non-pathogenic bacteria. In this study, we compare the genomes, metabolomes and selected ecological traits of the strains M8 and M31 of the hyperhalophilic bacterium Salinibacter ruber that contain ribosomal RNA (rRNA) gene and intergenic regions that are identical in sequence and were simultaneously isolated from a Mediterranean solar saltern. Comparative analyses indicate that S. ruber genomes present a mosaic structure with conserved and hypervariable regions (HVRs). The HVRs or genomic islands, are enriched in transposases, genes related to surface properties, strain-specific genes and highly divergent orthologous. However, the many indels outside the HVRs indicate that genome plasticity extends beyond them. Overall, 10% of the genes encoded in the M8 genome are absent from M31 and could stem from recent acquisitions. S. ruber genomes also harbor 34 genes located outside HVRs that are transcribed during standard growth and probably derive from lateral gene transfers with Archaea preceding the M8/M31 divergence. Metabolomic analyses, phage susceptibility and competition experiments indicate that these genomic differences cannot be considered neutral from an ecological perspective. The results point to the avoidance of competition by micro-niche adaptation and response to viral predation as putative major forces that drive microevolution within these Salinibacter strains. In addition, this work highlights the extent of bacterial functional diversity and environmental adaptation, beyond the resolution of the 16S rRNA and internal transcribed spacers regions.The ISME Journal advance online publication, 18 February 2010; doi:10.1038/ismej.2010.6.

%B The ISME journal %8 2010 Feb 18 %G eng %0 Journal Article %J The Journal of biological chemistry %D 2010 %T FM19G11, a new hypoxia-inducible factor (HIF) modulator, affects stem cell differentiation status. %A Moreno-Manzano, Victoria %A Rodríguez-Jiménez, Francisco J %A Aceña-Bonilla, Jose L %A Fustero-Lardíes, Santos %A Erceg, Slaven %A Dopazo, Joaquin %A Montaner, David %A Stojkovic, Miodrag %A Sánchez-Puelles, Jose M %X

The biology of the alpha subunits of hypoxia-inducible factors (HIFalpha) has expanded from their role in angiogenesis to their current position in the self-renewal and differentiation of stem cells. The results reported in this article show the discovery of FM19G11, a novel chemical entity that inhibits HIFalpha proteins that repress target genes of the two alpha subunits, in various tumor cell lines as well as in adult and embryonic stem cell models from rodents and humans, respectively. FM19G11 inhibits at nanomolar range the transcriptional and protein expression of Oct4, Sox2, Nanog, and Tgf-alpha undifferentiating factors, in adult rat and human embryonic stem cells, FM19G11 activity occurs in ependymal progenitor stem cells from rats (epSPC), a cell model reported for spinal cord regeneration, which allows the progression of oligodendrocyte cell differentiation in a hypoxic environment, has created interest in its characterization for pharmacological research. Experiments using small interfering RNA showed a significant depletion in Sox2 protein only in the case of HIF2alpha silencing, but not in HIF1alpha-mediated ablation. Moreover, chromatin immunoprecipitation data, together with the significant presence of functional hypoxia response element consensus sequences in the promoter region of Sox2, strongly validated that this factor behaves as a target gene of HIF2alpha in epSPCs. FM19G11 causes a reduction of overall histone acetylation with significant repression of p300, a histone acetyltransferase required as a co-factor for HIF-transcription activation. Arrays carried out in the presence and absence of the inhibitor showed the predominant involvement of epigenetic-associated events mediated by the drug.

%B The Journal of biological chemistry %V 285 %P 1333-42 %8 2010 Jan 8 %G eng %0 Journal Article %J Pharmacogenomics J %D 2010 %T Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes. %A Shi, W %A Bessarabova, M %A Dosymbekov, D %A Dezso, Z %A Nikolskaya, T %A Dudoladova, M %A Serebryiskaya, T %A Bugrim, A %A Guryanov, A %A Brennan, R J %A Shah, R %A Dopazo, J %A Chen, M %A Deng, Y %A Shi, T %A Jurman, G %A Furlanello, C %A Thomas, R S %A Corton, J C %A Tong, W %A Shi, L %A Nikolsky, Y %K Algorithms %K Databases, Genetic %K Endpoint Determination %K Gene Expression Profiling %K Genomics %K Humans %K Neural Networks, Computer %K Oligonucleotide Array Sequence Analysis %K Phenotype %K Predictive Value of Tests %K Proteins %K Quality Control %X

Gene expression signatures of toxicity and clinical response benefit both safety assessment and clinical practice; however, difficulties in connecting signature genes with the predicted end points have limited their application. The Microarray Quality Control Consortium II (MAQCII) project generated 262 signatures for ten clinical and three toxicological end points from six gene expression data sets, an unprecedented collection of diverse signatures that has permitted a wide-ranging analysis on the nature of such predictive models. A comprehensive analysis of the genes of these signatures and their nonredundant unions using ontology enrichment, biological network building and interactome connectivity analyses demonstrated the link between gene signatures and the biological basis of their predictive power. Different signatures for a given end point were more similar at the level of biological properties and transcriptional control than at the gene level. Signatures tended to be enriched in function and pathway in an end point and model-specific manner, and showed a topological bias for incoming interactions. Importantly, the level of biological similarity between different signatures for a given end point correlated positively with the accuracy of the signature predictions. These findings will aid the understanding, and application of predictive genomic signatures, and support their broader application in predictive medicine.

%B Pharmacogenomics J %V 10 %P 310-23 %8 2010 Aug %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/20676069?dopt=Abstract %R 10.1038/tpj.2010.35 %0 Journal Article %J Expert Rev Proteomics %D 2010 %T Functional genomics and networks: new approaches in the extraction of complex gene modules. %A Minguez, Pablo %A Dopazo, Joaquin %K Gene Expression Regulation %K Gene Regulatory Networks %K Genomics %K Protein Binding %K Proteins %K Systems biology %X

The engine that makes the cell work is made of an intricate network of molecular interactions. Nowadays, the elements and relationships of this complex network can be studied with several types of high-throughput techniques. The dream of having a global picture of the cell from different perspectives that can jointly explain cell behavior is, at least technically, feasible. However, this task can only be accomplished by filling the gap between data and information. The availability of methods capable of accurately managing, integrating and analyzing the results from these experiments is crucial for this purpose. Here, we review the new challenges raised by the availability of different genomic data, as well as the new proposals presented to cope with the increasing data complexity. Special emphasis is given to approaches that explore the transcriptome trying to describe the modules of genes that account for the traits studied.

%B Expert Rev Proteomics %V 7 %P 55-63 %8 2010 Feb %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/20121476?dopt=Abstract %R 10.1586/epr.09.103 %0 Journal Article %J PLoS One %D 2010 %T Functional genomics of 5- to 8-cell stage human embryos by blastomere single-cell cDNA analysis. %A Galan, Amparo %A Montaner, David %A Póo, M Eugenia %A Valbuena, Diana %A Ruiz, Veronica %A Aguilar, Cristóbal %A Dopazo, Joaquin %A Simon, Carlos %K Blastomeres %K DNA, Complementary %K Gene Expression Profiling %K Genomics %K Humans %K Oligonucleotide Array Sequence Analysis %X

Blastomere fate and embryonic genome activation (EGA) during human embryonic development are unsolved areas of high scientific and clinical interest. Forty-nine blastomeres from 5- to 8-cell human embryos have been investigated following an efficient single-cell cDNA amplification protocol to provide a template for high-density microarray analysis. The previously described markers, characteristic of Inner Cell Mass (ICM) (n = 120), stemness (n = 190) and Trophectoderm (TE) (n = 45), were analyzed, and a housekeeping pattern of 46 genes was established. All the human blastomeres from the 5- to 8-cell stage embryo displayed a common gene expression pattern corresponding to ICM markers (e.g., DDX3, FOXD3, LEFTY1, MYC, NANOG, POU5F1), stemness (e.g., POU5F1, DNMT3B, GABRB3, SOX2, ZFP42, TERT), and TE markers (e.g., GATA6, EOMES, CDX2, LHCGR). The EGA profile was also investigated between the 5-6- and 8-cell stage embryos, and compared to the blastocyst stage. Known genes (n = 92) such as depleted maternal transcripts (e.g., CCNA1, CCNB1, DPPA2) and embryo-specific activation (e.g., POU5F1, CDH1, DPPA4), as well as novel genes, were confirmed. In summary, the global single-cell cDNA amplification microarray analysis of the 5- to 8-cell stage human embryos reveals that blastomere fate is not committed to ICM or TE. Finally, new EGA features in human embryogenesis are presented.

%B PLoS One %V 5 %P e13615 %8 2010 Oct 26 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/21049019?dopt=Abstract %R 10.1371/journal.pone.0013615 %0 Book Section %B Methods in molecular biology (Clifton, N.J.) %D 2010 %T Functional profiling methods in cancer. %A Joaquín Dopazo %E Grützmann, Robert %E Pilarsky, Christian %X

The introduction of new high-throughput methodologies such as DNA microarrays constitutes a major breakthrough in cancer research. The unprecedented amount of data produced by such technologies has opened new avenues for interrogating living systems although, at the same time, it has demanded of the development of new data analytical methods as well as new strategies for testing hypotheses. A history of early successful applications in cancer boosted the use of microarrays and fostered further applications in other fields. Keeping the pace with these technologies, bioinformatics offers new solutions for data analysis and, what is more important, permits the formulation of a new class of hypotheses inspired in systems biology, more oriented to pathways or, in general, to modules of functionally related genes. Although these analytical methodologies are new, some options are already available and are discussed in this chapter.

%B Methods in molecular biology (Clifton, N.J.) %V 576 %P 363-74 %8 2010 %G eng %0 Journal Article %J Stem Cells %D 2010 %T Hypoxia promotes efficient differentiation of human embryonic stem cells to functional endothelium. %A Prado-Lopez, Sonia %A Conesa, Ana %A Armiñán, Ana %A Martínez-Losa, Magdalena %A Escobedo-Lucea, Carmen %A Gandia, Carolina %A Tarazona, Sonia %A Melguizo, Dario %A Blesa, David %A Montaner, David %A Sanz-González, Silvia %A Sepúlveda, Pilar %A Götz, Stefan %A O'Connor, José Enrique %A Moreno, Ruben %A Dopazo, Joaquin %A Burks, Deborah J %A Stojkovic, Miodrag %K Angiopoietin-1 %K Animals %K biomarkers %K Cell Culture Techniques %K Cell Differentiation %K Cell Hypoxia %K Cell Transplantation %K Cells, Cultured %K Down-Regulation %K Embryonic Stem Cells %K Endothelial Cells %K Gene Expression Profiling %K Gene Expression Regulation %K Humans %K Male %K Myocardial Infarction %K Neovascularization, Physiologic %K Oxygen %K Pluripotent Stem Cells %K Rats %K Rats, Nude %K Vascular Endothelial Growth Factor A %X

Early development of mammalian embryos occurs in an environment of relative hypoxia. Nevertheless, human embryonic stem cells (hESC), which are derived from the inner cell mass of blastocyst, are routinely cultured under the same atmospheric conditions (21% O(2)) as somatic cells. We hypothesized that O(2) levels modulate gene expression and differentiation potential of hESC, and thus, we performed gene profiling of hESC maintained under normoxic or hypoxic (1% or 5% O(2)) conditions. Our analysis revealed that hypoxia downregulates expression of pluripotency markers in hESC but increases significantly the expression of genes associated with angio- and vasculogenesis including vascular endothelial growth factor and angiopoitein-like proteins. Consequently, we were able to efficiently differentiate hESC to functional endothelial cells (EC) by varying O(2) levels; after 24 hours at 5% O(2), more than 50% of cells were CD34+. Transplantation of resulting endothelial-like cells improved both systolic function and fractional shortening in a rodent model of myocardial infarction. Moreover, analysis of the infarcted zone revealed that transplanted EC reduced the area of fibrous scar tissue by 50%. Thus, use of hypoxic conditions to specify the endothelial lineage suggests a novel strategy for cellular therapies aimed at repair of damaged vasculature in pathologies such as cerebral ischemia and myocardial infarction.

%B Stem Cells %V 28 %P 407-18 %8 2010 Mar 31 %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/20049902?dopt=Abstract %R 10.1002/stem.295 %0 Journal Article %J PLoS genetics %D 2010 %T Initial genomics of the human nucleolus. %A Németh, Attila %A Ana Conesa %A Santoyo-López, Javier %A Medina, Ignacio %A Montaner, David %A Péterfia, Bálint %A Solovei, Irina %A Cremer, Thomas %A Dopazo, Joaquin %A Längst, Gernot %K NGS %K nucleolus %X

We report for the first time the genomics of a nuclear compartment of the eukaryotic cell. 454 sequencing and microarray analysis revealed the pattern of nucleolus-associated chromatin domains (NADs) in the linear human genome and identified different gene families and certain satellite repeats as the major building blocks of NADs, which constitute about 4% of the genome. Bioinformatic evaluation showed that NAD-localized genes take part in specific biological processes, like the response to other organisms, odor perception, and tissue development. 3D FISH and immunofluorescence experiments illustrated the spatial distribution of NAD-specific chromatin within interphase nuclei and its alteration upon transcriptional changes. Altogether, our findings describe the nature of DNA sequences associated with the human nucleolus and provide insights into the function of the nucleolus in genome organization and establishment of nuclear architecture.

%B PLoS genetics %V 6 %P e1000889 %8 2010 %G eng %U http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000889 %R 10.1371/journal.pgen.1000889 %0 Journal Article %J Nature Biotechnology %D 2010 %T The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models %B Nature Biotechnology %V 28 %P 827 - 838 %8 Jan-08-2010 %G eng %U http://www.nature.com/articles/nbt.1665http://www.nature.com/articles/nbt.1665.pdfhttp://www.nature.com/articles/nbt.1665.pdfhttp://www.nature.com/articles/nbt.1665 %N 8 %! Nat Biotechnol %R 10.1038/nbt.1665 %0 Journal Article %J Nature biotechnology %D 2010 %T The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. %A Shi, Leming %A Campbell, Gregory %A Jones, Wendell D %A Campagne, Fabien %A Wen, Zhining %A Walker, Stephen J %A Su, Zhenqiang %A Chu, Tzu-Ming %A Goodsaid, Federico M %A Pusztai, Lajos %A Shaughnessy, John D %A Oberthuer, André %A Thomas, Russell S %A Paules, Richard S %A Fielden, Mark %A Barlogie, Bart %A Chen, Weijie %A Du, Pan %A Fischer, Matthias %A Furlanello, Cesare %A Gallas, Brandon D %A Ge, Xijin %A Megherbi, Dalila B %A Symmans, W Fraser %A Wang, May D %A Zhang, John %A Bitter, Hans %A Brors, Benedikt %A Bushel, Pierre R %A Bylesjo, Max %A Chen, Minjun %A Cheng, Jie %A Cheng, Jing %A Chou, Jeff %A Davison, Timothy S %A Delorenzi, Mauro %A Deng, Youping %A Devanarayan, Viswanath %A Dix, David J %A Dopazo, Joaquin %A Dorff, Kevin C %A Elloumi, Fathi %A Fan, Jianqing %A Fan, Shicai %A Fan, Xiaohui %A Fang, Hong %A Gonzaludo, Nina %A Hess, Kenneth R %A Hong, Huixiao %A Huan, Jun %A Irizarry, Rafael A %A Judson, Richard %A Juraeva, Dilafruz %A Lababidi, Samir %A Lambert, Christophe G %A Li, Li %A Li, Yanen %A Li, Zhen %A Lin, Simon M %A Liu, Guozhen %A Lobenhofer, Edward K %A Luo, Jun %A Luo, Wen %A McCall, Matthew N %A Nikolsky, Yuri %A Pennello, Gene A %A Perkins, Roger G %A Philip, Reena %A Popovici, Vlad %A Price, Nathan D %A Qian, Feng %A Scherer, Andreas %A Shi, Tieliu %A Shi, Weiwei %A Sung, Jaeyun %A Thierry-Mieg, Danielle %A Thierry-Mieg, Jean %A Thodima, Venkata %A Trygg, Johan %A Vishnuvajjala, Lakshmi %A Wang, Sue Jane %A Wu, Jianping %A Wu, Yichao %A Xie, Qian %A Yousef, Waleed A %A Zhang, Liang %A Zhang, Xuegong %A Zhong, Sheng %A Zhou, Yiming %A Zhu, Sheng %A Arasappan, Dhivya %A Bao, Wenjun %A Lucas, Anne Bergstrom %A Berthold, Frank %A Brennan, Richard J %A Buness, Andreas %A Catalano, Jennifer G %A Chang, Chang %A Chen, Rong %A Cheng, Yiyu %A Cui, Jian %A Czika, Wendy %A Demichelis, Francesca %A Deng, Xutao %A Dosymbekov, Damir %A Eils, Roland %A Feng, Yang %A Fostel, Jennifer %A Fulmer-Smentek, Stephanie %A Fuscoe, James C %A Gatto, Laurent %A Ge, Weigong %A Goldstein, Darlene R %A Guo, Li %A Halbert, Donald N %A Han, Jing %A Harris, Stephen C %A Hatzis, Christos %A Herman, Damir %A Huang, Jianping %A Jensen, Roderick V %A Jiang, Rui %A Johnson, Charles D %A Jurman, Giuseppe %A Kahlert, Yvonne %A Khuder, Sadik A %A Kohl, Matthias %A Li, Jianying %A Li, Li %A Li, Menglong %A Li, Quan-Zhen %A Li, Shao %A Li, Zhiguang %A Liu, Jie %A Liu, Ying %A Liu, Zhichao %A Meng, Lu %A Madera, Manuel %A Martinez-Murillo, Francisco %A Medina, Ignacio %A Meehan, Joseph %A Miclaus, Kelci %A Moffitt, Richard A %A Montaner, David %A Mukherjee, Piali %A Mulligan, George J %A Neville, Padraic %A Nikolskaya, Tatiana %A Ning, Baitang %A Page, Grier P %A Parker, Joel %A Parry, R Mitchell %A Peng, Xuejun %A Peterson, Ron L %A Phan, John H %A Quanz, Brian %A Ren, Yi %A Riccadonna, Samantha %A Roter, Alan H %A Samuelson, Frank W %A Schumacher, Martin M %A Shambaugh, Joseph D %A Shi, Qiang %A Shippy, Richard %A Si, Shengzhu %A Smalter, Aaron %A Sotiriou, Christos %A Soukup, Mat %A Staedtler, Frank %A Steiner, Guido %A Stokes, Todd H %A Sun, Qinglan %A Tan, Pei-Yi %A Tang, Rong %A Tezak, Zivana %A Thorn, Brett %A Tsyganova, Marina %A Turpaz, Yaron %A Vega, Silvia C %A Visintainer, Roberto %A von Frese, Juergen %A Wang, Charles %A Wang, Eric %A Wang, Junwei %A Wang, Wei %A Westermann, Frank %A Willey, James C %A Woods, Matthew %A Wu, Shujian %A Xiao, Nianqing %A Xu, Joshua %A Xu, Lei %A Yang, Lun %A Zeng, Xiao %A Zhang, Jialu %A Zhang, Li %A Zhang, Min %A Zhao, Chen %A Puri, Raj K %A Scherf, Uwe %A Tong, Weida %A Wolfinger, Russell D %X

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

%B Nature biotechnology %V 28 %P 827-38 %8 2010 Aug %G eng %U http://www.nature.com/nbt/journal/v28/n8/full/nbt.1665.html %0 Journal Article %J PLoS One %D 2010 %T Multidimensional gene set analysis of genomic data. %A Montaner, David %A Dopazo, Joaquin %K Databases, Genetic %K Gene Expression Profiling %K Gene Regulatory Networks %K Genome, Human %K Genomics %K Humans %K Models, Statistical %X

Understanding the functional implications of changes in gene expression, mutations, etc., is the aim of most genomic experiments. To achieve this, several functional profiling methods have been proposed. Such methods study the behaviour of different gene modules (e.g. gene ontology terms) in response to one particular variable (e.g. differential gene expression). In spite to the wealth of information provided by functional profiling methods, a common limitation to all of them is their inherent unidimensional nature. In order to overcome this restriction we present a multidimensional logistic model that allows studying the relationship of gene modules with different genome-scale measurements (e.g. differential expression, genotyping association, methylation, copy number alterations, heterozygosity, etc.) simultaneously. Moreover, the relationship of such functional modules with the interactions among the variables can also be studied, which produces novel results impossible to be derived from the conventional unidimensional functional profiling methods. We report sound results of gene sets associations that remained undetected by the conventional one-dimensional gene set analysis in several examples. Our findings demonstrate the potential of the proposed approach for the discovery of new cell functionalities with complex dependences on more than one variable.

%B PLoS One %V 5 %P e10348 %8 2010 Apr 27 %G eng %N 4 %1 https://www.ncbi.nlm.nih.gov/pubmed/20436964?dopt=Abstract %R 10.1371/journal.pone.0010348 %0 Journal Article %J Hum Mutat %D 2010 %T Mutation spectrum of EYS in Spanish patients with autosomal recessive retinitis pigmentosa. %A Barragán, Isabel %A Borrego, Salud %A Pieras, Juan Ignacio %A González-del Pozo, María %A Santoyo, Javier %A Ayuso, Carmen %A Baiget, Montserrat %A Millán, José M %A Mena, Marcela %A Abd El-Aziz, Mai M %A Audo, Isabelle %A Zeitz, Christina %A Littink, Karin W %A Dopazo, Joaquin %A Bhattacharya, Shomi S %A Antiňolo, Guillermo %K Amino Acid Sequence %K Animals %K Case-Control Studies %K DNA Mutational Analysis %K Drosophila Proteins %K Evolution, Molecular %K Eye Proteins %K Female %K Genes, Recessive %K Genetic Variation %K Humans %K Male %K Molecular Sequence Data %K mutation %K Pedigree %K Polymorphism, Single Nucleotide %K Protein Structure, Tertiary %K Retinitis pigmentosa %K Spain %K Structural Homology, Protein %X

Retinitis pigmentosa (RP) is a heterogeneous group of inherited retinal dystrophies characterised ultimately by the loss of photoreceptor cells. We have recently identified a new gene(EYS) encoding an ortholog of Drosophila space maker (spam) as a commonly mutated gene in autosomal recessive RP. In the present study, we report the identification of 73 sequence variations in EYS, of which 28 are novel. Of these, 42.9% (12/28) are very likely pathogenic, 17.9% (5/28)are possibly pathogenic, whereas 39.3% (11/28) are SNPs. In addition, we have detected 3 pathogenic changes previously reported in other populations. We are also presenting the characterisation of EYS homologues in different species, and a detailed analysis of the EYS domains, with the identification of an interesting novel feature: a putative coiled-coil domain.Majority of the mutations in the arRP patients have been found within the domain structures of EYS. The minimum observed prevalence of distinct EYS mutations in our group of patients is of 15.9% (15/94), confirming a major involvement of EYS in the pathogenesis of arRP in the Spanish population. Along with the detection of three recurrent mutations in Caucasian population, our hypothesis of EYS being the first prevalent gene in arRP has been reinforced in the present study.

%B Hum Mutat %V 31 %P E1772-800 %8 2010 Nov %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/21069908?dopt=Abstract %R 10.1002/humu.21334 %0 Journal Article %J BMC bioinformatics %D 2010 %T Quantifying the relationship between sequence and three-dimensional structure conservation in RNA. %A E. Capriotti %A M. A. Marti-Renom %X

ABSTRACT: BACKGROUND: In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. RESULTS: Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. DISCUSSION: The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction.

%B BMC bioinformatics %V 11 %P 322 %8 2010 Jun 15 %G eng %0 Journal Article %J PLoS Comput. Biol. %D 2010 %T Selection upon Genome Architecture: Conservation of Functional Neighborhoods with Changing Genes %A Al-Shahrour, Fátima %A Minguez, Pablo %A Marqués-Bonet, Tomás %A Gazave, Elodie %A Navarro, Arcadi %A Dopazo, Joaquin %B PLoS Comput. Biol. %V 6 %P e1000953 %G eng %U http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000953 %R doi:10.1371/journal.pcbi.1000953 %0 Journal Article %J Nucleic Acids Res %D 2010 %T Serial Expression Analysis: a web tool for the analysis of serial gene expression data. %A Nueda, Maria José %A Carbonell, José %A Medina, Ignacio %A Dopazo, Joaquin %A Conesa, Ana %K Algorithms %K Gene Expression Profiling %K Internet %K Kinetics %K Linear Models %K Oligonucleotide Array Sequence Analysis %K Software %X

Serial transcriptomics experiments investigate the dynamics of gene expression changes associated with a quantitative variable such as time or dosage. The statistical analysis of these data implies the study of global and gene-specific expression trends, the identification of significant serial changes, the comparison of expression profiles and the assessment of transcriptional changes in terms of cellular processes. We have created the SEA (Serial Expression Analysis) suite to provide a complete web-based resource for the analysis of serial transcriptomics data. SEA offers five different algorithms based on univariate, multivariate and functional profiling strategies framed within a user-friendly interface and a project-oriented architecture to facilitate the analysis of serial gene expression data sets from different perspectives. SEA is available at sea.bioinfo.cipf.es.

%B Nucleic Acids Res %V 38 %P W239-45 %8 2010 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/20525784?dopt=Abstract %R 10.1093/nar/gkq488 %0 Journal Article %J Nucleic acids research %D 2010 %T SIMAP–a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters. %A Rattei, Thomas %A Tischler, Patrick %A Götz, Stefan %A Jehl, Marc-André %A Hoser, Jonathan %A Arnold, Roland %A Ana Conesa %A Mewes, Hans-Werner %X

The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).

%B Nucleic acids research %V 38 %P D223-6 %8 2010 Jan %G eng %0 Journal Article %J Protein engineering, design & selection : PEDS %D 2009 %T Alignment of multiple protein structures based on sequence and structure features. %A Madhusudhan, M. S. %A Webb, Benjamin M %A Marti-Renom, Marc A %A Eswar, Narayanan %A Sali, Andrej %X

Comparing the structures of proteins is crucial to gaining insight into protein evolution and function. Here, we align the sequences of multiple protein structures by a dynamic programming optimization of a scoring function that is a sum of an affine gap penalty and terms dependent on various sequence and structure features (SALIGN). The features include amino acid residue type, residue position, residue accessible surface area, residue secondary structure state and the conformation of a short segment centered on the residue. The multiple alignment is built by following the ’guide’ tree constructed from the matrix of all pairwise protein alignment scores. Importantly, the method does not depend on the exact values of various parameters, such as feature weights and gap penalties, because the optimal alignment across a range of parameter values is found. Using multiple structure alignments in the HOMSTRAD database, SALIGN was benchmarked against MUSTANG for multiple alignments as well as against TM-align and CE for pairwise alignments. On the average, SALIGN produces a 15% improvement in structural overlap over HOMSTRAD and 14% over MUSTANG, and yields more equivalent structural positions than TM-align and CE in 90% and 95% of cases, respectively. The utility of accurate multiple structure alignment is illustrated by its application to comparative protein structure modeling.

%B Protein engineering, design & selection : PEDS %V 22 %P 569-74 %8 2009 Sep %G eng %0 Journal Article %J Leuk Lymphoma %D 2009 %T Analysis of chronic lymphotic leukemia transcriptomic profile: differences between molecular subgroups %A Jantus Lewintre, E. %A Reinoso Martin, C. %A Montaner, D. %A Marin, M. %A Jose Terol, M. %A Farras, R. %A Benet, I. %A Calvete, J. J. %A Dopazo, J. %A Garcia-Conde, J. %K cancer %K microarray data analysis %X

B cell chronic lymphocytic leukemia (CLL) is a lymphoproliferative disorder with a variable clinical course. Patients with unmutated IgV(H) gene show a shorter progression-free and overall survival than patients with immunoglobulin heavy chain variable regions (IgV(H)) gene mutated. In addition, BCL6 mutations identify a subgroup of patients with high risk of progression. Gene expression was analysed in 36 early-stage patients using high-density microarrays. Around 150 genes differentially expressed were found according to IgV(H) mutations, whereas no difference was found according to BCL6 mutations. Functional profiling methods allowed us to distinguish KEGG and gene ontology terms showing coordinated gene expression changes across subgroups of CLL. We validated a set of differentially expressed genes according to IgV(H) status, scoring them as putative prognostic markers in CLL. Among them, CRY1, LPL, CD82 and DUSP22 are the ones with at least equal or superior performance to ZAP70 which is actually the most used surrogate marker of IgV(H) status.

%B Leuk Lymphoma %V 50 %P 68-79 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19127482 %0 Journal Article %J Ciencia Hoy %D 2009 %T Bioinformática, Genómica y Evolución. Una alianza estratégica para la biología de este siglo. %A H. Dopazo %B Ciencia Hoy %V 19 %P 88-93 %G eng %0 Book %D 2009 %T Evolución y Adaptación.150 años después del Origen de las Especies %E H. Dopazo %E Navarro, A. %X

Evolución y Adaptación: 150 años después del Origen de las Especies es un homenaje a la figura de Charles Darwin al cumplirse 200 años de su nacimiento y 150 años de la publicación que lo hiciese mundialmente famoso. En esta edición 101 autores convocados por la Sociedad Española de Biología Evolutiva han resumido su trabajo de investigación en 51 artículos. Estos se han agrupado en temáticas que abarcan los problemas de la evolución molecular, el cambio morfológico, la evolución del desarrollo, el origen de las especies y su interacción, la diversidad biológica, la evolución del comportamiento, la paleobiología, la evolución experimental, la evolución cultural y la evolución en la filosofía y la docencia. Muchos de estos trabajos representan décadas de constante investigación en el laboratorio y en el campo. El común denominador de los artículos que contiene este libro es el esfuerzo por transmitir a un público no necesariamente experto la actualidad de las investigaciones que en el campo de la adaptación y la evolución se desarrolla en diferentes laboratorios. Esta obra resume por lo tanto, gran parte de las investigaciones que  en materia de evolución biológica se realiza en España. Esta edición deja constancia entonces del "Hecho de la Evolución", y de la actualidad de teoría evolutiva moderna como cuerpo explicativo del mundo biológico 150 años después del origen de las especies. 

%I Obrapropia. %C Valencia. España %P 510 %G eng %0 Journal Article %J Microbiology %D 2009 %T Exploring the antimicrobial action of a carbon monoxide-releasing compound through whole-genome transcription profiling of Escherichia coli %A Nobre, L. S. %A Fatima Al-Shahrour %A Dopazo, J. %A Saraiva, L. M. %K Bacterial Genes %K Bacterial/genetics %K Biofilms Carbon Monoxide/*metabolism Escherichia coli/*genetics/metabolism Escherichia coli Proteins/genetics/metabolism *Gene Expression Profiling Gene Expression Regulation %K Regulator Genetic Complementation Test Methionine/metabolism Microbial Viability Mutation Oligonucleotide Array Sequence Analysis Organometallic Compounds/*pharmacology Phenotype RNA %X

We recently reported that carbon monoxide (CO) has bactericidal activity. To understand its mode of action we analysed the gene expression changes occurring when Escherichia coli, grown aerobically and anaerobically, is treated with the CO-releasing molecule CORM-2 (tricarbonyldichlororuthenium(II) dimer). Microarray analysis shows that the E. coli CORM-2 response is multifaceted, with a high number of differentially regulated genes spread through several functional categories, namely genes involved in inorganic ion transport and metabolism, regulators, and genes implicated in post-translational modification, such as chaperones. CORM-2 has a higher impact in E. coli cells grown anaerobically, as judged by the repression of genes belonging to eight functional classes which are not seen in the response of aerobically CORM-2-treated cells. The biological relevance of the variations caused by CORM-2 was substantiated by studying the CORM-2 sensitivity of selected E. coli mutants. The results show that the deletion of redox-sensing regulators SoxS and OxyR increased the sensitivity to CORM-2 and suggest that while SoxS plays an important role in protection against CORM-2 under both growth conditions, OxyR seems to participate only in the aerobic CORM-2 response. Under anaerobic conditions, we found that the heat-shock proteins IbpA and IbpB contribute to CORM-2 defence since the deletion of these genes increases the sensitivity of the strain. The induction of several met genes and the hypersensitivity to CORM-2 of the DeltametR, DeltametI and DeltametN mutant strains suggest that CO has effects on the methionine metabolism of E. coli. CORM-2 also affects the transcription of several E. coli biofilm-related genes and increases biofilm formation in E. coli. In particular, the absence of tqsA or bhsA increases the resistance of E. coli to CORM-2, and deletion of tsqA leads to a strain that has lost its capacity to form biofilm upon treatment with CORM-2. In spite of the relatively stable nature of the CO molecule, our results show that CO is able to trigger a significant alteration in the transcriptome of E. coli which necessarily has effects in several key metabolic pathways.

%B Microbiology %V 155 %P 813-24 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19246752 %0 Journal Article %J Microbiology (Reading) %D 2009 %T Exploring the antimicrobial action of a carbon monoxide-releasing compound through whole-genome transcription profiling of Escherichia coli. %A Nobre, Lígia S %A Al-Shahrour, Fátima %A Dopazo, Joaquin %A Saraiva, Lígia M %K Biofilms %K Carbon Monoxide %K Escherichia coli %K Escherichia coli Proteins %K Gene Expression Profiling %K Gene Expression Regulation, Bacterial %K Genes, Bacterial %K Genes, Regulator %K Genetic Complementation Test %K Methionine %K Microbial Viability %K mutation %K Oligonucleotide Array Sequence Analysis %K Organometallic Compounds %K Phenotype %K RNA, Bacterial %X

We recently reported that carbon monoxide (CO) has bactericidal activity. To understand its mode of action we analysed the gene expression changes occurring when Escherichia coli, grown aerobically and anaerobically, is treated with the CO-releasing molecule CORM-2 (tricarbonyldichlororuthenium(II) dimer). Microarray analysis shows that the E. coli CORM-2 response is multifaceted, with a high number of differentially regulated genes spread through several functional categories, namely genes involved in inorganic ion transport and metabolism, regulators, and genes implicated in post-translational modification, such as chaperones. CORM-2 has a higher impact in E. coli cells grown anaerobically, as judged by the repression of genes belonging to eight functional classes which are not seen in the response of aerobically CORM-2-treated cells. The biological relevance of the variations caused by CORM-2 was substantiated by studying the CORM-2 sensitivity of selected E. coli mutants. The results show that the deletion of redox-sensing regulators SoxS and OxyR increased the sensitivity to CORM-2 and suggest that while SoxS plays an important role in protection against CORM-2 under both growth conditions, OxyR seems to participate only in the aerobic CORM-2 response. Under anaerobic conditions, we found that the heat-shock proteins IbpA and IbpB contribute to CORM-2 defence since the deletion of these genes increases the sensitivity of the strain. The induction of several met genes and the hypersensitivity to CORM-2 of the DeltametR, DeltametI and DeltametN mutant strains suggest that CO has effects on the methionine metabolism of E. coli. CORM-2 also affects the transcription of several E. coli biofilm-related genes and increases biofilm formation in E. coli. In particular, the absence of tqsA or bhsA increases the resistance of E. coli to CORM-2, and deletion of tsqA leads to a strain that has lost its capacity to form biofilm upon treatment with CORM-2. In spite of the relatively stable nature of the CO molecule, our results show that CO is able to trigger a significant alteration in the transcriptome of E. coli which necessarily has effects in several key metabolic pathways.

%B Microbiology (Reading) %V 155 %P 813-824 %8 2009 Mar %G eng %N Pt 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/19246752?dopt=Abstract %R 10.1099/mic.0.023911-0 %0 Journal Article %J Artif Intell Med %D 2009 %T Formulating and testing hypotheses in functional genomics. %A Dopazo, Joaquin %K Biological Evolution %K Computational Biology %K Genomics %K Genotype %K Models, Theoretical %X

OBJECTIVE: The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the results, relating the available genomic information to the hypotheses that originated the experiment.

METHODS AND RESULTS: Initially, this interpretation has been made on a pre-selection of relevant genes, based on the experimental values, followed by the study of the enrichment in some functional properties. Nevertheless, functional enrichment methods, demonstrated to have a flaw: the first step of gene selection was too stringent given that the cooperation among genes was ignored. The assumption that modules of genes related by relevant biological properties (functionality, co-regulation, chromosomal location, etc.) are the real actors of the cell biology lead to the development of new procedures, inspired in systems biology criteria, generically known as gene-set methods. These methods have been successfully used to analyze transcriptomic and large-scale genotyping experiments as well as to test other different genome-scale hypothesis in other fields such as phylogenomics.

%B Artif Intell Med %V 45 %P 97-107 %8 2009 Feb-Mar %G eng %N 2-3 %1 https://www.ncbi.nlm.nih.gov/pubmed/18789659?dopt=Abstract %R 10.1016/j.artmed.2008.08.003 %0 Journal Article %J Artif Intell Med %D 2009 %T Formulating and testing hypotheses in functional genomics %A Dopazo, J. %K babelomics %K gene set analysis %X

OBJECTIVE: The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the results, relating the available genomic information to the hypotheses that originated the experiment. METHODS AND RESULTS: Initially, this interpretation has been made on a pre-selection of relevant genes, based on the experimental values, followed by the study of the enrichment in some functional properties. Nevertheless, functional enrichment methods, demonstrated to have a flaw: the first step of gene selection was too stringent given that the cooperation among genes was ignored. The assumption that modules of genes related by relevant biological properties (functionality, co-regulation, chromosomal location, etc.) are the real actors of the cell biology lead to the development of new procedures, inspired in systems biology criteria, generically known as gene-set methods. These methods have been successfully used to analyze transcriptomic and large-scale genotyping experiments as well as to test other different genome-scale hypothesis in other fields such as phylogenomics.

%B Artif Intell Med %V 45 %P 97-107 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18789659 %0 Journal Article %J BMC Bioinformatics %D 2009 %T Functional assessment of time course microarray data. %A Nueda, Maria José %A Sebastián, Patricia %A Tarazona, Sonia %A Garcia-Garcia, Francisco %A Dopazo, Joaquin %A Ferrer, Alberto %A Conesa, Ana %K Computer Simulation %K Gene Expression Profiling %K Oligonucleotide Array Sequence Analysis %K Time Factors %X

MOTIVATION: Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.

METHODS: We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.

RESULTS: Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.

%B BMC Bioinformatics %V 10 Suppl 6 %P S9 %8 2009 Jun 16 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/19534758?dopt=Abstract %R 10.1186/1471-2105-10-S6-S9 %0 Journal Article %J Leuk Lymphoma %D 2009 %T Functional signatures identified in B-cell non-Hodgkin lymphoma profiles. %A Aggarwal, Mohit %A Sánchez-Beato, Margarita %A Gómez-López, Gonzalo %A Al-Shahrour, Fátima %A Martínez, Nerea %A Rodríguez, Antonia %A Ruiz-Ballesteros, Elena %A Camacho, Francisca I %A Pérez-Rosado, Alberto %A de la Cueva, Paloma %A Artiga, María J %A Pisano, David G %A Kimby, Eva %A Dopazo, Joaquin %A Villuendas, Raquel %A Piris, Miguel A %K Adult %K Cluster Analysis %K Gene Expression Profiling %K Gene Expression Regulation, Leukemic %K Genetic Heterogeneity %K Humans %K Lymphoma, B-Cell %K Neoplasm Proteins %K Oligonucleotide Array Sequence Analysis %K RNA, Messenger %K RNA, Neoplasm %K Transcription, Genetic %X

Gene-expression profiling in B-cell lymphomas has provided crucial data on specific lymphoma types, which can contribute to the identification of essential lymphoma survival genes and pathways. In this study, the gene-expression profiling data of all major B-cell lymphoma types were analyzed by unsupervised clustering. The transcriptome classification so obtained, was explored using gene set enrichment analysis generating a heatmap for B-cell lymphoma that identifies common lymphoma survival mechanisms and potential therapeutic targets, recognizing sets of coregulated genes and functional pathways expressed in different lymphoma types. Some of the most relevant signatures (stroma, cell cycle, B-cell receptor (BCR)) are shared by multiple lymphoma types or subclasses. A specific attention was paid to the analysis of BCR and coregulated pathways, defining molecular heterogeneity within multiple B-cell lymphoma types.

%B Leuk Lymphoma %V 50 %P 1699-708 %8 2009 Oct %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/19863341?dopt=Abstract %R 10.1080/10428190903189035 %0 Journal Article %J BMC Genomics %D 2009 %T Gene set internal coherence in the context of functional profiling. %A Montaner, David %A Minguez, Pablo %A Al-Shahrour, Fátima %A Dopazo, Joaquin %K Algorithms %K Breast Neoplasms %K Carcinoma, Intraductal, Noninfiltrating %K Computational Biology %K Databases, Nucleic Acid %K Female %K Gene Expression Profiling %K Genomics %K Humans %K Oligonucleotide Array Sequence Analysis %K Papillomavirus Infections %K Reproducibility of Results %X

BACKGROUND: Functional profiling methods have been extensively used in the context of high-throughput experiments and, in particular, in microarray data analysis. Such methods use available biological information to define different types of functional gene modules (e.g. gene ontology -GO-, KEGG pathways, etc.) whose representation in a pre-defined list of genes is further studied. In the most popular type of microarray experimental designs (e.g. up- or down-regulated genes, clusters of co-expressing genes, etc.) or in other genomic experiments (e.g. Chip-on-chip, epigenomics, etc.) these lists are composed by genes with a high degree of co-expression. Therefore, an implicit assumption in the application of functional profiling methods within this context is that the genes corresponding to the modules tested are effectively defining sets of co-expressing genes. Nevertheless not all the functional modules are biologically coherent entities in terms of co-expression, which will eventually hinder its detection with conventional methods of functional enrichment.

RESULTS: Using a large collection of microarray data we have carried out a detailed survey of internal correlation in GO terms and KEGG pathways, providing a coherence index to be used for measuring functional module co-regulation. An unexpected low level of internal correlation was found among the modules studied. Only around 30% of the modules defined by GO terms and 57% of the modules defined by KEGG pathways display an internal correlation higher than the expected by chance.This information on the internal correlation of the genes within the functional modules can be used in the context of a logistic regression model in a simple way to improve their detection in gene expression experiments.

CONCLUSION: For the first time, an exhaustive study on the internal co-expression of the most popular functional categories has been carried out. Interestingly, the real level of coexpression within many of them is lower than expected (or even inexistent), which will preclude its detection by means of most conventional functional profiling methods. If the gene-to-function correlation information is used in functional profiling methods, the results obtained improve the ones obtained by conventional enrichment methods.

%B BMC Genomics %V 10 %P 197 %8 2009 Apr 27 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/19397819?dopt=Abstract %R 10.1186/1471-2164-10-197 %0 Journal Article %J Nucl. Acids Res. %D 2009 %T Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies %A Medina, Ignacio %A Montaner, David %A Bonifaci, Núria %A Pujana, Miguel Angel %A Carbonell, José %A Tárraga, Joaquín %A Fatima Al-Shahrour %A Dopazo, Joaquin %K babelomics %K gene set %K GESBAP %K pathway-based analysis %K SNP %X

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/

%B Nucl. Acids Res. %V 37 %P W340-344 %G eng %U http://nar.oxfordjournals.org/cgi/content/abstract/37/suppl_2/W340 %R 10.1093/nar/gkp481 %0 Journal Article %J Nucleic Acids Res %D 2009 %T Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies. %A Medina, Ignacio %A Montaner, David %A Bonifaci, Núria %A Pujana, Miguel Angel %A Carbonell, José %A Tárraga, Joaquín %A Al-Shahrour, Fátima %A Dopazo, Joaquin %K Biological Phenomena %K Breast Neoplasms %K Female %K Genes %K Genetic Variation %K Genome-Wide Association Study %K Humans %K Polymorphism, Single Nucleotide %K Software %K User-Computer Interface %X

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/.

%B Nucleic Acids Res %V 37 %P W340-4 %8 2009 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/19502494?dopt=Abstract %R 10.1093/nar/gkp481 %0 Book Section %B Evolución y Adaptacón. 150 años después del Origen de las Especies %D 2009 %T Genómica Comparativa y Selección Natural. Aplicaciones en el Genoma Humano. Capítulo 1.6 %A Serra, François %A Arbiza, L. %A H. Dopazo %E H. Dopazo %E Navarro, A. %X

La búsqueda de los eventos adaptativos a nivel molecular que ha diferenciado el genoma humano del de nuestro pariente vivo más cercano, el chimpancé, ha sido una de las áreas de mayor investigación en genómica comparativa. Paralelamente, la predicción funcional de variantes genéticas en nuestra especie ha sido un área de intenso desarrollo en bioinformática. En este trabajo discutiremos resultados previos y otros más recientes que dan cuenta de estos desarrollos. Veremos que en todos los casos la estimación de las presiones selectivas a nivel de los genes individuales o de los residuos de las proteínas son el denominador común para discutir ambos aspectos. Finalmente mostraremos cómo el análisis de estas presiones selectivas por grupos funcionales de genes resulta una alternativa viable y con suficiente poder estadístico para el análisis de la adaptación y de las restricciones evolutivas a nivel genómico. 

%B Evolución y Adaptacón. 150 años después del Origen de las Especies %I Obrapropia. %C Valencia %P 51-59 %G eng %& 19 %0 Journal Article %J PLoS Negl Trop Dis %D 2009 %T A kernel for open source drug discovery in tropical diseases %A Orti, L. %A Carbajo, R. J. %A Pieper, U. %A Eswar, N. %A Maurer, S. M. %A Rai, A. K. %A Taylor, G. %A Todd, M. H. %A Pineda-Lucena, A. %A Sali, A. %A M. A. Marti-Renom %X BACKGROUND: Conventional patent-based drug development incentives work badly for the developing world, where commercial markets are usually small to non-existent. For this reason, the past decade has seen extensive experimentation with alternative R&D institutions ranging from private-public partnerships to development prizes. Despite extensive discussion, however, one of the most promising avenues-open source drug discovery-has remained elusive. We argue that the stumbling block has been the absence of a critical mass of preexisting work that volunteers can improve through a series of granular contributions. Historically, open source software collaborations have almost never succeeded without such "kernels". METHODOLOGY/PRINCIPAL FINDINGS: HERE, WE USE A COMPUTATIONAL PIPELINE FOR: (i) comparative structure modeling of target proteins, (ii) predicting the localization of ligand binding sites on their surfaces, and (iii) assessing the similarity of the predicted ligands to known drugs. Our kernel currently contains 143 and 297 protein targets from ten pathogen genomes that are predicted to bind a known drug or a molecule similar to a known drug, respectively. The kernel provides a source of potential drug targets and drug candidates around which an online open source community can nucleate. Using NMR spectroscopy, we have experimentally tested our predictions for two of these targets, confirming one and invalidating the other. CONCLUSIONS/SIGNIFICANCE: The TDI kernel, which is being offered under the Creative Commons attribution share-alike license for free and unrestricted use, can be accessed on the World Wide Web at http://www.tropicaldisease.org. We hope that the kernel will facilitate collaborative efforts towards the discovery of new drugs against parasites that cause tropical diseases. %B PLoS Negl Trop Dis %V 3 %P e418 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19381286 %0 Journal Article %J Nat Biotechnol %D 2009 %T A kernel for the Tropical Disease Initiative %A Orti, L. %A Carbajo, R. J. %A Pieper, U. %A Eswar, N. %A Maurer, S. M. %A Rai, A. K. %A Taylor, G. %A Todd, M. H. %A Pineda-Lucena, A. %A Sali, A. %A M. A. Marti-Renom %B Nat Biotechnol %V 27 %P 320-1 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19352362 %0 Journal Article %J Funct Integr Genomics %D 2009 %T Membrane transporters and carbon metabolism implicated in chloride homeostasis differentiate salt stress responses in tolerant and sensitive Citrus rootstocks %A Brumos, J. %A Colmenero-Flores, J. M. %A A. Conesa %A Izquierdo, P. %A Sanchez, G. %A Iglesias, D. J. %A Lopez-Climent, M. F. %A Gomez-Cadenas, A. %A Talon, M. %X

Salinity tolerance in Citrus is strongly related to leaf chloride accumulation. Both chloride homeostasis and specific genetic responses to Cl(-) toxicity are issues scarcely investigated in plants. To discriminate the transcriptomic network related to Cl(-) toxicity and salinity tolerance, we have used two Cl(-) salt treatments (NaCl and KCl) to perform a comparative microarray approach on two Citrus genotypes, the salt-sensitive Carrizo citrange, a poor Cl(-) excluder, and the tolerant Cleopatra mandarin, an efficient Cl(-) excluder. The data indicated that Cl(-) toxicity, rather than Na(+) toxicity and/or the concomitant osmotic perturbation, is the primary factor involved in the molecular responses of citrus plant leaves to salinity. A number of uncharacterized membrane transporter genes, like NRT1-2, were differentially regulated in the tolerant and the sensitive genotypes, suggesting its potential implication in Cl(-) homeostasis. Analyses of enriched functional categories showed that the tolerant rootstock induced wider stress responses in gene expression while repressing central metabolic processes such as photosynthesis and carbon utilization. These features were in agreement with phenotypic changes in the patterns of photosynthesis, transpiration, and stomatal conductance and support the concept that regulation of transpiration and its associated metabolic adjustments configure an adaptive response to salinity that reduces Cl(-) accumulation in the tolerant genotype.

%B Funct Integr Genomics %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19190944 %0 Journal Article %J Nucleic Acids Res %D 2009 %T MODBASE, a database of annotated comparative protein structure models and associated resources %A Pieper, U. %A Eswar, N. %A Webb, B. M. %A Eramian, D. %A Kelly, L. %A Barkan, D. T. %A Carter, H. %A Mankoo, P. %A Karchin, R. %A M. A. Marti-Renom %A Davis, F. P. %A Sali, A. %K *Databases %K Molecular Mutation Polymorphism %K Protein Genomics Humans Ligands *Models %K Protein User-Computer Interface %K Single Nucleotide Protein Folding Protein Interaction Domains and Motifs *Protein Structure %K Tertiary Proteins/genetics *Structural Homology %X MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by MODPIPE, an automated modeling pipeline that relies primarily on MODELLER for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE currently contains 5,152,695 reliable models for domains in 1,593,209 unique protein sequences; only models based on statistically significant alignments and/or models assessed to have the correct fold are included. MODBASE also allows users to calculate comparative models on demand, through an interface to the MODWEB modeling server (http://salilab.org/modweb). Other resources integrated with MODBASE include databases of multiple protein structure alignments (DBAli), structurally defined ligand binding sites (LIGBASE), predicted ligand binding sites (AnnoLyze), structurally defined binary domain interfaces (PIBASE) and annotated single nucleotide polymorphisms and somatic mutations found in human proteins (LS-SNP, LS-Mut). MODBASE models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). %B Nucleic Acids Res %V 37 %P D347-54 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18948282 %0 Journal Article %J OMICS %D 2009 %T Modeling and managing experimental data using FuGE. %A Andrew R Jones %A Allyson L Lister %A Leandro Hermida %A Peter Wilkinson %A Martin Eisenacher %A Khalid Belhajjame %A Frank Gibson %A Phil Lord %A Matthew Pocock %A Heiko Rosenfelder %A Santoyo-López, Javier %A Anil Wipat %A Norman W Paton %B OMICS %V 13 %P 239-51 %G eng %0 Journal Article %J Bioinformatics %D 2009 %T ModLink+: Improving fold recognition by using protein-protein interactions %A Fornes, O. %A Aragues, R. %A Espadaler, J. %A M. A. Marti-Renom %A Sali, A. %A Oliva, B. %K protein folding %X

MOTIVATION: Several strategies have been developed to predict the fold of a target protein sequence, most of which are based on aligning the target sequence to other sequences of known structure. Previously, we demonstrated that the consideration of protein-protein interactions significantly increases the accuracy of fold assignment compared to PSI-BLAST sequence comparisons. A drawback of our method was the low number of proteins to which a fold could be assigned. Here, we present an improved version of the method that addresses this limitation. We also compare our method to other state-of-the-art fold assignment methodologies. RESULTS: Our approach (ModLink+) has been tested on 3,716 proteins with domain folds classified in the Structural Classification Of Proteins (SCOP) as well as known interacting partners in the Database of Interacting Proteins (DIP). For this test set, the ratio of success (PPV) on fold assignment increases from 75% for PSI-BLAST, 83% for HHSearch and 81% for PRC to more than 90% for ModLink+ at the e-value cutoff of 10(-3). Under this e-value, ModLink+ can assign a fold to 30-45% of the proteins in the test set, while our previous method could cover less than 25%. When applied to 6,384 proteins with unknown fold in the yeast proteome, ModLink+ combined with PSI-BLAST assigns a fold for domains in 3,738 proteins, while PSI-BLAST alone only covers 2,122 proteins, HHSearch 2,969 and PRC 2,826 proteins, using a threshold e-value that would represent a PPV higher than 82% for each method in the test set. AVAILABILITY: The ModLink+ server is freely accessible in the World Wide Web at http://sbi.imim.es/modlink/. CONTACT: boliva@imim.es.

%B Bioinformatics %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19357100 %0 Journal Article %J BMC Res Notes %D 2009 %T Parallel changes in gene expression in peripheral blood mononuclear cells and the brain after maternal separation in the mouse. %A Johan H van Heerden %A Ana Conesa %A Dan J Stein %A Montaner, David %A Vivienne Russell %A Nicola Illing %B BMC Res Notes %V 2 %P 195 %G eng %0 Journal Article %J Biological Theory %D 2009 %T Pere Alberch: Originator of EvoDevo %A Reiss, JO %A Burke, A C %A Archer, C %A De Renzi, M %A H. Dopazo %A Etxeberria, A %A Gale, E A %A Hinchliffe, J R %A Nuño de la Rosa, L %A Rose, C S %A Rasskin-Gutman, D %A Müller, G %B Biological Theory %V 3 %P 351-353 %G eng %0 Conference Paper %D 2009 %T Peripheral blood cells transcriptome to study new biomarkers for myocardial infarction follow up %A Silbiger, Vivian %A Luchessi, André %A Hirata, Rosario %A Carracedo, Ángel %A Brión, Maria %A Lima Neto, Lidio %A P. Pastorelli, C %A Dopazo, Joaquin %A Montaner, David %A Garcia, F %A P. Sampaio, M %A P. Pereira, M %A S. Santos, E %A Armaganijan, Dikran %A Hirata, Mario %8 06 %G eng %0 Book Section %B Biological Data Mining in Protein Interaction Networks %D 2009 %T Protein Interactions for Functional Genomics %A Minguez, P. %A Dopazo, J. %E Li, Xiao-Li %E Ng, See-Kiong %B Biological Data Mining in Protein Interaction Networks %I Idea Group Inc (IGI) %C Hershey, USA %P 223-238 %G eng %U http://books.google.es/books?id=pNyCy5GsqtkC %0 Journal Article %J Nucl. Acids Res. %D 2009 %T SARA: a server for function annotation of RNA structures %A Capriotti, Emidio %A M. A. Marti-Renom %K RNA %K RNA structure %X

Recent interest in non-coding RNA transcripts has resulted in a rapid increase of deposited RNA structures in the Protein Data Bank. However, a characterization and functional classification of the RNA structure and function space have only been partially addressed. Here, we introduce the SARA program for pair-wise alignment of RNA structures as a web server for structure-based RNA function assignment. The SARA server relies on the SARA program, which aligns two RNA structures based on a unit-vector root-mean-square approach. The likely accuracy of the SARA alignments is assessed by three different P-values estimating the statistical significance of the sequence, secondary structure and tertiary structure identity scores, respectively. Our benchmarks, which relied on a set of 419 RNA structures with known SCOR structural class, indicate that at a negative logarithm of mean P-value higher or equal than 2.5, SARA can assign the correct or a similar SCOR class to 81.4% and 95.3% of the benchmark set, respectively. The SARA server is freely accessible via the World Wide Web at http://sgu.bioinfo.cipf.es/services/SARA/.

%B Nucl. Acids Res. %P gkp433 %G eng %U http://nar.oxfordjournals.org/cgi/content/abstract/gkp433v1 %R 10.1093/nar/gkp433 %0 Journal Article %J Proc Biol Sci %D 2009 %T Sexual selection drives weak positive selection in protamine genes and high promoter divergence, enhancing sperm competitiveness %A Martin-Coello, J. %A H. Dopazo %A Arbiza, L. %A Ausio, J. %A Roldan, E. R. %A Gomendio, M. %K Adaptation %K positive selection %K sperm competition %X

Phenotypic adaptations may be the result of changes in gene structure or gene regulation, but little is known about the evolution of gene expression. In addition, it is unclear whether the same selective forces may operate at both levels simultaneously. Reproductive proteins evolve rapidly, but the underlying selective forces promoting such rapid changes are still a matter of debate. In particular, the role of sexual selection in driving positive selection among reproductive proteins remains controversial, whereas its potential influence on changes in promoter regions has not been explored. Protamines are responsible for maintaining DNA in a compacted form in chromosomes in sperm and the available evidence suggests that they evolve rapidly. Because protamines condense DNA within the sperm nucleus, they influence sperm head shape. Here, we examine the influence of sperm competition upon protamine 1 and protamine 2 genes and their promoters, by comparing closely related species of Mus that differ in relative testes size, a reliable indicator of levels of sperm competition. We find evidence of positive selection in the protamine 2 gene in the species with the highest inferred levels of sperm competition. In addition, sperm competition levels across all species are strongly associated with high divergence in protamine 2 promoters that, in turn, are associated with sperm swimming speed. We suggest that changes in protamine 2 promoters are likely to enhance sperm swimming speed by making sperm heads more hydrodynamic. Such phenotypic changes are adaptive because sperm swimming speed may be a major determinant of fertilization success under sperm competition. Thus, when species have diverged recently, few changes in gene-coding sequences are found, while high divergence in promoters seems to be associated with the intensity of sexual selection.

%B Proc Biol Sci %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19364735 %0 Journal Article %J Nucleic Acids Res %D 2009 %T SNOW, a web-based tool for the statistical analysis of protein-protein interaction networks. %A Minguez, Pablo %A Götz, Stefan %A Montaner, David %A Al-Shahrour, Fátima %A Dopazo, Joaquin %K Computer Graphics %K Data Interpretation, Statistical %K Databases, Protein %K Humans %K Internet %K Protein Interaction Mapping %K Software %X

Understanding the structure and the dynamics of the complex intercellular network of interactions that contributes to the structure and function of a living cell is one of the main challenges of today's biology. SNOW inputs a collection of protein (or gene) identifiers and, by using the interactome as scaffold, draws the connections among them, calculates several relevant network parameters and, as a novelty among the rest of tools of its class, it estimates their statistical significance. The parameters calculated for each node are: connectivity, betweenness and clustering coefficient. It also calculates the number of components, number of bicomponents and articulation points. An interactive network viewer is also available to explore the resulting network. SNOW is available at http://snow.bioinfo.cipf.es.

%B Nucleic Acids Res %V 37 %P W109-14 %8 2009 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/19454602?dopt=Abstract %R 10.1093/nar/gkp402 %0 Journal Article %J Nucl. Acids Res. %D 2009 %T SNOW, a web-based tool for the statistical analysis of protein-protein interaction networks %A Minguez, Pablo %A Gotz, S. %A Montaner, David %A Fatima Al-Shahrour %A Dopazo, Joaquin %K interactome %K network %K snow %X

Understanding the structure and the dynamics of the complex intercellular network of interactions that contributes to the structure and function of a living cell is one of the main challenges of today’s biology. SNOW inputs a collection of protein (or gene) identifiers and, by using the interactome as scaffold, draws the connections among them, calculates several relevant network parameters and, as a novelty among the rest of tools of its class, it estimates their statistical significance. The parameters calculated for each node are: connectivity, betweenness and clustering coefficient. It also calculates the number of components, number of bicomponents and articulation points. An interactive network viewer is also available to explore the resulting network. SNOW is available at http://snow.bioinfo.cipf.es.

%B Nucl. Acids Res. %V 37 %P W109-114 %G eng %U http://nar.oxfordjournals.org/content/early/2009/05/19/nar.gkp402.full %R 10.1093/nar/gkp402 %0 Journal Article %J Nature Methods %D 2009 %T Statistical methods for analysis of high-throughput RNA interference screens %A Birmingham, Amanda %A Selfors, Laura M %A Forster, Thorsten %A Wrobel, David %A Kennedy, Caleb J %A Shanks, Emma %A Santoyo-López, Javier %A Dunican, Dara J %A Long, Aideen %A Kelleher, Dermot %A Smith, Queta %A Beijersbergen, Roderick L %A Ghazal, Peter %A Shamu, Caroline E %K gene silencing %K regulation %K siRNA %B Nature Methods %I Nature Publishing Group %V 6 %P 569 - 575 %8 2009/08//print %@ 1548-7091 %G eng %U http://dx.doi.org/10.1038/nmeth.1351 %0 Book Section %B Structural Bioinformatics %D 2009 %T Structural Comparison and Alignment %A M. A. Marti-Renom %A E. Capriotti %A Shindyalov, I. %A Bourne, P. %K Structural Bioinformatics %B Structural Bioinformatics %7 2nd %I Wiley-Blackwell %C New Jersey. USA %G eng %U http://www.amazon.com/gp/product/0470181052/ %0 Journal Article %J Molecular and Cellular Toxicology %D 2009 %T On the Use of Functional Module Definitions in the Analysis of Genomic Experiments %A Dopazo, Joaquin %B Molecular and Cellular Toxicology %V 5 %P 47-47 %8 09 %G eng %0 Book Section %B Computational Structural Biology %D 2008 %T Assessment of protein structure predictions %A E. Capriotti %A M. A. Marti-Renom %B Computational Structural Biology %I World Scientific Publishing Company %C New Jersey, USA %G eng %U http://www.amazon.com/dp/9812778772/ %0 Journal Article %J Nucleic Acids Res %D 2008 %T Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments %A Fatima Al-Shahrour %A Carbonell, J. %A Minguez, P. %A Goetz, S. %A A. Conesa %A Tarraga, J. %A Medina, Ignacio %A Alloza, E. %A Montaner, D. %A Dopazo, J. %K babelomics %K funtional profiling %X

We present a new version of Babelomics, a complete suite of web tools for the functional profiling of genome scale experiments, with new and improved methods as well as more types of functional definitions. Babelomics includes different flavours of conventional functional enrichment methods as well as more advanced gene set analysis methods that makes it a unique tool among the similar resources available. In addition to the well-known functional definitions (GO, KEGG), Babelomics includes new ones such as Biocarta pathways or text mining-derived functional terms. Regulatory modules implemented include transcriptional control (Transfac, CisRed) and other levels of regulation such as miRNA-mediated interference. Moreover, Babelomics allows for sub-selection of terms in order to test more focused hypothesis. Also gene annotation correspondence tables can be imported, which allows testing with user-defined functional modules. Finally, a tool for the ’de novo’ functional annotation of sequences has been included in the system. This allows using yet unannotated organisms in the program. Babelomics has been extensively re-engineered and now it includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. Babelomics is available at http://www.babelomics.org.

%B Nucleic Acids Res %V 36 %P W341-6 %G eng %U http://nar.oxfordjournals.org/content/36/suppl_2/W341.long %0 Journal Article %J BMC Med Genomics %D 2008 %T Biological processes, properties and molecular wiring diagrams of candidate low-penetrance breast cancer susceptibility genes %A Bonifaci, N. %A Berenguer, A. %A Diez, J. %A Reina, O. %A Medina, Ignacio %A Dopazo, J. %A Moreno, V. %A Pujana, M. A. %K gene set %K GWAS %K SNP %X

ABSTRACT: BACKGROUND: Recent advances in whole-genome association studies (WGASs) for human cancer risk are beginning to provide the part lists of low-penetrance susceptibility genes. However, statistical analysis in these studies is complicated by the vast number of genetic variants examined and the weak effects observed, as a result of which constraints must be incorporated into the study design and analytical approach. In this scenario, biological attributes beyond the adjusted statistics generally receive little attention and, more importantly, the fundamental biological characteristics of low-penetrance susceptibility genes have yet to be determined. METHODS: We applied an integrative approach for identifying candidate low-penetrance breast cancer susceptibility genes, their characteristics and molecular networks through the analysis of diverse sources of biological evidence. RESULTS: First, examination of the distribution of Gene Ontology terms in ordered WGAS results identified asymmetrical distribution of Cell Communication and Cell Death processes linked to risk. Second, analysis of 11 different types of molecular or functional relationships in genomic and proteomic data sets defined the "omic" properties of candidate genes: i/ differential expression in tumors relative to normal tissue; ii/ somatic genomic copy number changes correlating with gene expression levels; iii/ differentially expressed across age at diagnosis; and iv/ expression changes after BRCA1 perturbation. Finally, network modeling of the effects of variants on germline gene expression showed higher connectivity than expected by chance between novel candidates and with known susceptibility genes, which supports functional relationships and provides mechanistic hypotheses of risk. CONCLUSION: This study proposes that cell communication and cell death are major biological processes perturbed in risk of breast cancer conferred by low-penetrance variants, and defines the common omic properties, molecular interactions and possible functional effects of candidate genes and proteins.

%B BMC Med Genomics %V 1 %P 62 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19094230 %0 Journal Article %J Int J Plant Genomics %D 2008 %T Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics %A A. Conesa %A Gotz, S. %X

Functional annotation of novel sequence data is a primary requirement for the utilization of functional genomics approaches in plant research. In this paper, we describe the Blast2GO suite as a comprehensive bioinformatics tool for functional annotation of sequences and data mining on the resulting annotations, primarily based on the gene ontology (GO) vocabulary. Blast2GO optimizes function transfer from homologous sequences through an elaborate algorithm that considers similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. The tool includes numerous functions for the visualization, management, and statistical analysis of annotation results, including gene set enrichment analysis. The application supports InterPro, enzyme codes, KEGG pathways, GO direct acyclic graphs (DAGs), and GOSlim. Blast2GO is a suitable tool for plant genomics research because of its versatility, easy installation, and friendly use.

%B Int J Plant Genomics %V 2008 %P 619832 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18483572 %0 Journal Article %J J Biomed Inform %D 2008 %T CLEAR-test: combining inference for differential expression and variability in microarray data analysis %A Valls, J. %A Grau, M. %A Sole, X. %A Hernandez, P. %A Montaner, D. %A Dopazo, J. %A Peinado, M. A. %A Capella, G. %A Moreno, V. %A Pujana, M. A. %K *Algorithms Artificial Intelligence *Data Interpretation %K Statistical Gene Expression Profiling/*methods Gene Expression Regulation/*physiology Oligonucleotide Array Sequence Analysis/*methods Proteome/*metabolism Signal Transduction/*physiology %X

A common goal of microarray experiments is to detect genes that are differentially expressed under distinct experimental conditions. Several statistical tests have been proposed to determine whether the observed changes in gene expression are significant. The t-test assigns a score to each gene on the basis of changes in its expression relative to its estimated variability, in such a way that genes with a higher score (in absolute values) are more likely to be significant. Most variants of the t-test use the complete set of genes to influence the variance estimate for each single gene. However, no inference is made in terms of the variability itself. Here, we highlight the problem of low observed variances in the t-test, when genes with relatively small changes are declared differentially expressed. Alternatively, the z-test could be used although, unlike the t-test, it can declare differentially expressed genes with high observed variances. To overcome this, we propose to combine the z-test, which focuses on large changes, with a chi(2) test to evaluate variability. We call this procedure CLEAR-test and we provide a combined p-value that offers a compromise between both aspects. Analysis of three publicly available microarray datasets reveals the greater performance of the CLEAR-test relative to the t-test and alternative methods. Finally, empirical and simulated data analyses demonstrate the greater reproducibility and statistical power of the CLEAR-test and z-test with respect to current alternative methods. In addition, the CLEAR-test improves the z-test by capturing reproducible genes with high variability.

%B J Biomed Inform %V 41 %P 33-45 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17597009 %0 Book Section %B Methods in Molecular Biology %D 2008 %T Comparative genomics-based prediction of protein function %A Gabaldón, T. %B Methods in Molecular Biology %I M. Starkey and R. Elaswarapu, Humana press %V 439 %G eng %U http://www.springerprotocols.com/Abstract/doi/10.1007/978-1-59745-188-8_26 %0 Journal Article %J J Clin Endocrinol Metab %D 2008 %T Controlled ovarian stimulation induces a functional genomic delay of the endometrium with potential clinical implications %A Horcajadas, J. A. %A Minguez, P. %A Dopazo, J. %A Esteban, F. J. %A Dominguez, F. %A Giudice, L. C. %A Pellicer, A. %A Simon, C. %K Algorithms Chorionic Gonadotropin/genetics Endometrium/cytology/pathology/*physiology/physiopathology Female Gene Expression Regulation Genome %K Human Glutathione Peroxidase/genetics Humans Insulin-Like Growth Factor Binding Proteins/genetics Luteal Phase/physiology Luteinizing Hormone/genetics Menstrual Cycle Oligonucleotide Array Sequence Analysis Ovulation Induction/*methods RNA/genetics/isola %X

CONTEXT: Controlled ovarian stimulation induces morphological, biochemical, and functional genomic modifications of the human endometrium during the window of implantation. OBJECTIVE: Our objective was to compare the gene expression profile of the human endometrium in natural vs. controlled ovarian stimulation cycles throughout the early-mid secretory transition using microarray technology. METHOD: Microarray data from 49 endometrial biopsies obtained from LH+1 to LH+9 (n=25) in natural cycles and from human chorionic gonadotropin (hCG) +1 to hCG+9 in controlled ovarian stimulation cycles (n=24) were analyzed using different methods, such as clustering, profiling of biological processes, and selection of differentially expressed genes, as implemented in Gene Expression Pattern Analysis Suite and Babelomics programs. RESULTS: Endometria from natural cycles followed different genomic patterns compared with controlled ovarian stimulation cycles in the transition from the pre-receptive (days LH/hCG+1 until LH/hCG+5) to the receptive phase (day LH+7/hCG+7). Specifically, we have demonstrated the existence of a 2-d delay in the activation/repression of two clusters composed by 218 and 133 genes, respectively, on day hCG+7 vs. LH+7. Many of these delayed genes belong to the class window of implantation genes affecting basic biological processes in the receptive endometrium. CONCLUSIONS: These results demonstrate that gene expression profiling of the endometrium is different between natural and controlled ovarian stimulation cycles in the receptive phase. Identification of these differentially regulated genes can be used to understand the different developmental profiles of receptive endometrium during controlled ovarian stimulation and to search for the best controlled ovarian stimulation treatment in terms of minimal endometrial impact.

%B J Clin Endocrinol Metab %V 93 %P 4500-10 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18697870 %0 Book Section %B Protocells: Bridging nonliving and living matter %D 2008 %T The core of a minimal gene set: insights from natural reduced genomes %A Gabaldón, T. %A Gil, R. %A Peretó, J. %A Latorre, A. %A Moya, A. %B Protocells: Bridging nonliving and living matter %I The MIT Press %C USA %P 347-366 %G eng %0 Journal Article %J Genomics %D 2008 %T Direct functional assessment of the composite phenotype through multivariate projection strategies %A A. Conesa %A Bro, R. %A Garcia-Garcia, F. %A Prats, J. M. %A Gotz, S. %A Kjeldahl, K. %A Montaner, D. %A Dopazo, J. %K Breast Neoplasms/genetics Computational Biology/*methods Databases %K Genetic Female Gene Expression Profiling/*statistics & numerical data Humans Mathematical Computing Multivariate Analysis Phenotype %X

We present a novel approach for the analysis of transcriptomics data that integrates functional annotation of gene sets with expression values in a multivariate fashion, and directly assesses the relation of functional features to a multivariate space of response phenotypical variables. Multivariate projection methods are used to obtain new correlated variables for a set of genes that share a given function. These new functional variables are then related to the response variables of interest. The analysis of the principal directions of the multivariate regression allows for the identification of gene function features correlated with the phenotype. Two different transcriptomics studies are used to illustrate the statistical and interpretative aspects of the methodology. We demonstrate the superiority of the proposed method over equivalent approaches.

%B Genomics %V 92 %P 373-83 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18652888 %0 Journal Article %J Genomics %D 2008 %T Direct functional assessment of the composite phenotype through multivariate projection strategies. %A Conesa, Ana %A Bro, Rasmus %A Garcia-Garcia, Francisco %A Prats, José Manuel %A Götz, Stefan %A Kjeldahl, Karin %A Montaner, David %A Dopazo, Joaquin %K Breast Neoplasms %K Computational Biology %K Databases, Genetic %K Female %K Gene Expression Profiling %K Humans %K Mathematical Computing %K Multivariate Analysis %K Phenotype %X

We present a novel approach for the analysis of transcriptomics data that integrates functional annotation of gene sets with expression values in a multivariate fashion, and directly assesses the relation of functional features to a multivariate space of response phenotypical variables. Multivariate projection methods are used to obtain new correlated variables for a set of genes that share a given function. These new functional variables are then related to the response variables of interest. The analysis of the principal directions of the multivariate regression allows for the identification of gene function features correlated with the phenotype. Two different transcriptomics studies are used to illustrate the statistical and interpretative aspects of the methodology. We demonstrate the superiority of the proposed method over equivalent approaches.

%B Genomics %V 92 %P 373-83 %8 2008 Dec %G eng %N 6 %1 https://www.ncbi.nlm.nih.gov/pubmed/18652888?dopt=Abstract %R 10.1016/j.ygeno.2008.05.015 %0 Journal Article %J Genome Biol %D 2008 %T Evolutionary potentials: structure specific knowledge-based potentials exploiting the evolutionary record of sequence homologs %A Panjkovich, A. %A Melo, F. %A M. A. Marti-Renom %X ABSTRACT: We introduce a new type of knowledge-based potentials for protein structure prediction, called ’evolutionary potentials’, which are derived using a single experimental protein structure and all three-dimensional models of its homologous sequences. The new potentials have been benchmarked against other knowledge-based potentials, resulting in a significant increase in accuracy for model assessment. In contrast to standard knowledge-based potentials, we propose that evolutionary potentials capture key determinants of thermodynamic stability and specific sequence constraints required for fast folding. %B Genome Biol %V 9 %P R68 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18397517 %0 Journal Article %J Methods Mol Biol %D 2008 %T Expression and microarrays. %A Dopazo, Joaquin %A Al-Shahrour, Fátima %K Animals %K Computational Biology %K gene expression %K Gene Expression Profiling %K Humans %K Oligonucleotide Array Sequence Analysis %X

High throughput methodologies have increased by several orders of magnitude the amount of experimental microarray data available. Nevertheless, translating these data into useful biological knowledge remains a challenge. There is a risk of perceiving these methodologies as mere factories that produce never-ending quantities of data if a proper biological interpretation is not provided. Methods of interpreting these data are continuously evolving. Typically, a simple two-step approach has been used, in which genes of interest are first selected based on thresholds for the experimental values, and then enrichment in biologically relevant terms in the annotations of these genes is analyzed in a second step. For various reasons, such methods are quite poor in terms of performance and new procedures inspired by systems biology that directly address sets of functionally related genes are currently under development.

%B Methods Mol Biol %V 453 %P 245-55 %8 2008 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/18712307?dopt=Abstract %R 10.1007/978-1-60327-429-6_12 %0 Journal Article %J Nucleic Acids Res %D 2008 %T GEPAS, a web-based tool for microarray data analysis and interpretation %A Tarraga, J. %A Medina, Ignacio %A Carbonell, J. %A Huerta-Cepas, J. %A Minguez, P. %A Alloza, E. %A Fatima Al-Shahrour %A Vegas-Azcarate, S. %A Goetz, S. %A Escobar, P. %A Garcia-Garcia, F. %A A. Conesa %A Montaner, D. %A Dopazo, J. %K gepas %K microarray data analysis %X

Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 36 %P W308-14 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18508806 %0 Journal Article %J Nucleic Acids Res %D 2008 %T GEPAS, a web-based tool for microarray data analysis and interpretation. %A Tárraga, Joaquín %A Medina, Ignacio %A Carbonell, José %A Huerta-Cepas, Jaime %A Minguez, Pablo %A Alloza, Eva %A Al-Shahrour, Fátima %A Vegas-Azcárate, Susana %A Goetz, Stefan %A Escobar, Pablo %A Garcia-Garcia, Francisco %A Conesa, Ana %A Montaner, David %A Dopazo, Joaquin %K Computer Graphics %K Dose-Response Relationship, Drug %K Gene Expression Profiling %K Internet %K Kinetics %K Oligonucleotide Array Sequence Analysis %K Software %X

Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 36 %P W308-14 %8 2008 Jul 01 %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/18508806?dopt=Abstract %R 10.1093/nar/gkn303 %0 Journal Article %J Nucleic Acids Res %D 2008 %T High-throughput functional annotation and data mining with the Blast2GO suite. %A Götz, Stefan %A García-Gómez, Juan Miguel %A Terol, Javier %A Williams, Tim D %A Nagaraj, Shivashankar H %A Nueda, Maria José %A Robles, Montserrat %A Talon, Manuel %A Dopazo, Joaquin %A Conesa, Ana %K Animals %K Computational Biology %K Computer Graphics %K Databases, Genetic %K Expressed Sequence Tags %K Genes %K Genomics %K Sequence Analysis, DNA %K Sequence Analysis, Protein %K Software %K Vocabulary, Controlled %X

Functional genomics technologies have been widely adopted in the biological research of both model and non-model species. An efficient functional annotation of DNA or protein sequences is a major requirement for the successful application of these approaches as functional information on gene products is often the key to the interpretation of experimental results. Therefore, there is an increasing need for bioinformatics resources which are able to cope with large amount of sequence data, produce valuable annotation results and are easily accessible to laboratories where functional genomics projects are being undertaken. We present the Blast2GO suite as an integrated and biologist-oriented solution for the high-throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. The most outstanding Blast2GO features are: (i) the combination of various annotation strategies and tools controlling type and intensity of annotation, (ii) the numerous graphical features such as the interactive GO-graph visualization for gene-set function profiling or descriptive charts, (iii) the general sequence management features and (iv) high-throughput capabilities. We used the Blast2GO framework to carry out a detailed analysis of annotation behaviour through homology transfer and its impact in functional genomics research. Our aim is to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data.

%B Nucleic Acids Res %V 36 %P 3420-35 %8 2008 Jun %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/18445632?dopt=Abstract %R 10.1093/nar/gkn176 %0 Journal Article %J Brief Bioinform %D 2008 %T Interoperability with Moby 1.0--it's better than sharing your toothbrush! %A Wilkinson, Mark D %A Senger, Martin %A Kawas, Edward %A Bruskiewich, Richard %A Gouzy, Jerome %A Noirot, Celine %A Bardou, Philippe %A Ng, Ambrose %A Haase, Dirk %A Saiz, Enrique de Andres %A Wang, Dennis %A Gibbons, Frank %A Gordon, Paul M K %A Sensen, Christoph W %A Carrasco, Jose Manuel Rodriguez %A Fernández, José M %A Shen, Lixin %A Links, Matthew %A Ng, Michael %A Opushneva, Nina %A Neerincx, Pieter B T %A Leunissen, Jack A M %A Ernst, Rebecca %A Twigger, Simon %A Usadel, Bjorn %A Good, Benjamin %A Wong, Yan %A Stein, Lincoln %A Crosby, William %A Karlsson, Johan %A Royo, Romina %A Párraga, Iván %A Ramírez, Sergio %A Gelpi, Josep Lluis %A Trelles, Oswaldo %A Pisano, David G %A Jimenez, Natalia %A Kerhornou, Arnaud %A Rosset, Roman %A Zamacola, Leire %A Tárraga, Joaquín %A Huerta-Cepas, Jaime %A Carazo, Jose María %A Dopazo, Joaquin %A Guigó, Roderic %A Navarro, Arcadi %A Orozco, Modesto %A Valencia, Alfonso %A Claros, M Gonzalo %A Pérez, Antonio J %A Aldana, Jose %A Rojano, M Mar %A Fernandez-Santa Cruz, Raul %A Navas, Ismael %A Schiltz, Gary %A Farmer, Andrew %A Gessler, Damian %A Schoof, Heiko %A Groscurth, Andreas %K Computational Biology %K Database Management Systems %K Databases, Factual %K Information Storage and Retrieval %K Internet %K Programming Languages %K Systems Integration %X

The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.0 version of the interoperability framework, registry Application Programming Interface and supporting Perl and Java code-bases. Together, these provide interoperable access to over 1400 bioinformatics resources worldwide through the BioMoby platform, and this number continues to grow. Here we highlight and discuss the features of BioMoby that make it distinct from other Semantic Web Service and interoperability initiatives, and that have been instrumental to its deployment and use by a wide community of bioinformatics service providers. The standard, client software, and supporting code libraries are all freely available at http://www.biomoby.org/.

%B Brief Bioinform %V 9 %P 220-31 %8 2008 May %G eng %N 3 %1 https://www.ncbi.nlm.nih.gov/pubmed/18238804?dopt=Abstract %R 10.1093/bib/bbn003 %0 Journal Article %J Brief Bioinform %D 2008 %T Interoperability with Moby 1.0–it’s better than sharing your toothbrush! %A Wilkinson, M. D. %A Senger, M. %A Kawas, E. %A Bruskiewich, R. %A Gouzy, J. %A Noirot, C. %A Bardou, P. %A Ng, A. %A Haase, D. %A Saiz Ede, A. %A Wang, D. %A Gibbons, F. %A Gordon, P. M. %A Sensen, C. W. %A Carrasco, J. M. %A Fernandez, J. M. %A Shen, L. %A Links, M. %A Ng, M. %A Opushneva, N. %A Neerincx, P. B. %A Leunissen, J. A. %A Ernst, R. %A Twigger, S. %A Usadel, B. %A Good, B. %A Wong, Y. %A Stein, L. %A Crosby, W. %A Karlsson, J. %A Royo, R. %A Parraga, I. %A Ramirez, S. %A Gelpi, J. L. %A Trelles, O. %A Pisano, D. G. %A Jimenez, N. %A Kerhornou, A. %A Rosset, R. %A Zamacola, L. %A Tarraga, J. %A Huerta-Cepas, J. %A Carazo, J. M. %A Dopazo, J. %A R. Guigo %A Navarro, A. %A Orozco, M. %A Valencia, A. %A Claros, M. G. %A Perez, A. J. %A Aldana, J. %A Rojano, M. M. %A Fernandez-Santa Cruz, R. %A Navas, I. %A Schiltz, G. %A Farmer, A. %A Gessler, D. %A Schoof, H. %A Groscurth, A. %K Computational Biology/*methods *Database Management Systems *Databases %K Factual Information Storage and Retrieval/*methods *Internet *Programming Languages Systems Integration %X

The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.0 version of the interoperability framework, registry Application Programming Interface and supporting Perl and Java code-bases. Together, these provide interoperable access to over 1400 bioinformatics resources worldwide through the BioMoby platform, and this number continues to grow. Here we highlight and discuss the features of BioMoby that make it distinct from other Semantic Web Service and interoperability initiatives, and that have been instrumental to its deployment and use by a wide community of bioinformatics service providers. The standard, client software, and supporting code libraries are all freely available at http://www.biomoby.org/.

%B Brief Bioinform %V 9 %P 220-31 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18238804 %0 Journal Article %J Nucleic Acids Res %D 2008 %T Joint annotation of coding and non-coding single nucleotide polymorphisms and mutations in the SNPeffect and PupaSuite databases %A Reumers, J. %A L. Conde %A Medina, Ignacio %A Maurer-Stroh, S. %A Van Durme, J. %A Dopazo, J. %A Rousseau, F. %A Schymkowitz, J. %K Amino Acid Substitution Animals *Databases %K Genetic Genetic Diseases %K Inborn/genetics HSP70 Heat-Shock Proteins/metabolism Humans Internet Mice MicroRNAs/metabolism *Mutation *Polymorphism %K Single Nucleotide Proteins/chemistry/genetics RNA Splice Sites Rats Transcription Factors/metabolism %X

Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutation and polymorphism is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments. The SNPeffect and PupaSuite databases are now synchronized to deliver annotations for both non-coding and coding SNP, as well as annotations for the SwissProt set of human disease mutations. In addition, SNPeffect now contains predictions of Tango2: an improved aggregation detector, and Waltz: a novel predictor of amyloid-forming sequences, as well as improved predictors for regions that are recognized by the Hsp70 family of chaperones. The new PupaSuite version incorporates predictions for SNPs in silencers and miRNAs including their targets, as well as additional methods for predicting SNPs in TFBSs and splice sites. Also predictions for mouse and rat genomes have been added. In addition, a PupaSuite web service has been developed to enable data access, programmatically. The combined database holds annotations for 4,965,073 regulatory as well as 133,505 coding human SNPs and 14,935 disease mutations, and phenotypic descriptions of 43,797 human proteins and is accessible via http://snpeffect.vib.be and http://pupasuite.bioinfo.cipf.es/.

%B Nucleic Acids Res %V 36 %P D825-9 %G eng %U http://nar.oxfordjournals.org/cgi/content/full/36/suppl_1/D825 %0 Journal Article %J Nucleic Acids Res %D 2008 %T Joint annotation of coding and non-coding single nucleotide polymorphisms and mutations in the SNPeffect and PupaSuite databases. %A Reumers, Joke %A Conde, Lucia %A Medina, Ignacio %A Maurer-Stroh, Sebastian %A Van Durme, Joost %A Dopazo, Joaquin %A Rousseau, Frederic %A Schymkowitz, Joost %K Amino Acid Substitution %K Animals %K Databases, Genetic %K Genetic Diseases, Inborn %K HSP70 Heat-Shock Proteins %K Humans %K Internet %K Mice %K MicroRNAs %K mutation %K Polymorphism, Single Nucleotide %K Proteins %K Rats %K RNA Splice Sites %K Transcription Factors %X

Single nucleotide polymorphisms (SNPs) are, together with copy number variation, the primary source of variation in the human genome. SNPs are associated with altered response to drug treatment, susceptibility to disease and other phenotypic variation. Furthermore, during genetic screens for disease-associated mutations in groups of patients and control individuals, the distinction between disease causing mutation and polymorphism is often unclear. Annotation of the functional and structural implications of single nucleotide changes thus provides valuable information to interpret and guide experiments. The SNPeffect and PupaSuite databases are now synchronized to deliver annotations for both non-coding and coding SNP, as well as annotations for the SwissProt set of human disease mutations. In addition, SNPeffect now contains predictions of Tango2: an improved aggregation detector, and Waltz: a novel predictor of amyloid-forming sequences, as well as improved predictors for regions that are recognized by the Hsp70 family of chaperones. The new PupaSuite version incorporates predictions for SNPs in silencers and miRNAs including their targets, as well as additional methods for predicting SNPs in TFBSs and splice sites. Also predictions for mouse and rat genomes have been added. In addition, a PupaSuite web service has been developed to enable data access, programmatically. The combined database holds annotations for 4,965,073 regulatory as well as 133,505 coding human SNPs and 14,935 disease mutations, and phenotypic descriptions of 43,797 human proteins and is accessible via http://snpeffect.vib.be and http://pupasuite.bioinfo.cipf.es/.

%B Nucleic Acids Res %V 36 %P D825-9 %8 2008 Jan %G eng %N Database issue %1 https://www.ncbi.nlm.nih.gov/pubmed/18086700?dopt=Abstract %R 10.1093/nar/gkm979 %0 Journal Article %J BMC Genomics %D 2008 %T Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology %A Botton, A. %A Galla, G. %A A. Conesa %A Bachem, C. %A Ramina, A. %A Barcaccia, G. %X

BACKGROUND: After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO), consisting in three structured vocabularies (i.e. ontologies) describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology. RESULTS: Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species. CONCLUSION: Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization of the experimental steps and the statistical parameters adopted. The Blast2GO software was shown to represent a comprehensive bioinformatics solution for an annotation-based functional analysis. According to the whole set of GO annotations, the AFLP technology generates thorough information for angiosperm gene products and shares common features across angiosperm species and families. The utility of this technology for structural and functional genomics in plants can be implemented by serial annotation analyses of genome-anchored fragments and organ/tissue-specific repertories of transcriptome-derived fragments.

%B BMC Genomics %V 9 %P 347 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18652646 %0 Journal Article %J Oncogene %D 2008 %T Molecular profiling related to poor prognosis in thyroid carcinoma. Combining gene expression data and biological information %A Montero-Conde, C. %A Martin-Campos, J. M. %A Lerma, E. %A Gimenez, G. %A Martinez-Guitarte, J. L. %A Combalia, N. %A Montaner, D. %A Matias-Guiu, X. %A Dopazo, J. %A de Leiva, A. %A M. Robledo %A Mauricio, D. %K Adenoma/genetics/metabolism/pathology Adolescent Adult Aged Carcinoma/genetics/metabolism/pathology Carcinoma %K Biological/*genetics/metabolism %K Neoplasm/genetics/metabolism Reverse Transcriptase Polymerase Chain Reaction Signal Transduction Thyroid Neoplasms/classification/*genetics/metabolism Tumor Markers %K Neoplastic Humans Male Middle Aged *Oligonucleotide Array Sequence Analysis Prognosis RNA %K Papillary/genetics/metabolism/pathology Cell Differentiation Female *Gene Expression Profiling *Gene Expression Regulation %X

Undifferentiated and poorly differentiated thyroid tumors are responsible for more than half of thyroid cancer patient deaths in spite of their low incidence. Conventional treatments do not obtain substantial benefits, and the lack of alternative approaches limits patient survival. Additionally, the absence of prognostic markers for well-differentiated tumors complicates patient-specific treatments and favors the progression of recurrent forms. In order to recognize the molecular basis involved in tumor dedifferentiation and identify potential markers for thyroid cancer prognosis prediction, we analysed the expression profile of 44 thyroid primary tumors with different degrees of dedifferentiation and aggressiveness using cDNA microarrays. Transcriptome comparison of dedifferentiated and well-differentiated thyroid tumors identified 1031 genes with >2-fold difference in absolute values and false discovery rate of <0.15. According to known molecular interaction and reaction networks, the products of these genes were mainly clustered in the MAPkinase signaling pathway, the TGF-beta signaling pathway, focal adhesion and cell motility, activation of actin polymerization and cell cycle. An exhaustive search in several databases allowed us to identify various members of the matrix metalloproteinase, melanoma antigen A and collagen gene families within the upregulated gene set. We also identified a prognosis classifier comprising just 30 transcripts with an overall accuracy of 95%. These findings may clarify the molecular mechanisms involved in thyroid tumor dedifferentiation and provide a potential prognosis predictor as well as targets for new therapies.

%B Oncogene %V 27 %P 1554-61 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17873908 %0 Journal Article %J Oncogene %D 2008 %T Molecular profiling related to poor prognosis in thyroid carcinoma. Combining gene expression data and biological information. %A Montero-Conde, C %A Martín-Campos, J M %A Lerma, E %A Gimenez, G %A Martínez-Guitarte, J L %A Combalía, N %A Montaner, D %A Matías-Guiu, X %A Dopazo, J %A de Leiva, A %A Robledo, M %A Mauricio, D %K Adenoma %K Adolescent %K Adult %K Aged %K Biomarkers, Tumor %K Carcinoma %K Carcinoma, Papillary %K Cell Differentiation %K Female %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K Male %K Middle Aged %K Oligonucleotide Array Sequence Analysis %K Prognosis %K Reverse Transcriptase Polymerase Chain Reaction %K RNA, Neoplasm %K Signal Transduction %K Thyroid Neoplasms %X

Undifferentiated and poorly differentiated thyroid tumors are responsible for more than half of thyroid cancer patient deaths in spite of their low incidence. Conventional treatments do not obtain substantial benefits, and the lack of alternative approaches limits patient survival. Additionally, the absence of prognostic markers for well-differentiated tumors complicates patient-specific treatments and favors the progression of recurrent forms. In order to recognize the molecular basis involved in tumor dedifferentiation and identify potential markers for thyroid cancer prognosis prediction, we analysed the expression profile of 44 thyroid primary tumors with different degrees of dedifferentiation and aggressiveness using cDNA microarrays. Transcriptome comparison of dedifferentiated and well-differentiated thyroid tumors identified 1031 genes with >2-fold difference in absolute values and false discovery rate of <0.15. According to known molecular interaction and reaction networks, the products of these genes were mainly clustered in the MAPkinase signaling pathway, the TGF-beta signaling pathway, focal adhesion and cell motility, activation of actin polymerization and cell cycle. An exhaustive search in several databases allowed us to identify various members of the matrix metalloproteinase, melanoma antigen A and collagen gene families within the upregulated gene set. We also identified a prognosis classifier comprising just 30 transcripts with an overall accuracy of 95%. These findings may clarify the molecular mechanisms involved in thyroid tumor dedifferentiation and provide a potential prognosis predictor as well as targets for new therapies.

%B Oncogene %V 27 %P 1554-61 %8 2008 Mar 06 %G eng %N 11 %1 https://www.ncbi.nlm.nih.gov/pubmed/17873908?dopt=Abstract %R 10.1038/sj.onc.1210792 %0 Journal Article %J Nucleic Acids Res %D 2008 %T PhylomeDB: a database for genome-wide collections of gene phylogenies. %A Huerta-Cepas, Jaime %A Bueno, Anibal %A Dopazo, Joaquin %A Gabaldón, Toni %K Base Sequence %K Escherichia coli %K Genes %K Genomics %K History, Ancient %K Humans %K Phylogeny %K Proteins %K Saccharomyces cerevisiae %K Sequence Alignment %X

The complete collection of evolutionary histories of all genes in a genome, also known as phylome, constitutes a valuable source of information. The reconstruction of phylomes has been previously prevented by large demands of time and computer power, but is now feasible thanks to recent developments in computers and algorithms. To provide a publicly available repository of complete phylomes that allows researchers to access and store large-scale phylogenomic analyses, we have developed PhylomeDB. PhylomeDB is a database of complete phylomes derived for different genomes within a specific taxonomic range. All phylomes in the database are built using a high-quality phylogenetic pipeline that includes evolutionary model testing and alignment trimming phases. For each genome, PhylomeDB provides the alignments, phylogentic trees and tree-based orthology predictions for every single encoded protein. The current version of PhylomeDB includes the phylomes of Human, the yeast Saccharomyces cerevisiae and the bacterium Escherichia coli, comprising a total of 32 289 seed sequences with their corresponding alignments and 172 324 phylogenetic trees. PhylomeDB can be publicly accessed at http://phylomedb.bioinfo.cipf.es.

%B Nucleic Acids Res %V 36 %P D491-6 %8 2008 Jan %G eng %N Database issue %1 https://www.ncbi.nlm.nih.gov/pubmed/17962297?dopt=Abstract %R 10.1093/nar/gkm899 %0 Journal Article %J Nucleic Acids Res %D 2008 %T PhylomeDB: a database for genome-wide collections of gene phylogenies %A Huerta-Cepas, J. %A Bueno, A. %A Dopazo, J. %A Gabaldón, T. %K Ancient Humans *Phylogeny Proteins/classification/genetics Saccharomyces cerevisiae/classification/genetics Sequence Alignment %K Base Sequence Escherichia coli/classification/genetics Genes *Genomics History %X The complete collection of evolutionary histories of all genes in a genome, also known as phylome, constitutes a valuable source of information. The reconstruction of phylomes has been previously prevented by large demands of time and computer power, but is now feasible thanks to recent developments in computers and algorithms. To provide a publicly available repository of complete phylomes that allows researchers to access and store large-scale phylogenomic analyses, we have developed PhylomeDB. PhylomeDB is a database of complete phylomes derived for different genomes within a specific taxonomic range. All phylomes in the database are built using a high-quality phylogenetic pipeline that includes evolutionary model testing and alignment trimming phases. For each genome, PhylomeDB provides the alignments, phylogentic trees and tree-based orthology predictions for every single encoded protein. The current version of PhylomeDB includes the phylomes of Human, the yeast Saccharomyces cerevisiae and the bacterium Escherichia coli, comprising a total of 32 289 seed sequences with their corresponding alignments and 172 324 phylogenetic trees. PhylomeDB can be publicly accessed at http://phylomedb.bioinfo.cipf.es. %B Nucleic Acids Res %V 36 %P D491-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17962297 %0 Journal Article %J BMC Bioinformatics %D 2008 %T Prediction of enzyme function by combining sequence similarity and protein interactions %A Espadaler, J. %A Eswar, N. %A Querol, E. %A Aviles, F. X. %A Sali, A. %A M. A. Marti-Renom %A Oliva, B. %K Amino Acid *Software Structure-Activity Relationship Substrate Specificity/genetics %K Amino Acid Sequence/physiology Databases %K Automated Predictive Value of Tests Protein Interaction Mapping Proteins/analysis/metabolism Sequence Alignment Sequence Analysis %K Protein *Sequence Homology %K Protein Enzymes/analysis/*metabolism Fuzzy Logic Pattern Recognition %X BACKGROUND: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. RESULTS: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. CONCLUSION: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone. %B BMC Bioinformatics %V 9 %P 249 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18505562 %0 Journal Article %J Bioinformatics %D 2008 %T RNA structure alignment by a unit-vector approach %A E. Capriotti %A M. A. Marti-Renom %K Algorithms Base Sequence Computer Simulation *Models %K Chemical *Models %K Molecular Molecular Sequence Data Nucleic Acid Conformation RNA/*chemistry/*ultrastructure Sequence Alignment/*methods Sequence Analysis %K RNA/*methods *Software %X MOTIVATION: The recent discovery of tiny RNA molecules such as microRNAs and small interfering RNA are transforming the view of RNA as a simple information transfer molecule. Similar to proteins, the native three-dimensional structure of RNA determines its biological activity. Therefore, classifying the current structural space is paramount for functionally annotating RNA molecules. The increasing numbers of RNA structures deposited in the PDB requires more accurate, automatic and benchmarked methods for RNA structure comparison. In this article, we introduce a new algorithm for RNA structure alignment based on a unit-vector approach. The algorithm has been implemented in the SARA program, which results in RNA structure pairwise alignments and their statistical significance. RESULTS: The SARA program has been implemented to be of general applicability even when no secondary structure can be calculated from the RNA structures. A benchmark against the ARTS program using a set of 1275 non-redundant pairwise structure alignments results in inverted approximately 6% extra alignments with at least 50% structurally superposed nucleotides and base pairs. A first attempt to perform RNA automatic functional annotation based on structure alignments indicates that SARA can correctly assign the deepest SCOR classification to >60% of the query structures. AVAILABILITY: The SARA program is freely available through a World Wide Web server http://sgu.bioinfo.cipf.es/services/SARA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics %V 24 %P i112-8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18689811 %0 Book Section %B Encyclopedia of Life Science %D 2008 %T Selective Constraints and Human Disease Genes: Evolutionary and Bioinformatic Approaches %A H. Dopazo %B Encyclopedia of Life Science %I John Wiley & Sons, Ltd. %C UK %G eng %R 10.1002/9780470015902.a0020762 %0 Book Section %B Handbook of Human Molecular Evolution %D 2008 %T Selective Constraints on Human Disease Mutations and Polymorphisms %A H. Dopazo %B Handbook of Human Molecular Evolution %I Hildegard Kehrer-Sawatzki & David N. Cooper. John Wiley & Sons, Ltd %C UK %G eng %U http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470517468,descCd-description.html %0 Journal Article %J Nat Genet %D 2008 %T SNP and haplotype mapping for genetic analysis in the rat. %A Saar, Kathrin %A Beck, Alfred %A Bihoreau, Marie-Thérèse %A Birney, Ewan %A Brocklebank, Denise %A Chen, Yuan %A Cuppen, Edwin %A Demonchy, Stephanie %A Dopazo, Joaquin %A Flicek, Paul %A Foglio, Mario %A Fujiyama, Asao %A Gut, Ivo G %A Gauguier, Dominique %A Guigó, Roderic %A Guryev, Victor %A Heinig, Matthias %A Hummel, Oliver %A Jahn, Niels %A Klages, Sven %A Kren, Vladimir %A Kube, Michael %A Kuhl, Heiner %A Kuramoto, Takashi %A Kuroki, Yoko %A Lechner, Doris %A Lee, Young-Ae %A Lopez-Bigas, Nuria %A Lathrop, G Mark %A Mashimo, Tomoji %A Medina, Ignacio %A Mott, Richard %A Patone, Giannino %A Perrier-Cornet, Jeanne-Antide %A Platzer, Matthias %A Pravenec, Michal %A Reinhardt, Richard %A Sakaki, Yoshiyuki %A Schilhabel, Markus %A Schulz, Herbert %A Serikawa, Tadao %A Shikhagaie, Medya %A Tatsumoto, Shouji %A Taudien, Stefan %A Toyoda, Atsushi %A Voigt, Birger %A Zelenika, Diana %A Zimdahl, Heike %A Hubner, Norbert %K Animals %K Chromosome Mapping %K Databases, Genetic %K Genome %K Haplotypes %K Linkage Disequilibrium %K Phylogeny %K Polymorphism, Single Nucleotide %K Quantitative Trait Loci %K Rats %K Rats, Inbred Strains %K Recombination, Genetic %X

The laboratory rat is one of the most extensively studied model organisms. Inbred laboratory rat strains originated from limited Rattus norvegicus founder populations, and the inherited genetic variation provides an excellent resource for the correlation of genotype to phenotype. Here, we report a survey of genetic variation based on almost 3 million newly identified SNPs. We obtained accurate and complete genotypes for a subset of 20,238 SNPs across 167 distinct inbred rat strains, two rat recombinant inbred panels and an F2 intercross. Using 81% of these SNPs, we constructed high-density genetic maps, creating a large dataset of fully characterized SNPs for disease gene mapping. Our data characterize the population structure and illustrate the degree of linkage disequilibrium. We provide a detailed SNP map and demonstrate its utility for mapping of quantitative trait loci. This community resource is openly available and augments the genetic tools for this workhorse of physiological studies.

%B Nat Genet %V 40 %P 560-6 %8 2008 May %G eng %N 5 %1 https://www.ncbi.nlm.nih.gov/pubmed/18443594?dopt=Abstract %R 10.1038/ng.124 %0 Journal Article %J Nat Genet %D 2008 %T SNP and haplotype mapping for genetic analysis in the rat %A K. Saar %A A. Beck %A M. T. Bihoreau %A E. Birney %A D. Brocklebank %A Y. Chen %A E. Cuppen %A S. Demonchy %A Dopazo, J. %A P. Flicek %A M. Foglio %A A. Fujiyama %A I. G. Gut %A D. Gauguier %A R. Guigo %A V. Guryev %A M. Heinig %A O. Hummel %A N. Jahn %A S. Klages %A V. Kren %A M. Kube %A H. Kuhl %A Kuramoto, T. %A Kuroki, Y. %A Lechner, D. %A Lee, Y. A. %A Lopez-Bigas, N. %A Lathrop, G. M. %A Mashimo, T. %A Medina, Ignacio %A Mott, R. %A Patone, G. %A Perrier-Cornet, J. A. %A Platzer, M. %A Pravenec, M. %A Reinhardt, R. %A Sakaki, Y. %A Schilhabel, M. %A Schulz, H. %A Serikawa, T. %A Shikhagaie, M. %A Tatsumoto, S. %A Taudien, S. %A Toyoda, A. %A Voigt, B. %A Zelenika, D. %A Zimdahl, H. %A Hubner, N. %K Animals Chromosome Mapping *Databases %K Genetic %K Genetic Genome *Haplotypes Linkage Disequilibrium Phylogeny *Polymorphism %K Inbred Strains/*genetics Recombination %K Single Nucleotide *Quantitative Trait Loci Rats/*genetics Rats %X

The laboratory rat is one of the most extensively studied model organisms. Inbred laboratory rat strains originated from limited Rattus norvegicus founder populations, and the inherited genetic variation provides an excellent resource for the correlation of genotype to phenotype. Here, we report a survey of genetic variation based on almost 3 million newly identified SNPs. We obtained accurate and complete genotypes for a subset of 20,238 SNPs across 167 distinct inbred rat strains, two rat recombinant inbred panels and an F2 intercross. Using 81% of these SNPs, we constructed high-density genetic maps, creating a large dataset of fully characterized SNPs for disease gene mapping. Our data characterize the population structure and illustrate the degree of linkage disequilibrium. We provide a detailed SNP map and demonstrate its utility for mapping of quantitative trait loci. This community resource is openly available and augments the genetic tools for this workhorse of physiological studies.

%B Nat Genet %V 40 %P 560-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18443594 %0 Journal Article %J Mol Vis %D 2008 %T Time course profiling of the retinal transcriptome after optic nerve transection and optic nerve crush %A Agudo, M. %A Perez-Marin, M. C. %A Lonngren, U. %A Sobrado, P. %A A. Conesa %A Canovas, I. %A Salinas-Navarro, M. %A Miralles-Imperial, J. %A Hallbook, F. %A Vidal-Sanz, M. %K Animals Cell Death Cluster Analysis Female *Gene Expression Profiling Gene Expression Regulation *Nerve Crush Optic Nerve/*metabolism/*pathology Optic Nerve Injuries/*genetics Rats Rats %K Sprague-Dawley Reproducibility of Results Retina/*metabolism/*pathology Time Factors %X PURPOSE: A time-course analysis of gene regulation in the adult rat retina after intraorbital nerve crush (IONC) and intraorbital nerve transection (IONT). METHODS: RNA was extracted from adult rat retinas undergoing either IONT or IONC at increasing times post-lesion. Affymetrix RAE230.2 arrays were hybridized and analyzed. Statistically regulated genes were annotated and functionally clustered. Arrays were validated by means of quantative reverse transcription polymerase chain reaction (qRT-PCR) on ten regulated genes at two times post-lesion. Western blotting and immunohistofluorescence for four pro-apoptotic proteins were performed on naive and injured retinas. Finally, custom signaling maps for IONT- and IONC-induced death response were generated (MetaCore, Genego Inc.). RESULTS: Here we show that over time, 3,219 sequences were regulated after IONT and 1,996 after IONC. Out of the total of regulated sequences, 1,078 were commonly regulated by both injuries. Interestingly, while IONT mainly triggers a gene upregulation-sustained over time, IONC causes a transitory downregulation. Functional clustering identified the regulation of high interest biologic processes, most importantly cell death wherein apoptosis was the most significant cluster. Ten death-related genes upregulated by both injuries were used for array validation by means of qRT-PCR. In addition, western blotting and immunohistofluorescence of total and active Caspase 3 (Casp3), tumor necrosis factor receptor type 1 associated death domain (TRADD), tumor necrosis factor receptor superfamily member 1a (TNFR1a), and c-fos were performed to confirm their protein regulation and expression pattern in naive and injured retinas. These analyses demonstrated that for these genes, protein regulation followed transcriptional regulation and that these pro-apoptotic proteins were expressed by retinal ganglion cells (RGCs). MetaCore-based death-signaling maps show that several apoptotic cascades were regulated in the retina following optic nerve injury and highlight the similarities and differences between IONT and IONC in cell death profiling. CONCLUSIONS: This comprehensive time course retinal transcriptome study comparing IONT and IONC lesions provides a unique valuable tool to understand the molecular mechanisms underlying optic nerve injury and to design neuroprotective protocols. %B Mol Vis %V 14 %P 1050-63 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18552980 %0 Journal Article %J Gastroenterology %D 2008 %T Transcriptional profiling of mRNA expression in the mouse distal colon %A Hoogerwerf, W. A. %A Sinha, M. %A A. Conesa %A Luxon, B. A. %A Shahinian, V. B. %A Cornelissen, G. %A Halberg, F. %A Bostwick, J. %A Timm, J. %A Cassone, V. M. %K Animals Blotting %K Genetic %K Inbred C57BL Microarray Analysis Proteins/*genetics/metabolism RNA %K Messenger/biosynthesis/*genetics Reverse Transcriptase Polymerase Chain Reaction *Transcription %K Western Cell Proliferation Circadian Rhythm/*genetics Colon/cytology/*metabolism Male Mice Mice %X BACKGROUND & AIMS: Intestinal epithelial cells and the myenteric plexus of the mouse gastrointestinal tract contain a circadian clock-based intrinsic time-keeping system. Because disruption of the biological clock has been associated with increased susceptibility to colon cancer and gastrointestinal symptoms, we aimed to identify rhythmically expressed genes in the mouse distal colon. METHODS: Microarray analysis was used to identify genes that were rhythmically expressed over a 24-hour light/dark cycle. The transcripts were then classified according to expression pattern, function, and association with physiologic and pathophysiologic processes of the colon. RESULTS: A circadian gene expression pattern was detected in approximately 3.7% of distal colonic genes. A large percentage of these genes were involved in cell signaling, differentiation, and proliferation and cell death. Of all the rhythmically expressed genes in the mouse colon, approximately 7% (64/906) have been associated with colorectal cancer formation (eg, B-cell leukemia/lymphoma-2 [Bcl2]) and 1.8% (18/906) with various colonic functions such as motility and secretion (eg, vasoactive intestinal polypeptide, cystic fibrosis transmembrane conductance regulator). CONCLUSIONS: A subset of genes in the murine colon follows a rhythmic expression pattern. These findings may have significant implications for colonic physiology and pathophysiology. %B Gastroenterology %V 135 %P 2019-29 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18848557 %0 Journal Article %J Food Chem Toxicol %D 2008 %T Transcriptome analysis provides new insights into liver changes induced in the rat upon dietary administration of the food additives butylated hydroxytoluene, curcumin, propyl gallate and thiabendazole %A Stierum, R. %A A. Conesa %A Heijne, W. %A Ommen, B. %A Junker, K. %A Scott, M. P. %A Price, R. J. %A Meredith, C. %A Lake, B. G. %A Groten, J. %K Animals Aryl Hydrocarbon Hydroxylases/metabolism Body Weight/drug effects Butylated Hydroxytoluene/toxicity Curcumin/toxicity Cytochrome P-450 CYP1A2/metabolism Cytochrome P-450 CYP2B1/metabolism DNA %K Complementary/biosynthesis/genetics Data Interpretation %K Sprague-Dawley Reverse Transcriptase Polymerase Chain Reaction Steroid Hydroxylases/metabolism Thiabendazole/toxicity %K Statistical *Diet Food Additives/*toxicity Gene Expression/drug effects *Gene Expression Profiling Glutathione Transferase/metabolism Liver/*drug effects Male Organ Size/drug effects Oxidation-Reduction Palmitoyl Coenzyme A/metabolism Propyl Gallate/toxi %X Transcriptomics was performed to gain insight into mechanisms of food additives butylated hydroxytoluene (BHT), curcumin (CC), propyl gallate (PG), and thiabendazole (TB), additives for which interactions in the liver can not be excluded. Additives were administered in diets for 28 days to Sprague-Dawley rats and cDNA microarray experiments were performed on hepatic RNA. BHT induced changes in the expression of 10 genes, including phase I (CYP2B1/2; CYP3A9; CYP2C6) and phase II metabolism (GST mu2). The CYP2B1/2 and GST expression findings were confirmed by real time RT-PCR, western blotting, and increased GST activity towards DCNB. CC altered the expression of 12 genes. Three out of these were related to peroxisomes (phytanoyl-CoA dioxygenase, enoyl-CoA hydratase; CYP4A3). Increased cyanide insensitive palmitoyl-CoA oxidation was observed, suggesting that CC is a weak peroxisome proliferator. TB changed the expression of 12 genes, including CYP1A2. In line, CYP1A2 protein expression was increased. The expression level of five genes, associated with p53 was found to change upon TB treatment, including p53 itself, GADD45alpha, DN-7, protein kinase C beta and serum albumin. These array experiments led to the novel finding that TB is capable of inducing p53 at the protein level, at least at the highest dose levels employed above the current NOAEL. The expression of eight genes changed upon PG administration. This study shows the value of gene expression profiling in food toxicology in terms of generating novel hypotheses on the mechanisms of action of food additives in relation to pathology. %B Food Chem Toxicol %V 46 %P 2616-28 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18539377 %0 Journal Article %J Hum Mutat %D 2008 %T Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans %A E. Capriotti %A Arbiza, L. %A Casadio, R. %A Dopazo, J. %A H. Dopazo %A M. A. Marti-Renom %K Algorithms Codon/genetics Computational Biology/*methods *DNA Mutational Analysis Databases %K Human Humans Iduronic Acid/analogs & derivatives/metabolism *Point Mutation Polymorphism %K Molecular *Genetic Predisposition to Disease Genetic Variation Genome %K Protein *Evolution %K Single Nucleotide Proteins/chemistry/*genetics Tumor Suppressor Protein p53/genetics %X Predicting the functional impact of protein variation is one of the most challenging problems in bioinformatics. A rapidly growing number of genome-scale studies provide large amounts of experimental data, allowing the application of rigorous statistical approaches for predicting whether a given single point mutation has an impact on human health. Up until now, existing methods have limited their source data to either protein or gene information. Novel in this work, we take advantage of both and focus on protein evolutionary information by using estimated selective pressures at the codon level. Here we introduce a new method (SeqProfCod) to predict the likelihood that a given protein variant is associated with human disease or not. Our method relies on a support vector machine (SVM) classifier trained using three sources of information: protein sequence, multiple protein sequence alignments, and the estimation of selective pressure at the codon level. SeqProfCod has been benchmarked with a large dataset of 8,987 single point mutations from 1,434 human proteins from SWISS-PROT. It achieves 82% overall accuracy and a correlation coefficient of 0.59, indicating that the estimation of the selective pressure helps in predicting the functional impact of single-point mutations. Moreover, this study demonstrates the synergic effect of combining two sources of information for predicting the functional effects of protein variants: protein sequence/profile-based information and the evolutionary estimation of the selective pressures at the codon level. The results of large-scale application of SeqProfCod over all annotated point mutations in SWISS-PROT (available for download at http://sgu.bioinfo.cipf.es/services/Omidios/; last accessed: 24 August 2007), could be used to support clinical studies. %B Hum Mutat %V 29 %P 198-204 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17935148 %0 Journal Article %J Hum Mutat %D 2008 %T Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans. %A Capriotti, Emidio %A Arbiza, Leonardo %A Casadio, Rita %A Dopazo, Joaquin %A Dopazo, Hernán %A Marti-Renom, Marc A %K Algorithms %K Codon %K Computational Biology %K Databases, Protein %K DNA Mutational Analysis %K Evolution, Molecular %K Genetic Predisposition to Disease %K Genetic Variation %K Genome, Human %K Humans %K Iduronic Acid %K Point Mutation %K Polymorphism, Single Nucleotide %K Proteins %K Tumor Suppressor Protein p53 %X

Predicting the functional impact of protein variation is one of the most challenging problems in bioinformatics. A rapidly growing number of genome-scale studies provide large amounts of experimental data, allowing the application of rigorous statistical approaches for predicting whether a given single point mutation has an impact on human health. Up until now, existing methods have limited their source data to either protein or gene information. Novel in this work, we take advantage of both and focus on protein evolutionary information by using estimated selective pressures at the codon level. Here we introduce a new method (SeqProfCod) to predict the likelihood that a given protein variant is associated with human disease or not. Our method relies on a support vector machine (SVM) classifier trained using three sources of information: protein sequence, multiple protein sequence alignments, and the estimation of selective pressure at the codon level. SeqProfCod has been benchmarked with a large dataset of 8,987 single point mutations from 1,434 human proteins from SWISS-PROT. It achieves 82% overall accuracy and a correlation coefficient of 0.59, indicating that the estimation of the selective pressure helps in predicting the functional impact of single-point mutations. Moreover, this study demonstrates the synergic effect of combining two sources of information for predicting the functional effects of protein variants: protein sequence/profile-based information and the evolutionary estimation of the selective pressures at the codon level. The results of large-scale application of SeqProfCod over all annotated point mutations in SWISS-PROT (available for download at http://sgu.bioinfo.cipf.es/services/Omidios/; last accessed: 24 August 2007), could be used to support clinical studies.

%B Hum Mutat %V 29 %P 198-204 %8 2008 Jan %G eng %N 1 %1 https://www.ncbi.nlm.nih.gov/pubmed/17935148?dopt=Abstract %R 10.1002/humu.20628 %0 Journal Article %J BMC Genomics %D 2007 %T Analysis of 13000 unique Citrus clusters associated with fruit quality, production and salinity tolerance %A Terol, J. %A A. Conesa %A Colmenero, J. M. %A Cercos, M. %A Tadeo, F. %A Agusti, J. %A Alos, E. %A Andres, F. %A Soler, G. %A Brumos, J. %A Iglesias, D. J. %A Gotz, S. %A Legaz, F. %A Argout, X. %A Courtois, B. %A Ollitrault, P. %A Dossat, C. %A Wincker, P. %A Morillon, R. %A Talon, M. %K Acclimatization/*genetics Amino Acid Motifs Citrus/*genetics Cluster Analysis Expressed Sequence Tags Fruit/genetics Gene Duplication *Gene Expression Regulation %K Plant Gene Library Genes %K Plant Genomics Molecular Sequence Data Multigene Family Phylogeny *Salts/adverse effects %X BACKGROUND: Improvement of Citrus, the most economically important fruit crop in the world, is extremely slow and inherently costly because of the long-term nature of tree breeding and an unusual combination of reproductive characteristics. Aside from disease resistance, major commercial traits in Citrus are improved fruit quality, higher yield and tolerance to environmental stresses, especially salinity. RESULTS: A normalized full length and 9 standard cDNA libraries were generated, representing particular treatments and tissues from selected varieties (Citrus clementina and C. sinensis) and rootstocks (C. reshni, and C. sinenis x Poncirus trifoliata) differing in fruit quality, resistance to abscission, and tolerance to salinity. The goal of this work was to provide a large expressed sequence tag (EST) collection enriched with transcripts related to these well appreciated agronomical traits. Towards this end, more than 54000 ESTs derived from these libraries were analyzed and annotated. Assembly of 52626 useful sequences generated 15664 putative transcription units distributed in 7120 contigs, and 8544 singletons. BLAST annotation produced significant hits for more than 80% of the hypothetical transcription units and suggested that 647 of these might be Citrus specific unigenes. The unigene set, composed of 13000 putative different transcripts, including more than 5000 novel Citrus genes, was assigned with putative functions based on similarity, GO annotations and protein domains CONCLUSION: Comparative genomics with Arabidopsis revealed the presence of putative conserved orthologs and single copy genes in Citrus and also the occurrence of both gene duplication events and increased number of genes for specific pathways. In addition, phylogenetic analysis performed on the ammonium transporter family and glycosyl transferase family 20 suggested the existence of Citrus paralogs. Analysis of the Citrus gene space showed that the most important metabolic pathways known to affect fruit quality were represented in the unigene set. Overall, the similarity analyses indicated that the sequences of the genes belonging to these varieties and rootstocks were essentially identical, suggesting that the differential behaviour of these species cannot be attributed to major sequence divergences. This Citrus EST assembly contributes both crucial information to discover genes of agronomical interest and tools for genetic and genomic analyses, such as the development of new markers and microarrays. %B BMC Genomics %V 8 %P 31 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17254327 %0 Journal Article %J BMC Bioinformatics %D 2007 %T The AnnoLite and AnnoLyze programs for comparative annotation of protein structures %A M. A. Marti-Renom %A Rossi, A. %A Fatima Al-Shahrour %A Davis, F. P. %A Pieper, U. %A Dopazo, J. %A Sali, A. %K *Algorithms Amino Acid Sequence Confidence Intervals Data Interpretation %K Amino Acid *Software Structure-Activity Relationship %K Protein Information Storage and Retrieval/methods Molecular Sequence Data Proteins/*chemistry/classification/*metabolism Sensitivity and Specificity Sequence Alignment/*methods Sequence Analysis %K Protein/*methods Sequence Homology %K Statistical *Databases %X BACKGROUND: Advances in structural biology, including structural genomics, have resulted in a rapid increase in the number of experimentally determined protein structures. However, about half of the structures deposited by the structural genomics consortia have little or no information about their biological function. Therefore, there is a need for tools for automatically and comprehensively annotating the function of protein structures. We aim to provide such tools by applying comparative protein structure annotation that relies on detectable relationships between protein structures to transfer functional annotations. Here we introduce two programs, AnnoLite and AnnoLyze, which use the structural alignments deposited in the DBAli database. DESCRIPTION: AnnoLite predicts the SCOP, CATH, EC, InterPro, PfamA, and GO terms with an average sensitivity of 90% and average precision of 80%. AnnoLyze predicts ligand binding site and domain interaction patches with an average sensitivity of 70% and average precision of 30%, correctly localizing binding sites for small molecules in 95% of its predictions. CONCLUSION: The AnnoLite and AnnoLyze programs for comparative annotation of protein structures can reliably and automatically annotate new protein structures. The programs are fully accessible via the Internet as part of the DBAli suite of tools at http://salilab.org/DBAli/. %B BMC Bioinformatics %V 8 Suppl 4 %P S4 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17570147 %0 Journal Article %J Cancer Res %D 2007 %T Association study of 69 genes in the ret pathway identifies low-penetrance loci in sporadic medullary thyroid carcinoma %A Ruiz-Llorente, S. %A Montero-Conde, C. %A Milne, R. L. %A Moya, C. M. %A Cebrian, A. %A Leton, R. %A Cascon, A. %A Mercadillo, F. %A Landa, I. %A Borrego, S. %A Perez de Nanclares, G. %A Alvarez-Escola, C. %A Diaz-Perez, J. A. %A Carracedo, A. %A Urioste, M. %A Gonzalez-Neira, A. %A Benitez, J. %A Santisteban, P. %A Dopazo, J. %A Ponder, B. A. %A M. Robledo %K 80 and over Carcinoma %K Adolescent Adult Aged Aged %K Genetic %K Genetic Proto-Oncogene Proteins c-ret/*genetics/metabolism Signal Transduction Thyroid Neoplasms/*genetics/metabolism Transcription %K Medullary/*genetics/metabolism Case-Control Studies Cyclin-Dependent Kinase Inhibitor p15/biosynthesis/genetics Female Genetic Predisposition to Disease Germ-Line Mutation Haplotypes Humans Male Middle Aged Penetrance Polymorphism %K Single Nucleotide Promoter Regions %X To date, few association studies have been done to better understand the genetic basis for the development of sporadic medullary thyroid carcinoma (sMTC). To identify additional low-penetrance genes, we have done a two-stage case-control study in two European populations using high-throughput genotyping. We selected 417 single nucleotide polymorphisms (SNP) belonging to 69 genes either related to RET signaling pathway/functions or involved in key processes for cancer development. TagSNPs and functional variants were included where possible. These SNPs were initially studied in the largest known series of sMTC cases (n = 266) and controls (n = 422), all of Spanish origin. In stage II, an independent British series of 155 sMTC patients and 531 controls was included to validate the previous results. Associations were assessed by an exhaustive analysis of individual SNPs but also considering gene- and linkage disequilibrium-based haplotypes. This strategy allowed us to identify seven low-penetrance genes, six of them (STAT1, AURKA, BCL2, CDKN2B, CDK6, and COMT) consistently associated with sMTC risk in the two case-control series and a seventh (HRAS) with individual SNPs and haplotypes associated with sMTC in the Spanish data set. The potential role of CDKN2B was confirmed by a functional assay showing a role of a SNP (rs7044859) in the promoter region in altering the binding of the transcription factor HNF1. These results highlight the utility of association studies using homogeneous series of cases for better understanding complex diseases. %B Cancer Res %V 67 %P 9561-7 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17909067 %0 Journal Article %J PLoS Comput Biol %D 2007 %T Characterization of protein hubs by inferring interacting motifs from protein interactions %A Aragues, R. %A Sali, A. %A Bonet, J. %A M. A. Marti-Renom %A Oliva, B. %K Amino Acid Motifs Amino Acid Sequence Binding Sites Computer Simulation *Models %K Chemical *Models %K Molecular Molecular Sequence Data Protein Binding Protein Interaction Mapping/*methods Proteins/*chemistry Sequence Analysis %K Protein/*methods %X The characterization of protein interactions is essential for understanding biological systems. While genome-scale methods are available for identifying interacting proteins, they do not pinpoint the interacting motifs (e.g., a domain, sequence segments, a binding site, or a set of residues). Here, we develop and apply a method for delineating the interacting motifs of hub proteins (i.e., highly connected proteins). The method relies on the observation that proteins with common interaction partners tend to interact with these partners through a common interacting motif. The sole input for the method are binary protein interactions; neither sequence nor structure information is needed. The approach is evaluated by comparing the inferred interacting motifs with domain families defined for 368 proteins in the Structural Classification of Proteins (SCOP). The positive predictive value of the method for detecting proteins with common SCOP families is 75% at sensitivity of 10%. Most of the inferred interacting motifs were significantly associated with sequence patterns, which could be responsible for the common interactions. We find that yeast hubs with multiple interacting motifs are more likely to be essential than hubs with one or two interacting motifs, thus rationalizing the previously observed correlation between essentiality and the number of interacting partners of a protein. We also find that yeast hubs with multiple interacting motifs evolve slower than the average protein, contrary to the hubs with one or two interacting motifs. The proposed method will help us discover unknown interacting motifs and provide biological insights about protein hubs and their roles in interaction networks. %B PLoS Comput Biol %V 3 %P 1761-71 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17941705 %0 Book Section %B Fundamentals of data mining in genomics and proteomics %D 2007 %T Clustering - Class discovery in the post-genomic era %A Dopazo, J. %B Fundamentals of data mining in genomics and proteomics %I Springer-Verlag, W. Dubitzky, M. Granzow and D.P. Berrar %C New York, USA %G eng %0 Journal Article %J Nucleic Acids Res %D 2007 %T DBAli tools: mining the protein structure space. %A Marti-Renom, Marc A %A Pieper, Ursula %A Madhusudhan, M S %A Rossi, Andrea %A Eswar, Narayanan %A Davis, Fred P %A Al-Shahrour, Fátima %A Dopazo, Joaquin %A Sali, Andrej %K Algorithms %K Amino Acid Sequence %K Computational Biology %K Data Interpretation, Statistical %K Databases, Protein %K Internet %K Molecular Sequence Data %K Protein Conformation %K Proteins %K Pseudomonas aeruginosa %K Sequence Alignment %K Sequence Analysis, Protein %K Sequence Homology, Amino Acid %K Software %K Structure-Activity Relationship %X

The DBAli tools use a comprehensive set of structural alignments in the DBAli database to leverage the structural information deposited in the Protein Data Bank (PDB). These tools include (i) the DBAlit program that allows users to input the 3D coordinates of a protein structure for comparison by MAMMOTH against all chains in the PDB; (ii) the AnnoLite and AnnoLyze programs that annotate a target structure based on its stored relationships to other structures; (iii) the ModClus program that clusters structures by sequence and structure similarities; (iv) the ModDom program that identifies domains as recurrent structural fragments and (v) an implementation of the COMPARER method in the SALIGN command in MODELLER that creates a multiple structure alignment for a set of related protein structures. Thus, the DBAli tools, which are freely accessible via the World Wide Web at http://salilab.org/DBAli/, allow users to mine the protein structure space by establishing relationships between protein structures and their functions.

%B Nucleic Acids Res %V 35 %P W393-7 %8 2007 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/17478513?dopt=Abstract %R 10.1093/nar/gkm236 %0 Journal Article %J Nucleic Acids Res %D 2007 %T DBAli tools: mining the protein structure space %A M. A. Marti-Renom %A Pieper, U. %A Madhusudhan, M. S. %A Rossi, A. %A Eswar, N. %A Davis, F. P. %A Fatima Al-Shahrour %A Dopazo, J. %A Sali, A. %K *Algorithms Amino Acid Sequence Computational Biology/*methods Data Interpretation %K Amino Acid *Software Structure-Activity Relationship %K Protein Internet Molecular Sequence Data Protein Conformation Proteins/*chemistry/classification/*metabolism Pseudomonas aeruginosa/*metabolism Sequence Alignment/*methods Sequence Analysis %K Protein/*methods Sequence Homology %K Statistical *Databases %X The DBAli tools use a comprehensive set of structural alignments in the DBAli database to leverage the structural information deposited in the Protein Data Bank (PDB). These tools include (i) the DBAlit program that allows users to input the 3D coordinates of a protein structure for comparison by MAMMOTH against all chains in the PDB; (ii) the AnnoLite and AnnoLyze programs that annotate a target structure based on its stored relationships to other structures; (iii) the ModClus program that clusters structures by sequence and structure similarities; (iv) the ModDom program that identifies domains as recurrent structural fragments and (v) an implementation of the COMPARER method in the SALIGN command in MODELLER that creates a multiple structure alignment for a set of related protein structures. Thus, the DBAli tools, which are freely accessible via the World Wide Web at http://salilab.org/DBAli/, allow users to mine the protein structure space by establishing relationships between protein structures and their functions. %B Nucleic Acids Res %V 35 %P W393-7 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17478513 %0 Journal Article %J Bioinformatics %D 2007 %T Discovering gene expression patterns in time course microarray experiments by ANOVA-SCA %A Nueda, M. J. %A A. Conesa %A Westerhuis, J. A. %A Hoefsloot, H. C. %A Smilde, A. K. %A Talon, M. %A Ferrer, A. %K Algorithms *Analysis of Variance Computational Biology/*methods Computer Simulation Data Interpretation %K Genetic %K Genetic Models %K Statistical Gene Expression Profiling/*methods Models %K Statistical Oligonucleotide Array Sequence Analysis/*methods Principal Component Analysis Time Factors Transcription %X MOTIVATION: Designed microarray experiments are used to investigate the effects that controlled experimental factors have on gene expression and learn about the transcriptional responses associated with external variables. In these datasets, signals of interest coexist with varying sources of unwanted noise in a framework of (co)relation among the measured variables and with the different levels of the studied factors. Discovering experimentally relevant transcriptional changes require methodologies that take all these elements into account. RESULTS: In this work, we develop the application of the Analysis of variance-simultaneous component analysis (ANOVA-SCA) Smilde et al. Bioinformatics, (2005) to the analysis of multiple series time course microarray data as an example of multifactorial gene expression profiling experiments. We denoted this implementation as ASCA-genes. We show how the combination of ANOVA-modeling and a dimension reduction technique is effective in extracting targeted signals from data by-passing structural noise. The methodology is valuable for identifying main and secondary responses associated with the experimental factors and spotting relevant experimental conditions. We additionally propose a novel approach for gene selection in the context of the relation of individual transcriptional patterns to global gene expression signals. We demonstrate the methodology on both real and synthetic datasets. AVAILABILITY: ASCA-genes has been implemented in the statistical language R and is available at http://www.ivia.es/centrodegenomica/bioinformatics.htm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics %V 23 %P 1792-800 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17519250 %0 Journal Article %J BMC Genomics %D 2007 %T Evidence for systems-level molecular mechanisms of tumorigenesis %A Hernandez, P. %A Huerta-Cepas, J. %A Montaner, D. %A Fatima Al-Shahrour %A Valls, J. %A Gomez, L. %A Capella, G. %A Dopazo, J. %A Pujana, M. A. %K *Cell Transformation %K Biological Models %K Genetic Models %K Messenger/metabolism Signal Transduction Systems Biology %K Neoplastic *Gene Expression Profiling *Gene Expression Regulation %K Neoplastic Humans Male Models %K Statistical Neoplasm Proteins/*physiology Neoplasms/etiology/*genetics Prostatic Neoplasms/genetics Protein Interaction Mapping RNA %X BACKGROUND: Cancer arises from the consecutive acquisition of genetic alterations. Increasing evidence suggests that as a consequence of these alterations, molecular interactions are reprogrammed in the context of highly connected and regulated cellular networks. Coordinated reprogramming would allow the cell to acquire the capabilities for malignant growth. RESULTS: Here, we determine the coordinated function of cancer gene products (i.e., proteins encoded by differentially expressed genes in tumors relative to healthy tissue counterparts, hereafter referred to as "CGPs") defined as their topological properties and organization in the interactome network. We show that CGPs are central to information exchange and propagation and that they are specifically organized to promote tumorigenesis. Centrality is identified by both local (degree) and global (betweenness and closeness) measures, and systematically appears in down-regulated CGPs. Up-regulated CGPs do not consistently exhibit centrality, but both types of cancer products determine the overall integrity of the network structure. In addition to centrality, down-regulated CGPs show topological association that correlates with common biological processes and pathways involved in tumorigenesis. CONCLUSION: Given the current limited coverage of the human interactome, this study proposes that tumorigenesis takes place in a specific and organized way at the molecular systems-level and suggests a model that comprises the precise down-regulation of groups of topologically-associated proteins involved in particular functions, orchestrated with the up-regulation of specific proteins. %B BMC Genomics %V 8 %P 185 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17584915 %0 Journal Article %J BMC Genomics %D 2007 %T Evidence for systems-level molecular mechanisms of tumorigenesis. %A Hernández, Pilar %A Huerta-Cepas, Jaime %A Montaner, David %A Al-Shahrour, Fátima %A Valls, Joan %A Gómez, Laia %A Capellà, Gabriel %A Dopazo, Joaquin %A Pujana, Miguel Angel %K Cell Transformation, Neoplastic %K Gene Expression Profiling %K Gene Expression Regulation, Neoplastic %K Humans %K Male %K Models, Biological %K Models, Genetic %K Models, Statistical %K Neoplasm Proteins %K Neoplasms %K Prostatic Neoplasms %K Protein Interaction Mapping %K RNA, Messenger %K Signal Transduction %K Systems biology %X

BACKGROUND: Cancer arises from the consecutive acquisition of genetic alterations. Increasing evidence suggests that as a consequence of these alterations, molecular interactions are reprogrammed in the context of highly connected and regulated cellular networks. Coordinated reprogramming would allow the cell to acquire the capabilities for malignant growth.

RESULTS: Here, we determine the coordinated function of cancer gene products (i.e., proteins encoded by differentially expressed genes in tumors relative to healthy tissue counterparts, hereafter referred to as "CGPs") defined as their topological properties and organization in the interactome network. We show that CGPs are central to information exchange and propagation and that they are specifically organized to promote tumorigenesis. Centrality is identified by both local (degree) and global (betweenness and closeness) measures, and systematically appears in down-regulated CGPs. Up-regulated CGPs do not consistently exhibit centrality, but both types of cancer products determine the overall integrity of the network structure. In addition to centrality, down-regulated CGPs show topological association that correlates with common biological processes and pathways involved in tumorigenesis.

CONCLUSION: Given the current limited coverage of the human interactome, this study proposes that tumorigenesis takes place in a specific and organized way at the molecular systems-level and suggests a model that comprises the precise down-regulation of groups of topologically-associated proteins involved in particular functions, orchestrated with the up-regulation of specific proteins.

%B BMC Genomics %V 8 %P 185 %8 2007 Jun 20 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/17584915?dopt=Abstract %R 10.1186/1471-2164-8-185 %0 Book Section %B Microarray Technology Through Applications %D 2007 %T f single nucleotide polymorphism arrays: Design, tools and applications %A M. Robledo %A González-Neira, A %A Dopazo, J. %B Microarray Technology Through Applications %I Taylor & Francis, F. Falciani %C New York, USA %G eng %0 Journal Article %J Nucleic Acids Res %D 2007 %T FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments %A Fatima Al-Shahrour %A Minguez, P. %A Tarraga, J. %A Medina, Ignacio %A Alloza, E. %A Montaner, D. %A Dopazo, J. %K babelomics %K functional enrichment analysys %X

The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the data, relating the available information with the hypotheses that originated the experiment. Thus, functional profiling methods have become essential in diverse scenarios such as microarray experiments, proteomics, etc. We present the FatiGO+, a web-based tool for the functional profiling of genome-scale experiments, specially oriented to the interpretation of microarray experiments. In addition to different functional annotations (gene ontology, KEGG pathways, Interpro motifs, Swissprot keywords and text-mining based bioentities related to diseases and chemical compounds) FatiGO+ includes, as a novelty, regulatory and structural information. The regulatory information used includes predictions of targets for distinct regulatory elements (obtained from the Transfac and CisRed databases). Additionally FatiGO+ uses predictions of target motifs of miRNA to infer which of these can be activated or deactivated in the sample of genes studied. Finally, properties of gene products related to their relative location and connections in the interactome have also been used. Also, enrichment of any of these functional terms can be directly analysed on chromosomal coordinates. FatiGO+ can be found at: http://www.fatigoplus.org and within the Babelomics environment http://www.babelomics.org.

%B Nucleic Acids Res %V 35 %P W91-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17478504 %0 Journal Article %J Nucleic Acids Res %D 2007 %T FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. %A Al-Shahrour, Fátima %A Minguez, Pablo %A Tárraga, Joaquín %A Medina, Ignacio %A Alloza, Eva %A Montaner, David %A Dopazo, Joaquin %K Amino Acid Motifs %K Animals %K Binding Sites %K Computational Biology %K Gene Expression Profiling %K Genes %K Genomics %K Humans %K Internet %K Oligonucleotide Array Sequence Analysis %K Programming Languages %K Software %K Systems Integration %K Transcription Factors %X

The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the data, relating the available information with the hypotheses that originated the experiment. Thus, functional profiling methods have become essential in diverse scenarios such as microarray experiments, proteomics, etc. We present the FatiGO+, a web-based tool for the functional profiling of genome-scale experiments, specially oriented to the interpretation of microarray experiments. In addition to different functional annotations (gene ontology, KEGG pathways, Interpro motifs, Swissprot keywords and text-mining based bioentities related to diseases and chemical compounds) FatiGO+ includes, as a novelty, regulatory and structural information. The regulatory information used includes predictions of targets for distinct regulatory elements (obtained from the Transfac and CisRed databases). Additionally FatiGO+ uses predictions of target motifs of miRNA to infer which of these can be activated or deactivated in the sample of genes studied. Finally, properties of gene products related to their relative location and connections in the interactome have also been used. Also, enrichment of any of these functional terms can be directly analysed on chromosomal coordinates. FatiGO+ can be found at: http://www.fatigoplus.org and within the Babelomics environment http://www.babelomics.org.

%B Nucleic Acids Res %V 35 %P W91-6 %8 2007 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/17478504?dopt=Abstract %R 10.1093/nar/gkm260 %0 Journal Article %J PLoS Comput Biol %D 2007 %T From endosymbiont to host-controlled organelle: the hijacking of mitochondrial protein synthesis and metabolism %A Gabaldón, T. %A M. A. Huynen %K Computer Simulation DNA Mutational Analysis/methods Evolution *Evolution %K Genetic Organelles/physiology Protein Biosynthesis/*genetics Symbiosis/*genetics %K Molecular Fungal Proteins/*physiology Genetic Variation/genetics Humans Mitochondria/*physiology Mitochondrial Proteins/*physiology *Models %X Mitochondria are eukaryotic organelles that originated from the endosymbiosis of an alpha-proteobacterium. To gain insight into the evolution of the mitochondrial proteome as it proceeded through the transition from a free-living cell to a specialized organelle, we compared a reconstructed ancestral proteome of the mitochondrion with the proteomes of alpha-proteobacteria as well as with the mitochondrial proteomes in yeast and man. Overall, there has been a large turnover of the mitochondrial proteome during the evolution of mitochondria. Early in the evolution of the mitochondrion, proteins involved in cell envelope synthesis have virtually disappeared, whereas proteins involved in replication, transcription, cell division, transport, regulation, and signal transduction have been replaced by eukaryotic proteins. More than half of what remains from the mitochondrial ancestor in modern mitochondria corresponds to translation, including post-translational modifications, and to metabolic pathways that are directly, or indirectly, involved in energy conversion. Altogether, the results indicate that the eukaryotic host has hijacked the proto-mitochondrion, taking control of its protein synthesis and metabolism. %B PLoS Comput Biol %V 3 %P e219 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17983265 %0 Journal Article %J BMC Bioinformatics %D 2007 %T From genes to functional classes in the study of biological systems. %A Al-Shahrour, Fátima %A Arbiza, Leonardo %A Dopazo, Hernán %A Huerta-Cepas, Jaime %A Minguez, Pablo %A Montaner, David %A Dopazo, Joaquin %K Algorithms %K Chromosome Mapping %K Computer Simulation %K Gene Expression Profiling %K Models, Biological %K Multigene Family %K Signal Transduction %K Software %K Systems biology %K User-Computer Interface %X

BACKGROUND: With the popularization of high-throughput techniques, the need for procedures that help in the biological interpretation of results has increased enormously. Recently, new procedures inspired in systems biology criteria have started to be developed.

RESULTS: Here we present FatiScan, a web-based program which implements a threshold-independent test for the functional interpretation of large-scale experiments that does not depend on the pre-selection of genes based on the multiple application of independent tests to each gene. The test implemented aims to directly test the behaviour of blocks of functionally related genes, instead of focusing on single genes. In addition, the test does not depend on the type of the data used for obtaining significance values, and consequently different types of biologically informative terms (gene ontology, pathways, functional motifs, transcription factor binding sites or regulatory sites from CisRed) can be applied to different classes of genome-scale studies. We exemplify its application in microarray gene expression, evolution and interactomics.

CONCLUSION: Methods for gene set enrichment which, in addition, are independent from the original data and experimental design constitute a promising alternative for the functional profiling of genome-scale experiments. A web server that performs the test described and other similar ones can be found at: http://www.babelomics.org.

%B BMC Bioinformatics %V 8 %P 114 %8 2007 Apr 03 %G eng %1 https://www.ncbi.nlm.nih.gov/pubmed/17407596?dopt=Abstract %R 10.1186/1471-2105-8-114 %0 Journal Article %J BMC Bioinformatics %D 2007 %T From genes to functional classes in the study of biological systems %A Fatima Al-Shahrour %A Arbiza, L. %A H. Dopazo %A Huerta-Cepas, J. %A Minguez, P. %A Montaner, D. %A Dopazo, J. %K Algorithms Chromosome Mapping/*methods Computer Simulation Gene Expression Profiling/methods *Models %K babelomics %K Biological Multigene Family/*physiology Signal Transduction/*physiology *Software Systems Biology/*methods *User-Computer Interface %X

BACKGROUND: With the popularization of high-throughput techniques, the need for procedures that help in the biological interpretation of results has increased enormously. Recently, new procedures inspired in systems biology criteria have started to be developed. RESULTS: Here we present FatiScan, a web-based program which implements a threshold-independent test for the functional interpretation of large-scale experiments that does not depend on the pre-selection of genes based on the multiple application of independent tests to each gene. The test implemented aims to directly test the behaviour of blocks of functionally related genes, instead of focusing on single genes. In addition, the test does not depend on the type of the data used for obtaining significance values, and consequently different types of biologically informative terms (gene ontology, pathways, functional motifs, transcription factor binding sites or regulatory sites from CisRed) can be applied to different classes of genome-scale studies. We exemplify its application in microarray gene expression, evolution and interactomics. CONCLUSION: Methods for gene set enrichment which, in addition, are independent from the original data and experimental design constitute a promising alternative for the functional profiling of genome-scale experiments. A web server that performs the test described and other similar ones can be found at: http://www.babelomics.org.

%B BMC Bioinformatics %V 8 %P 114 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17407596 %0 Book Section %B Microarray Technology Through Applications %D 2007 %T Functional annotation of microarray experiments %A Dopazo, J. %A Fatima Al-Shahrour %B Microarray Technology Through Applications %I Taylor & Francis, F. Falciani %C New York, USA %G eng %0 Journal Article %J Bioinformation %D 2007 %T Functional profiling and gene expression analysis of chromosomal copy number alterations %A L. Conde %A Montaner, D. %A Burguet-Castell, J. %A Tarraga, J. %A Fatima Al-Shahrour %A Dopazo, J. %K babelomics %X

Contrarily to the traditional view in which only one or a few key genes were supposed to be the causative factors of diseases, we discuss the importance of considering groups of functionally related genes in the study of pathologies characterised by chromosomal copy number alterations. Recent observations have reported the existence of regions in higher eukaryotic chromosomes (including humans) containing genes of related function that show a high degree of coregulation. Copy number alterations will consequently affect to clusters of functionally related genes, which will be the final causative agents of the diseased phenotype, in many cases. Therefore, we propose that the functional profiling of the regions affected by copy number alterations must be an important aspect to take into account in the understanding of this type of pathologies. To illustrate this, we present an integrated study of DNA copy number variations, gene expression along with the functional profiling of chromosomal regions in a case of multiple myeloma.

%B Bioinformation %V 1 %P 432-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17597935 %0 Journal Article %J Bioinformation %D 2007 %T Functional profiling and gene expression analysis of chromosomal copy number alterations. %A Conde, Lucia %A Montaner, David %A Burguet-Castell, Jordi %A Tárraga, Joaquín %A Al-Shahrour, Fátima %A Dopazo, Joaquin %X

Contrarily to the traditional view in which only one or a few key genes were supposed to be the causative factors of diseases, we discuss the importance of considering groups of functionally related genes in the study of pathologies characterised by chromosomal copy number alterations. Recent observations have reported the existence of regions in higher eukaryotic chromosomes (including humans) containing genes of related function that show a high degree of coregulation. Copy number alterations will consequently affect to clusters of functionally related genes, which will be the final causative agents of the diseased phenotype, in many cases. Therefore, we propose that the functional profiling of the regions affected by copy number alterations must be an important aspect to take into account in the understanding of this type of pathologies. To illustrate this, we present an integrated study of DNA copy number variations, gene expression along with the functional profiling of chromosomal regions in a case of multiple myeloma.

%B Bioinformation %V 1 %P 432-5 %8 2007 Apr 10 %G eng %N 10 %1 https://www.ncbi.nlm.nih.gov/pubmed/17597935?dopt=Abstract %R 10.6026/97320630001432 %0 Journal Article %J Bioinformatics %D 2007 %T Functional profiling of microarray experiments using text-mining derived bioentities. %A Minguez, Pablo %A Al-Shahrour, Fátima %A Montaner, David %A Dopazo, Joaquin %K Artificial Intelligence %K Databases, Protein %K Gene Expression Profiling %K Information Storage and Retrieval %K Natural Language Processing %K Proteins %K Research Design %K Systems Integration %X

MOTIVATION: The increasing use of microarray technologies brought about a parallel demand in methods for the functional interpretation of the results. Beyond the conventional functional annotations for genes, such as gene ontology, pathways, etc. other sources of information are still to be exploited. Text-mining methods allow extracting informative terms (bioentities) with different functional, chemical, clinical, etc. meanings, that can be associated to genes. We show how to use these associations within an appropriate statistical framework and how to apply them through easy-to-use, web-based environments to the functional interpretation of microarray experiments. Functional enrichment and gene set enrichment tests using bioentities are presented.

%B Bioinformatics %V 23 %P 3098-9 %8 2007 Nov 15 %G eng %N 22 %1 https://www.ncbi.nlm.nih.gov/pubmed/17855415?dopt=Abstract %R 10.1093/bioinformatics/btm445 %0 Journal Article %J Bioinformatics %D 2007 %T Functional profiling of microarray experiments using text-mining derived bioentities %A Minguez, P. %A Fatima Al-Shahrour %A Montaner, D. %A Dopazo, J. %K Artificial Intelligence *Databases %K babelomics %K Protein Gene Expression Profiling/*methods Information Storage and Retrieval/*methods *Natural Language Processing Proteins/*classification/*metabolism Research/*methods Systems Integration %X

MOTIVATION: The increasing use of microarray technologies brought about a parallel demand in methods for the functional interpretation of the results. Beyond the conventional functional annotations for genes, such as gene ontology, pathways, etc. other sources of information are still to be exploited. Text-mining methods allow extracting informative terms (bioentities) with different functional, chemical, clinical, etc. meanings, that can be associated to genes. We show how to use these associations within an appropriate statistical framework and how to apply them through easy-to-use, web-based environments to the functional interpretation of microarray experiments. Functional enrichment and gene set enrichment tests using bioentities are presented.

%B Bioinformatics %V 23 %P 3098-9 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17855415 %0 Journal Article %J Genome Biol %D 2007 %T The human phylome %A Huerta-Cepas, J. %A H. Dopazo %A Dopazo, J. %A Gabaldón, T. %K Animals *Evolution Evolution %K DNA %K Molecular Gene Duplication *Genome Humans *Phylogeny Proteins/genetics Sequence Analysis %X BACKGROUND: Phylogenomics analyses serve to establish evolutionary relationships among organisms and their genes. A phylome, the complete collection of all gene phylogenies in a genome, constitutes a valuable source of information, but its use in large genomes still constitutes a technical challenge. The use of phylomes also requires the development of new methods that help us to interpret them. RESULTS: We reconstruct here the human phylome, which includes the evolutionary relationships of all human proteins and their homologs among 39 fully sequenced eukaryotes. Phylogenetic techniques used include alignment trimming, branch length optimization, evolutionary model testing and maximum likelihood and Bayesian methods. Although differences with alternative topologies are minor, most of the trees support the Coelomata and Unikont hypotheses as well as the grouping of primates with laurasatheria to the exclusion of rodents. We assess the extent of gene duplication events and their relationship with the functional roles of the protein families involved. We find support for at least one, and probably two, rounds of whole genome duplications before vertebrate radiation. Using a novel algorithm that is independent from a species phylogeny, we derive orthology and paralogy relationships of human proteins among eukaryotic genomes. CONCLUSION: Topological variations among phylogenies for different genes are to be expected, highlighting the danger of gene-sampling effects in phylogenomic analyses. Several links can be established between the functions of gene families duplicated at certain phylogenetic splits and major evolutionary transitions in those lineages. The pipeline implemented here can be easily adapted for use in other organisms. %B Genome Biol %V 8 %P R109 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17567924 %0 Journal Article %J BMC Genomics %D 2007 %T Identification of conserved domains in the promoter regions of nitric oxide synthase 2: implications for the species-specific transcription and evolutionary differences %A Rico, D. %A Vaquerizas, J. M. %A H. Dopazo %A Bosca, L. %K Animals Base Sequence Conserved Sequence Enhancer Elements %K Genetic *Evolution %K Genetic Response Elements Species Specificity %K Molecular Humans Inflammation/metabolism Interferon-gamma/metabolism Mice NF-kappa B/metabolism Nitric Oxide Synthase Type II/*genetics *Promoter Regions %X BACKGROUND: The majority of the genes involved in the inflammatory response are highly conserved in mammals. These genes are not significantly expressed under normal conditions and are mainly regulated at the transcription and prost-transcriptional level. Transcription from the promoters of these genes is very dependent on NF-kappaB activation, which integrates the response to diverse extracellular stresses. However, in spite of the high conservation of the pattern of promoter regulation in kappaB-regulated genes, there is inter-species diversity in some genes. One example is nitric oxide synthase 2 (NOS-2), which exhibits a species-specific pattern of expression in response to infection or pro-inflammatory challenge. RESULTS: We have conducted a comparative genomic analysis of NOS-2 with different bioinformatic approaches. This analysis shows that in the NOS-2 gene promoter the position and the evolutionary divergence of some conserved regions are different in rodents and non-rodent mammals, and in particular in primates. Two not previously described distal regions in rodents that are similar to the unique upstream region responsible of the NF-kappaB activation of NOS-2 in humans are fragmented and translocated to different locations in the rodent promoters. The rodent sequences moreover lack the functional kappaB sites and IFN-gamma response sites present in the homologous human, rhesus monkey and chimpanzee regions. The absence of kappaB binding in these regions was confirmed by electrophoretic mobility shift assays. CONCLUSION: The data presented reveal divergence between rodents and other mammals in the location and functionality of conserved regions of the NOS-2 promoter containing NF-kappaB and IFN-gamma response elements. %B BMC Genomics %V 8 %P 271 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17686182 %0 Journal Article %J Nucleic Acids Res %D 2007 %T ISACGH: a web-based environment for the analysis of Array CGH and gene expression which includes functional profiling %A L. Conde %A Montaner, D. %A Burguet-Castell, J. %A Tarraga, J. %A Medina, Ignacio %A Fatima Al-Shahrour %A Dopazo, J. %K Animals Cluster Analysis Computational Biology/*methods Computer Graphics Gene Expression Profiling/*methods Humans Internet Models %K Genetic *Nucleic Acid Hybridization Oligonucleotide Array Sequence Analysis/*methods Programming Languages *Software Systems Integration User-Computer Interface %X We present the ISACGH, a web-based system that allows for the combination of genomic data with gene expression values and provides different options for functional profiling of the regions found. Several visualization options offer a convenient representation of the results. Different efficient methods for accurate estimation of genomic copy number from array-CGH hybridization data have been included in the program. Moreover, the connection to the gene expression analysis package GEPAS allows the use of different facilities for data pre-processing and analysis. A DAS server allows exporting the results to the Ensembl viewer where contextual genomic information can be obtained. The program is freely available at: http://isacgh.bioinfo.cipf.es or within http://www.gepas.org. %B Nucleic Acids Res %V 35 %P W81-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17468499 %0 Journal Article %J Nucleic Acids Res %D 2007 %T ISACGH: a web-based environment for the analysis of Array CGH and gene expression which includes functional profiling. %A Conde, Lucia %A Montaner, David %A Burguet-Castell, Jordi %A Tárraga, Joaquín %A Medina, Ignacio %A Al-Shahrour, Fátima %A Dopazo, Joaquin %K Animals %K Cluster Analysis %K Computational Biology %K Computer Graphics %K Gene Expression Profiling %K Humans %K Internet %K Models, Genetic %K Nucleic Acid Hybridization %K Oligonucleotide Array Sequence Analysis %K Programming Languages %K Software %K Systems Integration %K User-Computer Interface %X

We present the ISACGH, a web-based system that allows for the combination of genomic data with gene expression values and provides different options for functional profiling of the regions found. Several visualization options offer a convenient representation of the results. Different efficient methods for accurate estimation of genomic copy number from array-CGH hybridization data have been included in the program. Moreover, the connection to the gene expression analysis package GEPAS allows the use of different facilities for data pre-processing and analysis. A DAS server allows exporting the results to the Ensembl viewer where contextual genomic information can be obtained. The program is freely available at: http://isacgh.bioinfo.cipf.es or within http://www.gepas.org.

%B Nucleic Acids Res %V 35 %P W81-5 %8 2007 Jul %G eng %N Web Server issue %1 https://www.ncbi.nlm.nih.gov/pubmed/17468499?dopt=Abstract %R 10.1093/nar/gkm257 %0 Book Section %B Microarray Technology Through Applications %D 2007 %T Microarray Technology in Agricultural Research %A A. Conesa %A J. Forment %A J. Gadea %A van Dijk, J. %K babelomics %B Microarray Technology Through Applications %I F. Falciani. Publisher: Taylor and Francis Group %P 173-209 %G eng %0 Book Section %B Progress in Industrial Mathematics at ECMI 2006 %D 2007 %T New Trends in the Analysis of Functional Genomic Data %A Montaner, D. %A Fatima Al-Shahrour %A Dopazo, J. %B Progress in Industrial Mathematics at ECMI 2006 %I Springer %C Berlin %V 12 %P 576-580 %G eng %U http://www.springerlink.com/content/m62p07r8111004vr/ %R 10.1007/978-3-540-71992-2_94 %0 Journal Article %J Nucleic Acids Res %D 2007 %T PeroxisomeDB: a database for the peroxisomal proteome, functional genomics and disease %A Schluter, A. %A Fourcade, S. %A Domenech-Estevez, E. %A Gabaldón, T. %A Huerta-Cepas, J. %A Berthommier, G. %A Ripp, R. %A Wanders, R. J. %A Poch, O. %A Pujol, A. %K Animals *Databases %K Protein Genomics Humans Internet Mice Peroxisomal Disorders/*genetics Peroxisomes/*metabolism Protein Sorting Signals Proteome/chemistry/*genetics/*physiology Rats Saccharomyces cerevisiae Proteins/genetics/physiology Software User-Computer Interface %X Peroxisomes are essential organelles of eukaryotic origin, ubiquitously distributed in cells and organisms, playing key roles in lipid and antioxidant metabolism. Loss or malfunction of peroxisomes causes more than 20 fatal inherited conditions. We have created a peroxisomal database (http://www.peroxisomeDB.org) that includes the complete peroxisomal proteome of Homo sapiens and Saccharomyces cerevisiae, by gathering, updating and integrating the available genetic and functional information on peroxisomal genes. PeroxisomeDB is structured in interrelated sections ’Genes’, ’Functions’, ’Metabolic pathways’ and ’Diseases’, that include hyperlinks to selected features of NCBI, ENSEMBL and UCSC databases. We have designed graphical depictions of the main peroxisomal metabolic routes and have included updated flow charts for diagnosis. Precomputed BLAST, PSI-BLAST, multiple sequence alignment (MUSCLE) and phylogenetic trees are provided to assist in direct multispecies comparison to study evolutionary conserved functions and pathways. Highlights of the PeroxisomeDB include new tools developed for facilitating (i) identification of novel peroxisomal proteins, by means of identifying proteins carrying peroxisome targeting signal (PTS) motifs, (ii) detection of peroxisomes in silico, particularly useful for screening the deluge of newly sequenced genomes. PeroxisomeDB should contribute to the systematic characterization of the peroxisomal proteome and facilitate system biology approaches on the organelle. %B Nucleic Acids Res %V 35 %P D815-22 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17135190 %0 Journal Article %J Nucleic Acids Res %D 2007 %T Phylemon: a suite of web tools for molecular evolution, phylogenetics and phylogenomics %A Tarraga, J. %A Medina, Ignacio %A Arbiza, L. %A Huerta-Cepas, J. %A Gabaldón, T. %A Dopazo, J. %A H. Dopazo %K Animals Computational Biology/*methods Databases %K DNA Sequence Analysis %K Genetic Evolution %K Molecular Genetic Techniques Humans *Internet Models %K Protein Software User-Computer Interface %K Statistical *Phylogeny Programming Languages Sequence Alignment Sequence Analysis %X Phylemon is an online platform for phylogenetic and evolutionary analyses of molecular sequence data. It has been developed as a web server that integrates a suite of different tools selected among the most popular stand-alone programs in phylogenetic and evolutionary analysis. It has been conceived as a natural response to the increasing demand of data analysis of many experimental scientists wishing to add a molecular evolution and phylogenetics insight into their research. Tools included in Phylemon cover a wide yet selected range of programs: from the most basic for multiple sequence alignment to elaborate statistical methods of phylogenetic reconstruction including methods for evolutionary rates analyses and molecular adaptation. Phylemon has several features that differentiates it from other resources: (i) It offers an integrated environment that enables the direct concatenation of evolutionary analyses, the storage of results and handles required data format conversions, (ii) Once an outfile is produced, Phylemon suggests the next possible analyses, thus guiding the user and facilitating the integration of multi-step analyses, and (iii) users can define and save complete pipelines for specific phylogenetic analysis to be automatically used on many genes in subsequent sessions or multiple genes in a single session (phylogenomics). The Phylemon web server is available at http://phylemon.bioinfo.cipf.es. %B Nucleic Acids Res %V 35 %P W38-42 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17452346 %0 Journal Article %J Bioinformatics %D 2007 %T Prophet, a web-based tool for class prediction using microarray data %A Medina, Ignacio %A Montaner, D. %A Tarraga, J. %A Dopazo, J. %K babelomics %K gepas %K predictors %X

Sample classification and class prediction is the aim of many gene expression studies. We present a web-based application, Prophet, which builds prediction rules and allows using them for further sample classification. Prophet automatically chooses the best classifier, along with the optimal selection of genes, using a strategy that renders unbiased cross-validated errors. Prophet is linked to different microarray data analysis modules, and includes a unique feature: the possibility of performing the functional interpretation of the molecular signature found. Availability: Prophet can be found at the URL http://prophet.bioinfo.cipf.es/ or within the GEPAS package at http://www.gepas.org/ Supplementary information: http://gepas.bioinfo.cipf.es/tutorial/prophet.html.

%B Bioinformatics %V 23 %P 390-1 %G eng %U http://bioinformatics.oxfordjournals.org/cgi/content/full/23/3/390?view=long&pmid=17138587 %0 Journal Article %J FEBS Lett %D 2007 %T Protein translocation into peroxisomes by ring-shaped import receptors %A Stanley, W. A. %A Fodor, K. %A M. A. Marti-Renom %A Schliebs, W. %A Wilmanns, M. %K Amino Acid Sequence Binding Sites Humans Molecular Sequence Data Peroxisomes/*metabolism Protein Structure %K Cytoplasmic and Nuclear/*chemistry %K Tertiary Protein Transport Receptors %X Folded and functional proteins destined for translocation from the cytosol into the peroxisomal matrix are recognized by two different peroxisomal import receptors, Pex5p and Pex7p. Both cargo-loaded receptors dock on the same translocon components, followed by cargo release and receptor recycling, as part of the complete translocation process. Recent structural and functional evidence on the Pex5p receptor has provided insight on the molecular requirements of specific cargo recognition, while the remaining processes still remain largely elusive. Comparison of experimental structures of Pex5p and a structural model of Pex7p reveal that both receptors are built by ring-like arrangements with cargo binding sites, central to the respective structures. Although, molecular insight into the complete peroxisomal translocon still remains to be determined, emerging data allow to deduce common molecular principles that may hold for other translocation systems as well. %B FEBS Lett %V 581 %P 4795-802 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17884042 %0 Book Section %B Ancestral Sequence Reconstruction %D 2007 %T Reconstruction of ancestral proteomes %A Gabaldón, T. %A M. A. Huynen %B Ancestral Sequence Reconstruction %I D. Liberles %C Oxford %G eng %U http://www.us.oup.com/us/catalog/general/subject/LifeSciences/EvolutionaryBiology/?view=usa&ci=9780199299188 %0 Journal Article %J Eukaryot Cell %D 2007 %T Spatial differentiation in the vegetative mycelium of Aspergillus niger %A Levin, A. M. %A de Vries, R. P. %A A. Conesa %A de Bekker, C. %A Talon, M. %A Menke, H. H. %A van Peij, N. N. %A Wosten, H. A. %K Aspergillus niger/*metabolism Cell Wall/metabolism Fungal Proteins/metabolism *Gene Expression Regulation %K Biological Mycelium/*metabolism Oligonucleotide Array Sequence Analysis RNA %K Fungal Genes %K Fungal Genome %K Fungal Glucans/chemistry Maltose/chemistry Models %K Fungal Time Factors Trans-Activators/metabolism Xylose/chemistry %X Fungal mycelia are exposed to heterogenic substrates. The substrate in the central part of the colony has been (partly) degraded, whereas it is still unexplored at the periphery of the mycelium. We here assessed whether substrate heterogeneity is a main determinant of spatial gene expression in colonies of Aspergillus niger. This question was addressed by analyzing whole-genome gene expression in five concentric zones of 7-day-old maltose- and xylose-grown colonies. Expression profiles at the periphery and the center were clearly different. More than 25% of the active genes showed twofold differences in expression between the inner and outermost zones of the colony. Moreover, 9% of the genes were expressed in only one of the five concentric zones, showing that a considerable part of the genome is active in a restricted part of the colony only. Statistical analysis of expression profiles of colonies that had either been or not been transferred to fresh xylose-containing medium showed that differential expression in a colony is due to the heterogeneity of the medium (e.g., genes involved in secretion, genes encoding proteases, and genes involved in xylose metabolism) as well as to medium-independent mechanisms (e.g., genes involved in nitrate metabolism and genes involved in cell wall synthesis and modification). Thus, we conclude that the mycelia of 7-day-old colonies of A. niger are highly differentiated. This conclusion is also indicated by the fact that distinct zones of the colony grow and secrete proteins, even after transfer to fresh medium. %B Eukaryot Cell %V 6 %P 2311-22 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17951513 %0 Journal Article %J Philos Trans R Soc Lond B Biol Sci %D 2007 %T Structural analyses of a hypothetical minimal metabolism %A Gabaldón, T. %A Peretó, J. %A Montero, F. %A Gil, R. %A Latorre, A. %A Moya, A. %K *Cell Physiological Phenomena Cells/*metabolism Cluster Analysis *Computer Simulation *Metabolic Networks and Pathways *Models %K Biological Models %K Statistical %X By integrating data from comparative genomics and large-scale deletion studies, we previously proposed a minimal gene set comprising 206 protein-coding genes. To evaluate the consistency of the metabolism encoded by such a minimal genome, we have carried out a series of computational analyses. Firstly, the topology of the minimal metabolism was compared with that of the reconstructed networks from natural bacterial genomes. Secondly, the robustness of the metabolic network was evaluated by simulated mutagenesis and, finally, the stoichiometric consistency was assessed by automatically deriving the steady-state solutions from the reaction set. The results indicated that the proposed minimal metabolism presents stoichiometric consistency and that it is organized as a complex power-law network with topological parameters falling within the expected range for a natural metabolism of its size. The robustness analyses revealed that most random mutations do not alter the topology of the network significantly, but do cause significant damage by preventing the synthesis of several compounds or compromising the stoichiometric consistency of the metabolism. The implications that these results have on the origins of metabolic complexity and the theoretical design of an artificial minimal cell are discussed. %B Philos Trans R Soc Lond B Biol Sci %V 362 %P 1751-62 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17510022 %0 Journal Article %J Virology %D 2007 %T Transcriptional response of Citrus aurantifolia to infection by Citrus tristeza virus %A Gandia, M. %A A. Conesa %A Ancillo, G. %A J. Gadea %A J. Forment %A Pallas, V. %A Flores, R. %A Duran-Vila, N. %A Moreno, P. %A Guerri, J. %K Citrus/*genetics/physiology/virology Closterovirus/genetics/*physiology Genes %K Genetic %K Plant Oligonucleotide Array Sequence Analysis Reverse Transcriptase Polymerase Chain Reaction *Transcription %X Changes in gene expression of Mexican lime plants in response to infection with a severe (T305) or a mild (T385) isolate of Citrus tristeza virus (CTV) were analyzed using a cDNA microarray containing 12,672 probes to 6875 different citrus genes. Statistically significant (P<0.01) expression changes of 334 genes were detected in response to infection with isolate T305, whereas infection with T385 induced no significant change. Induced genes included 145 without significant similarity with known sequences and 189 that were classified in seven functional categories. Genes related with response to stress and defense were the main category and included 28% of the genes induced. Selected transcription changes detected by microarray analysis were confirmed by quantitative real-time RT-PCR. Changes detected in the transcriptome upon infecting lime with T305 may be associated either with symptom expression, with a strain-specific defense mechanism, or with a general response to stress. %B Virology %V 367 %P 298-306 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17617431 %0 Journal Article %J Proteins %D 2006 %T Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets %A Melo, F. %A M. A. Marti-Renom %K Amino Acid Sequence Amino Acids/*chemistry/classification/*metabolism Consensus Sequence Molecular Sequence Data Oxidation-Reduction *Protein Folding Proteins/*chemistry/*metabolism Sequence Alignment/*methods Structural Homology %K Protein %X Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs. %B Proteins %V 63 %P 986-95 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16506243 %0 Journal Article %J Nucleic Acids Res %D 2006 %T BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments %A Fatima Al-Shahrour %A Minguez, P. %A Tarraga, J. %A Montaner, D. %A Alloza, E. %A Vaquerizas, J. M. %A L. Conde %A Blaschke, C. %A Vera, J. %A Dopazo, J. %K babelomics %K functional profiling %X

We present a new version of Babelomics, a complete suite of web tools for functional analysis of genome-scale experiments, with new and improved tools. New functionally relevant terms have been included such as CisRed motifs or bioentities obtained by text-mining procedures. An improved indexing has considerably speeded up several of the modules. An improved version of the FatiScan method for studying the coordinate behaviour of groups of functionally related genes is presented, along with a similar tool, the Gene Set Enrichment Analysis. Babelomics is now more oriented to test systems biology inspired hypotheses. Babelomics can be found at http://www.babelomics.org.

%B Nucleic Acids Res %V 34 %P W472-6 %G eng %U http://nar.oxfordjournals.org/content/34/suppl_2/W472.long %0 Journal Article %J Clin Transl Oncol %D 2006 %T Bioinformatics and cancer: an essential alliance %A Dopazo, J. %X

Modern research in cancer has been revolutionized by the introduction of new high-throughput methodologies such as DNA microarrays. Keeping the pace with these technologies, the bioinformatics offer new solutions for data analysis and, what is more important, it permits to formulate a new class of hypothesis inspired in systems biology, more oriented to blocks of functionally-related genes. Although software implementations for this new methodologies is new there are some options already available. Bioinformatic solutions for other high-throughput techniques such as array-CGH of large-scale genotyping is also revised.

%B Clin Transl Oncol %V 8 %P 409-15 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16790393 %0 Journal Article %J Stud Health Technol Inform %D 2006 %T Blast2GO goes grid: developing a grid-enabled prototype for functional genomics analysis %A Aparicio, G. %A Gotz, S. %A A. Conesa %A Segrelles, D. %A Blanquer, I. %A Garcia, J. M. %A Hernandez, V. %A Robles, M. %A Talon, M. %K babelomics %X

The vast amount in complexity of data generated in Genomic Research implies that new dedicated and powerful computational tools need to be developed to meet their analysis requirements. Blast2GO (B2G) is a bioinformatics tool for Gene Ontology-based DNA or protein sequence annotation and function-based data mining. The application has been developed with the aim of affering an easy-to-use tool for functional genomics research. Typical B2G users are middle size genomics labs carrying out sequencing, ETS and microarray projects, handling datasets up to several thousand sequences. In the current version of B2G. The power and analytical potential of both annotation and function data-mining is somehow restricted to the computational power behind each particular installation. In order to be able to offer the possibility of an enhanced computational capacity within this bioinformatics application, a Grid component is being developed. A prototype has been conceived for the particular problem of speeding up the Blast searches to obtain fast results for large datasets. Many efforts have been done in the literature concerning the speeding up of Blast searches, but few of them deal with the use of large heterogeneous production Grid Infrastructures. These are the infrastructures that could reach the largest number of resources and the best load balancing for data access. The Grid Service under development will analyse requests based on the number of sequences, splitting them accordingly to the available resources. Lower-level computation will be performed through MPIBLAST. The software architecture is based on the WSRF standard.

%B Stud Health Technol Inform %V 120 %P 194-204 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16823138 %0 Journal Article %J Curr Protoc Bioinformatics %D 2006 %T Comparative protein structure modeling using Modeller %A Eswar, N. %A Webb, B. %A M. A. Marti-Renom %A Madhusudhan, M. S. %A Eramian, D. %A Shen, M. Y. %A Pieper, U. %A Sali, A. %K Algorithms Amino Acid Sequence Computer Simulation Crystallography/*methods *Models %K Chemical *Models %K Molecular Molecular Sequence Data Protein Conformation Protein Folding Proteins/*chemistry/*ultrastructure Sequence Analysis %K Protein/*methods *Software %X Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. %B Curr Protoc Bioinformatics %V Chapter 5 %P Unit 5 6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18428767 %0 Journal Article %J Protein Sci %D 2006 %T A composite score for predicting errors in protein structure models %A Eramian, D. %A Shen, M. Y. %A Devos, D. %A Melo, F. %A Sali, A. %A M. A. Marti-Renom %K *Models %K Molecular Models %K Theoretical Proteins/*chemistry %X Reliable prediction of model accuracy is an important unsolved problem in protein structure modeling. To address this problem, we studied 24 individual assessment scores, including physics-based energy functions, statistical potentials, and machine learning-based scoring functions. Individual scores were also used to construct approximately 85,000 composite scoring functions using support vector machine (SVM) regression. The scores were tested for their abilities to identify the most native-like models from a set of 6000 comparative models of 20 representative protein structures. Each of the 20 targets was modeled using a template of <30% sequence identity, corresponding to challenging comparative modeling cases. The best SVM score outperformed all individual scores by decreasing the average RMSD difference between the model identified as the best of the set and the model with the lowest RMSD (DeltaRMSD) from 0.63 A to 0.45 A, while having a higher Pearson correlation coefficient to RMSD (r=0.87) than any other tested score. The most accurate score is based on a combination of the DOPE non-hydrogen atom statistical potential; surface, contact, and combined statistical potentials from MODPIPE; and two PSIPRED/DSSP scores. It was implemented in the SVMod program, which can now be applied to select the final model in various modeling problems, including fold assignment, target-template alignment, and loop modeling. %B Protein Sci %V 15 %P 1653-66 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16751606 %0 Journal Article %J Am J Physiol Cell Physiol %D 2006 %T Computational approaches for the prediction of protein function in the mitochondrion %A Gabaldón, T. %K *Computational Biology *Computer Simulation Humans Mitochondria/*metabolism Mitochondrial Proteins/genetics/*metabolism Mutation %X Understanding a complex biological system, such as the mitochondrion, requires the identification of the complete repertoire of proteins targeted to the organelle, the characterization of these, and finally, the elucidation of the functional and physical interactions that occur within the mitochondrion. In the last decade, significant developments have contributed to increase our understanding of the mitochondrion, and among these, computational research has played a significant role. Not only general bioinformatics tools have been applied in the context of the mitochondrion, but also some computational techniques have been specifically developed to address problems that arose from within the mitochondrial research field. In this review the contribution of bioinformatics to mitochondrial biology is addressed through a survey of current computational methods that can be applied to predict which proteins will be localized to the mitochondrion and to unravel their functional interactions. %B Am J Physiol Cell Physiol %V 291 %P C1121-8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16870830 %0 Journal Article %J Environ Sci Technol %D 2006 %T Development of the GENIPOL European flounder (Platichthys flesus) microarray and determination of temporal transcriptional responses to cadmium at low dose %A Williams, T. D. %A Diab, A. M. %A George, S. G. %A Godfrey, R. E. %A Sabine, V. %A A. Conesa %A Minchin, S. D. %A Watts, P. C. %A Chipman, J. K. %K Animals Cadmium Chloride/administration & dosage/*pharmacology Dose-Response Relationship %K Developmental/drug effects Liver/drug effects/growth & development/metabolism Oligonucleotide Array Sequence Analysis/*methods Reverse Transcriptase Polymerase Chain Reaction Transcription %K Drug Environmental Monitoring/methods Flounder/*genetics/growth & development Gene Expression Profiling Gene Expression Regulation %K Genetic/*drug effects %X We have constructed a high density, 13 270-clone cDNA array for the sentinel fish species European flounder (Platichthys flesus), combining clones from suppressive subtractive hybridization and a liver cDNA library; DNA sequences of 5211 clones were determined. Fish were treated by single intraperitoneal injection with 50 micrograms cadmium chloride per kilogram body weight, a dose relevant to environmental exposures, and hepatic gene expression changes were determined at 1, 2, 4, 8, and 16 days postinjection in comparison to saline-treated controls. Gene expression responses were confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). Blast2GO gene ontology analysis highlighted a general induction of the unfolded protein response, response to oxidative stress, protein synthesis, transport, and degradation pathways, while apoptosis, cell cycle, cytoskeleton, and cytokine genes were also affected. Transcript levels of cytochrome P450 1A (CYP1A) were repressed and vitellogenin altered, real-time PCR showed induction of metallothionein. We thus describe the establishment of a useful resource for ecotoxicogenomics and the determination of the temporal molecular responses to cadmium, a prototypical heavy metal pollutant. %B Environ Sci Technol %V 40 %P 6479-88 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17120584 %0 Journal Article %J Genome Biol %D 2006 %T Discovery and hypothesis generation through bioinformatics %A Dopazo, J. %A Aloy, P. %K *Computational Biology Genome %K Genetic Phylogeny %K Human *Genomics Humans *Models %X A report on the 4th European Conference on Computational Biology and the 6th Spanish Annual Meeting on Bioinformatics, Madrid, Spain, 28 September-1 October 2005. %B Genome Biol %V 7 %P 307 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16522224 %0 Journal Article %J Cancer Res %D 2006 %T ERCC4 associated with breast cancer risk: a two-stage case-control study using high-throughput genotyping %A Milne, R. L. %A Ribas, G. %A Gonzalez-Neira, A. %A Fagerholm, R. %A Salas, A. %A Gonzalez, E. %A Dopazo, J. %A Nevanlinna, H. %A M. Robledo %A Benitez, J. %K 80 and over Breast Neoplasms/epidemiology/*genetics/pathology Case-Control Studies DNA-Binding Proteins/genetics/*physiology Female Finland/epidemiology Genes %K Adult Aged Aged %K Recessive Genetic Predisposition to Disease Genotype Humans Introns/genetics Linkage Disequilibrium Middle Aged Neoplasm Proteins/genetics/*physiology Neoplasm Staging *Polymorphism %K Single Nucleotide Risk Spain/epidemiology %X The failure of linkage studies to identify further high-penetrance susceptibility genes for breast cancer points to a polygenic model, with more common variants having modest effects on risk, as the most likely candidate. We have carried out a two-stage case-control study in two European populations to identify low-penetrance genes for breast cancer using high-throughput genotyping. Single-nucleotide polymorphisms (SNPs) were selected across preselected cancer-related genes, choosing tagSNPs and functional variants where possible. In stage 1, genotype frequencies for 640 SNPs in 111 genes were compared between 864 breast cancer cases and 845 controls from the Spanish population. In stage 2, candidate SNPs identified in stage 1 (nominal P < 0.01) were tested in a Finnish series of 884 cases and 1,104 controls. Of the 10 candidate SNPs in seven genes identified in stage 1, one (rs744154) on intron 1 of ERCC4, a gene belonging to the nucleotide excision repair pathway, was associated with recessive protection from breast cancer after adjustment for multiple testing in stage 2 (odds ratio, 0.57; Bonferroni-adjusted P = 0.04). After considering potential functional SNPs in the region of high linkage disequilibrium that extends across the entire gene and upstream into the promoter region, we concluded that rs744154 itself could be causal. Although intronic, it is located on the first intron, in a region that is highly conserved across species, and could therefore be functionally important. This study suggests that common intronic variation in ERCC4 is associated with protection from breast cancer. %B Cancer Res %V 66 %P 9420-7 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17018596 %0 Journal Article %J BMC Genomics %D 2006 %T Exploring the reasons for the large density of triplex-forming oligonucleotide target sequences in the human regulatory regions %A Goni, J. R. %A Vaquerizas, J. M. %A Dopazo, J. %A Orozco, M. %K Animals Base Sequence Computational Biology DNA/chemistry/*genetics/*metabolism Genome %K Genetic/genetics Regulatory Sequences %K Human/genetics Humans Mice Nucleic Acid Conformation Nucleotides/genetics Oligonucleotides/chemistry/*genetics/*metabolism Promoter Regions %K Nucleic Acid/*genetics Transcription Factors/metabolism %X BACKGROUND: DNA duplex sequences that can be targets for triplex formation are highly over-represented in the human genome, especially in regulatory regions. RESULTS: Here we studied using bioinformatics tools several properties of triplex target sequences in an attempt to determine those that make these sequences so special in the genome. CONCLUSION: Our results strongly suggest that the unique physical properties of these sequences make them particularly suitable as "separators" between protein-recognition sites in the promoter region. %B BMC Genomics %V 7 %P 63 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16566817 %0 Journal Article %J OMICS %D 2006 %T Functional interpretation of microarray experiments %A Dopazo, J. %K babelomics %K Diabetes Mellitus %K microarray data analysis %X

Over the past few years, due to the popularisation of high-throughput methodologies such as DNA microarrays, the possibility of obtaining experimental data has increased significantly. Nevertheless, the interpretation of the results, which involves translating these data into useful biological knowledge, still remains a challenge. The methods and strategies used for this interpretation are in continuous evolution and new proposals are constantly arising. Initially, a two-step approach was used in which genes of interest were initially selected, based on thresholds that consider only experimental values, and then in a second, independent step the enrichment of these genes in biologically relevant terms, was analysed. For different reasons, these methods are relatively poor in terms of performance and a new generation of procedures, which draw inspiration from systems biology criteria, are currently under development. Such procedures, aim to directly test the behaviour of blocks of functionally related genes, instead of focusing on single genes.

%B OMICS %V 10 %P 398-410 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17069516 %0 Journal Article %J Genome Inform %D 2006 %T A function-centric approach to the biological interpretation of microarray time-series %A Minguez, P. %A Fatima Al-Shahrour %A Dopazo, J. %K babelomics %X

The interpretation of microarray experiments is commonly addressed by means a two-step approach in which the relevant genes are firstly selected uniquely on the basis of their experimental values (ignoring their coordinate behaviors) and in a second step their functional properties are studied to hypothesize about the biological roles they are fulfilling in the cell. Recently, different methods (e.g. GSEA or FatiScan) have been proposed to study the coordinate behavior of blocks of functionally-related genes. These methods study the distribution of functional information across lists of genes ranked according their different experimental values in a static situation, such as the comparison between two classes (e.g. healthy controls versus diseased cases). Nevertheless there is no an equivalent way of studying a dynamic situation from a functional point of view. We present a method for the functional analysis of microarrays series in which the experiments display autocorrelation between successive points (e.g. time series, dose-response experiments, etc.) The method allows to recover the dynamics of the molecular roles fulfilled by the genes along the series which provides a novel approach to functional interpretation of such experiments. The method finds blocks of functionally-related genes which are significantly and coordinately over-expressed at different points of the series. This method draws inspiration from systems biology given that the analysis does not focus on individual properties of genes but on collective behaving blocks of functionally-related genes. The FatiScan algorithm used in the method proposed is available at: http://fatiscan.bioinfo.cipf.es, or within the Babelomics suite: http://www.babelomics.org. Additional material is available at: http://bioinfo.cipf.es/data/plasmodium.

%B Genome Inform %V 17 %P 57-66 %G eng %0 Journal Article %J Haematologica %D 2006 %T Identification of overexpressed genes in frequently gained/amplified chromosome regions in multiple myeloma %A Largo, C. %A Alvarez, S. %A Saez, B. %A Blesa, D. %A Martin-Subero, J. I. %A Gonzalez-Garcia, I. %A Brieva, J. A. %A Dopazo, J. %A Siebert, R. %A Calasanz, M. J. %A Cigudosa, J. C. %K B-Cell %K Caspases Cell Line %K Human *Gene Amplification Gene Dosage Gene Expression Profiling *Gene Expression Regulation %K Marginal Zone/genetics Multiple Myeloma/*genetics Neoplasm Proteins/genetics Proto-Oncogene Proteins c-bcl-2/genetics %K Neoplasm Humans Immunoglobulin Heavy Chains/genetics Lymphoma %K Neoplastic Gene Rearrangement *Genes %K Tumor *Chromosomes %X BACKGROUND AND OBJECTIVES: Multiple myeloma (MM) is a malignancy characterized by clonal expansion of plasma cells. In 50% of the cases, the neoplastic transformation begins with a chromosomal translocation that juxtaposes the IGH gene locus to an oncogene. Gene copy number changes are also frequent in MM but less characterized than in other neoplasias. We aimed to characterize genes that are amplified and overexpressed in human myeloma cell lines (HMCL) to provide putative molecular targets for MM therapy. DESIGN AND METHODS: Nine HMCL were characterized by fluorescent in situ hybridization, comparative genomic hybridization (CGH) and cDNA microarrays for gene expression profiling and copy number changes. RESULTS: After defining the IGH-translocations present in the cell lines, we conducted expression-profiling analysis. Supervised analysis identified 166 genes with significantly different expression among the cell lines harboring MMSET/FGFR3 (4p16), MAF (16q) and CCND1 (11q13) rearrangements. Array-CGH was then performed. Five chromosomes recurrently affected by gains/amplifications in primary samples and cell lines were analyzed in detail. Sixty amplified and overexpressed genes were found and 25 (42%) of them were only overexpressed when amplified; moreover, six showed a significant association between overexpression and gain/amplification. We also found co-amplification and overexpression for genes located within the same amplicons, such as MALT1 and BCL2. INTERPRETATION AND CONCLUSIONS: Parallel analysis of gene copy numbers and expression levels by cDNA microarray in MM allowed efficient identification of genes whose expression levels are elevated because of increased copy number. This is the first time that MALT1 and BCL2 have been shown to be overexpressed and amplified in MM. %B Haematologica %V 91 %P 184-91 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16461302 %0 Book Section %B Invitación a la Biología %D 2006 %T La clasificación de los organismos %A H. Dopazo %B Invitación a la Biología %I Curtis, Barnes, Schnek & Flores. 2da, Editorial Medica Panamericana %C Buenos Aires %G eng %& 22 %0 Journal Article %J Protein Sci %D 2006 %T Localization of binding sites in protein structures by optimization of a composite scoring function %A Rossi, A. %A M. A. Marti-Renom %A Sali, A. %K Amino Acid Sequence Binding Sites Biomechanics Hydrophobicity Ligands *Monte Carlo Method Protein Conformation Proteins/*chemistry Static Electricity %X The rise in the number of functionally uncharacterized protein structures is increasing the demand for structure-based methods for functional annotation. Here, we describe a method for predicting the location of a binding site of a given type on a target protein structure. The method begins by constructing a scoring function, followed by a Monte Carlo optimization, to find a good scoring patch on the protein surface. The scoring function is a weighted linear combination of the z-scores of various properties of protein structure and sequence, including amino acid residue conservation, compactness, protrusion, convexity, rigidity, hydrophobicity, and charge density; the weights are calculated from a set of previously identified instances of the binding-site type on known protein structures. The scoring function can easily incorporate different types of information useful in localization, thus increasing the applicability and accuracy of the approach. To test the method, 1008 known protein structures were split into 20 different groups according to the type of the bound ligand. For nonsugar ligands, such as various nucleotides, binding sites were correctly identified in 55%-73% of the cases. The method is completely automated (http://salilab.org/patcher) and can be applied on a large scale in a structural genomics setting. %B Protein Sci %V 15 %P 2366-80 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16963645 %0 Journal Article %J Bioinformatics %D 2006 %T maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments %A A. Conesa %A Nueda, M. J. %A Ferrer, A. %A Talon, M. %K *Algorithms Computer Simulation Gene Expression/*physiology Gene Expression Profiling/*methods *Models %K Genetic Models %K Statistical Oligonucleotide Array Sequence Analysis/*methods *Software Time Factors %X MOTIVATION: Multi-series time-course microarray experiments are useful approaches for exploring biological processes. In this type of experiments, the researcher is frequently interested in studying gene expression changes along time and in evaluating trend differences between the various experimental groups. The large amount of data, multiplicity of experimental conditions and the dynamic nature of the experiments poses great challenges to data analysis. RESULTS: In this work, we propose a statistical procedure to identify genes that show different gene expression profiles across analytical groups in time-course experiments. The method is a two-regression step approach where the experimental groups are identified by dummy variables. The procedure first adjusts a global regression model with all the defined variables to identify differentially expressed genes, and in second a variable selection strategy is applied to study differences between groups and to find statistically significant different profiles. The methodology is illustrated on both a real and a simulated microarray dataset. %B Bioinformatics %V 22 %P 1096-102 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16481333 %0 Journal Article %J Nucleic Acids Res %D 2006 %T MODBASE: a database of annotated comparative protein structure models and associated resources %A Pieper, U. %A Eswar, N. %A Davis, F. P. %A Braberg, H. %A Madhusudhan, M. S. %A Rossi, A. %A M. A. Marti-Renom %A Karchin, R. %A Webb, B. M. %A Eramian, D. %A Shen, M. Y. %A Kelly, L. %A Melo, F. %A Sali, A. %K Binding Sites *Databases %K Molecular Polymorphism %K Protein Humans Internet Ligands *Models %K Protein Systems Integration User-Computer Interface %K Single Nucleotide Protein Structure %K Tertiary Proteins/*chemistry/genetics/metabolism Software *Structural Homology %X MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models for all available protein sequences that can be matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on MODELLER for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, and improvements in the software for calculating the models. MODBASE currently contains 3 094 524 reliable models for domains in 1 094 750 out of 1 817 889 unique protein sequences in the UniProt database (July 5, 2005); only models based on statistically significant alignments and models assessed to have the correct fold despite insignificant alignments are included. MODBASE also allows users to generate comparative models for proteins of interest with the automated modeling server MODWEB (http://salilab.org/modweb). Our other resources integrated with MODBASE include comprehensive databases of multiple protein structure alignments (DBAli, http://salilab.org/dbali), structurally defined ligand binding sites and structurally defined binary domain interfaces (PIBASE, http://salilab.org/pibase) as well as predictions of ligand binding sites, interactions between yeast proteins, and functional consequences of human nsSNPs (LS-SNP, http://salilab.org/LS-SNP). %B Nucleic Acids Res %V 34 %P D291-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16381869 %0 Journal Article %J Nucleic Acids Res %D 2006 %T Next station in microarray data analysis: GEPAS %A Montaner, D. %A Tarraga, J. %A Huerta-Cepas, J. %A Burguet, J. %A Vaquerizas, J. M. %A L. Conde %A Minguez, P. %A Vera, J. %A Mukherjee, S. %A Valls, J. %A Pujana, M. A. %A Alloza, E. %A Herrero, J. %A Fatima Al-Shahrour %A Dopazo, J. %K gepas %K microarray data analysis %X

The Gene Expression Profile Analysis Suite (GEPAS) has been running for more than four years. During this time it has evolved to keep pace with the new interests and trends in the still changing world of microarray data analysis. GEPAS has been designed to provide an intuitive although powerful web-based interface that offers diverse analysis options from the early step of preprocessing (normalization of Affymetrix and two-colour microarray experiments and other preprocessing options), to the final step of the functional annotation of the experiment (using Gene Ontology, pathways, PubMed abstracts etc.), and include different possibilities for clustering, gene selection, class prediction and array-comparative genomic hybridization management. GEPAS is extensively used by researchers of many countries and its records indicate an average usage rate of 400 experiments per day. The web-based pipeline for microarray gene expression data, GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 34 %P W486-91 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16845056 %0 Journal Article %J Methods Mol Biol %D 2006 %T Ontology-driven approaches to analyzing data in functional genomics %A F. Azuaje %A Fatima Al-Shahrour %A Dopazo, J. %K babelomics %K Cluster Analysis %K Cluster Analysis Computational Biology/*methods *Data Interpretation %K Computational Biology %K Statistical Gene Expression Profiling %K Statistical Gene Expression Profiling *Genomics Humans %X

Ontologies are fundamental knowledge representations that provide not only standards for annotating and indexing biological information, but also the basis for implementing functional classification and interpretation models. This chapter discusses the application of gene ontology (GO) for predictive tasks in functional genomics. It focuses on the problem of analyzing functional patterns associated with gene products. This chapter is divided into two main parts. The first part overviews GO and its applications for the development of functional classification models. The second part presents two methods for the characterization of genomic information using GO. It discusses methods for measuring functional similarity of gene products, and a tool for supporting gene expression clustering analysis and validation.

%B Methods Mol Biol %V 316 %P 67-86 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16671401 %0 Journal Article %J Biol Direct %D 2006 %T Origin and evolution of the peroxisomal proteome %A Gabaldón, T. %A B. Snel %A van Zimmeren, F. %A Hemrika, W. %A Tabak, H. %A M. A. Huynen %X BACKGROUND: Peroxisomes are ubiquitous eukaryotic organelles involved in various oxidative reactions. Their enzymatic content varies between species, but the presence of common protein import and organelle biogenesis systems support a single evolutionary origin. The precise scenario for this origin remains however to be established. The ability of peroxisomes to divide and import proteins post-translationally, just like mitochondria and chloroplasts, supports an endosymbiotic origin. However, this view has been challenged by recent discoveries that mutant, peroxisome-less cells restore peroxisomes upon introduction of the wild-type gene, and that peroxisomes are formed from the Endoplasmic Reticulum. The lack of a peroxisomal genome precludes the use of classical analyses, as those performed with mitochondria or chloroplasts, to settle the debate. We therefore conducted large-scale phylogenetic analyses of the yeast and rat peroxisomal proteomes. RESULTS : Our results show that most peroxisomal proteins (39-58%) are of eukaryotic origin, comprising all proteins involved in organelle biogenesis or maintenance. A significant fraction (13-18%), consisting mainly of enzymes, has an alpha-proteobacterial origin and appears to be the result of the recruitment of proteins originally targeted to mitochondria. Consistent with the findings that peroxisomes are formed in the Endoplasmic Reticulum, we find that the most universally conserved Peroxisome biogenesis and maintenance proteins are homologous to proteins from the Endoplasmic Reticulum Assisted Decay pathway. CONCLUSION: Altogether our results indicate that the peroxisome does not have an endosymbiotic origin and that its proteins were recruited from pools existing within the primitive eukaryote. Moreover the reconstruction of primitive peroxisomal proteomes suggests that ontogenetically as well as phylogenetically, peroxisomes stem from the Endoplasmic Reticulum. REVIEWERS: This article was reviewed by Arcady Mushegian, Gaspar Jekely and John Logsdon. OPEN PEER REVIEW: Reviewed by Arcady Mushegian, Gaspar Jekely and John Logsdon. For the full reviews, please go to the Reviewers’ comments section. %B Biol Direct %V 1 %P 8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16556314 %0 Journal Article %J PLoS Comput Biol %D 2006 %T Positive selection, relaxation, and acceleration in the evolution of the human and chimp genome %A Arbiza, L. %A Dopazo, J. %A H. Dopazo %K Adaptation %K Biological/genetics Animals *Evolution %K Molecular Genome/*genetics Humans Pan troglodytes/*genetics *Selection (Genetics) %X For years evolutionary biologists have been interested in searching for the genetic bases underlying humanness. Recent efforts at a large or a complete genomic scale have been conducted to search for positively selected genes in human and in chimp. However, recently developed methods allowing for a more sensitive and controlled approach in the detection of positive selection can be employed. Here, using 13,198 genes, we have deduced the sets of genes involved in rate acceleration, positive selection, and relaxation of selective constraints in human, in chimp, and in their ancestral lineage since the divergence from murids. Significant deviations from the strict molecular clock were observed in 469 human and in 651 chimp genes. The more stringent branch-site test of positive selection detected 108 human and 577 chimp positively selected genes. An important proportion of the positively selected genes did not show a significant acceleration in rates, and similarly, many of the accelerated genes did not show significant signals of positive selection. Functional differentiation of genes under rate acceleration, positive selection, and relaxation was not statistically significant between human and chimp with the exception of terms related to G-protein coupled receptors and sensory perception. Both of these were over-represented under relaxation in human in relation to chimp. Comparing differences between derived and ancestral lineages, a more conspicuous change in trends seems to have favored positive selection in the human lineage. Since most of the positively selected genes are different under the same functional categories between these species, we suggest that the individual roles of the alternative positively selected genes may be an important factor underlying biological differences between these species. %B PLoS Comput Biol %V 2 %P e38 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16683019 %0 Journal Article %J Nucleic Acids Res %D 2006 %T PupaSuite: finding functional single nucleotide polymorphisms for large-scale genotyping purposes %A L. Conde %A Vaquerizas, J. M. %A H. Dopazo %A Arbiza, L. %A Reumers, J. %A Rousseau, F. %A Schymkowitz, J. %A Dopazo, J. %K Algorithms Computer Graphics Databases %K Molecular Genotype Haplotypes Internet Linkage Disequilibrium *Polymorphism %K Nucleic Acid Evolution %K Single Nucleotide *Software User-Computer Interface %X

We have developed a web tool, PupaSuite, for the selection of single nucleotide polymorphisms (SNPs) with potential phenotypic effect, specifically oriented to help in the design of large-scale genotyping projects. PupaSuite uses a collection of data on SNPs from heterogeneous sources and a large number of pre-calculated predictions to offer a flexible and intuitive interface for selecting an optimal set of SNPs. It improves the functionality of PupaSNP and PupasView programs and implements new facilities such as the analysis of user’s data to derive haplotypes with functional information. A new estimator of putative effect of polymorphisms has been included that uses evolutionary information. Also SNPeffect database predictions have been included. The PupaSuite web interface is accessible through http://pupasuite.bioinfo.cipf.es and through http://www.pupasnp.org.

%B Nucleic Acids Res %V 34 %P W621-5 %G eng %U http://nar.oxfordjournals.org/cgi/content/full/34/suppl_2/W621 %0 Journal Article %J J Mol Biol %D 2006 %T Refinement of protein structures by iterative comparative modeling and CryoEM density fitting %A Topf, M. %A Baker, M. L. %A M. A. Marti-Renom %A Chiu, W. %A Sali, A. %K Amino Acid Sequence Cryoelectron Microscopy *Models %K Molecular Molecular Sequence Data Plant Viruses/chemistry *Protein Conformation Software Viral Proteins/*chemistry/genetics %X We developed a method for structure characterization of assembly components by iterative comparative protein structure modeling and fitting into cryo-electron microscopy (cryoEM) density maps. Specifically, we calculate a comparative model of a given component by considering many alternative alignments between the target sequence and a related template structure while optimizing the fit of a model into the corresponding density map. The method relies on the previously developed Moulder protocol that iterates over alignment, model building, and model assessment. The protocol was benchmarked using 20 varied target-template pairs of known structures with less than 30% sequence identity and corresponding simulated density maps at resolutions from 5A to 25A. Relative to the models based on the best existing sequence profile alignment methods, the percentage of C(alpha) atoms that are within 5A of the corresponding C(alpha) atoms in the superposed native structure increases on average from 52% to 66%, which is half-way between the starting models and the models from the best possible alignments (82%). The test also reveals that despite the improvements in the accuracy of the fitness function, this function is still the bottleneck in reducing the remaining errors. To demonstrate the usefulness of the protocol, we applied it to the upper domain of the P8 capsid protein of rice dwarf virus that has been studied by cryoEM at 6.8A. The C(alpha) root-mean-square deviation of the model based on the remotely related template, bluetongue virus VP7, improved from 8.7A to 6.0A, while the best possible model has a C(alpha) RMSD value of 5.3A. Moreover, the resulting model fits better into the cryoEM density map than the initial template structure. The method is being implemented in our program MODELLER for protein structure modeling by satisfaction of spatial restraints and will be applicable to the rapidly increasing number of cryoEM density maps of macromolecular assemblies. %B J Mol Biol %V 357 %P 1655-68 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16490207 %0 Book Section %B Discovery of biomolecular mechanisms with theoretical data analyses %D 2006 %T Reliable and specific protein function prediction by combining homology with genomic(s) context %A M. A. Huynen %A B. Snel %A Gabaldón T %B Discovery of biomolecular mechanisms with theoretical data analyses %I F. Eisenhaber, Landes Bioscience %G eng %U http://www.landesbioscience.com/iu/output.php?id=479 %0 Journal Article %J J Mol Biol %D 2006 %T Selective pressures at a codon-level predict deleterious mutations in human disease genes %A Arbiza, L. %A Duchi, S. %A Montaner, D. %A Burguet, J. %A Pantoja-Uceda, D. %A Pineda-Lucena, A. %A Dopazo, J. %A H. Dopazo %K Amino Acid Sequence Amino Acid Substitution Codon/*genetics Databases %K Genetic Evolution %K Genetic Models %K Human Humans Models %K Inborn/*genetics Genome %K Molecular Genes %K Molecular Molecular Sequence Data *Mutation Neoplasms/genetics Proteins/genetics *Selection (Genetics) Tumor Suppressor Protein p53/chemistry/genetics %K p53 Genetic Diseases %X Deleterious mutations affecting biological function of proteins are constantly being rejected by purifying selection from the gene pool. The non-synonymous/synonymous substitution rate ratio (omega) is a measure of selective pressure on amino acid replacement mutations for protein-coding genes. Different methods have been developed in order to predict non-synonymous changes affecting gene function. However, none has considered the estimation of selective constraints acting on protein residues. Here, we have used codon-based maximum likelihood models in order to estimate the selective pressures on the individual amino acid residues of a well-known model protein: p53. We demonstrate that the number of residues under strong purifying selection in p53 is much higher than those that are strictly conserved during the evolution of the species. In agreement with theoretical expectations, residues that have been noted to be of structural relevance, or in direct association with DNA, were among those showing the highest signals of purifying selection. Conversely, those changing according to a neutral, or nearly neutral mode of evolution, were observed to be irrelevant for protein function. Finally, using more than 40 human disease genes, we demonstrate that residues evolving under strong selective pressures (omega<0.1) are significantly associated (p<0.01) with human disease. We hypothesize that non-synonymous change on amino acids showing omega<0.1 will most likely affect protein function. The application of this evolutionary prediction at a genomic scale will provide an a priori hypothesis of the phenotypic effect of non-synonymous coding single nucleotide polymorphisms (SNPs) in the human genome. %B J Mol Biol %V 358 %P 1390-404 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16584746 %0 Journal Article %J Protein Eng Des Sel %D 2006 %T Variable gap penalty for protein sequence-structure alignment %A Madhusudhan, M. S. %A M. A. Marti-Renom %A Sanchez, R. %A Sali, A. %K Algorithms Amino Acid Sequence Models %K Amino Acid *Software %K Molecular Molecular Sequence Data Proteins/*chemistry Sequence Alignment/*methods Sequence Analysis %K Protein/*methods *Sequence Homology %X The penalty for inserting gaps into an alignment between two protein sequences is a major determinant of the alignment accuracy. Here, we present an algorithm for finding a globally optimal alignment by dynamic programming that can use a variable gap penalty (VGP) function of any form. We also describe a specific function that depends on the structural context of an insertion or deletion. It penalizes gaps that are introduced within regions of regular secondary structure, buried regions, straight segments and also between two spatially distant residues. The parameters of the penalty function were optimized on a set of 240 sequence pairs of known structure, spanning the sequence identity range of 20-40%. We then tested the algorithm on another set of 238 sequence pairs of known structures. The use of the VGP function increases the number of correctly aligned residues from 81.0 to 84.5% in comparison with the optimized affine gap penalty function; this difference is statistically significant according to Student’s t-test. We estimate that the new algorithm allows us to produce comparative models with an additional approximately 7 million accurately modeled residues in the approximately 1.1 million proteins that are detectably related to a known structure. %B Protein Eng Des Sel %V 19 %P 129-33 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16423846 %0 Journal Article %J Nature %D 2005 %T An anaerobic mitochondrion that produces hydrogen %A Boxma, B. %A de Graaf, R. M. %A van der Staay, G. W. %A van Alen, T. A. %A Ricard, G. %A Gabaldón, T. %A van Hoek, A. H. %A Moon-van der Staay, S. Y. %A Koopman, W. J. %A van Hellemond, J. J. %A Tielens, A. G. %A Friedrich, T. %A Veenhuis, M. %A M. A. Huynen %A Hackstein, J. H. %K *Anaerobiosis Animals Ciliophora/*cytology/genetics/*metabolism/ultrastructure Cockroaches/parasitology DNA %K Mitochondrial/genetics Electron Transport Electron Transport Complex I/antagonists & inhibitors/metabolism Genome Glucose/metabolism Hydrogen/*metabolism Mitochondria/enzymology/genetics/*metabolism/ultrastructure Molecular Sequence Data Open Reading Fra %X Hydrogenosomes are organelles that produce ATP and hydrogen, and are found in various unrelated eukaryotes, such as anaerobic flagellates, chytridiomycete fungi and ciliates. Although all of these organelles generate hydrogen, the hydrogenosomes from these organisms are structurally and metabolically quite different, just like mitochondria where large differences also exist. These differences have led to a continuing debate about the evolutionary origin of hydrogenosomes. Here we show that the hydrogenosomes of the anaerobic ciliate Nyctotherus ovalis, which thrives in the hindgut of cockroaches, have retained a rudimentary genome encoding components of a mitochondrial electron transport chain. Phylogenetic analyses reveal that those proteins cluster with their homologues from aerobic ciliates. In addition, several nucleus-encoded components of the mitochondrial proteome, such as pyruvate dehydrogenase and complex II, were identified. The N. ovalis hydrogenosome is sensitive to inhibitors of mitochondrial complex I and produces succinate as a major metabolic end product–biochemical traits typical of anaerobic mitochondria. The production of hydrogen, together with the presence of a genome encoding respiratory chain components, and biochemical features characteristic of anaerobic mitochondria, identify the N. ovalis organelle as a missing link between mitochondria and hydrogenosomes. %B Nature %V 434 %P 74-9 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15744302 %0 Journal Article %J Nucleic Acids Res %D 2005 %T BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments %A Fatima Al-Shahrour %A Minguez, P. %A Vaquerizas, J. M. %A L. Conde %A Dopazo, J. %K babelomics %K functional profiling %X

We present Babelomics, a complete suite of web tools for the functional analysis of groups of genes in high-throughput experiments, which includes the use of information on Gene Ontology terms, interpro motifs, KEGG pathways, Swiss-Prot keywords, analysis of predicted transcription factor binding sites, chromosomal positions and presence in tissues with determined histological characteristics, through five integrated modules: FatiGO (fast assignment and transference of information), FatiWise, transcription factor association test, GenomeGO and tissues mining tool, respectively. Additionally, another module, FatiScan, provides a new procedure that integrates biological information in combination with experimental results in order to find groups of genes with modest but coordinate significant differential behaviour. FatiScan is highly sensitive and is capable of finding significant asymmetries in the distribution of genes of common function across a list of ordered genes even if these asymmetries were not extreme. The strong multiple-testing nature of the contrasts made by the tools is taken into account. All the tools are integrated in the gene expression analysis package GEPAS. Babelomics is the natural evolution of our tool FatiGO (which analysed almost 22,000 experiments during the last year) to include more sources on information and new modes of using it. Babelomics can be found at http://www.babelomics.org.

%B Nucleic Acids Res %V 33 %P W460-4 %G eng %U http://nar.oxfordjournals.org/content/33/suppl_2/W460.long %0 Journal Article %J Bioinformatics %D 2005 %T Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research %A A. Conesa %A Gotz, S. %A Garcia-Gomez, J. M. %A Terol, J. %A Talon, M. %A Robles, M. %K babelomics %X

SUMMARY: We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. AVAILABILITY: Blast2GO is freely available via Java Web Start at http://www.blast2go.de. SUPPLEMENTARY MATERIAL: http://www.blast2go.de -> Evaluation.

%B Bioinformatics %V 21 %P 3674-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16081474 %0 Journal Article %J FEBS Lett %D 2005 %T Combining data from genomes, Y2H and 3D structure indicates that BolA is a reductase interacting with a glutaredoxin %A M. A. Huynen %A Spronk, C. A. %A Gabaldón, T. %A B. Snel %K *Genome Glutaredoxins Models %K Molecular Oxidoreductases/chemistry/*metabolism Phylogeny Protein Conformation %X Genomes, functional genomics data and 3D structure reflect different aspects of protein function. Here, we combine these data to predict that BolA, a widely distributed protein family with unknown function, is a reductase that interacts with a glutaredoxin. Comparisons at the 3D structure level as well as at the sequence profile level indicate homology between BolA and OsmC, an enzyme that reduces organic peroxides. Complementary to this, comparative analyses of genomes and genomics data provide strong evidence of an interaction between BolA and the mono-thiol glutaredoxin family. The interaction between BolA and a mono-thiol glutaredoxin is of particular interest because BolA does not, in contrast to its homolog OsmC, have evolutionarily conserved cysteines to provide it with reducing equivalents. We propose that BolA uses the mono-thiol glutaredoxin as the source for these. %B FEBS Lett %V 579 %P 591-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15670813 %0 Journal Article %J Nat Struct Mol Biol %D 2005 %T The C-type lectin fold as an evolutionary solution for massive sequence variation %A McMahon, S. A. %A Miller, J. L. %A Lawton, J. A. %A Kerkow, D. E. %A Hodes, A. %A M. A. Marti-Renom %A Doulatov, S. %A Narayanan, E. %A Sali, A. %A Miller, J. F. %A Ghosh, P. %K Amino Acid Sequence Bacterial Outer Membrane Proteins/*chemistry Bacteriophages/*metabolism Bordetella/*virology Evolution %K Bordetella/*chemistry %K C-Type/*chemistry Molecular Sequence Data Protein Conformation Protein Folding Viral Proteins/*chemistry/*genetics Virulence Factors %K Molecular Genetic Variation Genome %K Viral Lectins %X Only few instances are known of protein folds that tolerate massive sequence variation for the sake of binding diversity. The most extensively characterized is the immunoglobulin fold. We now add to this the C-type lectin (CLec) fold, as found in the major tropism determinant (Mtd), a retroelement-encoded receptor-binding protein of Bordetella bacteriophage. Variation in Mtd, with its approximately 10(13) possible sequences, enables phage adaptation to Bordetella spp. Mtd is an intertwined, pyramid-shaped trimer, with variable residues organized by its CLec fold into discrete receptor-binding sites. The CLec fold provides a highly static scaffold for combinatorial display of variable residues, probably reflecting a different evolutionary solution for balancing diversity against stability from that in the immunoglobulin fold. Mtd variants are biased toward the receptor pertactin, and there is evidence that the CLec fold is used broadly for sequence variation by related retroelements. %B Nat Struct Mol Biol %V 12 %P 886-92 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16170324 %0 Book Section %D 2005 %T Data analysis and visualisation in genomics and proteomics %A F. Azuaje %A Dopazo, J. %K babelomics %I Wiley, F. Azuaje and J. Dopazo %G eng %0 Book Section %B Data analysis and visualisation in genomics and proteomics %D 2005 %T Data and Predictive Model Integration: an Overview of Key Concepts, Problems and Solutions %A F. Azuaje %A Dopazo, J. %A Wang, H %B Data analysis and visualisation in genomics and proteomics %I Wiley, F. Azuaje and J. Dopazo %G eng %0 Journal Article %J Proc Natl Acad Sci U S A %D 2005 %T Detecting remotely related proteins by their interactions and sequence similarity %A Espadaler, J. %A Aragues, R. %A Eswar, N. %A M. A. Marti-Renom %A Querol, E. %A Aviles, F. X. %A Sali, A. %A Oliva, B. %K Amino Acid %K Computational Biology Databases %K Molecular Protein Conformation Protein Folding Proteins/*genetics/*metabolism Proteomics/*methods *Sequence Homology %K Protein *Evolution %X The function of an uncharacterized protein is usually inferred either from its homology to, or its interactions with, characterized proteins. Here, we use both sequence similarity and protein interactions to identify relationships between remotely related protein sequences. We rely on the fact that homologous sequences share similar interactions, and, therefore, the set of interacting partners of the partners of a given protein is enriched by its homologs. The approach was bench-marked by assigning the fold and functional family to test sequences of known structure. Specifically, we relied on 1,434 proteins with known folds, as defined in the Structural Classification of Proteins (SCOP) database, and with known interacting partners, as defined in the Database of Interacting Proteins (DIP). For this subset, the specificity of fold assignment was increased from 54% for position-specific iterative BLAST to 75% for our approach, with a concomitant increase in sensitivity for a few percentage points. Similarly, the specificity of family assignment at the e-value threshold of 10(-8) was increased from 70% to 87%. The proposed method would be a useful tool for large-scale automated discovery of remote relationships between protein sequences, given its unique reliance on sequence similarity and protein-protein interactions. %B Proc Natl Acad Sci U S A %V 102 %P 7151-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15883372 %0 Journal Article %J Plant Mol Biol %D 2005 %T Development of a citrus genome-wide EST collection and cDNA microarray as resources for genomic studies %A J. Forment %A J. Gadea %A Huerta, L. %A Abizanda, L. %A Agusti, J. %A Alamar, S. %A Alos, E. %A Andres, F. %A Arribas, R. %A Beltran, J. P. %A Berbel, A. %A Blazquez, M. A. %A Brumos, J. %A Canas, L. A. %A Cercos, M. %A Colmenero-Flores, J. M. %A A. Conesa %A Estables, B. %A Gandia, M. %A Garcia-Martinez, J. L. %A Gimeno, J. %A Gisbert, A. %A Gomez, G. %A Gonzalez-Candelas, L. %A Granell, A. %A Guerri, J. %A Lafuente, M. T. %A Madueno, F. %A Marcos, J. F. %A Marques, M. C. %A Martinez, F. %A Martinez-Godoy, M. A. %A Miralles, S. %A Moreno, P. %A Navarro, L. %A Pallas, V. %A Perez-Amador, M. A. %A Perez-Valle, J. %A Pons, C. %A Rodrigo, I. %A Rodriguez, P. L. %A Royo, C. %A Serrano, R. %A Soler, G. %A Tadeo, F. %A Talon, M. %A Terol, J. %A Trenor, M. %A Vaello, L. %A Vicente, O. %A Vidal, Ch %A Zacarias, L. %A Conejero, V. %K Citrus/*genetics DNA %K Complementary/chemistry/genetics *Expressed Sequence Tags Gene Expression Profiling Gene Library *Genome %K DNA %K Plant Genomics/*methods Molecular Sequence Data Oligonucleotide Array Sequence Analysis/*methods RNA %K Plant/genetics/metabolism Reproducibility of Results Sequence Analysis %X A functional genomics project has been initiated to approach the molecular characterization of the main biological and agronomical traits of citrus. As a key part of this project, a citrus EST collection has been generated from 25 cDNA libraries covering different tissues, developmental stages and stress conditions. The collection includes a total of 22,635 high-quality ESTs, grouped in 11,836 putative unigenes, which represent at least one third of the estimated number of genes in the citrus genome. Functional annotation of unigenes which have Arabidopsis orthologues (68% of all unigenes) revealed gene representation in every major functional category, suggesting that a genome-wide EST collection was obtained. A Citrus clementina Hort. ex Tan. cv. Clemenules genomic library, that will contribute to further characterization of relevant genes, has also been constructed. To initiate the analysis of citrus transcriptome, we have developed a cDNA microarray containing 12,672 probes corresponding to 6875 putative unigenes of the collection. Technical characterization of the microarray showed high intra- and inter-array reproducibility, as well as a good range of sensitivity. We have also validated gene expression data achieved with this microarray through an independent technique such as RNA gel blot analysis. %B Plant Mol Biol %V 57 %P 375-91 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15830128 %0 Journal Article %J Bioinformatics %D 2005 %T Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information %A Fatima Al-Shahrour %A Diaz-Uriarte, R. %A Dopazo, J. %K babelomics %K Biological Neoplasm Proteins/genetics/*metabolism Phenotype Software Structure-Activity Relationship Systems Integration Tumor Markers %K Biological/genetics/*metabolism %K Breast Neoplasms/genetics/*metabolism Computer Simulation *Database Management Systems *Databases %K Protein Documentation/methods Gene Expression Profiling/*methods Humans *Models %X

MOTIVATION: The analysis of genome-scale data from different high throughput techniques can be used to obtain lists of genes ordered according to their different behaviours under distinct experimental conditions corresponding to different phenotypes (e.g. differential gene expression between diseased samples and controls, different response to a drug, etc.). The order in which the genes appear in the list is a consequence of the biological roles that the genes play within the cell, which account, at molecular scale, for the macroscopic differences observed between the phenotypes studied. Typically, two steps are followed for understanding the biological processes that differentiate phenotypes at molecular level: first, genes with significant differential expression are selected on the basis of their experimental values and subsequently, the functional properties of these genes are analysed. Instead, we present a simple procedure which combines experimental measurements with available biological information in a way that genes are simultaneously tested in groups related by common functional properties. The method proposed constitutes a very sensitive tool for selecting genes with significant differential behaviour in the experimental conditions tested. RESULTS: We propose the use of a method to scan ordered lists of genes. The method allows the understanding of the biological processes operating at molecular level behind the macroscopic experiment from which the list was generated. This procedure can be useful in situations where it is not possible to obtain statistically significant differences based on the experimental measurements (e.g. low prevalence diseases, etc.). Two examples demonstrate its application in two microarray experiments and the type of information that can be extracted.

%B Bioinformatics %V 21 %P 2988-93 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15840702 %0 Journal Article %J Evol Bioinform Online %D 2005 %T Evolution of proteins and proteomes: a phylogenetics approach %A Gabaldón, T. %X The study of evolutionary relationships among protein sequences was one of the first applications of bioinformatics. Since then, and accompanying the wealth of biological data produced by genome sequencing and other high-throughput techniques, the use of bioinformatics in general and phylogenetics in particular has been gaining ground in the study of protein and proteome evolution. Nowadays, the use of phylogenetics is instrumental not only to infer the evolutionary relationships among species and their genome sequences, but also to reconstruct ancestral states of proteins and proteomes and hence trace the paths followed by evolution. Here I survey recent progress in the elucidation of mechanisms of protein and proteome evolution in which phylogenetics has played a determinant role. %B Evol Bioinform Online %V 1 %P 51-61 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=19325853 %0 Journal Article %J Genome Biol %D 2005 %T Genome-scale evidence of the nematode-arthropod clade %A H. Dopazo %A Dopazo, J. %K Animals Arthropods/*classification/genetics Caenorhabditis elegans/classification/genetics Evolution %K Molecular *Genome Genomics Nematoda/*classification/genetics *Phylogeny %X BACKGROUND: The issue of whether coelomates form a single clade, the Coelomata, or whether all animals that moult an exoskeleton (such as the coelomate arthropods and the pseudocoelomate nematodes) form a distinct clade, the Ecdysozoa, is the most puzzling issue in animal systematics and a major open-ended subject in evolutionary biology. Previous single-gene and genome-scale analyses designed to resolve the issue have produced contradictory results. Here we present the first genome-scale phylogenetic evidence that strongly supports the Ecdysozoa hypothesis. RESULTS: Through the most extensive phylogenetic analysis carried out to date, the complete genomes of 11 eukaryotic species have been analyzed in order to find homologous sequences derived from 18 human chromosomes. Phylogenetic analysis of datasets showing an increased adjustment to equal evolutionary rates between nematode and arthropod sequences produced a gradual change from support for Coelomata to support for Ecdysozoa. Transition between topologies occurred when fast-evolving sequences of Caenorhabditis elegans were removed. When chordate, nematode and arthropod sequences were constrained to fit equal evolutionary rates, the Ecdysozoa topology was statistically accepted whereas Coelomata was rejected. CONCLUSIONS: The reliability of a monophyletic group clustering arthropods and nematodes was unequivocally accepted in datasets where traces of the long-branch attraction effect were removed. This is the first phylogenomic evidence to strongly support the ’moulting clade’ hypothesis. %B Genome Biol %V 6 %P R41 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15892869 %0 Journal Article %J Nucleic Acids Res %D 2005 %T GEPAS, an experiment-oriented pipeline for the analysis of microarray gene expression data %A Vaquerizas, J. M. %A L. Conde %A Yankilevich, P. %A Cabezon, A. %A Minguez, P. %A Diaz-Uriarte, R. %A Fatima Al-Shahrour %A Herrero, J. %A Dopazo, J. %K gepas %K microarray data analysis %X

The Gene Expression Profile Analysis Suite, GEPAS, has been running for more than three years. With >76,000 experiments analysed during the last year and a daily average of almost 300 analyses, GEPAS can be considered a well-established and widely used platform for gene expression microarray data analysis. GEPAS is oriented to the analysis of whole series of experiments. Its design and development have been driven by the demands of the biomedical community, probably the most active collective in the field of microarray users. Although clustering methods have obviously been implemented in GEPAS, our interest has focused more on methods for finding genes differentially expressed among distinct classes of experiments or correlated to diverse clinical outcomes, as well as on building predictors. There is also a great interest in CGH-arrays which fostered the development of the corresponding tool in GEPAS: InSilicoCGH. Much effort has been invested in GEPAS for developing and implementing efficient methods for functional annotation of experiments in the proper statistical framework. Thus, the popular FatiGO has expanded to a suite of programs for functional annotation of experiments, including information on transcription factor binding sites, chromosomal location and tissues. The web-based pipeline for microarray gene expression data, GEPAS, is available at http://www.gepas.org.

%B Nucleic Acids Res %V 33 %P W616-20 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15980548 %0 Journal Article %J Nucleic Acids Res %D 2005 %T HCAD, closing the gap between breakpoints and genes %A Hoffmann, R. %A Dopazo, J. %A Cigudosa, J. C. %A Valencia, A. %K *Chromosome Breakage Chromosome Disorders/diagnosis/*genetics *Databases %K Genetic Genes *Genetic Predisposition to Disease Humans PubMed Systems Integration %X Recurrent chromosome aberrations are an important resource when associating human pathologies to specific genes. However, for technical reasons a large number of chromosome breakpoints are defined only at the level of cytobands and many of the genes involved remain unidentified. We developed a web-based information system that mines the scientific literature and generates textual and comprehensive information on all human breakpoints. We show that the statistical analysis of this textual information and its combination with genomic data can identify genes directly involved in DNA rearrangements. The Human Chromosome Aberration Database (HCAD) is publicly accessible at http://www.pdg.cnb.uam.es/UniPub/HCAD/. %B Nucleic Acids Res %V 33 %P D511-3 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15608250 %0 Journal Article %J Bioinformatics %D 2005 %T Highly specific and accurate selection of siRNAs for high-throughput functional assays %A J. Santoyo %A Vaquerizas, J. M. %A Dopazo, J. %K *Algorithms Base Sequence *Gene Silencing Molecular Sequence Data RNA %K RNA/*methods *Software *User-Computer Interface %K Small Interfering/*genetics Sequence Alignment/*methods Sequence Analysis %X MOTIVATION: Small interfering RNA (siRNA) is widely used in functional genomics to silence genes by decreasing their expression to study the resulting phenotypes. The possibility of performing large-scale functional assays by gene silencing accentuates the necessity of a software capable of the high-throughput design of highly specific siRNA. The main objective sought was the design of a large number of siRNAs with appropriate thermodynamic properties and, especially, high specificity. Since all the available procedures require, to some extent, manual processing of the results to guarantee specific results, specificity constitutes to date, the major obstacle to the complete automation of all the steps necessary for the selection of optimal candidate siRNAs. RESULT: Here, we present a program that for the first time completely automates the search for siRNAs. In SiDE, the most complete set of rules for the selection of siRNA candidates (including G+C content, nucleotides at determined positions, thermodynamic properties, propensity to form internal hairpins, etc.) is implemented and moreover, specificity is achieved by a conceptually new method. After selecting possible siRNA candidates with the optimal functional properties, putative unspecific matches, which can cause cross-hybridization, are checked in databases containing a unique entry for each gene. These truly non-redundant databases are constructed from the genome annotations (Ensembl). Also intron/exon boundaries, presence of polymorphisms (single nucleotide polymorphisms) specificity for either gene or transcript, and other features can be selected to be considered in the design of siRNAs. AVAILABILITY: The program is available as a web server at http://side.bioinfo.cnio.es. The program was written under the GPL license. CONTACT: jdopazo@cnio.es. %B Bioinformatics %V 21 %P 1376-82 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15591357 %0 Book Section %B Data analysis and visualisation in genomics and proteomics %D 2005 %T Integrative Data Analysis and Visualization: Introduction to Critical Problems, Goals and Challenges %A F. Azuaje %A Dopazo, J. %B Data analysis and visualisation in genomics and proteomics %I Wiley, F. Azuaje and J. Dopazo %P 3-9 %G eng %0 Journal Article %J Bioinformatics %D 2005 %T Lineage-specific gene loss following mitochondrial endosymbiosis and its potential for function prediction in eukaryotes %A Gabaldón, T. %A M. A. Huynen %K Animals Chromosome Mapping/*methods DNA %K Mitochondrial/*genetics *Evolution %K Molecular *Gene Deletion Genetic Variation/genetics Humans Linkage Disequilibrium/*genetics Mitochondrial Proteins/*genetics Sequence Homology %K Nucleic Acid Species Specificity Symbiosis/*genetics %X MOTIVATION: The endosymbiotic origin of mitochondria has resulted in a massive horizontal transfer of genetic material from an alpha-proteobacterium to the early eukaryotes. Using large-scale phylogenetic analysis we have previously identified 630 orthologous groups of proteins derived from this event. Here we show that this proto-mitochondrial protein set has undergone extensive lineage-specific gene loss in the eukaryotes, with an average of three losses per orthologous group in a phylogeny of nine species. This gene loss has resulted in a high variability of the alphaproteobacterial-derived gene content of present-day eukaryotic genomes that might reflect functional adaptation to different environments. Proteins functioning in the same biochemical pathway tend to have a similar history of gene loss events, and we use this property to predict functional interactions among proteins in our set. %B Bioinformatics %V 21 Suppl 2 %P ii144-50 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16204094 %0 Journal Article %J Genes Chromosomes Cancer %D 2005 %T A novel candidate region linked to development of both pheochromocytoma and head/neck paraganglioma %A Cascon, A. %A Ruiz-Llorente, S. %A Rodriguez-Perales, S. %A Honrado, E. %A Martinez-Ramirez, A. %A Leton, R. %A Montero-Conde, C. %A Benitez, J. %A Dopazo, J. %A Cigudosa, J. C. %A M. Robledo %K 80 and over Child Chromosomes %K Adolescent Adrenal Gland Neoplasms/*genetics Adult Aged Aged %K Biological/*genetics %K Human %K Pair 1/genetics Chromosomes %K Pair 11/genetics Chromosomes %K Pair 3/genetics Chromosomes %K Pair 8/genetics Female Gene Deletion Head and Neck Neoplasms/*genetics Humans Male Middle Aged Nucleic Acid Hybridization Paraganglioma/*genetics Pheochromocytoma/*genetics Tumor Markers %X Although the histologic distinction between pheochromocytomas and head and neck paragangliomas is clear, little is known about the genetic differences between them. To date, various sets of genes have been found to be involved in inherited susceptibility to developing both tumor types, but the genes involved in sporadic pathogenesis are still unknown. To define new candidate regions, we performed CGH analysis on 29 pheochromocytomas and on 24 paragangliomas mainly of head and neck origin (20 of 24), which allowed us to differentiate between the two tumor types. Loss of 3q was significantly more frequent in pheochromocytomas, and loss of 1q appeared only in paragangliomas. We also found gain of 11q13 to be a significantly frequent alteration in malignant cases of both types. In addition, recurrent loss of 8p22-23 was found in 62% of pheochromocytomas (including all malignant cases) versus in 33% of paragangliomas, suggesting that this region contains candidate genes involved in the pathogenesis of this abnormality. Using FISH analysis on tissue microarrays, we confirmed genomic deletion of this region in 55% of pheochromocytomas compared to 12% of paragangliomas. Loss of 8p22-23 appears to be an important event in the sporadic development of these tumors, and additional molecular studies are necessary to identify candidate genes in this chromosomal region. %B Genes Chromosomes Cancer %V 42 %P 260-8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15609347 %0 Book Section %B Data analysis and visualisation in genomics and proteomics %D 2005 %T Ontologies and functional genomics %A Fatima Al-Shahrour %A Dopazo, J. %B Data analysis and visualisation in genomics and proteomics %I Wiley, F. Azuaje and J. Dopazo %P 99-102 %G eng %0 Journal Article %J Breast Cancer Res Treat %D 2005 %T Phenotypic characterization of BRCA1 and BRCA2 tumors based in a tissue microarray study with 37 immunohistochemical markers %A Palacios, J. %A Honrado, E. %A Osorio, A. %A Cazorla, A. %A Sarrio, D. %A Barroso, A. %A Rodriguez, S. %A Cigudosa, J. C. %A Diez, O. %A Alonso, C. %A Lerma, E. %A Dopazo, J. %A Rivas, C. %A Benitez, J. %K Adult Apoptosis Breast Neoplasms/*genetics/*pathology Cell Cycle Proteins Cluster Analysis Female *Genes %K Biological/genetics/metabolism %K BRCA1 *Genes %K BRCA2 Humans Immunohistochemistry In Situ Hybridization %K Fluorescence Phenotype Spain *Tissue Array Analysis *Tumor Markers %X Familial breast cancers that are associated with BRCA1 or BRCA2 germline mutations differ in both their morphological and immunohistochemical characteristics. To further characterize the molecular difference between genotypes, the authors evaluated the expression of 37 immunohistochemical markers in a tissue microarray (TMA) containing cores from 20 BRCA1, 14 BRCA2, and 59 sporadic age-matched breast carcinomas. Markers analyzed included, amog others, common markers in breast cancer, such as hormone receptors, p53 and HER2, along with 15 molecules involved in cell cycle regulation, such as cyclins, cyclin dependent kinases (CDK) and CDK inhibitors (CDKI), apoptosis markers, such as BCL2 and active caspase 3, and two basal/myoepithelial markers (CK 5/6 and P-cadherin). In addition, we analyzed the amplification of CCND1, CCNE, HER2 and MYC by FISH.Unsupervised cluster data analysis of both hereditary and sporadic cases using the complete set of immunohistochemical markers demonstrated that most BRCA1-associated carcinomas grouped in a branch of ER-, HER2-negative tumors that expressed basal cell markers and/or p53 and had higher expression of activated caspase 3. The cell cycle proteins associated with these tumors were E2F6, cyclins A, B1 and E, SKP2 and Topo IIalpha. In contrast, most BRCA2-associated carcinomas grouped in a branch composed by ER/PR/BCL2-positive tumors with a higher expression of the cell cycle proteins cyclin D1, cyclin D3, p27, p16, p21, CDK4, CDK2 and CDK1. In conclusion, our study in hereditary breast cancer tumors analyzing 37 immunohistochemical markers, define the molecular differences between BRCA1 and BRCA2 tumors with respect to hormonal receptors, cell cycle, apoptosis and basal cell markers. %B Breast Cancer Res Treat %V 90 %P 5-14 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15770521 %0 Journal Article %J Clin Cancer Res %D 2005 %T A predictor based on the somatic genomic changes of the BRCA1/BRCA2 breast cancer tumors identifies the non-BRCA1/BRCA2 tumors with BRCA1 promoter hypermethylation %A Alvarez, S. %A Diaz-Uriarte, R. %A Osorio, A. %A Barroso, A. %A Melchor, L. %A Paz, M. F. %A Honrado, E. %A Rodriguez, R. %A Urioste, M. %A Valle, L. %A Diez, O. %A Cigudosa, J. C. %A Dopazo, J. %A Esteller, M. %A Benitez, J. %K BRCA1 Protein/*genetics BRCA2 Protein/*genetics Breast Neoplasms/*genetics/pathology Chromosomes %K Genetic/*genetics %K Human %K Human Humans Male Mutation Nucleic Acid Hybridization/methods Promoter Regions %K Pair 12/genetics Chromosomes %K Pair 15/genetics Chromosomes %K Pair 18/genetics Chromosomes %K Pair 2/genetics Chromosomes %K Pair 8/genetics *DNA Methylation Female Genome %X The genetic changes underlying in the development and progression of familial breast cancer are poorly understood. To identify a somatic genetic signature of tumor progression for each familial group, BRCA1, BRCA2, and non-BRCA1/BRCA2 (BRCAX) tumors, by high-resolution comparative genomic hybridization, we have analyzed 77 tumors previously characterized for BRCA1 and BRCA2 germ line mutations. Based on a combination of the somatic genetic changes observed at the six most different chromosomal regions and the status of the estrogen receptor, we developed using random forests a molecular classifier, which assigns to a given tumor a probability to belong either to the BRCA1 or to the BRCA2 class. Because 76.5% (26 of 34) of the BRCAX cases were classified with our predictor to the BRCA1 class with a probability of >50%, we analyzed the BRCA1 promoter region for aberrant methylation in all the BRCAX cases. We found that 15 of the 34 BRCAX analyzed tumors had hypermethylation of the BRCA1 gene. When we considered the predictor, we observed that all the cases with this epigenetic event were assigned to the BRCA1 class with a probability of >50%. Interestingly, 84.6% of the cases (11 of 13) assigned to the BRCA1 class with a probability >80% had an aberrant methylation of the BRCA1 promoter. This fact suggests that somatic BRCA1 inactivation could modify the profile of tumor progression in most of the BRCAX cases. %B Clin Cancer Res %V 11 %P 1146-53 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15709182 %0 Journal Article %J Nucleic Acids Res %D 2005 %T PupasView: a visual tool for selecting suitable SNPs, with putative pathological effect in genes, for genotyping purposes %A L. Conde %A Vaquerizas, J. M. %A Ferrer-Costa, C. %A de la Cruz, X. %A Orozco, M. %A Dopazo, J. %K Computer Graphics Genes *Genetic Predisposition to Disease Genotype Internet Phenotype *Polymorphism %K Single Nucleotide *Software User-Computer Interface %X We have developed a web tool, PupasView, for the selection of single nucleotide polymorphisms (SNPs) with potential phenotypic effect. PupasView constitutes an interactive environment in which functional information and population frequency data can be used as sequential filters over linkage disequilibrium parameters to obtain a final list of SNPs optimal for genotyping purposes. PupasView is the first resource that integrates phenotypic effects caused by SNPs at both the translational and the transcriptional level. PupasView retrieves SNPs that could affect conserved regions that the cellular machinery uses for the correct processing of genes (intron/exon boundaries or exonic splicing enhancers), predicted transcription factor binding sites and changes in amino acids in the proteins for which a putative pathological effect is calculated. The program uses the mapping of SNPs in the genome provided by Ensembl. PupasView will be of much help in studies of multifactorial disorders, where the use of functional SNPs will increase the sensitivity of the identification of the genes responsible for the disease. The PupasView web interface is accessible through http://pupasview.ochoa.fib.es and through http://www.pupasnp.org. %B Nucleic Acids Res %V 33 %P W501-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15980522 %0 Book Section %B Adaptation to life in high salt concentrations in Archaea, Bacteria and Eukarya %D 2005 %T Salinibacter ruber: genomics and biogeography %A Antón, J %A Peña, A %A Valens, M %A Santos, F %A Glöckner, F.O %A Bauer, M %A Dopazo, J. %A Herrero, J. %A Rosselló-Mora, R %A Amann, R %B Adaptation to life in high salt concentrations in Archaea, Bacteria and Eukarya %I Nina Gunde-Cimerman, Ana Plemenitas, and Aharon Oren. Kluwer Academic Publishers %C Dordrecht, Netherlands %V 9 %P 257-266 %G eng %0 Journal Article %J J Mol Biol %D 2005 %T Tracing the evolution of a large protein complex in the eukaryotes, NADH:ubiquinone oxidoreductase (Complex I) %A Gabaldón, T. %A Rainey, D. %A M. A. Huynen %K Amino Acid Sequence Animals Computational Biology Electron Transport Complex I/*chemistry/*genetics/metabolism Eukaryotic Cells/*enzymology *Evolution %K Molecular Humans Molecular Sequence Data Photosynthesis Phylogeny Plastids/enzymology Protein Binding Protein Subunits/chemistry/genetics/metabolism Sequence Alignment Structural Homology %K Protein %X The increasing availability of sequenced genomes enables the reconstruction of the evolutionary history of large protein complexes. Here, we trace the evolution of NADH:ubiquinone oxidoreductase (Complex I), which has increased in size, by so-called supernumary subunits, from 14 subunits in the bacteria to 30 in the plants and algae, 37 in the fungi and 46 in the mammals. Using a combination of pair-wise and profile-based sequence comparisons at the levels of proteins and the DNA of the sequenced eukaryotic genomes, combined with phylogenetic analyses to establish orthology relationships, we were able to (1) trace the origin of six of the supernumerary subunits to the alpha-proteobacterial ancestor of the mitochondria, (2) detect previously unidentified homology relations between subunits from fungi and mammals, (3) detect previously unidentified subunits in the genomes of several species and (4) document several cases of gene duplications among supernumerary subunits in the eukaryotes. One of these, a duplication of N7BM (B17.2), is particularly interesting as it has been lost from genomes that have also lost Complex I proteins, making it a candidate for a Complex I interacting protein. A parsimonious reconstruction of eukaryotic Complex I evolution shows an initial increase in size that predates the separation of plants, fungi and metazoa, followed by a gradual adding and incidental losses of subunits in the various evolutionary lineages. This evolutionary scenario is in contrast to that for Complex I in the prokaryotes, for which the combination of several separate, and previously independently functioning modules into a single complex has been proposed. %B J Mol Biol %V 348 %P 857-70 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15843018 %0 Journal Article %J FEBS Lett %D 2005 %T Variation and evolution of biomolecular systems: searching for functional relevance %A M. A. Huynen %A Gabaldón, T. %A B. Snel %K *Evolution %K Molecular Genetic Variation Multiprotein Complexes/*genetics Phylogeny Protein Binding/genetics %X The availability of genome sequences and functional genomics data from multiple species enables us to compare the composition of biomolecular systems like biochemical pathways and protein complexes between species. Here, we review small- and large-scale, "genomics-based" approaches to biomolecular systems variation. In general, caution is required when comparing the results of bioinformatics analyses of genomes or of functional genomics data between species. Limitations to the sensitivity of sequence analysis tools and the noisy nature of genomics data tend to lead to systematic overestimates of the amount of variation. Nevertheless, the results from detailed manual analyses, and of large-scale analyses that filter out systematic biases, point to a large amount of variation in the composition of biomolecular systems. Such observations challenge our understanding of the function of the systems and their individual components and can potentially facilitate the identification and functional characterization of sub-systems within a system. Mapping the inter-species variation of complex biomolecular systems on a phylogenetic species tree allows one to reconstruct their evolution. %B FEBS Lett %V 579 %P 1839-45 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15763561 %0 Journal Article %J Protein Sci %D 2004 %T Alignment of protein sequences by their profiles %A M. A. Marti-Renom %A Madhusudhan, M. S. %A Sali, A. %K *Algorithms Amino Acid Sequence Computational Biology Databases %K Protein Markov Chains Molecular Sequence Data *Protein Folding Protein Structure %K Tertiary Proteins/*chemistry *Sequence Alignment Sequence Homology *Software %X The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thirteen different protocols for creating and comparing profiles corresponding to the multiple sequence alignments are implemented in the SALIGN command of MODELLER. A test set of 200 pairwise, structure-based alignments with sequence identities below 40% is used to benchmark the 13 protocols as well as a number of previously described sequence alignment methods, including heuristic pairwise sequence alignment by BLAST, pairwise sequence alignment by global dynamic programming with an affine gap penalty function by the ALIGN command of MODELLER, sequence-profile alignment by PSI-BLAST, Hidden Markov Model methods implemented in SAM and LOBSTER, pairwise sequence alignment relying on predicted local structure by SEA, and multiple sequence alignment by CLUSTALW and COMPASS. The alignment accuracies of the best new protocols were significantly better than those of the other tested methods. For example, the fraction of the correctly aligned residues relative to the structure-based alignment by the best protocol is 56%, which can be compared with the accuracies of 26%, 42%, 43%, 48%, 50%, 49%, 43%, and 43% for the other methods, respectively. The new method is currently applied to large-scale comparative protein structure modeling of all known sequences. %B Protein Sci %V 13 %P 1071-87 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15044736 %0 Journal Article %J Bioinformatics %D 2004 %T DNMAD: web-based diagnosis and normalization for microarray data %A Vaquerizas, J. M. %A Dopazo, J. %A Diaz-Uriarte, R. %K Algorithms Database Management Systems Gene Expression Profiling/*methods/standards Information Storage and Retrieval/*methods *Internet Oligonucleotide Array Sequence Analysis/*methods/standards Sequence Alignment/methods Sequence Analysis %K DNA/*methods *Software *User-Computer Interface %X SUMMARY: We present a web server for Diagnosis and Normalization of MicroArray Data (DNMAD). DNMAD includes several common data transformations such as spatial and global robust local regression or multiple slide normalization, and allows for detecting several kinds of errors that result from the manipulation and the image analysis of the arrays. This tool offers a user-friendly interface, and is completely integrated within the Gene Expression Pattern Analysis Suite (GEPAS). AVAILABILITY: The tool is accessible on-line at http://dnmad.bioinfo.cnio.es. %B Bioinformatics %V 20 %P 3656-8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15247094 %0 Journal Article %J Clinical cancer research : an official journal of the American Association for Cancer Research %D 2004 %T Expression profiling of T-cell lymphomas differentiates peripheral and lymphoblastic lymphomas and defines survival related genes. %A Martinez-Delgado, Beatriz %A Meléndez, Barbara %A Cuadros, Marta %A Alvarez, Javier %A Castrillo, Jose Maria %A Ruiz De La Parte, Ana %A Mollejo, Manuela %A Bellas, Carmen %A Diaz, Ramon %A Lombardía, Luis %A Fatima Al-Shahrour %A Domínguez, Orlando %A Cascon, Alberto %A Robledo, Mercedes %A Rivas, Carmen %A Benitez, Javier %X

PURPOSE: T-Cell lymphomas constitute heterogeneous and aggressive tumors in which pathogenic alterations remain largely unknown. Expression profiling has demonstrated to be a useful tool for molecular classification of tumors. EXPERIMENTAL DESIGN: Using DNA microarrays (CNIO-OncoChip) containing 6386 cancer-related genes, we established the expression profiling of T-cell lymphomas and compared them to normal lymphocytes and lymph nodes. RESULTS: We found significant differences between the peripheral and lymphoblastic T-cell lymphomas, which include a deregulation of nuclear factor-kappaB signaling pathway. We also identify differentially expressed genes between peripheral T-cell lymphoma tumors and normal T lymphocytes or reactive lymph nodes, which could represent candidate tumor markers of these lymphomas. Additionally, a close relationship between genes associated to survival and those that differentiate among the stages of disease and responses to therapy was found. CONCLUSIONS: Our results reflect the value of gene expression profiling to gain insight about the molecular alterations involved in the pathogenesis of T-cell lymphomas.

%B Clinical cancer research : an official journal of the American Association for Cancer Research %V 10 %P 4971-82 %8 2004 Aug 1 %G eng %U http://clincancerres.aacrjournals.org/content/10/15/4971.long %0 Journal Article %J Bioinformatics %D 2004 %T FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes %A Fatima Al-Shahrour %A Diaz-Uriarte, R. %A Dopazo, J. %K *Algorithms Artificial Intelligence Databases %K babelomics %K DNA/*methods *Software %K Genetic Gene Expression Profiling/*methods *Hypermedia Information Storage and Retrieval/*methods *Internet *Phylogeny Sequence Alignment/methods Sequence Analysis %X

We present a simple but powerful procedure to extract Gene Ontology (GO) terms that are significantly over- or under-represented in sets of genes within the context of a genome-scale experiment (DNA microarray, proteomics, etc.). Said procedure has been implemented as a web application, FatiGO, allowing for easy and interactive querying. FatiGO, which takes the multiple-testing nature of statistical contrast into account, currently includes GO associations for diverse organisms (human, mouse, fly, worm and yeast) and the TrEMBL/Swissprot GOAnnotations@EBI correspondences from the European Bioinformatics Institute.

%B Bioinformatics %V 20 %P 578-80 %G eng %U http://bioinformatics.oxfordjournals.org/content/20/4/578.abstract %0 Journal Article %J Genes Chromosomes Cancer %D 2004 %T Gene expression analysis of chromosomal regions with gain or loss of genetic material detected by comparative genomic hybridization %A Melendez, B. %A Diaz-Uriarte, R. %A Cuadros, M. %A Martinez-Ramirez, A. %A Fernandez-Piqueras, J. %A Dopazo, A. %A Cigudosa, J. C. %A Rivas, C. %A Dopazo, J. %A Martinez-Delgado, B. %A Benitez, J. %K Chromosomes %K Fluorescence Lymphoma %K Human %K Pair 13/*genetics Chromosomes %K Pair 19/*genetics Chromosomes %K Pair 6/*genetics Expressed Sequence Tags *Gene Dosage Gene Expression Profiling Humans In Situ Hybridization %K T-Cell/*genetics Nucleic Acid Hybridization Oligonucleotide Array Sequence Analysis %X Comparative genomic hybridization (CGH) has been widely used to detect copy number alterations in cancer and to identify regions containing candidate tumor-responsible genes; however, gene expression changes have been described only in highly amplified regions (amplicons). To study the overall impact of slight copy number changes on gene expression, we analyzed 16 T-cell lymphomas by using CGH and a custom-designed cDNA microarray containing 7,657 genes and expressed sequence tags related to tumorigenesis. We evaluated mean gene expression and variability within CGH-altered regions and explored the relationship between the effects of the gene and its position within these regions. Minimally overlapping CGH candidate areas (6q25, 13q21-q22, and 19q13.1) revealed a weak relationship between altered genomic content and gene expression. However, some candidate genes showed modified expression within these regions in the majority of tumors; these candidate genes were evaluated and confirmed in another independent series of 23 T-cell lymphomas by use of the same cDNA microarray and by FISH on a tissue microarray. When all the CGH regions detected for each tumor were considered, we found a significant increase or decrease in the mean expression of the genes contained in gained or lost regions, respectively. In addition, we found that the expression of a gene was dependent not only on its position within an altered region but also on its own mechanism of regulation: genes in the same altered region responded very differently to the gain or loss of genetic material. Supplementary material for this article can be found on the Genes, Chromosomes, and Cancer website at http://www.interscience.wiley.com/jpages/1045-2257/suppmat/index.html. %B Genes Chromosomes Cancer %V 41 %P 353-65 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15382261 %0 Book Section %B IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology %D 2004 %T Gene expression Correlation and Gene Ontology-Based Similarity: An Assessment of Quantitative Relationship %A Wang, H %A F. Azuaje %A Bodenreider, O %A Dopazo, J. %B IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology %P 25-31 %G eng %0 Journal Article %J Nucleic Acids Res %D 2004 %T MODBASE, a database of annotated comparative protein structure models, and associated resources %A Pieper, U. %A Eswar, N. %A Braberg, H. %A Madhusudhan, M. S. %A Davis, F. P. %A Stuart, A. C. %A Mirkovic, N. %A Rossi, A. %A M. A. Marti-Renom %A Fiser, A. %A Webb, B. %A Greenblatt, D. %A Huang, C. C. %A Ferrin, T. E. %A Sali, A. %K Amino Acid Sequence Animals Binding Sites *Computational Biology *Databases %K Molecular Molecular Sequence Data Polymorphism %K Protein Genomics Humans Internet Ligands Models %K Single Nucleotide Protein Binding Protein Conformation Proteins/*chemistry/genetics Sequence Alignment Software User-Computer Interface %X MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1,26,629 models for domains in 659,495 out of 1,182,126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24,113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab. org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab. org/snpweb). %B Nucleic Acids Res %V 32 %P D217-22 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=14681398 %0 Journal Article %J Nucleic Acids Res %D 2004 %T New challenges in gene expression data analysis and the extended GEPAS %A Herrero, J. %A Vaquerizas, J. M. %A Fatima Al-Shahrour %A L. Conde %A A. Mateos %A Diaz-Uriarte, J. S. %A Dopazo, J. %K gepas %K microarray data analysis %X

Since the first papers published in the late nineties, including, for the first time, a comprehensive analysis of microarray data, the number of questions that have been addressed through this technique have both increased and diversified. Initially, interest focussed on genes coexpressing across sets of experimental conditions, implying, essentially, the use of clustering techniques. Recently, however, interest has focussed more on finding genes differentially expressed among distinct classes of experiments, or correlated to diverse clinical outcomes, as well as in building predictors. In addition to this, the availability of accurate genomic data and the recent implementation of CGH arrays has made mapping expression and genomic data on the chromosomes possible. There is also a clear demand for methods that allow the automatic transfer of biological information to the results of microarray experiments. Different initiatives, such as the Gene Ontology (GO) consortium, pathways databases, protein functional motifs, etc., provide curated annotations for genes. Whereas many resources on the web focus mainly on clustering methods, GEPAS has evolved to cope with the aforementioned new challenges that have recently arisen in the field of microarray data analysis. The web-based pipeline for microarray gene expression data, GEPAS, is available at http://gepas.bioinfo.cnio.es.

%B Nucleic Acids Res %V 32 %P W485-91 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15215434 %0 Journal Article %J EMBO Rep %D 2004 %T Perceptions about postdocs %A Vella, F. %A Mietchen, D. %A Gabaldón, T. %K Europe *Fellowships and Scholarships *Research Personnel %B EMBO Rep %V 5 %P 1104 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15577920 %0 Journal Article %J Bioinformatics %D 2004 %T Phylogenomics and the number of characters required for obtaining an accurate phylogeny of eukaryote model species %A H. Dopazo %A J. Santoyo %A Dopazo, J. %X

MOTIVATION: Through the most extensive phylogenomic analysis carried out to date, complete genomes of 11 eukaryotic species have been examined in order to find the homologous of more than 25,000 amino acid sequences. These sequences correspond to the exons of more than 3000 genes and were used as presence/absence characters to test one of the most controversial hypotheses concerning animal evolution, namely the Ecdysozoa hypothesis. Distance, maximum parsimony and Bayesian methods of phylogenetic reconstruction were used to test the hypothesis. RESULTS: The reliability of the ecdysozoa, grouping arthropods and nematodes in a single clade was unequivocally rejected in all the consensus trees. The Coelomata clade, grouping arthropods and chordates, was supported by the highest statistical confidence in all the reconstructions. The study of the dependence of the genomes’ tree accuracy on the number of exons used, demonstrated that an unexpectedly larger number of characters are necessary to obtain robust phylogenies. Previous studies supporting ecdysozoa, could not guarantee an accurate phylogeny because the number of characters used was clearly below the minimum required.

%B Bioinformatics %V 20 Suppl 1 %P i116-21 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15262789 %0 Journal Article %J Cell Mol Life Sci %D 2004 %T Prediction of protein function and pathways in the genome era %A Gabaldón, T. %A M. A. Huynen %K ATP-Binding Cassette Transporters/genetics/metabolism Amino Acid Sequence Animals Artificial Gene Fusion Base Sequence Chaperonins/genetics/metabolism Chromosomes/genetics/metabolism Evolution %K Molecular *Genome Genomics Humans Molecular Sequence Data Phylogeny *Proteins/classification/genetics/metabolism RNA %K Ribosomal/metabolism Sequence Alignment %X The growing number of completely sequenced genomes adds new dimensions to the use of sequence analysis to predict protein function. Compared with the classical knowledge transfer from one protein to a similar sequence (homology-based function prediction), knowledge about the corresponding genes in other genomes (orthology-based function prediction) provides more specific information about the protein’s function, while the analysis of the sequence in its genomic context (context-based function prediction) provides information about its functional context. Whereas homology-based methods predict the molecular function of a protein, genomic context methods predict the biological process in which it plays a role. These complementary approaches can be combined to elucidate complete functional networks and biochemical pathways from the genome sequence of an organism. Here we review recent advances in the field of genomic-context based methods of protein function prediction. Techniques are highlighted with examples, including an analysis that combines information from genomic-context with homology to predict a role of the RNase L inhibitor in the maturation of ribosomal RNA. %B Cell Mol Life Sci %V 61 %P 930-44 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15095013 %0 Journal Article %J Nucleic Acids Res %D 2004 %T PupaSNP Finder: a web tool for finding SNPs with putative effect at transcriptional level %A L. Conde %A Vaquerizas, J. M. %A J. Santoyo %A Fatima Al-Shahrour %A Ruiz-Llorente, S. %A M. Robledo %A Dopazo, J. %K Amino Acid Substitution Binding Sites Humans Internet Phenotype *Polymorphism %K Genetic %K Single Nucleotide RNA Splicing *Software Transcription Factors/metabolism *Transcription %X We have developed a web tool, PupaSNP Finder (PupaSNP for short), for high-throughput searching for single nucleotide polymorphisms (SNPs) with potential phenotypic effect. PupaSNP takes as its input lists of genes (or generates them from chromosomal coordinates) and retrieves SNPs that could affect the conserved regions that the cellular machinery uses for the correct processing of genes (intron/exon boundaries or exonic splicing enhancers), predicted transcription factor binding sites (TFBS) and changes in amino acids in the proteins. The program uses the mapping of SNPs in the genome provided by Ensembl. Additionally, user-defined SNPs (not yet mapped in the genome) can be easily provided to the program. Also, additional functional information from Gene Ontology, OMIM and homologies in other model organisms is provided. In contrast to other programs already available, which focus only on SNPs with possible effect in the protein, PupaSNP includes SNPs with possible transcriptional effect. PupaSNP will be of significant help in studies of multifactorial disorders, where the use of functional SNPs will increase the sensitivity of identification of the genes responsible for the disease. The PupaSNP web interface is accessible through http://pupasnp.bioinfo.cnio.es. %B Nucleic Acids Res %V 32 %P W242-8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15215388 %0 Journal Article %J Biochim Biophys Acta %D 2004 %T Shaping the mitochondrial proteome %A Gabaldón, T. %A M. A. Huynen %K Animals Biological Transport Energy Metabolism Eukaryotic Cells/physiology *Evolution Humans Mitochondria/*physiology Phylogeny Proteome/*physiology %X Mitochondria are eukaryotic organelles that originated from a single bacterial endosymbiosis some 2 billion years ago. The transition from the ancestral endosymbiont to the modern mitochondrion has been accompanied by major changes in its protein content, the so-called proteome. These changes included complete loss of some bacterial pathways, amelioration of others and gain of completely new complexes of eukaryotic origin such as the ATP/ADP translocase and most of the mitochondrial protein import machinery. This renewal of proteins has been so extensive that only 14-16% of modern mitochondrial proteome has an origin that can be traced back to the bacterial endosymbiont. The rest consists of proteins of diverse origin that were eventually recruited to function in the organelle. This shaping of the proteome content reflects the transformation of mitochondria into a highly specialized organelle that, besides ATP production, comprises a variety of functions within the eukaryotic metabolism. Here we review recent advances in the fields of comparative genomics and proteomics that are throwing light on the origin and evolution of the mitochondrial proteome. %B Biochim Biophys Acta %V 1659 %P 212-20 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15576054 %0 Journal Article %J Cancer Res %D 2004 %T Structure-based assessment of missense mutations in human BRCA1: implications for breast and ovarian cancer predisposition %A Mirkovic, N. %A M. A. Marti-Renom %A Weber, B. L. %A Sali, A. %A Monteiro, A. N. %K BRCA1 Genetic Predisposition to Disease Humans *Mutation %K BRCA1 Protein/*chemistry/genetics Breast Neoplasms/*genetics Female *Genes %K Missense Ovarian Neoplasms/*genetics Pedigree Protein Conformation Structure-Activity Relationship Transcriptional Activation %X The BRCA1 gene from individuals at risk of breast and ovarian cancers can be screened for the presence of mutations. However, the cancer association of most alleles carrying missense mutations is unknown, thus creating significant problems for genetic counseling. To increase our ability to identify cancer-associated mutations in BRCA1, we set out to use the principles of protein three-dimensional structure as well as the correlation between the cancer-associated mutations and those that abolish transcriptional activation. Thirty-one of 37 missense mutations of known impact on the transcriptional activation function of BRCA1 are readily rationalized in structural terms. Loss-of-function mutations involve nonconservative changes in the core of the BRCA1 C-terminus (BRCT) fold or are localized in a groove that presumably forms a binding site involved in the transcriptional activation by BRCA1; mutations that do not abolish transcriptional activation are either conservative changes in the core or are on the surface outside of the putative binding site. Next, structure-based rules for predicting functional consequences of a given missense mutation were applied to 57 germ-line BRCA1 variants of unknown cancer association. Such a structure-based approach may be helpful in an integrated effort to identify mutations that predispose individuals to cancer. %B Cancer Res %V 64 %P 3790-7 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15172985 %0 Journal Article %J Comp Funct Genomics %D 2003 %T An approach to inferring transcriptional regulation among genes from large-scale expression data %A Herrero, J. %A Diaz-Uriarte, R. %A Dopazo, J. %X The use of DNA microarrays opens up the possibility of measuring the expression levels of thousands of genes simultaneously under different conditions. Time-course experiments allow researchers to study the dynamics of gene interactions. The inference of genetic networks from such measures can give important insights for the understanding of a variety of biological problems. Most of the existing methods for genetic network reconstruction require many experimental data points, or can only be applied to the reconstruction of small subnetworks. Here we present a method that reduces the dimensionality of the dataset and then extracts the significant dynamic correlations among genes. The method requires a number of points achievable in common time-course experiments. %B Comp Funct Genomics %V 4 %P 148-54 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18629097 %0 Journal Article %J Genome Res %D 2003 %T Comparing bacterial genomes through conservation profiles %A Martin, M. J. %A Herrero, J. %A A. Mateos %A Dopazo, J. %K Bacterial Genotype Models %K Bacterial/genetics Cluster Analysis Conserved Sequence/*genetics DNA %K Bacterial/genetics Escherichia coli/classification/*genetics Evolution %K Bacterial/genetics Gene Order/genetics Genes %K Bacterial/genetics/physiology *Genome %K Chromosome Mapping/methods Chromosomes %K Genetic Phenotype Phylogeny Sequence Homology %K Molecular Gene Expression Profiling/methods Gene Expression Regulation %K Nucleic Acid Species Specificity Terminology as Topic %X We constructed two-dimensional representations of profiles of gene conservation across different genomes using the genome of Escherichia coli as a model. These profiles permit both the visualization at the genome level of different traits in the organism studied and, at the same time, reveal features related to the genomes analyzed (such as defective genomes or genomes that lack a particular system). Conserved genes are not uniformly distributed along the E. coli genome but tend to cluster together. The study of gene distribution patterns across genomes is important for the understanding of how sets of genes seem to be dependent on each other, probably having some functional link. This provides additional evidence that can be used for the elucidation of the function of unannotated genes. Clustering these patterns produces families of genes which can be arranged in a hierarchy of closeness. In this way, functions can be defined at different levels of generality depending on the level of the hierarchy that is studied. The combined study of conservation and phenotypic traits opens up the possibility of defining phenotype/genotype associations, and ultimately inferring the gene or genes responsible for a particular trait. %B Genome Res %V 13 %P 991-8 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12695324 %0 Journal Article %J Nucleic Acids Res %D 2003 %T EVA: Evaluation of protein structure prediction servers %A Koh, I. Y. %A Eyrich, V. A. %A M. A. Marti-Renom %A Przybylski, D. %A Madhusudhan, M. S. %A Eswar, N. %A Grana, O. %A Pazos, F. %A Valencia, A. %A Sali, A. %A Rost, B. %K Automation Databases %K Protein %K Protein Internet *Protein Conformation Protein Folding Protein Structure %K Protein Structural Homology %K Secondary Proteins/chemistry Reproducibility of Results *Sequence Analysis %X EVA (http://cubic.bioc.columbia.edu/eva/) is a web server for evaluation of the accuracy of automated protein structure prediction methods. The evaluation is updated automatically each week, to cope with the large number of existing prediction servers and the constant changes in the prediction methods. EVA currently assesses servers for secondary structure prediction, contact prediction, comparative protein structure modelling and threading/fold recognition. Every day, sequences of newly available protein structures in the Protein Data Bank (PDB) are sent to the servers and their predictions are collected. The predictions are then compared to the experimental structures once a week; the results are published on the EVA web pages. Over time, EVA has accumulated prediction results for a large number of proteins, ranging from hundreds to thousands, depending on the prediction method. This large sample assures that methods are compared reliably. As a result, EVA provides useful information to developers as well as users of prediction methods. %B Nucleic Acids Res %V 31 %P 3311-5 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12824315 %0 Journal Article %J J Biol Chem %D 2003 %T Examining the role of glutamic acid 183 in chloroperoxidase catalysis %A Yi, X. %A A. Conesa %A Punt, P. J. %A Hager, L. P. %K Aspergillus niger/metabolism Catalase/metabolism Catalysis Chloride Peroxidase/*chemistry/*metabolism Chlorine/metabolism Chromatography %K Ion Exchange Circular Dichroism Crystallography %K Polyacrylamide Gel Fungi/enzymology Glutamic Acid/*chemistry Histidine/chemistry/metabolism Hydrogen-Ion Concentration Immunoblotting Isoelectric Focusing Mutation Oxidoreductases/metabolism Plasmids/metabolism %K X-Ray Electrophoresis %X Site-directed mutagenesis has been used to investigate the role of glutamic acid 183 in chloroperoxidase catalysis. Based on the x-ray crystallographic structure of chloroperoxidase, Glu-183 is postulated to function on distal side of the heme prosthetic group as an acid-base catalyst in facilitating the reaction between the peroxidase and hydrogen peroxide with the formation of Compound I. In contrast, the other members of the heme peroxidase family use a histidine residue in this role. Plasmids have now been constructed in which the codon for Glu-183 is replaced with a histidine codon. The mutant recombinant gene has been expressed in Aspergillus niger. An analysis of the produced mutant gene shows that the substitution of Glu-183 with a His residue is detrimental to the chlorination and dismutation activity of chloroperoxidase. The activity is reduced by 85 and 50% of wild type activity, respectively. However, quite unexpectedly, the epoxidation activity of the mutant enzyme is significantly enhanced approximately 2.5-fold. These results show that Glu-183 is important but not essential for the chlorination activity of chloroperoxidase. It is possible that the increased epoxidation of the mutant enzyme is based on an increase in the hydrophobicity of the active site. %B J Biol Chem %V 278 %P 13855-9 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12576477 %0 Journal Article %J Bioinformatics %D 2003 %T Gene expression data preprocessing %A Herrero, J. %A Diaz-Uriarte, R. %A Dopazo, J. %K *Database Management Systems Gene Expression Profiling/*methods Information Storage and Retrieval/methods Internet Oligonucleotide Array Sequence Analysis/*methods Sequence Alignment/*methods Sequence Analysis %K DNA/*methods *Software *User-Computer Interface %X We present an interactive web tool for preprocessing microarray gene expression data. It analyses the data, suggests the most appropriate transformations and proceeds with them after user agreement. The normal preprocessing steps include scale transformations, management of missing values, replicate handling, flat pattern filtering and pattern standardization and they are required before performing any pattern analysis. The processed data set can be sent to other pattern analysis tools. %B Bioinformatics %V 19 %P 655-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12651726 %0 Journal Article %J Nucleic Acids Res %D 2003 %T GEPAS: A web-based resource for microarray gene expression data analysis %A Herrero, J. %A Fatima Al-Shahrour %A Diaz-Uriarte, R. %A A. Mateos %A Vaquerizas, J. M. %A J. Santoyo %A Dopazo, J. %K gepas %K microarray data analysis %X

We present a web-based pipeline for microarray gene expression profile analysis, GEPAS, which stands for Gene Expression Profile Analysis Suite (http://gepas.bioinfo.cnio.es). GEPAS is composed of different interconnected modules which include tools for data pre-processing, two-conditions comparison, unsupervised and supervised clustering (which include some of the most popular methods as well as home made algorithms) and several tests for differential gene expression among different classes, continuous variables or survival analysis. A multiple purpose tool for data mining, based on Gene Ontology, is also linked to the tools, which constitutes a very convenient way of analysing clustering results. On-line tutorials are available from our main web server (http://bioinfo.cnio.es).

%B Nucleic Acids Res %V 31 %P 3461-7 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12824345 %0 Journal Article %J Bull Math Biol %D 2003 %T A model for the emergence of adaptive subsystems %A H. Dopazo %A Gordon, M. B. %A Perazzo, R. %A Risau-Gusman, S. %K *Adaptation %K Biological Algorithms Alleles Animals Evolution Genotype Humans *Learning *Models %K Genetic Models %K Statistical Neural Networks (Computer) Phenotype Synapses/genetics %X We investigate the interaction of learning and evolution in a changing environment. A stable learning capability is regarded as an emergent adaptive system evolved by natural selection of genetic variants. We consider the evolution of an asexual population. Each genotype can have ’fixed’ and ’flexible’ alleles. The former express themselves as synaptic connections that remain unchanged during ontogeny and the latter as synapses that can be adjusted through a learning algorithm. Evolution is modelled using genetic algorithms and the changing environment is represented by two optimal synaptic patterns that alternate a fixed number of times during the ’life’ of the individuals. The amplitude of the change is related to the Hamming distance between the two optimal patterns and the rate of change to the frequency with which both exchange roles. This model is an extension of that of Hinton and Nowlan in which the fitness is given by a probabilistic measure of the Hamming distance to the optimum. We find that two types of evolutionary pathways are possible depending upon how difficult (costly) it is to cope with the changes of the environment. In one case the population loses the learning ability, and the individuals inherit fixed synapses that are optimal in only one of the environmental states. In the other case a flexible subsystem emerges that allows the individuals to adapt to the changes of the environment. The model helps us to understand how an adaptive subsystem can emerge as the result of the tradeoff between the exploitation of a congenital structure and the exploration of the adaptive capabilities practised by learning. %B Bull Math Biol %V 65 %P 27-56 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12597115 %0 Journal Article %J Bioinformatics %D 2003 %T ModView, visualization of multiple protein sequences and structures %A Ilyin, V. A. %A Pieper, U. %A Stuart, A. C. %A M. A. Marti-Renom %A McMahan, L. %A Sali, A. %K *Database Management Systems Documentation/methods Imaging %K Protein/*methods *User-Computer Interface %K Three-Dimensional/methods Protein Conformation Proteins/*chemistry/genetics Sequence Alignment/*methods Sequence Analysis %X SUMMARY: We describe ModView, a web application for visualization of multiple protein sequences and structures. ModView integrates a multiple structure viewer, a multiple sequence alignment editor, and a database querying engine. It is possible to interactively manipulate hundreds of proteins, to visualize conservative and variable residues, active and binding sites, fragments, and domains in protein families, as well as to display large macromolecular complexes such as ribosomes or viruses. As a Netscape plug-in, ModView can be included in HTML pages along with text and figures, which makes it useful for teaching and presentations. ModView is also suitable as a graphical interface to various databases because it can be controlled through JavaScript commands and called from CGI scripts. AVAILABILITY: ModView is available at http://guitar.rockefeller.edu/modview. %B Bioinformatics %V 19 %P 165-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12499313 %0 Journal Article %J Science %D 2003 %T Reconstruction of the proto-mitochondrial metabolism %A Gabaldón, T. %A M. A. Huynen %K Aerobiosis Algorithms Alphaproteobacteria/chemistry/genetics/*metabolism Amino Acids/metabolism Animals Bacterial Proteins/chemistry/*metabolism Genome Genome %K Bacterial Glycerol/metabolism Humans Lipid Metabolism Mitochondria/chemistry/genetics/*metabolism Phylogeny *Proteome Symbiosis Yeasts/metabolism %B Science %V 301 %P 609 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12893934 %0 Journal Article %J Nucleic Acids Res %D 2003 %T Tools for comparative protein structure modeling and analysis %A Eswar, N. %A John, B. %A Mirkovic, N. %A Fiser, A. %A Ilyin, V. A. %A Pieper, U. %A Stuart, A. C. %A M. A. Marti-Renom %A Madhusudhan, M. S. %A Yerkovich, B. %A Sali, A. %K Amino Acid *Software *Structural Homology %K Internet Models %K Molecular Protein Folding Proteins/chemistry Reproducibility of Results Sequence Alignment Sequence Homology %K Protein Systems Integration %X The following resources for comparative protein structure modeling and analysis are described (http://salilab.org): MODELLER, a program for comparative modeling by satisfaction of spatial restraints; MODWEB, a web server for automated comparative modeling that relies on PSI-BLAST, IMPALA and MODELLER; MODLOOP, a web server for automated loop modeling that relies on MODELLER; MOULDER, a CPU intensive protocol of MODWEB for building comparative models based on distant known structures; MODBASE, a comprehensive database of annotated comparative models for all sequences detectably related to a known structure; MODVIEW, a Netscape plugin for Linux that integrates viewing of multiple sequences and structures; and SNPWEB, a web server for structure-based prediction of the functional impact of a single amino acid substitution. %B Nucleic Acids Res %V 31 %P 3375-80 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12824331 %0 Book Section %B Microarray data analysis III %D 2003 %T Use of GO Terms to Understand the Biological Significance of Microarray Differential Gene Expression Data %A Díaz-Uriarte, R %A Fatima Al-Shahrour %A Dopazo, J. %B Microarray data analysis III %I Kluwer Academic, K. F. Johnson and S. M. Lin %P 233-247 %G eng %0 Book Section %B Neural Networks for Signal Processing XIII %D 2003 %T Using Gene Ontology on genome-scale studies to find significant associations of biologically relevant terms to group of genes %A Fatima Al-Shahrour %A Herrero, J. %A A. Mateos %A J. Santoyo %A Díaz-Uriarte, R %A Dopazo, J. %K babelomics %B Neural Networks for Signal Processing XIII %I IEEE Press %C New York, USA %P 43-52 %G eng %0 Journal Article %J J Biotechnol %D 2002 %T Bioinformatics methods for the analysis of expression arrays: data clustering and information extraction %A J. Tamames %A Clark, D. %A Herrero, J. %A Dopazo, J. %A Blaschke, C. %A Fernandez, J. M. %A Oliveros, J. C. %A Valencia, A. %K Abstracting and Indexing as Topic/methods *Cluster Analysis *Database Management Systems Databases %K Computer-Assisted/methods Information Storage and Retrieval/*methods Internet Medline National Library of Medicine (U.S.) Oligonucleotide Array Sequence Analysis/*methods United States %K Genetic Gene Expression Gene Expression Profiling/*methods Image Processing %X Expression arrays facilitate the monitoring of changes in the expression patterns of large collections of genes. The analysis of expression array data has become a computationally-intensive task that requires the development of bioinformatics technology for a number of key stages in the process, such as image analysis, database storage, gene clustering and information extraction. Here, we review the current trends in each of these areas, with particular emphasis on the development of the related technology being carried out within our groups. %B J Biotechnol %V 98 %P 269-83 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12141992 %0 Journal Article %J Appl Environ Microbiol %D 2002 %T Calnexin overexpression increases manganese peroxidase production in Aspergillus niger %A A. Conesa %A Jeenes, D. %A Archer, D. B. %A van den Hondel, C. A. %A Punt, P. J. %K Aspergillus niger/*enzymology/genetics Calcium-Binding Proteins/*metabolism Calnexin Culture Media *Fungal Proteins HSP70 Heat-Shock Proteins/metabolism Heme/metabolism Peroxidases/*biosynthesis/genetics Phanerochaete/enzymology/genetics Transformation %K Genetic %X Heme-containing peroxidases from white rot basidiomycetes, in contrast to most proteins of fungal origin, are poorly produced in industrial filamentous fungal strains. Factors limiting peroxidase production are believed to operate at the posttranslational level. In particular, insufficient availability of the prosthetic group which is required for peroxidase biosynthesis has been proposed to be an important bottleneck. In this work, we analyzed the role of two components of the secretion pathway, the chaperones calnexin and binding protein (BiP), in the production of a fungal peroxidase. Expression of the Phanerochaete chrysosporium manganese peroxidase (MnP) in Aspergillus niger resulted in an increase in the expression level of the clxA and bipA genes. In a heme-supplemented medium, where MnP was shown to be overproduced to higher levels, induction of clxA and bipA was also higher. Overexpression of these two chaperones in an MnP-producing strain was analyzed for its effect on MnP production. Whereas bipA overexpression seriously reduced MnP production, overexpression of calnexin resulted in a four- to fivefold increase in the extracellular MnP levels. However, when additional heme was provided in the culture medium, calnexin overexpression had no synergistic effect on MnP production. The possible function of these two chaperones in MnP maturation and production is discussed. %B Appl Environ Microbiol %V 68 %P 846-51 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11823227 %0 Journal Article %J J Proteome Res %D 2002 %T Combining hierarchical clustering and self-organizing maps for exploratory analysis of gene expression patterns %A Herrero, J. %A Dopazo, J. %K Cluster Analysis Computational Biology/methods *Gene Expression Genes %K Fungal/genetics *Genome Oligonucleotide Array Sequence Analysis/*methods Statistics as Topic/*methods Time Factors %X Self-organizing maps (SOM) constitute an alternative to classical clustering methods because of its linear run times and superior performance to deal with noisy data. Nevertheless, the clustering obtained with SOM is dependent on the relative sizes of the clusters. Here, we show how the combination of SOM with hierarchical clustering methods constitutes an excellent tool for exploratory analysis of massive data like DNA microarray expression patterns. %B J Proteome Res %V 1 %P 467-70 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12645919 %0 Journal Article %J Trends Biotechnol %D 2002 %T Filamentous fungi as cell factories for heterologous protein production %A Punt, P. J. %A van Biezen, N. %A A. Conesa %A Albers, A. %A Mangnus, J. %A van den Hondel, C. %K Fermentation/genetics/physiology Fungi/*genetics/*metabolism Humans Interleukin-6/analysis/*biosynthesis/genetics Peroxidases/analysis/*biosynthesis/genetics Protein Conformation Recombinant Proteins/analysis/*biosynthesis/genetics %X Filamentous fungi have been used as sources of metabolites and enzymes for centuries. For about two decades, molecular genetic tools have enabled us to use these organisms to express extra copies of both endogenous and exogenous genes. This review of current practice reveals that molecular tools have enabled several new developments. But it has been process development that has driven the final breakthrough to achieving commercially relevant quantities of protein. Recent research into gene expression in filamentous fungi has explored their wealth of genetic diversity with a view to exploiting them as expression hosts and as a source of new genes. Inevitably, the progress in the ’genomics’ technology will further develop high-throughput technologies for these organisms. %B Trends Biotechnol %V 20 %P 200-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11943375 %0 Journal Article %J J Biotechnol %D 2002 %T Fungal peroxidases: molecular aspects and applications %A A. Conesa %A Punt, P. J. %A van den Hondel, C. A. %K Amino Acid Sequence Binding Sites Biotechnology Catalysis Fungi/*enzymology Molecular Sequence Data Peroxidases/chemistry/*genetics/metabolism Recombinant Proteins Sequence Homology Substrate Specificity %X Peroxidases are oxidoreductases that utilize hydrogen peroxide to catalyze oxidative reactions. A large number of peroxidases have been identified in fungal species and are being characterized at the molecular level. In this manuscript we review the current knowledge on the molecular aspects of this type of enzymes. We present an overview of the research efforts undertaken in deciphering the structural basis of the catalytic properties of fungal peroxidases and discuss molecular genetics and protein homology aspects of this enzyme class. Finally, we summarize the potential biotechnological applications of these enzymes and evaluate recent advances on their expression in heterologous systems for production purposes. %B J Biotechnol %V 93 %P 143-58 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11738721 %0 Journal Article %J Am J Pathol %D 2002 %T Identification of genes involved in resistance to interferon-alpha in cutaneous T-cell lymphoma %A Tracey, L. %A Villuendas, R. %A Ortiz, P. %A Dopazo, A. %A Spiteri, I. %A Lombardia, L. %A Rodriguez-Peralto, J. L. %A Fernandez-Herrera, J. %A Hernandez, A. %A Fraga, J. %A Dominguez, O. %A Herrero, J. %A Alonso, M. A. %A Dopazo, J. %A Piris, M. A. %K Antineoplastic Agents/*pharmacology/therapeutic use Carrier Proteins/biosynthesis/genetics DNA-Binding Proteins/biosynthesis/genetics Drug Resistance %K Biological Oligonucleotide Array Sequence Analysis RNA %K Cultured %K Cutaneous/diagnosis/drug therapy/*genetics/metabolism *Membrane Glycoproteins Models %K Interleukin-1 Reproducibility of Results STAT1 Transcription Factor STAT3 Transcription Factor Trans-Activators/biosynthesis/genetics Tumor Cells %K Neoplasm Gene Expression Profiling *Gene Expression Regulation %K Neoplasm/biosynthesis *Receptors %K Neoplastic Humans Interferon-alpha/*pharmacology/therapeutic use Kinetics Lymphoma %K T-Cell %X Interferon-alpha therapy has been shown to be active in the treatment of mycosis fungoides although the individual response to this therapy is unpredictable and dependent on essentially unknown factors. In an effort to better understand the molecular mechanisms of interferon-alpha resistance we have developed an interferon-alpha resistant variant from a sensitive cutaneous T-cell lymphoma cell line. We have performed expression analysis to detect genes differentially expressed between both variants using a cDNA microarray including 6386 cancer-implicated genes. The experiments showed that resistance to interferon-alpha is consistently associated with changes in the expression of a set of 39 genes, involved in signal transduction, apoptosis, transcription regulation, and cell growth. Additional studies performed confirm that STAT1 and STAT3 expression and interferon-alpha induction and activation are not altered between both variants. The gene MAL, highly overexpressed by resistant cells, was also found to be expressed by tumoral cells in a series of cutaneous T-cell lymphoma patients treated with interferon-alpha and/or photochemotherapy. MAL expression was associated with longer time to complete remission. Time-course experiments of the sensitive and resistant cells showed a differential expression of a subset of genes involved in interferon-response (1 to 4 hours), cell growth and apoptosis (24 to 48 hours.), and signal transduction. %B Am J Pathol %V 161 %P 1825-37 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12414529 %0 Book %D 2002 %T Methods of Microarray Data Analysis IISupervised Neural Networks for Clustering Conditions in DNA Array Data After Reducing Noise by Clustering Gene Expression Profiles %A Mateos, Alvaro %A Herrero, Javier %A Tamames, Javier %A Dopazo, Joaquin %E Lin, Simon M. %E Johnson, Kimberly F. %I Kluwer Academic Publishers %C Boston %P 91 - 103 %G eng %U http://www.springerlink.com/index/10.1007/b112982http://link.springer.com/10.1007/0-306-47598-7_7http://www.springerlink.com/index/pdf/10.1007/0-306-47598-7_7 %R 10.1007/b11298210.1007/0-306-47598-7_7 %0 Book Section %B Microarray data analysis II %D 2002 %T Microarray Data Processing And Analysis %A Dopazo, J. %B Microarray data analysis II %I Kluwer Academic %P 43-63 %G eng %0 Journal Article %J Structure %D 2002 %T Reliability of assessment of protein structure prediction methods %A M. A. Marti-Renom %A Madhusudhan, M. S. %A Fiser, A. %A Rost, B. %A Sali, A. %K *Computer Simulation Humans *Models %K Molecular *Protein Conformation Proteins/*chemistry Reproducibility of Results %X

The reliability of ranking of protein structure modeling methods is assessed. The assessment is based on the parametric Student’s t test and the nonparametric Wilcox signed rank test of statistical significance of the difference between paired samples. The approach is applied to the ranking of the comparative modeling methods tested at the fourth meeting on Critical Assessment of Techniques for Protein Structure Prediction (CASP). It is shown that the 14 CASP4 test sequences may not be sufficient to reliably distinguish between the top eight methods, given the model quality differences and their standard deviations. We suggest that CASP needs to be supplemented by an assessment of protein structure prediction methods that is automated, continuous in time, based on several criteria applied to a large number of models, and with quantitative statistical reliability assigned to each characterization.

%B Structure %V 10 %P 435-40 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12005441 %0 Book Section %B Microarray data analysis II %D 2002 %T Supervised Neural Networks For Clustering Conditions In DNA Array Data After Reducing Noise By Clustering Gene Expression Profiles %A A. Mateos %A Herrero, J. %A J. Tamames %A Dopazo, J. %B Microarray data analysis II %I Kluwer Academic %P 91-103 %G eng %0 Journal Article %J Genome Res %D 2002 %T Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons %A A. Mateos %A Dopazo, J. %A Jansen, R. %A Tu, Y. %A Gerstein, M. %A Stolovitzky, G. %K Algorithms Artificial Intelligence Citric Acid Cycle/genetics Cluster Analysis Computational Biology/methods Gene Expression Profiling/*methods/statistics & numerical data Genes/*physiology Genetic Heterogeneity Neural Networks (Computer) Oligonucleotide %X Recent advances in microarray technology have opened new ways for functional annotation of previously uncharacterised genes on a genomic scale. This has been demonstrated by unsupervised clustering of co-expressed genes and, more importantly, by supervised learning algorithms. Using prior knowledge, these algorithms can assign functional annotations based on more complex expression signatures found in existing functional classes. Previously, support vector machines (SVMs) and other machine-learning methods have been applied to a limited number of functional classes for this purpose. Here we present, for the first time, the comprehensive application of supervised neural networks (SNNs) for functional annotation. Our study is novel in that we report systematic results for 100 classes in the Munich Information Center for Protein Sequences (MIPS) functional catalog. We found that only 10% of these are learnable (based on the rate of false negatives). A closer analysis reveals that false positives (and negatives) in a machine-learning context are not necessarily "false" in a biological sense. We show that the high degree of interconnections among functional classes confounds the signatures that ought to be learned for a unique class. We term this the "Borges effect" and introduce two new numerical indices for its quantification. Our analysis indicates that classification systems with a lower Borges effect are better suitable for machine learning. Furthermore, we introduce a learning procedure for combining false positives with the original class. We show that in a few iterations this process converges to a gene set that is learnable with considerably low rates of false positives and negatives and contains genes that are biologically related to the original class, allowing for a coarse reconstruction of the interactions between associated biological pathways. We exemplify this methodology using the well-studied tricarboxylic acid cycle. %B Genome Res %V 12 %P 1703-15 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12421757 %0 Conference Paper %B Neural Networks for Signal Processing XII. 2002 IEEE Signal Processing Society WorkshopProceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing %D 2002 %T Unsupervised reduction of the dimensionality followed by supervised learning with a perceptron improves the classification of conditions in DNA microarray gene expression data %A Conde, L. %A Mateos, A. %A Herrero, J. %A Dopazo, J. %B Neural Networks for Signal Processing XII. 2002 IEEE Signal Processing Society WorkshopProceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing %I IEEE %C Martigny, Switzerland %G eng %U http://ieeexplore.ieee.org/document/1030019/http://xplorestaging.ieee.org/ielx5/8007/22134/01030019.pdf?arnumber=1030019 %R 10.1109/NNSP.2002.1030019 %0 Journal Article %J J Immunol %D 2002 %T Use of single point mutations in domain I of beta 2-glycoprotein I to determine fine antigenic specificity of antiphospholipid autoantibodies %A Iverson, G. M. %A Reddel, S. %A Victoria, E. J. %A Cockerill, K. A. %A Wang, Y. X. %A M. A. Marti-Renom %A Sali, A. %A Marquis, D. M. %A Krilis, S. A. %A Linnik, M. D. %K Amino Acid Substitution/genetics Antibodies %K Antibody/genetics Binding %K Antiphospholipid/blood/*metabolism Antibodies %K Competitive/genetics/immunology Enzyme-Linked Immunosorbent Assay/methods Epitopes/analysis/*immunology/metabolism Glycine/genetics Glycoproteins/biosynthesis/*genetics/*immunology/isolation & purification/metabolism Humans Models %K Molecular Peptide Fragments/genetics/immunology/isolation & purification/metabolism *Point Mutation Protein Structure %K Monoclonal/blood/metabolism Antiphospholipid Syndrome/immunology Arginine/genetics *Binding Sites %K Tertiary/genetics Recombinant Proteins/biosynthesis/immunology/isolation & purification/metabolism Static Electricity beta 2-Glycoprotein I %X Autoantibodies against beta(2)-glycoprotein I (beta(2)GPI) appear to be a critical feature of the antiphospholipid syndrome (APS). As determined using domain deletion mutants, human autoantibodies bind to the first of five domains present in beta(2)GPI. In this study the fine detail of the domain I epitope has been examined using 10 selected mutants of whole beta(2)GPI containing single point mutations in the first domain. The binding to beta(2)GPI was significantly affected by a number of single point mutations in domain I, particularly by mutations in the region of aa 40-43. Molecular modeling predicted these mutations to affect the surface shape and electrostatic charge of a facet of domain I. Mutation K19E also had an effect, albeit one less severe and involving fewer patients. Similar results were obtained in two different laboratories using affinity-purified anti-beta(2)GPI in a competitive inhibition ELISA and with whole serum in a direct binding ELISA. This study confirms that anti-beta(2)GPI autoantibodies bind to domain I, and that the charged surface patch defined by residues 40-43 contributes to a dominant target epitope. %B J Immunol %V 169 %P 7097-103 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=12471146 %0 Book Section %B ICANN 2002, LNCS 2415 %D 2002 %T Using perceptrons for supervised classification of DNA microarray samples: obtaining the optimal level of information and finding differentially expressed genes %A A. Mateos %A Herrero, J. %A Dopazo, J. %B ICANN 2002, LNCS 2415 %I J.R. Dorronsoro %P 577-582 %G eng %0 Journal Article %J Microb Drug Resist %D 2001 %T Annotated draft genomic sequence from a Streptococcus pneumoniae type 19F clinical isolate %A Dopazo, J. %A Mendoza, A. %A Herrero, J. %A Caldara, F. %A Humbert, Y. %A Friedli, L. %A Guerrier, M. %A Grand-Schenk, E. %A Gandin, C. %A de Francesco, M. %A Polissi, A. %A Buell, G. %A Feger, G. %A Garcia, E. %A Peitsch, M. %A Garcia-Bustos, J. F. %K Bacterial Molecular Sequence Data Pneumococcal Infections/*microbiology Prokaryotic Cells RNA %K Bacterial/chemistry/genetics Genes %K Bacterial/genetics *Genome %K DNA %K Transfer/metabolism Streptococcus pneumoniae/*genetics %X The public availability of numerous microbial genomes is enabling the analysis of bacterial biology in great detail and with an unprecedented, organism-wide and taxon-wide, broad scope. Streptococcus pneumoniae is one of the most important bacterial pathogens throughout the world. We present here sequences and functional annotations for 2.1-Mbp of pneumococcal DNA, covering more than 90% of the total estimated size of the genome. The sequenced strain is a clinical isolate resistant to macrolides and tetracycline. It carries a type 19F capsular locus, but multilocus sequence typing for several conserved genetic loci suggests that the strain sequenced belongs to a pneumococcal lineage that most often expresses a serotype 15 capsular polysaccharide. A total of 2,046 putative open reading frames (ORFs) longer than 100 amino acids were identified (average of 1,009 bp per ORF), including all described two-component systems and aminoacyl tRNA synthetases. Comparisons to other complete, or nearly complete, bacterial genomes were made and are presented in a graphical form for all the predicted proteins. %B Microb Drug Resist %V 7 %P 99-125 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11442348 %0 Journal Article %J J Comput Aided Mol Des %D 2001 %T Classification of protein disulphide-bridge topologies %A Mas, J. M. %A Aloy, P. %A M. A. Marti-Renom %A Oliva, B. %A de Llorens, R. %A Aviles, F. X. %A Querol, E. %K Algorithms Computer Simulation Databases as Topic Disulfides/*chemistry Models %K Molecular Protein Structure %K Secondary Protein Structure %K Tertiary Proteins/*chemistry/*classification Software %X The preferential occurrence of certain disulphide-bridge topologies in proteins has prompted us to design a method and a program, KNOT-MATCH, for their classification. The program has been applied to a database of proteins with less than 65% homology and more than two disulphide bridges. We have investigated whether there are topological preferences that can be used to group proteins and if these can be applied to gain insight into the structural or functional relationships among them. The classification has been performed by Density Search and Hierarchical Clustering Techniques, yielding thirteen main protein classes from the superimposition and clustering process. It is noteworthy that besides the disulphide bridges, regular secondary structures and loops frequently become correctly aligned. Although the lack of significant sequence similarity among some clustered proteins precludes the easy establishment of evolutionary relationships, the program permits us to find out important structural or functional residues upon the superimposition of two protein structures apparently unrelated. The derived classification can be very useful for finding relationships among proteins which would escape detection by current sequence or topology-based analytical algorithms. %B J Comput Aided Mol Des %V 15 %P 477-87 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11394740 %0 Journal Article %J FEBS Lett %D 2001 %T C-terminal propeptide of the Caldariomyces fumago chloroperoxidase: an intramolecular chaperone? %A A. Conesa %A Weelink, G. %A van den Hondel, C. A. %A Punt, P. J. %K Amino Acid Sequence Ascomycota/*enzymology/genetics Aspergillus niger/genetics Base Sequence Chloride Peroxidase/biosynthesis/*chemistry/genetics DNA Primers/genetics Enzyme Precursors/biosynthesis/chemistry/genetics Gene Expression Molecular Chaperones/b %X The Caldariomyces fumago chloroperoxidase (CPO) is synthesised as a 372-aa precursor which undergoes two proteolytic processing events: removal of a 21-aa N-terminal signal peptide and of a 52-aa C-terminal propeptide. The Aspergillus niger expression system developed for CPO was used to get insight into the function of this C-terminal propeptide. A. niger transformants expressing a CPO protein from which the C-terminal propeptide was deleted failed in producing any extracellular CPO activity, although the CPO polypeptide was synthesised. Expression of the full-length gene in an A. niger strain lacking the KEX2-like protease PclA also resulted in the production of CPO cross-reactive material into the culture medium, but no CPO activity. Based on these results, a function of the C-terminal propeptide in CPO maturation is indicated. %B FEBS Lett %V 503 %P 117-20 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11513866 %0 Journal Article %J Bioinformatics %D 2001 %T DBAli: a database of protein structure alignments %A M. A. Marti-Renom %A Ilyin, V. A. %A Sali, A. %K Computational Biology *Databases %K Protein Proteins/*chemistry/*genetics Sequence Alignment/*statistics & numerical data Software Software Design %X SUMMARY: The DBAli database includes approximately 35000 alignments of pairs of protein structures from SCOP (Lo Conte et al., Nucleic Acids Res., 28, 257-259, 2000) and CE (Shindyalov and Bourne, Protein Eng., 11, 739-747, 1998). DBAli is linked to several resources, including Compare3D (Shindyalov and Bourne, http://www.sdsc.edu/pb/software.htm, 1999) and ModView (Ilyin and Sali, http://guitar.rockefeller.edu/ModView/, 2001) for visualizing sequence alignments and structure superpositions. A flexible search of DBAli by protein sequence and structure properties allows construction of subsets of alignments suitable for a number of applications, such as benchmarking of sequence-sequence and sequence-structure alignment methods under a variety of conditions. AVAILABILITY: http://guitar.rockefeller.edu/DBAli/ %B Bioinformatics %V 17 %P 746-7 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11524379 %0 Journal Article %J Bioinformatics %D 2001 %T EVA: continuous automatic evaluation of protein structure prediction servers %A Eyrich, V. A. %A M. A. Marti-Renom %A Przybylski, D. %A Madhusudhan, M. S. %A Fiser, A. %A Pazos, F. %A Valencia, A. %A Sali, A. %A Rost, B. %K Automation Internet *Protein Conformation Proteins/*analysis *Software %X Evaluation of protein structure prediction methods is difficult and time-consuming. Here, we describe EVA, a web server for assessing protein structure prediction methods, in an automated, continuous and large-scale fashion. Currently, EVA evaluates the performance of a variety of prediction methods available through the internet. Every week, the sequences of the latest experimentally determined protein structures are sent to prediction servers, results are collected, performance is evaluated, and a summary is published on the web. EVA has so far collected data for more than 3000 protein chains. These results may provide valuable insight to both developers and users of prediction methods. AVAILABILITY: http://cubic.bioc.columbia.edu/eva. CONTACT: eva@cubic.bioc.columbia.edu %B Bioinformatics %V 17 %P 1242-3 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11751240 %0 Journal Article %J J Biol Chem %D 2001 %T Expression of the Caldariomyces fumago chloroperoxidase in Aspergillus niger and characterization of the recombinant enzyme %A A. Conesa %A van De Velde, F. %A van Rantwijk, F. %A Sheldon, R. A. %A van den Hondel, C. A. %A Punt, P. J. %K Aspergillus niger/enzymology/genetics Catalysis Chloride Peroxidase/biosynthesis/*genetics Fungal Proteins/biosynthesis/*genetics Recombinant Proteins/biosynthesis/genetics Substrate Specificity %X The Caldariomyces fumago chloroperoxidase was successfully expressed in Aspergillus niger. The recombinant enzyme was produced in the culture medium as an active protein and could be purified by a three-step purification procedure. The catalytic behavior of recombinant chloroperoxidase (rCPO) was studied and compared with that of native CPO. The specific chlorination activity (47 units/nmol) of rCPO and its pH optimum (pH 2.75) were very similar to those of native CPO. rCPO catalyzes the oxidation of various substrates in comparable yields and selectivities to native CPO. Indole was oxidized to 2-oxindole with 99% selectivity and thioanisole to the corresponding R-sulfoxide (enantiomeric excess >98%). Incorporation of (18)O from labeled H(2)18O(2) into the oxidized products was 100% in both cases. %B J Biol Chem %V 276 %P 17635-40 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11278701 %0 Journal Article %J Bioinformatics %D 2001 %T A hierarchical unsupervised growing neural network for clustering gene expression patterns %A Herrero, J. %A Valencia, A. %A Dopazo, J. %K *Algorithms Automatic Data Processing *Gene Expression Profiling *Neural Networks (Computer) *Oligonucleotide Array Sequence Analysis %X MOTIVATION: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlated gene expression patterns, and this is usually achieved by clustering them. The Self-Organising Tree Algorithm, (SOTA) (Dopazo,J. and Carazo,J.M. (1997) J. Mol. Evol., 44, 226-233), is a neural network that grows adopting the topology of a binary tree. The result of the algorithm is a hierarchical cluster obtained with the accuracy and robustness of a neural network. RESULTS: SOTA clustering confers several advantages over classical hierarchical clustering methods. SOTA is a divisive method: the clustering process is performed from top to bottom, i.e. the highest hierarchical levels are resolved before going to the details of the lowest levels. The growing can be stopped at the desired hierarchical level. Moreover, a criterion to stop the growing of the tree, based on the approximate distribution of probability obtained by randomisation of the original data set, is provided. By means of this criterion, a statistical support for the definition of clusters is proposed. In addition, obtaining average gene expression patterns is a built-in feature of the algorithm. Different neurons defining the different hierarchical levels represent the averages of the gene expression patterns contained in the clusters. Since SOTA runtimes are approximately linear with the number of items to be classified, it is especially suitable for dealing with huge amounts of data. The method proposed is very general and applies to any data providing that they can be coded as a series of numbers and that a computable measure of similarity between data items can be used. AVAILABILITY: A server running the program can be found at: http://bioinfo.cnio.es/sotarray. %B Bioinformatics %V 17 %P 126-36 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11238068 %0 Journal Article %J Vet Res %D 2001 %T Identification of optimal regions for phylogenetic studies on VP1 gene of foot-and-mouth disease virus: analysis of types A and O Argentinean viruses %A Nunez, J. I. %A Martin, M. J. %A Piccone, M. E. %A Carrillo, E. %A Palma, E. L. %A Dopazo, J. %A Sobrino, F. %K Amino Acid Sequence Animals Aphthovirus/classification/*genetics Base Sequence Capsid/chemistry/*genetics Capsid Proteins DNA %K Complementary/chemistry Molecular Sequence Data *Phylogeny Polymerase Chain Reaction RNA %K Viral/chemistry/genetics Serotyping Viral Proteins/analysis/*genetics %X An analysis of the informative content of sequence stretches on the foot-and-mouth disease virus (FMDV) VPI gene was applied to two important viral serotypes: A and O. Several sequence regions were identified to allow the reconstruction of phylogenetic trees equivalent to those derived from the whole VPI gene. The optimal informative regions for sequence windows of 150 to 250 nt were predicted between positions 250 and 550 of the gene. The sequences spanning the 250 nt of the 3’ end (positions 400 to 650), extensively used for FMDV phylogenetic analyses, showed a lower informative content. In spite of this, the use of sequences from this region allowed the derivation of phylogenetic trees for type A and type O FMDVs which showed topologies similar to those previously reported for the whole VP1 gene. When the sequences determined for viruses isolated in Argentina, between 1990 and 1993, were included in these analyses, the results obtained revealed features of the circulation of type A and type O viruses in the field, in the months that preceded the eradication of the disease in this country. Type A viruses were closely related to an Argentinean vaccine strain, and defined an independent cluster within this serotype. Among the type O viruses analysed, two groups were distinguished; one was closely related to the South American vaccine strains, while the other was grouped with viruses of the O3 subtype. In addition, a detailed phylogeny for type A FMDV is presented. %B Vet Res %V 32 %P 31-45 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11254175 %0 Journal Article %J J Immunol Methods %D 2001 %T Methods and approaches in the analysis of gene expression data %A Dopazo, J. %A Zanders, E. %A Dragoni, I. %A Amphlett, G. %A Falciani, F. %X

The application of high-density DNA array technology to monitor gene transcription has been responsible for a real paradigm shift in biology. The majority of research groups now have the ability to measure the expression of a significant proportion of the human genome in a single experiment, resulting in an unprecedented volume of data being made available to the scientific community. As a consequence of this, the storage, analysis and interpretation of this information present a major challenge. In the field of immunology the analysis of gene expression profiles has opened new areas of investigation. The study of cellular responses has revealed that cells respond to an activation signal with waves of co-ordinated gene expression profiles and that the components of these responses are the key to understanding the specific mechanisms which lead to phenotypic differentiation. The discovery of ’cell type specific’ gene expression signatures have also helped the interpretation of the mechanisms leading to disease progression. Here we review the principles behind the most commonly used data analysis methods and discuss the approaches that have been employed in immunological research.

%B J Immunol Methods %V 250 %P 93-112 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11251224 %0 Journal Article %J Bull Math Biol %D 2001 %T A model for the interaction of learning and evolution %A H. Dopazo %A Gordon, M. B. %A Perazzo, R. %A Risau-Gusman, S. %K Algorithms Alleles Animals *Evolution Genotype Humans *Learning *Neural Networks (Computer) Numerical Analysis %K Computer-Assisted Phenotype Synapses/genetics %X We present a simple model in order to discuss the interaction of the genetic and behavioral systems throughout evolution. This considers a set of adaptive perceptrons in which some of their synapses can be updated through a learning process. This framework provides an extension of the well-known Hinton and Nowlan model by blending together some learning capability and other (rigid) genetic effects that contribute to the fitness. We find a halting effect in the evolutionary dynamics, in which the transcription of environmental data into genetic information is hindered by learning, instead of stimulated as is usually understood by the so-called Baldwin effect. The present results are discussed and compared with those reported in the literature. An interpretation is provided of the halting effect. %B Bull Math Biol %V 63 %P 117-34 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11146879 %0 Journal Article %J J Mol Evol %D 2001 %T Phylogenetic analysis of viroid and viroid-like satellite RNAs from plants: a reassessment %A Elena, S. F. %A Dopazo, J. %A de la Pena, M. %A Flores, R. %A Diener, T. O. %A Moya, A. %K Evolution %K Molecular *Phylogeny Plant Viruses/*genetics RNA %K Satellite/*genetics RNA %K Viral/genetics Viroids/*genetics %X The proposed monophyletic origin of a group of subviral plant pathogens (viroids and viroid-like satellite RNAs), as well as the phylogenetic relationships and the resulting taxonomy of these entities, has been recently questioned. The criticism comes from the (apparent) lack of sequence similarity among these RNAs necessary to reliably infer a phylogeny. Here we show that, despite their low overall sequence similarity, a sequence alignment manually adjusted to take into account all the local similarities and the insertions/deletions and duplications/rearrangements described in the literature for viroids and viroid-like satellite RNA, along with the use of an appropriate estimator of genetic distances, constitutes a data set suitable for a phylogenetic reconstruction. When the likelihood-mapping method was applied to this data set, the tree-likeness obtained was higher than that corresponding to a sequence alignment that does not take into consideration the local similarities. In addition, bootstrap analysis also supports the major groups previously proposed and the reconstruction is consistent with the biological properties of this RNAs. %B J Mol Evol %V 53 %P 155-9 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11479686 %0 Journal Article %J Fungal Genet Biol %D 2001 %T The secretion pathway in filamentous fungi: a biotechnological view %A A. Conesa %A Punt, P. J. %A van Luijk, N. %A van den Hondel, C. A. %K Animals Biotechnology/*methods Fungal Proteins/*genetics/*metabolism Fungi/*genetics/*metabolism Humans Recombinant Proteins/metabolism %X The high capacity of the secretion machinery of filamentous fungi has been widely exploited for the production of homologous and heterologous proteins; however, our knowledge of the fungal secretion pathway is still at an early stage. Most of the knowledge comes from models developed in yeast and higher eukaryotes, which have served as reference for the studies on fungal species. In this review we compile the data accumulated in recent years on the molecular basis of fungal secretion, emphasizing the relevance of these data for the biotechnological use of the fungal cell and indicating how this information has been applied in attempts to create improved production strains. We also present recent emerging approaches that promise to provide answers to fundamental questions on the molecular genetics of the fungal secretory pathway. %B Fungal Genet Biol %V 33 %P 155-71 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11495573