02826nas a2200517 4500008004100000022001400041245009100055210006900146260001600215490000700231520126600238653001901504653001001523653001101533653001901544653002201563653002501585653001801610653001101628653000901639653003601648653001001684653002801694100002001722700002201742700002801764700002901792700002001821700002001841700002801861700002301889700002001912700002101932700002101953700002601974700002102000700002602021700002402047700001902071700002002090700001702110700002502127700001902152700002502171856011202196 2024 eng d a2073-442500aCharacterization of the Common Genetic Variation in the Spanish Population of Navarre.0 aCharacterization of the Common Genetic Variation in the Spanish c2024 May 040 v153 a

Large-scale genomic studies have significantly increased our knowledge of genetic variability across populations. Regional genetic profiling is essential for distinguishing common benign variants from disease-causing ones. To this end, we conducted a comprehensive characterization of exonic variants in the population of Navarre (Spain), utilizing whole genome sequencing data from 358 unrelated individuals of Spanish origin. Our analysis revealed 61,410 biallelic single nucleotide variants (SNV) within the Navarrese cohort, with 35% classified as common (MAF > 1%). By comparing allele frequency data from 1000 Genome Project (excluding the Iberian cohort of Spain, IBS), Genome Aggregation Database, and a Spanish cohort (including IBS individuals and data from Medical Genome Project), we identified 1069 SNVs common in Navarre but rare (MAF ≤ 1%) in all other populations. We further corroborated this observation with a second regional cohort of 239 unrelated exomes, which confirmed 676 of the 1069 SNVs as common in Navarre. In conclusion, this study highlights the importance of population-specific characterization of genetic variation to improve allele frequency filtering in sequencing data analysis to identify disease-causing variants.

10aCohort Studies10aExome10aFemale10aGene Frequency10aGenetic Variation10aGenetics, Population10aGenome, Human10aHumans10aMale10aPolymorphism, Single Nucleotide10aSpain10aWhole Genome Sequencing1 aMaillo, Alberto1 aHuergo, Estefania1 aApellániz-Ruiz, María1 aUrrutia-Lafuente, Edurne1 aMiranda, María1 aSalgado, Josefa1 aPasalodos-Sanchez, Sara1 aDelgado-Mora, Luna1 aTeijido, Óscar1 aGoicoechea, Ibai1 aCarmona, Rosario1 aPerez-Florido, Javier1 aAquino, Virginia1 aLópez-López, Daniel1 aPeña-Chilet, Maria1 aBeltran, Sergi1 aDopazo, Joaquin1 aLasa, Iñigo1 aBeloqui, Juan, José1 aAlonso, Ángel1 aGomez-Cabrero, David uhttps://www.clinbioinfosspa.es/content/characterization-common-genetic-variation-spanish-population-navarre01708nas a2200181 4500008004100000245013000041210006900171260000900240300001200249490000700261520105400268100001801322700003401340700003401374700002601408700003301434856005901467 2023 eng d00aCase report: Analysis of phage therapy failure in a patient with a Pseudomonas aeruginosa prosthetic vascular graft infection0 aCase report Analysis of phage therapy failure in a patient with c2023 a11996570 v103 a

Clinical case of a patient with a multidrug-resistant prosthetic vascular graft infection which was treated with a cocktail of phages (PT07, 14/01, and PNM) in combination with ceftazidime-avibactam (CZA). After the application of the phage treatment and in absence of antimicrobial therapy, a new bloodstream infection (BSI) with a septic residual limb metastasis occurred, now involving a wild-type strain being susceptible to ß-lactams and quinolones. Clinical strains were analyzed by microbiology and whole genome sequencing techniques. In relation with phage administration, the clinical isolates of before phage therapy (HE2011471) and post phage therapy (HE2105886) showed a clonal relationship but with important genomic changes which could be involved in the resistance to this therapy. Finally, phenotypic studies showed a decrease in Minimum Inhibitory Concentration (MIC) to ß-lactams and quinolones as well as an increase of the biofilm production and phage resistant mutants in the clinical isolate of post phage therapy.

1 aBlasco, Lucia1 aLópez-Hernández, Inmaculada1 aRodríguez-Fernández, Miguel1 aPerez-Florido, Javier1 aCasimiro-Soriguer, Carlos, S uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10235614/00729nam a2200205 4500008004100000020002200041022001400063245010000077210006900177260003800246300001200284490001000296100002000306700002500326700003000351700003600381700002000417700002000437856006600457 2023 eng d a978-3-031-42696-4 a0302-974300aCell-Level Pathway Scoring Comparison with a Biologically Constrained Variational Autoencoder0 aCellLevel Pathway Scoring Comparison with a Biologically Constra aChambSpringer Nature Switzerland a62 - 770 v141371 aGundogdu, Pelin1 aPayá-Milans, Miriam1 aAlamo-Alvarez, Inmaculada1 aNepomuceno-Chamorro, Isabel, A.1 aDopazo, Joaquin1 aLoucera, Carlos uhttps://link.springer.com/chapter/10.1007/978-3-031-42697-1_502250nas a2200253 4500008004100000022001400041245013700055210006900192260001600261490000700277520134400284100002701628700002001655700002401675700002601699700001801725700001801743700002101761700001901782700002801801700002001829700002601849856012101875 2023 eng d a1999-492300aA Comprehensive Analysis of 21 Actionable Pharmacogenes in the Spanish Population: From Genetic Characterisation to Clinical Impact.0 aComprehensive Analysis of 21 Actionable Pharmacogenes in the Spa c2023 Apr 190 v153 a

The implementation of pharmacogenetics (PGx) is a main milestones of precision medicine nowadays in order to achieve safer and more effective therapies. Nevertheless, the implementation of PGx diagnostics is extremely slow and unequal worldwide, in part due to a lack of ethnic PGx information. We analysed genetic data from 3006 Spanish individuals obtained by different high-throughput (HT) techniques. Allele frequencies were determined in our population for the main 21 actionable PGx genes associated with therapeutical changes. We found that 98% of the Spanish population harbours at least one allele associated with a therapeutical change and, thus, there would be a need for a therapeutical change in a mean of 3.31 of the 64 associated drugs. We also identified 326 putative deleterious variants that were not previously related with PGx in 18 out of the 21 main PGx genes evaluated and a total of 7122 putative deleterious variants for the 1045 PGx genes described. Additionally, we performed a comparison of the main HT diagnostic techniques, revealing that after whole genome sequencing, genotyping with the PGx HT array is the most suitable solution for PGx diagnostics. Finally, all this information was integrated in the Collaborative Spanish Variant Server to be available to and updated by the scientific community.

1 aNúñez-Torres, Rocío1 aPita, Guillermo1 aPeña-Chilet, Maria1 aLópez-López, Daniel1 aZamora, Jorge1 aRoldán, Gema1 aHerráez, Belén1 aAlvarez, Nuria1 aAlonso, María, Rosario1 aDopazo, Joaquin1 aGonzález-Neira, Anna uhttps://www.clinbioinfosspa.es/content/comprehensive-analysis-21-actionable-pharmacogenes-spanish-population-genetic01712nas a2200169 4500008004100000022001400041245008500055210006900140260001600209490000700225520110800232100001801340700002001358700002401378700002001402856012001422 2023 eng d a1422-006700aCrosstalk between Metabolite Production and Signaling Activity in Breast Cancer.0 aCrosstalk between Metabolite Production and Signaling Activity i c2023 Apr 180 v243 a

The reprogramming of metabolism is a recognized cancer hallmark. It is well known that different signaling pathways regulate and orchestrate this reprogramming that contributes to cancer initiation and development. However, recent evidence is accumulating, suggesting that several metabolites could play a relevant role in regulating signaling pathways. To assess the potential role of metabolites in the regulation of signaling pathways, both metabolic and signaling pathway activities of Breast invasive Carcinoma (BRCA) have been modeled using mechanistic models. Gaussian Processes, powerful machine learning methods, were used in combination with SHapley Additive exPlanations (SHAP), a recent methodology that conveys causality, to obtain potential causal relationships between the production of metabolites and the regulation of signaling pathways. A total of 317 metabolites were found to have a strong impact on signaling circuits. The results presented here point to the existence of a complex crosstalk between signaling and metabolic pathways more complex than previously was thought.

1 aCubuk, Cankut1 aLoucera, Carlos1 aPeña-Chilet, Maria1 aDopazo, Joaquin uhttps://www.clinbioinfosspa.es/content/crosstalk-between-metabolite-production-and-signaling-activity-breast-cancer02668nas a2200301 4500008004100000022001400041245008600055210006900141260001600210300000700226490000700233520168700240100002601927700001801953700002901971700002302000700002102023700002102044700002602065700002202091700002002113700002702133700002602160700002402186700002002210710002902230856010702259 2023 eng d a1479-736400aA crowdsourcing database for the copy-number variation of the Spanish population.0 acrowdsourcing database for the copynumber variation of the Spani c2023 Mar 09 a200 v173 a

BACKGROUND: Despite being a very common type of genetic variation, the distribution of copy-number variations (CNVs) in the population is still poorly understood. The knowledge of the genetic variability, especially at the level of the local population, is a critical factor for distinguishing pathogenic from non-pathogenic variation in the discovery of new disease variants.

RESULTS: Here, we present the SPAnish Copy Number Alterations Collaborative Server (SPACNACS), which currently contains copy number variation profiles obtained from more than 400 genomes and exomes of unrelated Spanish individuals. By means of a collaborative crowdsourcing effort whole genome and whole exome sequencing data, produced by local genomic projects and for other purposes, is continuously collected. Once checked both, the Spanish ancestry and the lack of kinship with other individuals in the SPACNACS, the CNVs are inferred for these sequences and they are used to populate the database. A web interface allows querying the database with different filters that include ICD10 upper categories. This allows discarding samples from the disease under study and obtaining pseudo-control CNV profiles from the local population. We also show here additional studies on the local impact of CNVs in some phenotypes and on pharmacogenomic variants. SPACNACS can be accessed at: http://csvs.clinbioinfosspa.es/spacnacs/ .

CONCLUSION: SPACNACS facilitates disease gene discovery by providing detailed information of the local variability of the population and exemplifies how to reuse genomic data produced for other purposes to build a local reference database.

1 aLópez-López, Daniel1 aRoldán, Gema1 aFernandez-Rueda, Jose, L1 aBostelmann, Gerrit1 aCarmona, Rosario1 aAquino, Virginia1 aPerez-Florido, Javier1 aOrtuno, Francisco1 aPita, Guillermo1 aNúñez-Torres, Rocío1 aGonzález-Neira, Anna1 aPeña-Chilet, Maria1 aDopazo, Joaquin1 aCSVS Crowdsourcing Group uhttps://www.clinbioinfosspa.es/content/crowdsourcing-database-copy-number-variation-spanish-population02679nas a2200337 4500008004100000022001400041245011500055210006900170260001600239520155800255100001601813700001901829700002001848700001901868700003101887700002201918700002501940700001701965700001701982700001901999700002102018700002702039700002002066700002202086700002202108700001902130700002002149700002102169710002002190856013102210 2022 eng d a1399-000400aCIBERER: Spanish National Network for Research on Rare Diseases: a highly productive collaborative initiative.0 aCIBERER Spanish National Network for Research on Rare Diseases a c2022 Jan 203 a

CIBER (Center for Biomedical Network Research; Centro de Investigación Biomédica En Red) is a public national consortium created in 2006 under the umbrella of the Spanish National Institute of Health Carlos III (ISCIII). This innovative research structure comprises 11 different specific areas dedicated to the main public health priorities in the National Health System. CIBERER, the thematic area of CIBER focused on Rare Diseases currently consists of 75 research groups belonging to universities, research centers and hospitals of the entire country. CIBERER's mission is to be a center prioritizing and favoring collaboration and cooperation between biomedical and clinical research groups, with special emphasis on the aspects of genetic, molecular, biochemical and cellular research of rare diseases. This research is the basis for providing new tools for the diagnosis and therapy of low-prevalence diseases, in line with the International Rare Diseases Research Consortium (IRDiRC) objectives, thus favoring translational research between the scientific environment of the laboratory and the clinical setting of health centers. In this paper, we intend to review CIBERER's 15-year journey and summarize the main results obtained in terms of internationalization, scientific production, contributions towards the discovery of new therapies and novel genes associated to diseases, cooperation with patients' associations and many other topics related to rare disease research. This article is protected by copyright. All rights reserved.

1 aLuque, Juan1 aMendes, Ingrid1 aGómez, Beatriz1 aMorte, Beatriz1 ade Heredia, Miguel, López1 aHerreras, Enrique1 aCorrochano, Virginia1 aBueren, Juan1 aGallano, Pia1 aArtuch, Rafael1 aFillat, Cristina1 aPérez-Jurado, Luis, A1 aMontoliu, Lluis1 aCarracedo, Ángel1 aMillán, José, M1 aWebb, Susan, M1 aPalau, Francesc1 aLapunzina, Pablo1 aCIBERER Network uhttps://www.clinbioinfosspa.es/content/ciberer-spanish-national-network-research-rare-diseases-highly-productive-collaborative02338nas a2200373 4500008004100000022001400041245009100055210006900146260001600215300000800231490000700239520108200246653002401328653002601352653002301378653001101401100003001412700002901442700002801471700002801499700003701527700002401564700003101588700001801619700002701637700002501664700003101689700002401720700002001744700002601764700003201790700002501822856011701847 2021 eng d a1471-210500aA comprehensive database for integrated analysis of omics data in autoimmune diseases.0 acomprehensive database for integrated analysis of omics data in c2021 Jun 24 a3430 v223 a

BACKGROUND: Autoimmune diseases are heterogeneous pathologies with difficult diagnosis and few therapeutic options. In the last decade, several omics studies have provided significant insights into the molecular mechanisms of these diseases. Nevertheless, data from different cohorts and pathologies are stored independently in public repositories and a unified resource is imperative to assist researchers in this field.

RESULTS: Here, we present Autoimmune Diseases Explorer ( https://adex.genyo.es ), a database that integrates 82 curated transcriptomics and methylation studies covering 5609 samples for some of the most common autoimmune diseases. The database provides, in an easy-to-use environment, advanced data analysis and statistical methods for exploring omics datasets, including meta-analysis, differential expression or pathway analysis.

CONCLUSIONS: This is the first omics database focused on autoimmune diseases. This resource incorporates homogeneously processed data to facilitate integrative analyses among studies.

10aAutoimmune Diseases10aComputational Biology10aDatabases, Factual10aHumans1 aMartorell-Marugán, Jordi1 aLópez-Domínguez, Raúl1 aGarcía-Moreno, Adrián1 aToro-Domínguez, Daniel1 aVillatoro-García, Juan, Antonio1 aBarturen, Guillermo1 aMartín-Gómez, Adoración1 aTroule, Kevin1 aGómez-López, Gonzalo1 aAl-Shahrour, Fátima1 aGonzález-Rumayor, Víctor1 aPeña-Chilet, Maria1 aDopazo, Joaquin1 aSaez-Rodriguez, Julio1 aAlarcón-Riquelme, Marta, E1 aCarmona-Sáez, Pedro uhttps://www.clinbioinfosspa.es/content/comprehensive-database-integrated-analysis-omics-data-autoimmune-diseases07193nas a2202077 4500008004100000022001400041245010000055210006900155260001200224300001100236490000700247520130900254653002101563653002601584653002201610653001301632653001401645653001601659653002301675653003101698653003201729653001101761653002301772653002201795653002101817653001601838653003601854653001801890653003201908653001501940653002401955653001301979653002601992653001902018100002302037700001902060700002202079700002102101700001802122700002702140700001902167700002602186700002602212700001802238700001902256700001702275700002202292700002102314700001902335700002902354700001802383700001602401700002902417700001802446700002302464700002202487700002002509700002002529700002702549700002102576700001702597700001802614700002402632700002102656700001602677700002102693700001902714700002302733700002302756700002002779700001702799700001902816700002402835700002102859700002402880700001702904700002302921700002402944700001902968700002002987700001903007700001903026700002903045700002303074700002003097700001703117700001803134700002403152700003303176700002003209700002003229700001603249700001503265700001803280700001903298700002603317700002703343700002003370700002403390700001803414700002403432700002203456700002203478700001903500700002003519700002103539700001803560700001503578700002203593700002303615700001703638700001903655700002403674700001703698700001803715700001703733700002303750700001803773700001803791700002303809700002103832700001703853700002203870700002003892700001803912700001803930700002303948700002103971700002503992700001804017700002004035700001704055700001904072700001704091700002104108700002504129700002704154700002404181700001604205700002104221700002504242700001704267700002004284700001804304700002504322700002404347700002304371700002104394700001904415700002204434700001804456700001604474700002204490700002004512700001804532700002504550700001904575700002204594700002404616700002304640700002004663700002304683700002204706700002304728700002304751700002604774700002004800700002204820700002004842700002304862700002104885700001804906700002404924710003504948856013204983 2021 eng d a1744-429200aCOVID19 Disease Map, a computational knowledge repository of virus-host interaction mechanisms.0 aCOVID19 Disease Map a computational knowledge repository of viru c2021 10 ae103870 v173 a

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.

10aAntiviral Agents10aComputational Biology10aComputer Graphics10aCOVID-1910aCytokines10aData Mining10aDatabases, Factual10aGene Expression Regulation10aHost Microbial Interactions10aHumans10aImmunity, Cellular10aImmunity, Humoral10aImmunity, Innate10aLymphocytes10aMetabolic Networks and Pathways10aMyeloid Cells10aProtein Interaction Mapping10aSARS-CoV-210aSignal Transduction10aSoftware10aTranscription Factors10aViral Proteins1 aOstaszewski, Marek1 aNiarakis, Anna1 aMazein, Alexander1 aKuperstein, Inna1 aPhair, Robert1 aOrta-Resendiz, Aurelio1 aSingh, Vidisha1 aAghamiri, Sara, Sadat1 aAcencio, Marcio, Luis1 aGlaab, Enrico1 aRuepp, Andreas1 aFobo, Gisela1 aMontrone, Corinna1 aBrauner, Barbara1 aFrishman, Goar1 aGómez, Luis, Cristóbal1 aSomers, Julia1 aHoch, Matti1 aGupta, Shailendra, Kumar1 aScheel, Julia1 aBorlinghaus, Hanna1 aCzauderna, Tobias1 aSchreiber, Falk1 aMontagud, Arnau1 ade Leon, Miguel, Ponce1 aFunahashi, Akira1 aHiki, Yusuke1 aHiroi, Noriko1 aYamada, Takahiro, G1 aDräger, Andreas1 aRenz, Alina1 aNaveez, Muhammad1 aBocskei, Zsolt1 aMessina, Francesco1 aBörnigen, Daniela1 aFergusson, Liam1 aConti, Marta1 aRameil, Marius1 aNakonecnij, Vanessa1 aVanhoefer, Jakob1 aSchmiester, Leonard1 aWang, Muying1 aAckerman, Emily, E1 aShoemaker, Jason, E1 aZucker, Jeremy1 aOxford, Kristie1 aTeuton, Jeremy1 aKocakaya, Ebru1 aSummak, Gökçe, Yağmur1 aHanspers, Kristina1 aKutmon, Martina1 aCoort, Susan1 aEijssen, Lars1 aEhrhart, Friederike1 aRex, Devasahayam, Arokia Bal1 aSlenter, Denise1 aMartens, Marvin1 aPham, Nhung1 aHaw, Robin1 aJassal, Bijay1 aMatthews, Lisa1 aOrlic-Milacic, Marija1 aRibeiro, Andrea, Senff1 aRothfels, Karen1 aShamovsky, Veronica1 aStephan, Ralf1 aSevilla, Cristoffer1 aVarusai, Thawfeek1 aRavel, Jean-Marie1 aFraser, Rupsha1 aOrtseifen, Vera1 aMarchesi, Silvia1 aGawron, Piotr1 aSmula, Ewa1 aHeirendt, Laurent1 aSatagopam, Venkata1 aWu, Guanming1 aRiutta, Anders1 aGolebiewski, Martin1 aOwen, Stuart1 aGoble, Carole1 aHu, Xiaoming1 aOverall, Rupert, W1 aMaier, Dieter1 aBauch, Angela1 aGyori, Benjamin, M1 aBachman, John, A1 aVega, Carlos1 aGrouès, Valentin1 aVazquez, Miguel1 aPorras, Pablo1 aLicata, Luana1 aIannuccelli, Marta1 aSacco, Francesca1 aNesterova, Anastasia1 aYuryev, Anton1 ade Waard, Anita1 aTurei, Denes1 aLuna, Augustin1 aBabur, Ozgun1 aSoliman, Sylvain1 aValdeolivas, Alberto1 aEsteban-Medina, Marina1 aPeña-Chilet, Maria1 aRian, Kinza1 aHelikar, Tomáš1 aPuniya, Bhanwar, Lal1 aModos, Dezso1 aTreveil, Agatha1 aOlbei, Marton1 aDe Meulder, Bertrand1 aBallereau, Stephane1 aDugourd, Aurélien1 aNaldi, Aurélien1 aNoël, Vincent1 aCalzone, Laurence1 aSander, Chris1 aDemir, Emek1 aKorcsmaros, Tamas1 aFreeman, Tom, C1 aAugé, Franck1 aBeckmann, Jacques, S1 aHasenauer, Jan1 aWolkenhauer, Olaf1 aWilighagen, Egon, L1 aPico, Alexander, R1 aEvelo, Chris, T1 aGillespie, Marc, E1 aStein, Lincoln, D1 aHermjakob, Henning1 aD'Eustachio, Peter1 aSaez-Rodriguez, Julio1 aDopazo, Joaquin1 aValencia, Alfonso1 aKitano, Hiroaki1 aBarillot, Emmanuel1 aAuffray, Charles1 aBalling, Rudi1 aSchneider, Reinhard1 aCOVID-19 Disease Map Community uhttps://www.clinbioinfosspa.es/content/covid19-disease-map-computational-knowledge-repository-virus-host-interaction-mechanisms03508nas a2200709 4500008004100000022001400041245008200055210006900137260001500206300001600221490000700237520139500244653001201639653002301651653001801674653002301692653001001715653001901725653002201744653002501766653001801791653001301809653001101822653001301833653002301846653001301869653001001882100002401892700001801916700002601934700002501960700002101985700002102006700002602027700002002053700002902073700002302102700002902125700002602154700002002180700002702200700002702227700001802254700001902272700003002291700001802321700003402339700001902373700002902392700002702421700001902448700002502467700001702492700002802509700002802537700001902565700002202584700001902606700002002625710004302645856011002688 2021 eng d a1362-496200aCSVS, a crowdsourcing database of the Spanish population genetic variability.0 aCSVS a crowdsourcing database of the Spanish population genetic c2021 01 08 aD1130-D11370 v493 a

The knowledge of the genetic variability of the local population is of utmost importance in personalized medicine and has been revealed as a critical factor for the discovery of new disease variants. Here, we present the Collaborative Spanish Variability Server (CSVS), which currently contains more than 2000 genomes and exomes of unrelated Spanish individuals. This database has been generated in a collaborative crowdsourcing effort collecting sequencing data produced by local genomic projects and for other purposes. Sequences have been grouped by ICD10 upper categories. A web interface allows querying the database removing one or more ICD10 categories. In this way, aggregated counts of allele frequencies of the pseudo-control Spanish population can be obtained for diseases belonging to the category removed. Interestingly, in addition to pseudo-control studies, some population studies can be made, as, for example, prevalence of pharmacogenomic variants, etc. In addition, this genomic data has been used to define the first Spanish Genome Reference Panel (SGRP1.0) for imputation. This is the first local repository of variability entirely produced by a crowdsourcing effort and constitutes an example for future initiatives to characterize local variability worldwide. CSVS is also part of the GA4GH Beacon network. CSVS can be accessed at: http://csvs.babelomics.org/.

10aAlleles10aChromosome Mapping10aCrowdsourcing10aDatabases, Genetic10aExome10aGene Frequency10aGenetic Variation10aGenetics, Population10aGenome, Human10aGenomics10aHumans10aInternet10aPrecision Medicine10aSoftware10aSpain1 aPeña-Chilet, Maria1 aRoldán, Gema1 aPerez-Florido, Javier1 aOrtuno, Francisco, M1 aCarmona, Rosario1 aAquino, Virginia1 aLópez-López, Daniel1 aLoucera, Carlos1 aFernandez-Rueda, Jose, L1 aGallego, Asunción1 aGarcia-Garcia, Francisco1 aGonzález-Neira, Anna1 aPita, Guillermo1 aNúñez-Torres, Rocío1 aSantoyo-López, Javier1 aAyuso, Carmen1 aMinguez, Pablo1 aAvila-Fernandez, Almudena1 aCorton, Marta1 aMoreno-Pelayo, Miguel, Ángel1 aMorin, Matías1 aGallego-Martinez, Alvaro1 aLopez-Escamez, Jose, A1 aBorrego, Salud1 aAntiňolo, Guillermo1 aAmigo, Jorge1 aSalgado-Garrido, Josefa1 aPasalodos-Sanchez, Sara1 aMorte, Beatriz1 aCarracedo, Ángel1 aAlonso, Ángel1 aDopazo, Joaquin1 aSpanish Exome Crowdsourcing Consortium uhttps://www.clinbioinfosspa.es/content/csvs-crowdsourcing-database-spanish-population-genetic-variability02937nas a2200613 4500008004100000022001400041245012600055210006900181260001500250300001500265490000700280520120500287653001801492653001101510653001301521653001101534653002101545653000901566653001401575653002001589653001301609653001501622653001801637100001301655700002401668700001201692700001701704700001601721700001701737700001601754700001601770700001201786700001401798700001601812700001501828700002101843700002101864700002201885700001501907700002301922700002101945700002101966700001601987700001602003700002102019700002502040700001902065700001702084700001402101700001802115700002602133710003102159856013302190 2020 eng d a2405-472000aCommunity Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.0 aCommunity Assessment of the Predictability of Cancer Protein and c2020 08 26 a186-195.e90 v113 a

Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteogenomic challenge. We asked for methods to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. The best performance was achieved by an ensemble of models, including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types, and phosphosite proximity for phosphorylation prediction. Proteins from metabolic pathways and complexes were the best and worst predicted, respectively. The performance of even the best-performing model was modest, suggesting that many proteins are strongly regulated through translational control and degradation. Our results set a reference for the limitations of computational inference in proteogenomics. A record of this paper's transparent peer review process is included in the Supplemental Information.

10aCrowdsourcing10aFemale10aGenomics10aHumans10aMachine Learning10aMale10aNeoplasms10aPhosphoproteins10aProteins10aProteomics10aTranscriptome1 aYang, Mi1 aPetralia, Francesca1 aLi, Zhi1 aLi, Hongyang1 aMa, Weiping1 aSong, Xiaoyu1 aKim, Sunkyu1 aLee, Heewon1 aYu, Han1 aLee, Bora1 aBae, Seohui1 aHeo, Eunji1 aKaczmarczyk, Jan1 aStępniak, Piotr1 aWarchoł, Michał1 aYu, Thomas1 aCalinawan, Anna, P1 aBoutros, Paul, C1 aPayne, Samuel, H1 aReva, Boris1 aBoja, Emily1 aRodriguez, Henry1 aStolovitzky, Gustavo1 aGuan, Yuanfang1 aKang, Jaewoo1 aWang, Pei1 aFenyö, David1 aSaez-Rodriguez, Julio1 aNCI-CPTAC-DREAM Consortium uhttps://www.clinbioinfosspa.es/content/community-assessment-predictability-cancer-protein-and-phosphoprotein-levels-genomics-and01798nas a2200577 4500008004100000022001400041245011100055210006900166260001500235300000800250490000600258653002000264653002600284653002700310653001300337653002300350653003200373653003100405653001100436653003000447653002300477653001400500653002100514653001500535100002300550700002200573700002300595700002100618700001900639700002300658700002300681700002500704700002000729700001900749700002000768700002100788700001600809700002200825700002200847700002300869700002000892700002700912700002300939700002200962700002100984700002001005700002101025700001801046700002401064856013201088 2020 eng d a2052-446300aCOVID-19 Disease Map, building a computational repository of SARS-CoV-2 virus-host interaction mechanisms.0 aCOVID19 Disease Map building a computational repository of SARSC c2020 05 05 a1360 v710aBetacoronavirus10aComputational Biology10aCoronavirus Infections10aCOVID-1910aDatabases, Factual10aHost Microbial Interactions10aHost-Pathogen Interactions10aHumans10aInternational Cooperation10aModels, Biological10aPandemics10aPneumonia, Viral10aSARS-CoV-21 aOstaszewski, Marek1 aMazein, Alexander1 aGillespie, Marc, E1 aKuperstein, Inna1 aNiarakis, Anna1 aHermjakob, Henning1 aPico, Alexander, R1 aWillighagen, Egon, L1 aEvelo, Chris, T1 aHasenauer, Jan1 aSchreiber, Falk1 aDräger, Andreas1 aDemir, Emek1 aWolkenhauer, Olaf1 aFurlong, Laura, I1 aBarillot, Emmanuel1 aDopazo, Joaquin1 aOrta-Resendiz, Aurelio1 aMessina, Francesco1 aValencia, Alfonso1 aFunahashi, Akira1 aKitano, Hiroaki1 aAuffray, Charles1 aBalling, Rudi1 aSchneider, Reinhard uhttps://www.clinbioinfosspa.es/content/covid-19-disease-map-building-computational-repository-sars-cov-2-virus-host-interaction03269nas a2200685 4500008004100000022001400041245011800055210006900173260001500242300000900257490000700266520115400273653001901427653005101446653001701497653002201514653002101536653002601557653002201583653002001605653003001625653001901655653001301674653001101687653003101698653001301729653001401742653002101756653003501777653004101812653002201853100002301875700001701898700001901915700001801934700002301952700001901975700001501994700001702009700001602026700002002042700001602062700002402078700001902102700001802121700002402139700001802163700002502181700001702206700001902223700002302242700002702265700002002292700002502312700002002337700002102357700002602378710005702404856012202461 2019 eng d a2041-172300aCommunity assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen.0 aCommunity assessment to advance computational prediction of canc c2019 06 17 a26740 v103 a

The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.

10aADAM17 Protein10aAntineoplastic Combined Chemotherapy Protocols10aBenchmarking10aBiomarkers, Tumor10aCell Line, Tumor10aComputational Biology10aDatasets as Topic10aDrug Antagonism10aDrug Resistance, Neoplasm10aDrug Synergism10aGenomics10aHumans10aMolecular Targeted Therapy10amutation10aNeoplasms10apharmacogenetics10aPhosphatidylinositol 3-Kinases10aPhosphoinositide-3 Kinase Inhibitors10aTreatment Outcome1 aMenden, Michael, P1 aWang, Dennis1 aMason, Mike, J1 aSzalai, Bence1 aBulusu, Krishna, C1 aGuan, Yuanfang1 aYu, Thomas1 aKang, Jaewoo1 aJeon, Minji1 aWolfinger, Russ1 aNguyen, Tin1 aZaslavskiy, Mikhail1 aJang, In, Sock1 aGhazoui, Zara1 aAhsen, Mehmet, Eren1 aVogel, Robert1 aNeto, Elias, Chaibub1 aNorman, Thea1 aK Y Tang, Eric1 aGarnett, Mathew, J1 aDi Veroli, Giovanni, Y1 aFawell, Stephen1 aStolovitzky, Gustavo1 aGuinney, Justin1 aDry, Jonathan, R1 aSaez-Rodriguez, Julio1 aAstraZeneca-Sanger Drug Combination DREAM Consortium uhttps://www.clinbioinfosspa.es/content/community-assessment-advance-computational-prediction-cancer-drug-combinations01819nas a2200265 4500008004100000022001400041245007700055210006900132260001600201300001400217490000700231520098400238653001501222653001101237653002301248653002401271653002001295653001801315100001901333700002201352700001801374700003101392700002001423856011001443 2019 eng d a1477-405400aA comparison of mechanistic signaling pathway activity analysis methods.0 acomparison of mechanistic signaling pathway activity analysis me c2019 Sep 27 a1655-16680 v203 a

Understanding the aspects of cell functionality that account for disease mechanisms or drug modes of action is a main challenge for precision medicine. Classical gene-based approaches ignore the modular nature of most human traits, whereas conventional pathway enrichment approaches produce only illustrative results of limited practical utility. Recently, a family of new methods has emerged that change the focus from the whole pathways to the definition of elementary subpathways within them that have any mechanistic significance and to the study of their activities. Thus, mechanistic pathway activity (MPA) methods constitute a new paradigm that allows recoding poorly informative genomic measurements into cell activity quantitative values and relate them to phenotypes. Here we provide a review on the MPA methods available and explain their contribution to systems medicine approaches for addressing challenges in the diagnostic and treatment of complex diseases.

10aAlgorithms10aHumans10aPostmortem Changes10aSignal Transduction10aSystems biology10aTranscriptome1 aAmadoz, Alicia1 aHidalgo, Marta, R1 aCubuk, Cankut1 aCarbonell-Caballero, José1 aDopazo, Joaquin uhttps://www.clinbioinfosspa.es/content/comparison-mechanistic-signaling-pathway-activity-analysis-methods01484nas a2200421 4500008004100000245011900041210006900160260001600229490000600245110005300251700001800304700001800322700002300340700002500363700001600388700001900404700001500423700001800438700001900456700002400475700001900499700002200518700001600540700003100556700001800587700001800605700001800623700001900641700002200660700002300682700002700705700002700732700001900759700002400778700002500802700002600827856020900853 2018 eng d00aA crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection0 acrowdsourced analysis to identify ab initio molecular signatures cJan-12-20180 v91 aThe Respiratory Viral DREAM Challenge Consortium1 aFourati, Slim1 aTalla, Aarthi1 aMahmoudian, Mehrad1 aBurkhart, Joshua, G.1 aKlén, Riku1 aHenao, Ricardo1 aYu, Thomas1 aAydın, Zafer1 aYeung, Ka, Yee1 aAhsen, Mehmet, Eren1 aAlmugbel, Reem1 aJahandideh, Samad1 aLiang, Xiao1 aNordling, Torbjörn, E. M.1 aShiga, Motoki1 aStanescu, Ana1 aVogel, Robert1 aPandey, Gaurav1 aChiu, Christopher1 aMcClain, Micah, T.1 aWoods, Christopher, W.1 aGinsburg, Geoffrey, S.1 aElo, Laura, L.1 aTsalik, Ephraim, L.1 aMangravite, Lara, M.1 aSieberts, Solveig, K. uhttp://www.nature.com/articles/s41467-018-06735-8http://www.nature.com/articles/s41467-018-06735-8.pdfhttp://www.nature.com/articles/s41467-018-06735-8.pdfhttp://www.nature.com/articles/s41467-018-06735-802634nas a2200289 4500008004100000022001400041245015800055210006900213260001500282300000900297520167800306653001901984653001202003653000902015653000902024653002302033653001202056653002102068100002302089700001802112700003002130700002302160700002002183700002402203700002902227856008802256 2016 eng d a1607-888800aChronic subordination stress selectively downregulates the insulin signaling pathway in liver and skeletal muscle but not in adipose tissue of male mice.0 aChronic subordination stress selectively downregulates the insul c2016 Mar 7 a1-113 aChronic stress has been associated with obesity, glucose intolerance, and insulin resistance. We developed a model of chronic psychosocial stress (CPS) in which subordinate mice are vulnerable to obesity and the metabolic-like syndrome while dominant mice exhibit a healthy metabolic phenotype. Here we tested the hypothesis that the metabolic difference between subordinate and dominant mice is associated with changes in functional pathways relevant for insulin sensitivity, glucose and lipid homeostasis. Male mice were exposed to CPS for four weeks and fed either a standard diet or a high-fat diet (HFD). We first measured, by real-time PCR candidate genes, in the liver, skeletal muscle, and the perigonadal white adipose tissue (pWAT). Subsequently, we used a probabilistic analysis approach to analyze different ways in which signals can be transmitted across the pathways in each tissue. Results showed that subordinate mice displayed a drastic downregulation of the insulin pathway in liver and muscle, indicative of insulin resistance, already on standard diet. Conversely, pWAT showed molecular changes suggestive of facilitated fat deposition in an otherwise insulin-sensitive tissue. The molecular changes in subordinate mice fed a standard diet were greater compared to HFD-fed controls. Finally, dominant mice maintained a substantially normal metabolic and molecular phenotype even when fed a HFD. Overall, our data demonstrate that subordination stress is a potent stimulus for the downregulation of the insulin signaling pathway in liver and muscle and a major risk factor for the development of obesity, insulin resistance, and type 2 diabetes mellitus.10aAdipose tissue10ainsulin10aIRS110aIRS210ametabolic syndrome10aobesity10apathway analysis1 aSanghez, Valentina1 aCubuk, Cankut1 aSebastián-Leon, Patricia1 aCarobbio, Stefania1 aDopazo, Joaquin1 aVidal-Puig, Antonio1 aBartolomucci, Alessandro uhttp://www.tandfonline.com/doi/abs/10.3109/10253890.2016.1151491?journalCode=ists2003376nas a2200793 4500008004100000022001400041245011500055210006900170260001600239520112300255653001101378653000801389653002001397100001901417700002601436700001201462700001801474700002201492700002701514700002201541700002201563700001901585700002901604700002301633700002001656700002001676700002301696700002501719700002201744700002201766700002101788700004001809700001201849700001701861700001201878700001601890700001901906700001801925700002101943700001601964700002001980700002402000700002002024700001802044700001302062700001702075700001802092700002102110700002402131700002002155700002102175700001702196700002102213700002502234700002102259700001902280700001902299700002102318700001602339700001402355700001702369700001502386700001302401700003102414700002102445700002102466700002002487856007502507 2015 eng d a1548-710500aCombining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.0 aCombining tumor genome simulation with crowdsourcing to benchmar c2015 May 183 aThe detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/.10acancer10aNGS10avariant calling1 aEwing, Adam, D1 aHoulahan, Kathleen, E1 aHu, Yin1 aEllrott, Kyle1 aCaloian, Cristian1 aYamaguchi, Takafumi, N1 aBare, Christopher1 aP’ng, Christine1 aWaggott, Daryl1 aSabelnykova, Veronica, Y1 aKellen, Michael, R1 aNorman, Thea, C1 aHaussler, David1 aFriend, Stephen, H1 aStolovitzky, Gustavo1 aMargolin, Adam, A1 aStuart, Joshua, M1 aBoutros, Paul, C1 aparticipants, ICGC-TCGA, DREAM Soma1 aXi, Liu1 aDewal, Ninad1 aFan, Yu1 aWang, Wenyi1 aWheeler, David1 aWilm, Andreas1 aTing, Grace, Hui1 aLi, Chenhao1 aBertrand, Denis1 aNagarajan, Niranjan1 aChen, Qing-Rong1 aHsu, Chih-Hao1 aHu, Ying1 aYan, Chunhua1 aKibbe, Warren1 aMeerzaman, Daoud1 aCibulskis, Kristian1 aRosenberg, Mara1 aBergelson, Louis1 aKiezun, Adam1 aRadenbaugh, Amie1 aSertier, Anne-Sophie1 aFerrari, Anthony1 aTonton, Laurie1 aBhutani, Kunal1 aHansen, Nancy, F1 aWang, Difei1 aSong, Lei1 aLai, Zhongwu1 aLiao, Yang1 aShi, Wei1 aCarbonell-Caballero, José1 aDopazo, Joaquín1 aLau, Cheryl, C K1 aGuinney, Justin uhttp://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3407.html02945nas a2200217 4500008004100000022001400041245012100055210006900176260001500245520217400260100003602434700003402470700002102504700002702525700002002552700002202572700002102594700002502615700001602640856007102656 2015 eng d a1878-589100aComparative gene expression study of the vestibular organ of the Igf1 deficient mouse using whole-transcript arrays.0 aComparative gene expression study of the vestibular organ of the c2015 Sep 13 aThe auditory and vestibular organs form the inner ear and have a common developmental origin. Insulin like growth factor 1 (IGF-1) has a central role in the development of the cochlea and maintenance of hearing. Its deficiency causes sensorineural hearing loss in man and mice. During chicken early development, IGF-1 modulates neurogenesis of the cochleovestibular ganglion but no further studies have been conducted to explore the potential role of IGF-1 in the vestibular system. In this study we have compared the whole transcriptome of the vestibular organ from wild type and Igf1(-/-) mice at different developmental and postnatal times. RNA was prepared from E18.5, P15 and P90 vestibular organs of Igf1(-/-) and Igf1(+/+) mice and the transcriptome analysed in triplicates using Affymetrix® Mouse Gene 1.1 ST Array Plates. These plates are whole-transcript arrays that include probes to measure both messenger (mRNA) and long intergenic non-coding RNA transcripts (lincRNA), with a coverage of over 28 thousand coding transcripts and over 7 thousands non-coding transcripts. Given the complexity of the data we used two different methods VSN-RMA and mmBGX to analyse and compare the data. This is to better evaluate the number of false positives and to quantify uncertainty of low signals. We identified a number of differentially expressed genes that we described using functional analysis and validated using RT-qPCR. The morphology of the vestibular organ did not show differences between genotypes and no evident alterations were observed in the vestibular sensory areas of the null mice. However, well-defined cellular alterations were found in the vestibular neurons with respect their number and size. Although these mice did not show a dramatic vestibular phenotype, we conducted a functional analysis on differentially expressed genes between genotypes and across time. This was with the aim to identify new pathways that are involved in the development of the vestibular organ as well as pathways that maybe affected by the lack of IGF-1 and be associated to the morphological changes of the vestibular neurons that we observed in the Igf1(-/-) mice.1 ade la Rosa, Lourdes, Rodríguez1 aSánchez-Calderón, Hortensia1 aContreras, Julio1 aMurillo-Cuesta, Silvia1 aFalagan, Sandra1 aAvendaño, Carlos1 aDopazo, Joaquín1 aVarela-Nieto, Isabel1 aMilo, Marta uhttp://www.sciencedirect.com/science/article/pii/S037859551500183501828nas a2200253 4500008004100000022001400041245007200055210006900127260001700196300001300213490000700226520106800233653000801301653000801309653002301317100002101340700002301361700002001384700002301404700002201427700002001449700003001469856007501499 2015 eng d a1557-996400aConcurrent and Accurate Short Read Mapping on Multicore Processors.0 aConcurrent and Accurate Short Read Mapping on Multicore Processo c2015 Sep-Oct a995-10070 v123 aWe introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, [Formula: see text] ([Formula: see text] is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of [Formula: see text], on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR.10aHPC10aNGS10ashort real mapping1 aMartinez, Hector1 aTárraga, Joaquín1 aMedina, Ignacio1 aBarrachina, Sergio1 aCastillo, Maribel1 aDopazo, Joaquin1 aQuintana-Orti, Enrique, S uhttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=701000502750nas a2200217 4500008004100000022001400041245011000055210006900165260000900234300001100243490000600254520206500260100002702325700002002352700002402372700001602396700002102412700001902433700002802452856005202480 2014 eng d a1932-620300aCombined genetic and high-throughput strategies for molecular diagnosis of inherited retinal dystrophies.0 aCombined genetic and highthroughput strategies for molecular dia c2014 ae884100 v93 aMost diagnostic laboratories are confronted with the increasing demand for molecular diagnosis from patients and families and the ever-increasing genetic heterogeneity of visual disorders. Concerning Retinal Dystrophies (RD), almost 200 causative genes have been reported to date, and most families carry private mutations. We aimed to approach RD genetic diagnosis using all the available genetic information to prioritize candidates for mutational screening, and then restrict the number of cases to be analyzed by massive sequencing. We constructed and optimized a comprehensive cosegregation RD-chip based on SNP genotyping and haplotype analysis. The RD-chip allows to genotype 768 selected SNPs (closely linked to 100 RD causative genes) in a single cost-, time-effective step. Full diagnosis was attained in 17/36 Spanish pedigrees, yielding 12 new and 12 previously reported mutations in 9 RD genes. The most frequently mutated genes were USH2A and CRB1. Notably, RD3-up to now only associated to Leber Congenital Amaurosis- was identified as causative of Retinitis Pigmentosa. The main assets of the RD-chip are: i) the robustness of the genetic information that underscores the most probable candidates, ii) the invaluable clues in cases of shared haplotypes, which are indicative of a common founder effect, and iii) the detection of extended haplotypes over closely mapping genes, which substantiates cosegregation, although the assumptions in which the genetic analysis is based could exceptionally lead astray. The combination of the genetic approach with whole exome sequencing (WES) greatly increases the diagnosis efficiency, and revealed novel mutations in USH2A and GUCY2D. Overall, the RD-chip diagnosis efficiency ranges from 16% in dominant, to 80% in consanguineous recessive pedigrees, with an average of 47%, well within the upper range of massive sequencing approaches, highlighting the validity of this time- and cost-effective approach whilst high-throughput methodologies become amenable for routine diagnosis in medium sized labs.1 ade Castro-Miró, Marta1 aPomares, Esther1 aLorés-Motta, Laura1 aTonda, Raul1 aDopazo, Joaquín1 aMarfany, Gemma1 aGonzàlez-Duarte, Roser uhttp://dx.plos.org/10.1371/journal.pone.008841001916nas a2200253 4500008004100000022001400041245013800055210006900193260001600262300001400278490000700292520118400299653000801483653001201491653000901503100001101512700001601523700000501539700001501544700000501559700001601564700001101580856007101591 2014 eng d a1546-169600aA comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.0 acomprehensive assessment of RNAseq accuracy reproducibility and c2014 Aug 24 a903–9140 v323 aWe present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.10aNGS10aRNA-seq10aSEQC1 aSu, Z.1 aLabaj, P.P.1 a1 aDopazo, J.1 a1 aMason, C.E.1 aShi, L uhttp://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.2957.html02423nas a2200433 4500008004100000022001400041245008500055210006900140260001500209300001400224490000700238520117700245653001501422653001601437653003101453100002001484700002201504700001901526700001901545700001701564700002201581700001901603700002601622700001801648700002301666700002201689700002101711700002001732700001801752700002001770700003001790700002301820700002501843700002001868700002001888700001301908700002001921856004801941 2014 eng d a1538-744500aA Comprehensive DNA Methylation Profile of Epithelial-to-Mesenchymal Transition.0 aComprehensive DNA Methylation Profile of EpithelialtoMesenchymal c2014 Aug 8 a5608–190 v743 aEpithelial-to-mesenchymal transition (EMT) is a plastic process in which fully differentiated epithelial cells are converted into poorly differentiated, migratory and invasive mesenchymal cells and it has been related to the metastasis potential of tumors. This is a reversible process and cells can also eventually undergo mesenchymal-to-epithelial transition (MET). The existence of a dynamic EMT process suggests the involvement of epigenetic shifts in the phenotype. Herein, we obtained the DNA methylomes at single-base resolution of MDCK cells undergoing epithelial-to-mesenchymal transition (EMT) and translated the identified differentially methylated regions (DMRs) to human breast cancer cells undergoing a gain of migratory and invasive capabilities associated with the EMT phenotype. We noticed dynamic and reversible changes of DNA methylation, both on promoter sequences and gene-bodies in association with transcription regulation of EMT-related genes. Most importantly, the identified DNA methylation markers of EMT were present in primary mammary tumors in association with the epithelial or the mesenchymal phenotype of the studied breast cancer samples.10aMethyl-Seq10aMethylomics10aNext Generation Sequencing1 aCarmona, Javier1 aDavalos, Veronica1 aVidal, Enrique1 aGomez, Antonio1 aHeyn, Holger1 aHashimoto, Yutaka1 aVizoso, Miguel1 aMartinez-Cardus, Anna1 aSayols, Sergi1 aFerreira, Humberto1 aSanchez-Mut, Jose1 aMoran, Sebastian1 aMargeli, Mireia1 aCastella, Eva1 aBerdasco, Maria1 aStefansson, Olafur, Andri1 aEyfjord, Jorunn, E1 aGonzalez-Suarez, Eva1 aDopazo, Joaquin1 aOrozco, Modesto1 aGut, Ivo1 aEsteller, Manel uhttp://www.ncbi.nlm.nih.gov/pubmed/2510642702299nas a2200253 4500008004100000022001400041245013400055210006900189260001600258520136900274100003001643700002601673700002901699700002201728700001801750700003401768700001901802700002001821700001801841700002101859700002101880700001701901856012701918 2013 eng d a1949-255300aCapturing the biological impact of CDKN2A and MC1R genes as an early predisposing event in melanoma and non melanoma skin cancer.0 aCapturing the biological impact of CDKN2A and MC1R genes as an e c2013 Dec 163 aGermline mutations in CDKN2A and/or red hair color variants in MC1R genes are associated with an increased susceptibility to develop cutaneous melanoma or non melanoma skin cancer. We studied the impact of the CDKN2A germinal mutation p.G101W and MC1R variants on gene expression and transcription profiles associated with skin cancer. To this end we set-up primary skin cell co-cultures from siblings of melanoma prone-families that were later analyzed using the expression array approach. As a result, we found that 1535 transcripts were deregulated in CDKN2A mutated cells, with over-expression of immunity-related genes (HLA-DPB1, CLEC2B, IFI44, IFI44L, IFI27, IFIT1, IFIT2, SP110 and IFNK) and down-regulation of genes playing a role in the Notch signaling pathway. 3570 transcripts were deregulated in MC1R variant carriers. In particular, genes related to oxidative stress and DNA damage pathways were up-regulated as well as genes associated with neurodegenerative diseases such as Parkinson’s, Alzheimer and Huntington. Finally, we observed that the expression signatures indentified in phenotypically normal cells carrying CDKN2A mutations or MC1R variants are maintained in skin cancer tumors (melanoma and squamous cell carcinoma). These results indicate that transcriptome deregulation represents an early event critical for skin cancer development.1 aPuig-Butille, Joan, Anton1 aEscamez, Maria, José1 aGarcia-Garcia, Francisco1 aTell-Marti, Gemma1 aFabra, Angels1 aMartínez-Santamaría, Lucía1 aBadenas, Celia1 aAguilera, Paula1 aPevida, Marta1 aDopazo, Joaquín1 aDel Rio, Marcela1 aPuig, Susana uhttp://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path%5B%5D=1444&path%5B%5D=182402076nas a2200241 4500008004100000022001400041245014000055210006900195260001300264300001200277490000700289520129000296100001701586700002301603700002401626700002401650700002401674700001901698700001901717700002001736700002001756856005801776 2012 eng d a1362-496200aCellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources.0 aCellBase a comprehensive collection of RESTful web services for c2012 Jul aW609-140 v403 aDuring the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase.1 aBleda, Marta1 aTárraga, Joaquín1 aDe Maria, Alejandro1 aSalavert, Francisco1 aGarcía-Alonso, Luz1 aCelma, Matilde1 aMartin, Ainoha1 aDopazo, Joaquin1 aMedina, Ignacio uhttp://nar.oxfordjournals.org/content/40/W1/W609.long02778nas a2200385 4500008004100000245011100041210006900152260001300221300001000234490000700244520152200251100002301773700002601796700001901822700002401841700002701865700002901892700002001921700002001941700002801961700001901989700002302008700002402031700002002055700001602075700002302091700001902114700002002133700002002153700002002173700002002193700002502213700002302238856013102261 2010 eng d00aChanges in the pattern of DNA methylation associate with twin discordance in systemic lupus erythematosus.0 aChanges in the pattern of DNA methylation associate with twin di c2010 Feb a170-90 v203 a

Monozygotic (MZ) twins are partially concordant for most complex diseases, including autoimmune disorders. Whereas phenotypic concordance can be used to study heritability, discordance suggests the role of non-genetic factors. In autoimmune diseases, environmentally driven epigenetic changes are thought to contribute to their etiology. Here we report the first high-throughput and candidate sequence analyses of DNA methylation to investigate discordance for autoimmune disease in twins. We used a cohort of MZ twins discordant for three diseases whose clinical signs often overlap: systemic lupus erythematosus (SLE), rheumatoid arthritis, and dermatomyositis. Only MZ twins discordant for SLE featured widespread changes in the DNA methylation status of a significant number of genes. Gene ontology analysis revealed enrichment in categories associated with immune function. Individual analysis confirmed the existence of DNA methylation and expression changes in genes relevant to SLE pathogenesis. These changes occurred in parallel with a global decrease in the 5-methylcytosine content that was concomitantly accompanied with changes in DNA methylation and expression levels of ribosomal RNA genes, although no changes in repetitive sequences were found. Our findings not only identify potentially relevant DNA methylation markers for the clinical characterization of SLE patients but also support the notion that epigenetic changes may be critical in the clinical manifestations of autoimmune disease.

1 aJavierre, Biola, M1 aFernandez, Agustin, F1 aRichter, Julia1 aAl-Shahrour, Fatima1 aMartin-Subero, Ignacio1 aRodriguez-Ubreva, Javier1 aBerdasco, Maria1 aFraga, Mario, F1 aO’Hanlon, Terrance, P1 aRider, Lisa, G1 aJacinto, Filipe, V1 aLopez-Longo, Javier1 aDopazo, Joaquin1 aForn, Marta1 aPeinado, Miguel, A1 aCarreño, Luis1 aSawalha, Amr, H1 aHarley, John, B1 aSiebert, Reiner1 aEsteller, Manel1 aMiller, Frederick, W1 aBallestar, Esteban uhttps://www.clinbioinfosspa.es/content/changes-pattern-dna-methylation-associate-twin-discordance-systemic-lupus-erythematosus02659nas a2200253 4500008004100000245010800041210006900149300001000218490000700228520165100235653006101886653019201947100001402139700001302153700001302166700001802179700001702197700001502214700002002229700001602249700001502265700001902280856010602299 2008 eng d00aCLEAR-test: combining inference for differential expression and variability in microarray data analysis0 aCLEARtest combining inference for differential expression and va a33-450 v413 a

A common goal of microarray experiments is to detect genes that are differentially expressed under distinct experimental conditions. Several statistical tests have been proposed to determine whether the observed changes in gene expression are significant. The t-test assigns a score to each gene on the basis of changes in its expression relative to its estimated variability, in such a way that genes with a higher score (in absolute values) are more likely to be significant. Most variants of the t-test use the complete set of genes to influence the variance estimate for each single gene. However, no inference is made in terms of the variability itself. Here, we highlight the problem of low observed variances in the t-test, when genes with relatively small changes are declared differentially expressed. Alternatively, the z-test could be used although, unlike the t-test, it can declare differentially expressed genes with high observed variances. To overcome this, we propose to combine the z-test, which focuses on large changes, with a chi(2) test to evaluate variability. We call this procedure CLEAR-test and we provide a combined p-value that offers a compromise between both aspects. Analysis of three publicly available microarray datasets reveals the greater performance of the CLEAR-test relative to the t-test and alternative methods. Finally, empirical and simulated data analyses demonstrate the greater reproducibility and statistical power of the CLEAR-test and z-test with respect to current alternative methods. In addition, the CLEAR-test improves the z-test by capturing reproducible genes with high variability.

10a*Algorithms Artificial Intelligence *Data Interpretation10aStatistical Gene Expression Profiling/*methods Gene Expression Regulation/*physiology Oligonucleotide Array Sequence Analysis/*methods Proteome/*metabolism Signal Transduction/*physiology1 aValls, J.1 aGrau, M.1 aSole, X.1 aHernandez, P.1 aMontaner, D.1 aDopazo, J.1 aPeinado, M., A.1 aCapella, G.1 aMoreno, V.1 aPujana, M., A. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1759700900426nas a2200109 4500008004100000245006200041210006100103260004700164490000800211100001800219856007900237 2008 eng d00aComparative genomics-based prediction of protein function0 aComparative genomicsbased prediction of protein function bM. Starkey and R. Elaswarapu, Humana press0 v4391 aGabaldón, T. uhttp://www.springerprotocols.com/Abstract/doi/10.1007/978-1-59745-188-8_2603014nas a2200229 4500008004100000245012600041210006900167300001200236490000700248520187400255653014702129653025902276100002302535700001602558700001502574700002002589700001802609700002002627700001702647700001402664856010602678 2008 eng d00aControlled ovarian stimulation induces a functional genomic delay of the endometrium with potential clinical implications0 aControlled ovarian stimulation induces a functional genomic dela a4500-100 v933 a

CONTEXT: Controlled ovarian stimulation induces morphological, biochemical, and functional genomic modifications of the human endometrium during the window of implantation. OBJECTIVE: Our objective was to compare the gene expression profile of the human endometrium in natural vs. controlled ovarian stimulation cycles throughout the early-mid secretory transition using microarray technology. METHOD: Microarray data from 49 endometrial biopsies obtained from LH+1 to LH+9 (n=25) in natural cycles and from human chorionic gonadotropin (hCG) +1 to hCG+9 in controlled ovarian stimulation cycles (n=24) were analyzed using different methods, such as clustering, profiling of biological processes, and selection of differentially expressed genes, as implemented in Gene Expression Pattern Analysis Suite and Babelomics programs. RESULTS: Endometria from natural cycles followed different genomic patterns compared with controlled ovarian stimulation cycles in the transition from the pre-receptive (days LH/hCG+1 until LH/hCG+5) to the receptive phase (day LH+7/hCG+7). Specifically, we have demonstrated the existence of a 2-d delay in the activation/repression of two clusters composed by 218 and 133 genes, respectively, on day hCG+7 vs. LH+7. Many of these delayed genes belong to the class window of implantation genes affecting basic biological processes in the receptive endometrium. CONCLUSIONS: These results demonstrate that gene expression profiling of the endometrium is different between natural and controlled ovarian stimulation cycles in the receptive phase. Identification of these differentially regulated genes can be used to understand the different developmental profiles of receptive endometrium during controlled ovarian stimulation and to search for the best controlled ovarian stimulation treatment in terms of minimal endometrial impact.

10aAlgorithms Chorionic Gonadotropin/genetics Endometrium/cytology/pathology/*physiology/physiopathology Female Gene Expression Regulation Genome10aHuman Glutathione Peroxidase/genetics Humans Insulin-Like Growth Factor Binding Proteins/genetics Luteal Phase/physiology Luteinizing Hormone/genetics Menstrual Cycle Oligonucleotide Array Sequence Analysis Ovulation Induction/*methods RNA/genetics/isola1 aHorcajadas, J., A.1 aMinguez, P.1 aDopazo, J.1 aEsteban, F., J.1 aDominguez, F.1 aGiudice, L., C.1 aPellicer, A.1 aSimon, C. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1869787000550nas a2200157 4500008004100000245007400041210006900115260002300184300001200207100001800219700001200237700001600249700001600265700001300281856009800294 2008 eng d00aThe core of a minimal gene set: insights from natural reduced genomes0 acore of a minimal gene set insights from natural reduced genomes aUSAbThe MIT Press a347-3661 aGabaldón, T.1 aGil, R.1 aPeretó, J.1 aLatorre, A.1 aMoya, A. uhttps://www.clinbioinfosspa.es/content/core-minimal-gene-set-insights-natural-reduced-genomes02608nas a2200217 4500008004100000245009500041210006900136300001200205490000600217520172500223653008401948653002102032653012902053653002102182100001602203700001302219700001402232700002402246700001402270856010602284 2007 eng d00aCharacterization of protein hubs by inferring interacting motifs from protein interactions0 aCharacterization of protein hubs by inferring interacting motifs a1761-710 v33 aThe characterization of protein interactions is essential for understanding biological systems. While genome-scale methods are available for identifying interacting proteins, they do not pinpoint the interacting motifs (e.g., a domain, sequence segments, a binding site, or a set of residues). Here, we develop and apply a method for delineating the interacting motifs of hub proteins (i.e., highly connected proteins). The method relies on the observation that proteins with common interaction partners tend to interact with these partners through a common interacting motif. The sole input for the method are binary protein interactions; neither sequence nor structure information is needed. The approach is evaluated by comparing the inferred interacting motifs with domain families defined for 368 proteins in the Structural Classification of Proteins (SCOP). The positive predictive value of the method for detecting proteins with common SCOP families is 75% at sensitivity of 10%. Most of the inferred interacting motifs were significantly associated with sequence patterns, which could be responsible for the common interactions. We find that yeast hubs with multiple interacting motifs are more likely to be essential than hubs with one or two interacting motifs, thus rationalizing the previously observed correlation between essentiality and the number of interacting partners of a protein. We also find that yeast hubs with multiple interacting motifs evolve slower than the average protein, contrary to the hubs with one or two interacting motifs. The proposed method will help us discover unknown interacting motifs and provide biological insights about protein hubs and their roles in interaction networks.10aAmino Acid Motifs Amino Acid Sequence Binding Sites Computer Simulation *Models10aChemical *Models10aMolecular Molecular Sequence Data Protein Binding Protein Interaction Mapping/*methods Proteins/*chemistry Sequence Analysis10aProtein/*methods1 aAragues, R.1 aSali, A.1 aBonet, J.1 aMarti-Renom, M., A.1 aOliva, B. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1794170500428nas a2200097 4500008004100000245005700041210005400098260007600152100001500228856008700243 2007 eng d00aClustering - Class discovery in the post-genomic era0 aClustering Class discovery in the postgenomic era aNew York, USAbSpringer-Verlag, W. Dubitzky, M. Granzow and D.P. Berrar1 aDopazo, J. uhttps://www.clinbioinfosspa.es/content/clustering-class-discovery-post-genomic-era02005nas a2200253 4500008004100000245005800041210005800099300001300157490001400170520105600184653008801240653002101328653012901349653003101478100001401509700001301523700002401536700002401560700001601584700001701600700001501617700001301632856010601645 2006 eng d00aComparative protein structure modeling using Modeller0 aComparative protein structure modeling using Modeller aUnit 5 60 vChapter 53 aFunctional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.10aAlgorithms Amino Acid Sequence Computer Simulation Crystallography/*methods *Models10aChemical *Models10aMolecular Molecular Sequence Data Protein Conformation Protein Folding Proteins/*chemistry/*ultrastructure Sequence Analysis10aProtein/*methods *Software1 aEswar, N.1 aWebb, B.1 aMarti-Renom, M., A.1 aMadhusudhan, M., S.1 aEramian, D.1 aShen, M., Y.1 aPieper, U.1 aSali, A. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1842876702131nas a2200217 4500008004100000245007200041210006900113300001200182490000700194520144000201653001201641653002101653653003601674100001601710700001701726700001401743700001301757700001301770700002401783856010601807 2006 eng d00aA composite score for predicting errors in protein structure models0 acomposite score for predicting errors in protein structure model a1653-660 v153 aReliable prediction of model accuracy is an important unsolved problem in protein structure modeling. To address this problem, we studied 24 individual assessment scores, including physics-based energy functions, statistical potentials, and machine learning-based scoring functions. Individual scores were also used to construct approximately 85,000 composite scoring functions using support vector machine (SVM) regression. The scores were tested for their abilities to identify the most native-like models from a set of 6000 comparative models of 20 representative protein structures. Each of the 20 targets was modeled using a template of <30% sequence identity, corresponding to challenging comparative modeling cases. The best SVM score outperformed all individual scores by decreasing the average RMSD difference between the model identified as the best of the set and the model with the lowest RMSD (DeltaRMSD) from 0.63 A to 0.45 A, while having a higher Pearson correlation coefficient to RMSD (r=0.87) than any other tested score. The most accurate score is based on a combination of the DOPE non-hydrogen atom statistical potential; surface, contact, and combined statistical potentials from MODPIPE; and two PSIPRED/DSSP scores. It was implemented in the SVMod program, which can now be applied to select the final model in various modeling problems, including fold assignment, target-template alignment, and loop modeling.10a*Models10aMolecular Models10aTheoretical Proteins/*chemistry1 aEramian, D.1 aShen, M., Y.1 aDevos, D.1 aMelo, F.1 aSali, A.1 aMarti-Renom, M., A. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1675160601615nas a2200133 4500008004100000245008900041210006900130300001200199490000800211520100500219653013301224100001801357856010601375 2006 eng d00aComputational approaches for the prediction of protein function in the mitochondrion0 aComputational approaches for the prediction of protein function aC1121-80 v2913 aUnderstanding a complex biological system, such as the mitochondrion, requires the identification of the complete repertoire of proteins targeted to the organelle, the characterization of these, and finally, the elucidation of the functional and physical interactions that occur within the mitochondrion. In the last decade, significant developments have contributed to increase our understanding of the mitochondrion, and among these, computational research has played a significant role. Not only general bioinformatics tools have been applied in the context of the mitochondrion, but also some computational techniques have been specifically developed to address problems that arose from within the mitochondrial research field. In this review the contribution of bioinformatics to mitochondrial biology is addressed through a survey of current computational methods that can be applied to predict which proteins will be localized to the mitochondrion and to unravel their functional interactions.10a*Computational Biology *Computer Simulation Humans Mitochondria/*metabolism Mitochondrial Proteins/genetics/*metabolism Mutation1 aGabaldón, T. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1687083001615nas a2200181 4500008004100000245012100041210006900162300001000231490000800241520089300249653003301142653008301175100001901258700001901277700001801296700001301314856010601327 2005 eng d00aCombining data from genomes, Y2H and 3D structure indicates that BolA is a reductase interacting with a glutaredoxin0 aCombining data from genomes Y2H and 3D structure indicates that a591-60 v5793 aGenomes, functional genomics data and 3D structure reflect different aspects of protein function. Here, we combine these data to predict that BolA, a widely distributed protein family with unknown function, is a reductase that interacts with a glutaredoxin. Comparisons at the 3D structure level as well as at the sequence profile level indicate homology between BolA and OsmC, an enzyme that reduces organic peroxides. Complementary to this, comparative analyses of genomes and genomics data provide strong evidence of an interaction between BolA and the mono-thiol glutaredoxin family. The interaction between BolA and a mono-thiol glutaredoxin is of particular interest because BolA does not, in contrast to its homolog OsmC, have evolutionarily conserved cysteines to provide it with reducing equivalents. We propose that BolA uses the mono-thiol glutaredoxin as the source for these.10a*Genome Glutaredoxins Models10aMolecular Oxidoreductases/chemistry/*metabolism Phylogeny Protein Conformation1 aHuynen, M., A.1 aSpronk, C., A.1 aGabaldón, T.1 aSnel, B. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1567081302167nas a2200301 4500008004100000245008600041210006900127300001100196490000700207520100200214653012701216653002601343653013701369653003901506653001801545100002001563700001901583700001901602700001901621700001401640700002401654700001701678700001801695700001301713700001901726700001401745856010601759 2005 eng d00aThe C-type lectin fold as an evolutionary solution for massive sequence variation0 aCtype lectin fold as an evolutionary solution for massive sequen a886-920 v123 aOnly few instances are known of protein folds that tolerate massive sequence variation for the sake of binding diversity. The most extensively characterized is the immunoglobulin fold. We now add to this the C-type lectin (CLec) fold, as found in the major tropism determinant (Mtd), a retroelement-encoded receptor-binding protein of Bordetella bacteriophage. Variation in Mtd, with its approximately 10(13) possible sequences, enables phage adaptation to Bordetella spp. Mtd is an intertwined, pyramid-shaped trimer, with variable residues organized by its CLec fold into discrete receptor-binding sites. The CLec fold provides a highly static scaffold for combinatorial display of variable residues, probably reflecting a different evolutionary solution for balancing diversity against stability from that in the immunoglobulin fold. Mtd variants are biased toward the receptor pertactin, and there is evidence that the CLec fold is used broadly for sequence variation by related retroelements.10aAmino Acid Sequence Bacterial Outer Membrane Proteins/*chemistry Bacteriophages/*metabolism Bordetella/*virology Evolution10aBordetella/*chemistry10aC-Type/*chemistry Molecular Sequence Data Protein Conformation Protein Folding Viral Proteins/*chemistry/*genetics Virulence Factors10aMolecular Genetic Variation Genome10aViral Lectins1 aMcMahon, S., A.1 aMiller, J., L.1 aLawton, J., A.1 aKerkow, D., E.1 aHodes, A.1 aMarti-Renom, M., A.1 aDoulatov, S.1 aNarayanan, E.1 aSali, A.1 aMiller, J., F.1 aGhosh, P. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1617032402356nas a2200265 4500008004100000245006200041210006200103300001000165490000700175520124200182653003001424653007301454653007501527653004901602653004201651653004301693653005001736653007501786653005801861100001901919700001601938700001501954700001501969856010601984 2003 eng d00aComparing bacterial genomes through conservation profiles0 aComparing bacterial genomes through conservation profiles a991-80 v133 aWe constructed two-dimensional representations of profiles of gene conservation across different genomes using the genome of Escherichia coli as a model. These profiles permit both the visualization at the genome level of different traits in the organism studied and, at the same time, reveal features related to the genomes analyzed (such as defective genomes or genomes that lack a particular system). Conserved genes are not uniformly distributed along the E. coli genome but tend to cluster together. The study of gene distribution patterns across genomes is important for the understanding of how sets of genes seem to be dependent on each other, probably having some functional link. This provides additional evidence that can be used for the elucidation of the function of unannotated genes. Clustering these patterns produces families of genes which can be arranged in a hierarchy of closeness. In this way, functions can be defined at different levels of generality depending on the level of the hierarchy that is studied. The combined study of conservation and phenotypic traits opens up the possibility of defining phenotype/genotype associations, and ultimately inferring the gene or genes responsible for a particular trait.10aBacterial Genotype Models10aBacterial/genetics Cluster Analysis Conserved Sequence/*genetics DNA10aBacterial/genetics Escherichia coli/classification/*genetics Evolution10aBacterial/genetics Gene Order/genetics Genes10aBacterial/genetics/physiology *Genome10aChromosome Mapping/methods Chromosomes10aGenetic Phenotype Phylogeny Sequence Homology10aMolecular Gene Expression Profiling/methods Gene Expression Regulation10aNucleic Acid Species Specificity Terminology as Topic1 aMartin, M., J.1 aHerrero, J.1 aMateos, A.1 aDopazo, J. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1269532402286nas a2200193 4500008004100000245009100041210006900132300001100201490000700212520140500219653025701624653001201881100001501893700001501908700001901923700002701942700001701969856010601986 2002 eng d00aCalnexin overexpression increases manganese peroxidase production in Aspergillus niger0 aCalnexin overexpression increases manganese peroxidase productio a846-510 v683 aHeme-containing peroxidases from white rot basidiomycetes, in contrast to most proteins of fungal origin, are poorly produced in industrial filamentous fungal strains. Factors limiting peroxidase production are believed to operate at the posttranslational level. In particular, insufficient availability of the prosthetic group which is required for peroxidase biosynthesis has been proposed to be an important bottleneck. In this work, we analyzed the role of two components of the secretion pathway, the chaperones calnexin and binding protein (BiP), in the production of a fungal peroxidase. Expression of the Phanerochaete chrysosporium manganese peroxidase (MnP) in Aspergillus niger resulted in an increase in the expression level of the clxA and bipA genes. In a heme-supplemented medium, where MnP was shown to be overproduced to higher levels, induction of clxA and bipA was also higher. Overexpression of these two chaperones in an MnP-producing strain was analyzed for its effect on MnP production. Whereas bipA overexpression seriously reduced MnP production, overexpression of calnexin resulted in a four- to fivefold increase in the extracellular MnP levels. However, when additional heme was provided in the culture medium, calnexin overexpression had no synergistic effect on MnP production. The possible function of these two chaperones in MnP maturation and production is discussed.10aAspergillus niger/*enzymology/genetics Calcium-Binding Proteins/*metabolism Calnexin Culture Media *Fungal Proteins HSP70 Heat-Shock Proteins/metabolism Heme/metabolism Peroxidases/*biosynthesis/genetics Phanerochaete/enzymology/genetics Transformation10aGenetic1 aConesa, A.1 aJeenes, D.1 aArcher, D., B.1 avan den Hondel, C., A.1 aPunt, P., J. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1182322701191nas a2200157 4500008004100000245011600041210006900157300001100226490000600237520046000243653007400703653011900777100001600896700001500912856010600927 2002 eng d00aCombining hierarchical clustering and self-organizing maps for exploratory analysis of gene expression patterns0 aCombining hierarchical clustering and selforganizing maps for ex a467-700 v13 aSelf-organizing maps (SOM) constitute an alternative to classical clustering methods because of its linear run times and superior performance to deal with noisy data. Nevertheless, the clustering obtained with SOM is dependent on the relative sizes of the clusters. Here, we show how the combination of SOM with hierarchical clustering methods constitutes an excellent tool for exploratory analysis of massive data like DNA microarray expression patterns.10aCluster Analysis Computational Biology/methods *Gene Expression Genes10aFungal/genetics *Genome Oligonucleotide Array Sequence Analysis/*methods Statistics as Topic/*methods Time Factors1 aHerrero, J.1 aDopazo, J. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1264591902134nas a2200241 4500008004100000245005900041210005800100300001100158490000700169520128500176653008301461653003201544653003201576653005801608100001601666700001301682700002401695700001401719700001901733700001901752700001501771856010601786 2001 eng d00aClassification of protein disulphide-bridge topologies0 aClassification of protein disulphidebridge topologies a477-870 v153 aThe preferential occurrence of certain disulphide-bridge topologies in proteins has prompted us to design a method and a program, KNOT-MATCH, for their classification. The program has been applied to a database of proteins with less than 65% homology and more than two disulphide bridges. We have investigated whether there are topological preferences that can be used to group proteins and if these can be applied to gain insight into the structural or functional relationships among them. The classification has been performed by Density Search and Hierarchical Clustering Techniques, yielding thirteen main protein classes from the superimposition and clustering process. It is noteworthy that besides the disulphide bridges, regular secondary structures and loops frequently become correctly aligned. Although the lack of significant sequence similarity among some clustered proteins precludes the easy establishment of evolutionary relationships, the program permits us to find out important structural or functional residues upon the superimposition of two protein structures apparently unrelated. The derived classification can be very useful for finding relationships among proteins which would escape detection by current sequence or topology-based analytical algorithms.10aAlgorithms Computer Simulation Databases as Topic Disulfides/*chemistry Models10aMolecular Protein Structure10aSecondary Protein Structure10aTertiary Proteins/*chemistry/*classification Software1 aMas, J., M.1 aAloy, P.1 aMarti-Renom, M., A.1 aOliva, B.1 ade Llorens, R.1 aAviles, F., X.1 aQuerol, E. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=1139474001688nas a2200169 4500008004100000245010100041210006900142300001100211490000800222520084700230653026001077100001501337700001601352700002701368700001701395856010601412 2001 eng d00aC-terminal propeptide of the Caldariomyces fumago chloroperoxidase: an intramolecular chaperone?0 aCterminal propeptide of the Caldariomyces fumago chloroperoxidas a117-200 v5033 aThe Caldariomyces fumago chloroperoxidase (CPO) is synthesised as a 372-aa precursor which undergoes two proteolytic processing events: removal of a 21-aa N-terminal signal peptide and of a 52-aa C-terminal propeptide. The Aspergillus niger expression system developed for CPO was used to get insight into the function of this C-terminal propeptide. A. niger transformants expressing a CPO protein from which the C-terminal propeptide was deleted failed in producing any extracellular CPO activity, although the CPO polypeptide was synthesised. Expression of the full-length gene in an A. niger strain lacking the KEX2-like protease PclA also resulted in the production of CPO cross-reactive material into the culture medium, but no CPO activity. Based on these results, a function of the C-terminal propeptide in CPO maturation is indicated.10aAmino Acid Sequence Ascomycota/*enzymology/genetics Aspergillus niger/genetics Base Sequence Chloride Peroxidase/biosynthesis/*chemistry/genetics DNA Primers/genetics Enzyme Precursors/biosynthesis/chemistry/genetics Gene Expression Molecular Chaperones/b1 aConesa, A.1 aWeelink, G.1 avan den Hondel, C., A.1 aPunt, P., J. uhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11513866