%0 Journal Article %J Front Genet %D 2023 %T Editorial: Critical assessment of massive data analysis (CAMDA) annual conference 2021. %A Łabaj, Paweł P %A Dopazo, Joaquin %A Xiao, Wenzhong %A Kreil, David P %B Front Genet %V 14 %P 1154398 %8 2023 %G eng %R 10.3389/fgene.2023.1154398 %0 Journal Article %J Nature Genetics %D 2021 %T The NCI Genomic Data Commons %A Heath, Allison P. %A Ferretti, Vincent %A Agrawal, Stuti %A An, Maksim %A Angelakos, James C. %A Arya, Renuka %A Bajari, Rosita %A Baqar, Bilal %A Barnowski, Justin H. B. %A Burt, Jeffrey %A Catton, Ann %A Chan, Brandon F. %A Chu, Fay %A Cullion, Kim %A Davidsen, Tanja %A Do, Phuong-My %A Dompierre, Christian %A Ferguson, Martin L. %A Fitzsimons, Michael S. %A Ford, Michael %A Fukuma, Miyuki %A Gaheen, Sharon %A Ganji, Gajanan L. %A Garcia, Tzintzuni I. %A George, Sameera S. %A Gerhard, Daniela S. %A Gerthoffert, Francois %A Gomez, Fauzi %A Han, Kang %A Hernandez, Kyle M. %A Issac, Biju %A Jackson, Richard %A Jensen, Mark A. %A Joshi, Sid %A Kadam, Ajinkya %A Khurana, Aishmit %A Kim, Kyle M. J. %A Kraft, Victoria E. %A Li, Shenglai %A Lichtenberg, Tara M. %A Lodato, Janice %A Lolla, Laxmi %A Martinov, Plamen %A Mazzone, Jeffrey A. %A Miller, Daniel P. %A Miller, Ian %A Miller, Joshua S. %A Miyauchi, Koji %A Murphy, Mark W. %A Nullet, Thomas %A Ogwara, Rowland O. %A Ortuño, Francisco M. %A Pedrosa, Jesús %A Pham, Phuong L. %A Popov, Maxim Y. %A Porter, James J. %A Powell, Raymond %A Rademacher, Karl %A Reid, Colin P. %A Rich, Samantha %A Rogel, Bessie %A Sahni, Himanso %A Savage, Jeremiah H. %A Schmitt, Kyle A. %A Simmons, Trevar J. %A Sislow, Joseph %A Spring, Jonathan %A Stein, Lincoln %A Sullivan, Sean %A Tang, Yajing %A Thiagarajan, Mathangi %A Troyer, Heather D. %A Wang, Chang %A Wang, Zhining %A West, Bedford L. %A Wilmer, Alex %A Wilson, Shane %A Wu, Kaman %A Wysocki, William P. %A Xiang, Linda %A Yamada, Joseph T. %A Yang, Liming %A Yu, Christine %A Yung, Christina K. %A Zenklusen, Jean Claude %A Zhang, Junjun %A Zhang, Zhenyu %A Zhao, Yuanheng %A Zubair, Ariz %A Staudt, Louis M. %A Grossman, Robert L. %B Nature Genetics %8 Oct-02-2022 %G eng %U http://www.nature.com/articles/s41588-021-00791-5 %! Nat Genet %R 10.1038/s41588-021-00791-5 %0 Journal Article %J Nature methods %D 2015 %T Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. %A Ewing, Adam D %A Houlahan, Kathleen E %A Hu, Yin %A Ellrott, Kyle %A Caloian, Cristian %A Yamaguchi, Takafumi N %A Bare, J Christopher %A P’ng, Christine %A Waggott, Daryl %A Sabelnykova, Veronica Y %A Kellen, Michael R %A Norman, Thea C %A Haussler, David %A Friend, Stephen H %A Stolovitzky, Gustavo %A Margolin, Adam A %A Stuart, Joshua M %A Boutros, Paul C %E ICGC-TCGA DREAM Somatic Mutation Calling Challenge participants %E Liu Xi %E Ninad Dewal %E Yu Fan %E Wenyi Wang %E David Wheeler %E Andreas Wilm %E Grace Hui Ting %E Chenhao Li %E Denis Bertrand %E Niranjan Nagarajan %E Qing-Rong Chen %E Chih-Hao Hsu %E Ying Hu %E Chunhua Yan %E Warren Kibbe %E Daoud Meerzaman %E Kristian Cibulskis %E Mara Rosenberg %E Louis Bergelson %E Adam Kiezun %E Amie Radenbaugh %E Anne-Sophie Sertier %E Anthony Ferrari %E Laurie Tonton %E Kunal Bhutani %E Nancy F Hansen %E Difei Wang %E Lei Song %E Zhongwu Lai %E Liao, Yang %E Shi, Wei %E Carbonell-Caballero, José %E Joaquín Dopazo %E Cheryl C K Lau %E Justin Guinney %K cancer %K NGS %K variant calling %X The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/. %B Nature methods %8 2015 May 18 %G eng %U http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3407.html %R 10.1038/nmeth.3407 %0 Journal Article %J Nature biotechnology %D 2015 %T Prediction of human population responses to toxic compounds by a collaborative competition. %A Eduati, Federica %A Mangravite, Lara M %A Wang, Tao %A Tang, Hao %A Bare, J Christopher %A Huang, Ruili %A Norman, Thea %A Kellen, Mike %A Menden, Michael P %A Yang, Jichen %A Zhan, Xiaowei %A Zhong, Rui %A Xiao, Guanghua %A Xia, Menghang %A Abdo, Nour %A Kosyk, Oksana %X The ability to computationally predict the effects of toxic compounds on humans could help address the deficiencies of current chemical safety testing. Here, we report the results from a community-based DREAM challenge to predict toxicities of environmental compounds with potential adverse health effects for human populations. We measured the cytotoxicity of 156 compounds in 884 lymphoblastoid cell lines for which genotype and transcriptional data are available as part of the Tox21 1000 Genomes Project. The challenge participants developed algorithms to predict interindividual variability of toxic response from genomic profiles and population-level cytotoxicity data from structural attributes of the compounds. 179 submitted predictions were evaluated against an experimental data set to which participants were blinded. Individual cytotoxicity predictions were better than random, with modest correlations (Pearson’s r < 0.28), consistent with complex trait genomic prediction. In contrast, predictions of population-level response to different compounds were higher (r < 0.66). The results highlight the possibility of predicting health risks associated with unknown compounds, although risk estimation accuracy remains suboptimal. %B Nature biotechnology %8 2015 Aug 10 %G eng %U http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3299.html %R 10.1038/nbt.3299 %0 Journal Article %J Nature communications %D 2014 %T Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. %A Munro, Sarah A %A Lund, Steven P %A Pine, P Scott %A Binder, Hans %A Clevert, Djork-Arné %A Ana Conesa %A Dopazo, Joaquin %A Fasold, Mario %A Hochreiter, Sepp %A Hong, Huixiao %A Jafari, Nadereh %A Kreil, David P %A Labaj, Paweł P %A Li, Sheng %A Liao, Yang %A Lin, Simon M %A Meehan, Joseph %A Mason, Christopher E %A Santoyo-López, Javier %A Setterquist, Robert A %A Shi, Leming %A Shi, Wei %A Smyth, Gordon K %A Stralis-Pavese, Nancy %A Su, Zhenqiang %A Tong, Weida %A Wang, Charles %A Wang, Jian %A Xu, Joshua %A Ye, Zhan %A Yang, Yong %A Yu, Ying %A Salit, Marc %K RNA-seq %X There is a critical need for standard approaches to assess, report and compare the technical performance of genome-scale differential gene expression experiments. Here we assess technical performance with a proposed standard ’dashboard’ of metrics derived from analysis of external spike-in RNA control ratio mixtures. These control ratio mixtures with defined abundance ratios enable assessment of diagnostic performance of differentially expressed transcript lists, limit of detection of ratio (LODR) estimates and expression ratio variability and measurement bias. The performance metrics suite is applicable to analysis of a typical experiment, and here we also apply these metrics to evaluate technical performance among laboratories. An interlaboratory study using identical samples shared among 12 laboratories with three different measurement processes demonstrates generally consistent diagnostic power across 11 laboratories. Ratio measurement variability and bias are also comparable among laboratories for the same measurement process. We observe different biases for measurement processes using different mRNA-enrichment protocols. %B Nature communications %V 5 %P 5125 %8 2014 %G eng %U http://www.nature.com/ncomms/2014/140925/ncomms6125/full/ncomms6125.html %R 10.1038/ncomms6125 %0 Journal Article %J Nature biotechnology %D 2010 %T The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. %A Shi, Leming %A Campbell, Gregory %A Jones, Wendell D %A Campagne, Fabien %A Wen, Zhining %A Walker, Stephen J %A Su, Zhenqiang %A Chu, Tzu-Ming %A Goodsaid, Federico M %A Pusztai, Lajos %A Shaughnessy, John D %A Oberthuer, André %A Thomas, Russell S %A Paules, Richard S %A Fielden, Mark %A Barlogie, Bart %A Chen, Weijie %A Du, Pan %A Fischer, Matthias %A Furlanello, Cesare %A Gallas, Brandon D %A Ge, Xijin %A Megherbi, Dalila B %A Symmans, W Fraser %A Wang, May D %A Zhang, John %A Bitter, Hans %A Brors, Benedikt %A Bushel, Pierre R %A Bylesjo, Max %A Chen, Minjun %A Cheng, Jie %A Cheng, Jing %A Chou, Jeff %A Davison, Timothy S %A Delorenzi, Mauro %A Deng, Youping %A Devanarayan, Viswanath %A Dix, David J %A Dopazo, Joaquin %A Dorff, Kevin C %A Elloumi, Fathi %A Fan, Jianqing %A Fan, Shicai %A Fan, Xiaohui %A Fang, Hong %A Gonzaludo, Nina %A Hess, Kenneth R %A Hong, Huixiao %A Huan, Jun %A Irizarry, Rafael A %A Judson, Richard %A Juraeva, Dilafruz %A Lababidi, Samir %A Lambert, Christophe G %A Li, Li %A Li, Yanen %A Li, Zhen %A Lin, Simon M %A Liu, Guozhen %A Lobenhofer, Edward K %A Luo, Jun %A Luo, Wen %A McCall, Matthew N %A Nikolsky, Yuri %A Pennello, Gene A %A Perkins, Roger G %A Philip, Reena %A Popovici, Vlad %A Price, Nathan D %A Qian, Feng %A Scherer, Andreas %A Shi, Tieliu %A Shi, Weiwei %A Sung, Jaeyun %A Thierry-Mieg, Danielle %A Thierry-Mieg, Jean %A Thodima, Venkata %A Trygg, Johan %A Vishnuvajjala, Lakshmi %A Wang, Sue Jane %A Wu, Jianping %A Wu, Yichao %A Xie, Qian %A Yousef, Waleed A %A Zhang, Liang %A Zhang, Xuegong %A Zhong, Sheng %A Zhou, Yiming %A Zhu, Sheng %A Arasappan, Dhivya %A Bao, Wenjun %A Lucas, Anne Bergstrom %A Berthold, Frank %A Brennan, Richard J %A Buness, Andreas %A Catalano, Jennifer G %A Chang, Chang %A Chen, Rong %A Cheng, Yiyu %A Cui, Jian %A Czika, Wendy %A Demichelis, Francesca %A Deng, Xutao %A Dosymbekov, Damir %A Eils, Roland %A Feng, Yang %A Fostel, Jennifer %A Fulmer-Smentek, Stephanie %A Fuscoe, James C %A Gatto, Laurent %A Ge, Weigong %A Goldstein, Darlene R %A Guo, Li %A Halbert, Donald N %A Han, Jing %A Harris, Stephen C %A Hatzis, Christos %A Herman, Damir %A Huang, Jianping %A Jensen, Roderick V %A Jiang, Rui %A Johnson, Charles D %A Jurman, Giuseppe %A Kahlert, Yvonne %A Khuder, Sadik A %A Kohl, Matthias %A Li, Jianying %A Li, Li %A Li, Menglong %A Li, Quan-Zhen %A Li, Shao %A Li, Zhiguang %A Liu, Jie %A Liu, Ying %A Liu, Zhichao %A Meng, Lu %A Madera, Manuel %A Martinez-Murillo, Francisco %A Medina, Ignacio %A Meehan, Joseph %A Miclaus, Kelci %A Moffitt, Richard A %A Montaner, David %A Mukherjee, Piali %A Mulligan, George J %A Neville, Padraic %A Nikolskaya, Tatiana %A Ning, Baitang %A Page, Grier P %A Parker, Joel %A Parry, R Mitchell %A Peng, Xuejun %A Peterson, Ron L %A Phan, John H %A Quanz, Brian %A Ren, Yi %A Riccadonna, Samantha %A Roter, Alan H %A Samuelson, Frank W %A Schumacher, Martin M %A Shambaugh, Joseph D %A Shi, Qiang %A Shippy, Richard %A Si, Shengzhu %A Smalter, Aaron %A Sotiriou, Christos %A Soukup, Mat %A Staedtler, Frank %A Steiner, Guido %A Stokes, Todd H %A Sun, Qinglan %A Tan, Pei-Yi %A Tang, Rong %A Tezak, Zivana %A Thorn, Brett %A Tsyganova, Marina %A Turpaz, Yaron %A Vega, Silvia C %A Visintainer, Roberto %A von Frese, Juergen %A Wang, Charles %A Wang, Eric %A Wang, Junwei %A Wang, Wei %A Westermann, Frank %A Willey, James C %A Woods, Matthew %A Wu, Shujian %A Xiao, Nianqing %A Xu, Joshua %A Xu, Lei %A Yang, Lun %A Zeng, Xiao %A Zhang, Jialu %A Zhang, Li %A Zhang, Min %A Zhao, Chen %A Puri, Raj K %A Scherf, Uwe %A Tong, Weida %A Wolfinger, Russell D %X

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

%B Nature biotechnology %V 28 %P 827-38 %8 2010 Aug %G eng %U http://www.nature.com/nbt/journal/v28/n8/full/nbt.1665.html