CAMDA 2008 2007 2006 2004 2003 2002 2001 2000 | What is CAMDA | Book | Discussion | Organizers | Contact Us

 

 

Home

Registration

Keynote Speakers

Call for Papers

Contest Datasets

Analysis Tools

Important Dates

Agenda

Sponsors

Scientific Committee

Accomodations

A collection of software packages and publications for ChIP-seq and other NGS analysis

(Most of these contents were adapted based on SEQanswers forum (thanks SEQanswers' members sci_guy and ECO organized the list of tools) and Bionformatics NGS virtual issue)

 

Integrated solutions
# Galaxy - Galaxy = interactive and reproducible genomics. A job webportal. Paper link
# PIAQ - Pipeline for Illumina G1 Genome Analyzer Data Quality Assessment. Paper link
# SHORE - SHORE, for Short Read, is a mapping and analysis pipeline for short DNA sequences produced on a Illumina Genome Analyzer. A suite created by the 1001 Genomes project. Source for POSIX. Paper link
# ShortRead - A Bioconductor package for input, quality assessment, and exploration of high throughput sequence data. Paper link

 

ChIP-Seq and other counting related NGS analysis
# BS-Seq - The source code and data used by paper "Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA Methylation Patterning". POSIX. Paper link
# chipseq – A Bioconductor package for analyzing chipseq data
# ChIPmeta - Hierarchical hidden Markov model with application to joint analysis of ChIP-chip and ChIP-seq data. Paper link
# ChIPSeq - Program used by paper “Genome-Wide Mapping of in Vivo Protein-DNA InteractionsPaper link
# ChiPDiff - An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data. Paper link
# CisGenome - An integrated software system for analyzing ChIP-chip and ChIP-seq data. Paper link
# CNV-Seq - CNV-seq, a new method to detect copy number variation using high-throughput sequencing. Perl/R. Paper link
# FindPeaks - perform analysis of ChIP-Seq experiments. It uses a naive algorithm for identifying regions of high coverage, which represent Chromatin Immunoprecipitation enrichment of sequence fragments, indicating the location of a bound protein of interest. JAVA/OS independent. Latest versions available as part of the Vancouver Short Read Analysis Package, Paper link
# F-seq - A feature density estimator for high-throughput sequence tags. Paper link
# MACS - Model-based Analysis for ChIP-Seq. MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. Paper link
# PeakSeq - PeakSeq: Systematic Scoring of ChIP-Seq Experiments Relative to Controls. a two-pass approach for scoring ChIP-Seq data relative to controls. The first pass identifies putative binding sites and compensates for variation in the mappability of sequences across the genome. The second pass filters out sites that are not significantly enriched compared to the normalized input DNA and computes a precise enrichment and significance. C/Perl. Paper link
# QuEST - Quantitative Enrichment of Sequence Tags. From the 2008 publication Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. (C++). Paper link
# SICER - A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Paper link
# SISSRs - Site Identification from Short Sequence Reads. BED file input. Perl. Paper link

 

Align/Assemble to a reference
# Bowtie - Ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of 25 million reads per hour on a typical workstation with 2 gigabytes of memory. Uses a Burrows-Wheeler-Transformed (BWT) index. Linux, Windows, and Mac OS X. Paper link
# ELAND - Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome allowing up to 2 errors per match. Written by Illumina author Anthony J. Cox for the Solexa 1G machine.
# Exonerate - Various forms of pairwise alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. C for POSIX. Paper link
# GMAP - GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. C/Perl for Unix. Paper link
# MAQ - Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data. Features extensive supporting tools for DIP/SNP detection, etc. C++ source  Paper link
# MUMmer - MUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Released as a package providing an efficient suffix tree library, seed-and-extend alignment, SNP detection, repeat detection, and visualization tools. POSIX OS required. Paper link
# PASS - It supports Illumina, SOLiD and Roche-FLX data formats and allows the user to modulate very finely the sensitivity of the alignments. Spaced seed intial filter, then NW dynamic algorithm to a SW(like) local alignment. Win/Linux. Paper link
# ProbeMatch, - rapid alignment of oligonucleotides to genome allowing both gaps and mismatches. Paper link
# Pyro-Align, - Multiple Sequence Alignment System for Pyrosequencing Reads. Paper link Free text
# RMAP - Assembles 20 - 64 bp Illumina reads to a FASTA reference genome. POSIX OS required. Paper link
# SeqMap - Supports up to 5 or more bp mismatches/INDELs. Highly tunable. Builds available for most OS's. Paper link
# SHRiMP - Assembles to a reference sequence. Developed with Applied Biosystem's colourspace genomic representation in mind. POSIX. Paper link
# Slider- An application for the Illumina Sequence Analyzer output that uses the probability files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. Paper link.
# SOAP - SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The updated version uses a BWT. Can call SNPs and INDELs. C++, POSIX. Paper link.
# SSAHA - SSAHA (Sequence Search and Alignment by Hashing Algorithm) is a tool for rapidly finding near exact matches in DNA or protein databases using a hash table. C++ for Linux/Alpha. Paper link
# SOCS - Aligns SOLiD data. SOCS is built on an iterative variation of the Rabin-Karp string search algorithm, which uses hashing to reduce the set of possible matches, drastically increasing search speed. Paper link
# SWIFT - The SWIFT suit is a software collection for fast index-based sequence comparison. It contains: SWIFT — fast local alignment search, guaranteeing to find epsilon-matches between two sequences. SWIFT BALSAM — a very fast program to find semiglobal non-gapped alignments based on k-mer seeds. Paper link
# Vmatch - A versatile software tool for efficiently solving large scale sequence matching tasks. Vmatch subsumes the software tool REPuter, but is much more general, with a very flexible user interface, and improved space and time requirements. Essentially a large string matching toolbox. POSIX.
# ZOOM - ZOOM (Zillions Of Oligos Mapped) is designed to map millions of short reads, emerged by next-generation sequencing technology, back to the reference genomes, and carry out post-analysis. ZOOM is developed to be highly accurate, flexible, and user-friendly with speed being a critical priority. Commercial. Supports Illumina and SOLiD data. Paper link


Genome Annotation/Genome Browser/Alignment Viewer/Assembly Database
# EagleView - An information-rich genome assembler viewer. EagleView can display a dozen different types of information including base quality and flowgram signal. Paper link
# LookSeq - LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data. LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualization of putative single nucleotide and structural variation. Paper link
# MapView - MapView: visualization of short reads alignment on desktop computer. Linux. Paper link
# rtracklayer - A Bioconductor package providing R interface to genome browsers and their annotation tracks. Paper link
# SAM - Sequence Assembly Manager. Whole Genome Assembly (WGA) Management and Visualization Tool. It provides a generic platform for manipulating, analyzing and viewing WGA data, regardless of input type. MySQL backend and Perl-CGI web-based frontend/Linux. Paper link
# XMatchView - A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canada's Michael Smith Genome Sciences Centre. Python/Win or Linux.

Last modified on 08/28/2009

 

 

© Northwestern University Biomedical Informatics Center