Pipeline Copy Number Variations

Pipeline Copy Number Variations

This page describes the process followed for the determination and analysis of CNVs for germ or somatic samples from a normal panel (PoN). It is specially adapted for illuminated paired-end sequencing, both of exomes and of complete genomes.

This pipeline is based on GATK's best practices instructions and their respective tutorials.

General Pipeline Structure

The pipeline has been developed using the WDL workflow development language. The implemented WDL code is specifically prepared to be executed by using the Cromwell tool. To facilitate execution, the pipeline has been encapsulated within a bash script that automates each of the steps.

The complete pipeline is represented in the following diagram:

Diagram available in: cnv_workflow_diagram.odp

Requirements and recommendations

  • This pipeline is specifically developed for illuminated paired-end sequencing, although it could work with other technologies.

  • To make the call of CNVs it is essential to create a panel of normal (PoN). It is recommended that the PoN be formed with at least 30 independent samples sequenced under the same conditions as the test sample.

  • It is recommended to use BAMs with duplicate reads marked or, in the case of PCR-free, unmarked.

  • It is recommended not to eliminate BAM reads (duplicates, low quality by alignment, etc.)

  • It is not advisable to call on regions with low mapping, segmental duplication or high GC content.