Introduction to SNP-based variant calling ========================================== This pipeline contains a wrapper for `vivaxGEN NGS-Pipeline `_, a SNP-based variant calling pipeline, which has been setup to process *P. vivax* sequence data. This pipeline will produce ordinary VCF file that can be used for further downstream analysis. Currently, this pipeline was setup to use multi-step mode of vivaxGEN NGS-Pipeline in a single command line with GATK-based workflow to generate VCF file There is also Freebayes-based workflow, but it requires modification of the setting, which will not be covered in this documentation. Quick Start ------------ #. Running the pipeline is as easy executing the following command: .. code-block:: console $ ngs-pl run-discovery-variant-caller -o output_dir path_to_fastq/*.fastq.gz #. When the command finishes, examine the content of ``output_dir`` directory: The layout of the output directory is: .. code-block:: console $ tree output_dir output_dir/ metafile/ manifest.tsv analysis/ SAMPLE-1/ gvcf/ SAMPLE-1-PvP01_all_v2.g.vcf.gz SAMPLE-1-PvP01_all_v2.g.vcf.gz.tbi maps/ mapped-final.bam mapped-final.bam.bai reads/ raw-0_R1.fastq.gz raw-0_R2.fastq.gz logs/ ... SAMPLE-2/ ... joint/ concatenated.vcf.gz vcfs/ reports/ ... completed_samples/ SAMPLE-1/ ... SAMPLE-2/ ... failed_samples/ ... The primary output file of interest is the ``concatenated.vcf.gz`` which contains the SNPs and the genotype calls for each of the samples.