See readthedocs for the full documentation of the pipeline.
quicksand (quick analysis of sedimentary ancient DNA) is an open-source Nextflow pipeline designed for rapid and accurate taxonomic classification of mammalian mitochondrial DNA (mtDNA) in aDNA samples. quicksand combines fast alignment-free classification using KrakenUniq with downstream mapping (BWA), post-classification filtering, and ancient DNA authentication. quicksand is optimized for speed and portablity and requires either Singularity or Docker.
To run the pipeline, please install
- Nextflow v22.10 or larger
- Singularity or Docker
Note: To run nextflow+singularity, your kernel needs to support user-namespaces (see here or here).
The input for quicksand is a directory with user-supplied files in BAM or FASTQ format. Adapter-trimming, overlap-merging and sequence demultiplexing need to be performed by the user prior to running quicksand. Provide the directory with the --split
flag
As a test file, download the Hohlenstein-Stadel mtDNA (please see the README for more information)
wget -P split \
http://ftp.eva.mpg.de/neandertal/Hohlenstein-Stadel/BAM/mtDNA/HST.raw_data.ALL.bam
The required KrakenUniq database, the reference genomes for mapping and the bed-files for low-complexity filtering are available on the MPI EVA FTP Servers. Custom versions of the reference material can be created with the quicksand-build pipeline
For the quickstart of quicksand, create a fresh database containing only the Hominidae mtDNA reference genomes (runtime: ~3-5 minutes)
nextflow run mpieva/quicksand-build -r v3.0 \
--include Hominidae \
--outdir refseq \
-profile singularity
To download the full reference database (~60GB), use this command:
latest=$(curl http://ftp.eva.mpg.de/quicksand/LATEST)
wget -r -np -nc -nH --cut-dirs=3 --reject="*index.html*" -q --show-progress -P refseq http://ftp.eva.mpg.de/quicksand/build/$latest
quicksand is executed directly from github. With the databases created and the testdata downloaded, run the pipeline as follows:
# set this if you encounter a heap-space error to increase the memory that is used by nextflow
export NXF_OPTS="-Xms10g -Xmx15g" # increase or decrease the numbers as required
nextflow run mpieva/quicksand -r v2.4 \
--db refseq/kraken/Mito_db_kmer22/ \
--genomes refseq/genomes/ \
--bedfiles refseq/masked/ \
--split split/ \
-profile singularity
Please see the documentation for a comprehensive description of the output!
This pipeline uses code inspired by the nf-core initative, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.