8000 GitHub - iqbal-lab-org/gramtools at v1.8.0
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

iqbal-lab-org/gramtools

Repository files navigation

Build Status Docker Repository on Quay

gramtools

TL;DR Genome inference using prior information encoded as a reference graph.

Gramtools builds a population reference genome (PRG) from a set of variants. Given sequence data from an individual, the graph is annotated with coverage and genotyped, producing a VCF 8000 and a jVCF of all the variation in the graph.

A personalised reference genome for the sample is also inferred and new variation can be discovered against it (see usage). You can then build a new PRG from the initial and the new variants, and genotype this augmented PRG.

Contents

Install

Container

The easiest way to run gramtools is via a container (hosted on quay.io).

To run with Docker:

tag="latest" # or, a specific released version
docker run "quay.io/iqballab/gramtools:${tag}"

To run with Singularity:

tag="latest" # or, a specific released version
URI="docker://quay.io/iqballab/gramtools:${tag}"
singularity exec "$URI" gramtools

Local

Latest release

VERSION="1.7.0"
wget -O - "https://github.com/iqbal-lab-org/gramtools/releases/download/v${VERSION}/gramtools-${VERSION}.tar.gz" | tar xfz -
pip install "./gramtools-${VERSION}"

The latest release includes a precompiled binary for Linux. This will be used if it works on your machine, else it will get compiled during the installation.

We recommend installing inside a virtual environment:

python -m venv gram_ve && source gram_ve/bin/activate
pip install pip==20.0.2
pip install gramtools-${VERSION}

Latest source

pip install git+https://github.com/iqbal-lab-org/gramtools

This will always compile the binary.

Requirements

  • Python >= 3.6
  • pip >= 20.0.2

If the binary needs to be compiled:

  • CMake >= 3.1.2
  • C++17 compatible compiler: g++ >=8 (tested), clang >=7 (untested)

For gramtools discover to function, you additionally need at runtime:

  • R
  • Perl

Usage

Gramtools

Usage: 
    gramtools [-h] [--debug] [--force] subcommand
    
    Subcommands:
        gramtools build -o GRAM_DIR --ref REFERENCE
                       (--vcf VCF [VCF ...] | --prg PRG)
                       [--kmer_size KMER_SIZE]

        gramtools genotype -i GRAM_DIR -o GENO_DIR
                          --reads READS [READS ...] --sample_id SAMPLE_ID
                          [--ploidy {haploid,diploid}]
                          [--max_threads MAX_THREADS] [--seed SEED]

        gramtools discover -i GENO_DIR -o DISCO_DIR
                          [--reads READS [READS ...]]

        gramtools simulate --prg PRG
                           [--max_num_paths MAX_NUM_PATHS]
                           [--sample_id SAMPLE_ID] [--output_dir OUTPUT_DIR]

Subcommands explained

  1. build - given a VCF and reference or a prg file, construct the graph.

    • --kmer_size: used for indexing the graph in preparation for genotype. higher k <=> faster genotype, but build output will consume more disk space.
  2. genotype - map reads to a graph generated in build and genotype the graph. Produces genotype calls (VCF) and a personalised reference genome (fasta).

    • --reads: 1+ reads files in (fasta/fastq/sam/bam/cram) format
    • --sample_id: displayed in VCF & personalised reference outputs
  3. discover - discovers new variation against the personalised reference genome from genotype using one or more variant callers (currently: cortex).

  4. simulate- samples paths through a prg, producing a fasta of the paths and a genotyped JSON of the variant bubbles the path went through.

    • --prg: a prg file as output by build

Documentation

Examples, documentation, and planned future enhancements can be found in the wiki.

For the C++ source code, doxygen formatted documentation can be generated by running doxygen doc/Doxyfile.in from inside the gramtools directory.

The documentation gets generated in doc/html/index.html and provides a useful reference for all files, classes, functions and data structures in gramtools.

Contributing

Please refer to the developers wiki page.

License

MIT

0