BGREAT2

Paired Read mapping on de Bruijn graphs

Description:

Bgreat2 is intended to map reads or read pairs on de Bruijn graphs in an efficient manner. The de Bruijn graph should be represented as a set of unitigs. We advise the use of Bcalm in order to do so (https://github.com/GATB/bcalm). A mapping can be represented by a path in the graph (list of nodes) or by the actual sequences of the graph. This last behavior have been shown able to correct a set a of reads (see https://travis-ci.org/Malfoy/BCOOL).

Bgreat2 versus Bgreat:

Bgreat2 now index anchors from the unitigs and do not need a third party tool to align on large unitigs. To do so Bgreat2 index by default all anchors. If too much memory is used try to reduce the proportion of kmer index with -i option as example -i 10 will index 1/10 kmers. Bgreat can be used to correct reads from a DBG (correction mode) or to know where the reads appear in the graph. Bgreat can now map read pairs direclty. Bgreat can handle zipped files.

Usage:

-u read file

Unpaired Mapping

-x read file

Paired mapping, reads should be interleaved

-k k value

Value of k used to construct the graph

-a anchors length

Size of the anchors used to start mapping Can be used if k is way larger than 31

-i indexed anchors fraction

By default Bgreat index all anchors from the graph. With -i 10 it index on out of ten anchors, in order to reduce the memory usage.

-g unitig file

Unitig file in fasta This file can be obtained using bcalm (https://github.com/GATB/bcalm) on a reads file

-m number of missmatch allowed

Maximal hamming distance between a read and its corresponding graph sequence for a mapping to be valid Default value is 5

-t number of thread

-f output file name

-z to compress the output file

-q input reads are Fastq

Note that Bgreat ignore quality information

-c to output corrected reads

Bgreat will output the sequence of corresponding path of the read in the graph Intuitevely, the read is "corrected" according to the graph sequence, mode used by Bcool corrector (https://github.com/Malfoy/BCOOL)

-O to keep read ordering

The advanced options are experimental and in current developpement and should not be used

Path mode:

In the default mode, the numbers outputed correspond to the paths of unitigs a read (or pair of reads) maps on.

>read1

3;4;-6;

Mean that the read1 mapped on unitig 3 then 4 then the reverse complement of the unitig 6.

To get the corresponding sequence the tool numberToSequences will do the conversion (warning: large files may be produced this way due to redundancy of large unitigs)

Usage: ./numbersToSequences unitigs.fa paths 31 > superReads.fa

In correction mode:

In this mode the corrected reads are direclty outputed. If the -O option is used, the corrected reads will be in the right order.

Example command lines:

Map an unpaired reads file on a low k DBG in a output file "output_paths"

./bgreat -u reads.fa -g dbg27.fa -k 27 -f output_paths

Can also work with zipped files

./bgreat -u reads.fa.gz -g dbg27.fa -k 27 -f output_paths

Map an unpaired reads file on a low k DBG in a output file "output_paths" with a maximum of 2 missmatches

./bgreat -u reads.fa -g dbg27.fa -k 27 -f output_paths -m 2

Map an unpaired reads file in FASTQ on a low k DBG in a output file "output_paths"

./bgreat -u reads.fa -q -g dbg27.fa -k 27 -f output_paths

Map an unpaired reads file on a low k DBG in a output file "output_paths" using 8 cores

./bgreat -u reads.fa -g dbg27.fa -k 27 -f output_paths -t 8

Map a paired reads file (interleaved format) on a low k DBG in a output file "output_paths"

./bgreat -x paired_reads.fa -g dbg27.fa -k 27 -f output_paths

Map a paired reads file (interleaved format) on a high k DBG in a output file "output_paths" with a anchors size of 31 (good value for NGS reads)

./bgreat -x paired_reads.fa -g dbg91.fa -k 91 -f output_paths -a 31

Correct an unpaired reads file on a low k DBG in a output file "reads_cor.fa"

./bgreat -u reads.fa -g dbg27.fa -k 27 -f reads_cor.fa -c

Correct an unpaired reads file on a low k DBG in a compressed output file "reads_cor.fa.gz"

./bgreat -u reads.fa -g dbg27.fa -k 27 -f reads_cor.fa.gz -c

Create superReads from a paired reads file on a low k DBG in a output file "superReads.fa"

./bgreat -x paired_reads.fa -g dbg27.fa -k 27 -f superReads.fa -c

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
.travis.yml		.travis.yml
BooPHF.h		BooPHF.h
README.md		README.md
aligner.cpp		aligner.cpp
aligner.h		aligner.h
alignerGreedy.cpp		alignerGreedy.cpp
bgreat.cpp		bgreat.cpp
makefile		makefile
numbersToSequences.cpp		numbersToSequences.cpp
sortPaths.cpp		sortPaths.cpp
strict_fstream.hpp		strict_fstream.hpp
utils.cpp		utils.cpp
utils.h		utils.h
zstr.hpp		zstr.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BGREAT2

Paired Read mapping on de Bruijn graphs

Description:

Bgreat2 versus Bgreat:

Usage:

-u read file

-x read file

-k k value

-a anchors length

-i indexed anchors fraction

-g unitig file

-m number of missmatch allowed

-t number of thread

-f output file name

-z to compress the output file

-q input reads are Fastq

-c to output corrected reads

-O to keep read ordering

Path mode:

In correction mode:

Example command lines:

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

Malfoy/BGREAT2

Folders and files

Latest commit

History

Repository files navigation

BGREAT2

Paired Read mapping on de Bruijn graphs

Description:

Bgreat2 versus Bgreat:

Usage:

-u read file

-x read file

-k k value

-a anchors length

-i indexed anchors fraction

-g unitig file

-m number of missmatch allowed

-t number of thread

-f output file name

-z to compress the output file

-q input reads are Fastq

-c to output corrected reads

-O to keep read ordering

Path mode:

In correction mode:

Example command lines:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages