List of ideas to improve assemblies

This is a collection of ideas that should be considered after the DSL2 conversion #56 is finished. The list is subject to change. Any ideas or discussions are welcome.

Preprocessing (check out nf-core/mag, any other examples out there?)

Filtlong to filter ONT by quality (e.g. >7)
Bowtie2 to remove Illumina PhiX reads
Nanolyse (alternatively Minimap2) to remove ONT Lambda reads
add option to down-sample reads, because sometimes this can actually improve assembly

Assemblers:

MEGAHIT (a5-miseq Add A5-miseq support #23 , ...) to have alternative short read assembler
Trycycler to have better hybrid and long read assembly than Unicycler
Flye (Tulip, Redbean, Raven) to have more long read assemblers at hand
Pilon to polish Nanopore-derived contigs with Illumina reads (for long read assemblers)

Assembly QC:

BUSCO to check completeness and contamination of assemblies (and possibly bins)
MaxBin2 (or any other binner) to separate assembly (cleanup if contaminated). In contrast to other binners, MaxBin2 outputs "Completeness, Genome size, GC content" for each bin it found, that comes very handy when judging whether there is real contamination.

Structural:

Use only the most polished assembly for Prokka & QUAST (currently assemblies before polishing are used!)
By default, run all (or at least many) assemblers inclusive polishing (Medaka & Pilon) that are appropriate for a data set. That allows easy comparison (with e.g. QUAST and BUSCO) of the performance of different assemblers and choosing the best assembly.

Defaults

In my opinion, --skip_kraken2 should be either removed (i.e. using --krakendb to determine whether Kraken2 is used) or a simple default (small, fast, but helpful) value should be chosen for --krakendb, e.g. "https://genome-idx.s3.amazonaws.com/kraken/16S_Greengenes13.5_20200326.tgz". This is a very small 16S database but should be sufficient to detect serious bacterial contamination.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Preprocessing (check out nf-core/mag, any other examples out there?)

Assemblers:

Assembly QC:

Structural:

Defaults

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Preprocessing (check out nf-core/mag, any other examples out there?)

Assemblers:

Assembly QC:

Structural:

Defaults

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions