diff --git a/CHANGELOG.md b/CHANGELOG.md index fce9b9d6..67fdb977 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### `Fixed` +- Citations for bwameme [#563](https://github.com/nf-core/raredisease/pull/563) + ## 2.1.0 - Obelix [2024-05-29] ### `Added` diff --git a/CITATIONS.md b/CITATIONS.md index 36b3cd7b..1db771ac 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -22,6 +22,10 @@ > Vasimuddin Md, Misra S, Li H, Aluru S. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. In: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE; 2019:314-324. doi:10.1109/IPDPS.2019.00041 +- [BWA-MEME](https://academic.oup.com/bioinformatics/article/38/9/2404/6543607) + + > Jung Y, Han D. BWA-MEME: BWA-MEM emulated with a machine learning approach. Bioinformatics. 2022;38(9):2404-2413. doi:10.1093/bioinformatics/btac137 + - [CADD1](https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-021-00835-9), [2](https://academic.oup.com/nar/article/47/D1/D886/5146191) > Rentzsch P, Schubach M, Shendure J, Kircher M. CADD-Splice—improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med. 2021;13(1):31. doi:10.1186/s13073-021-00835-9 diff --git a/docs/output.md b/docs/output.md index 3241f5d4..376acce9 100644 --- a/docs/output.md +++ b/docs/output.md @@ -17,6 +17,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - [Mapping](#mapping) - [Bwa-mem2](#bwa-mem2) - [BWA](#bwa) + - [BWA-MEME](#bwa-meme) - [Sentieon bwa mem](#sentieon-bwa-mem) - [Duplicate marking](#duplicate-marking) - [Picard's MarkDuplicates](#picards-markduplicates) @@ -88,6 +89,10 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d [BWA](https://github.com/lh3/bwa) used to map the reads to a reference genome. The aligned reads are coordinate sorted with samtools sort. These files are treated as intermediates and are not placed in the output folder by default. It is not the default aligner, but it can be chosen by setting `--aligner` option to bwa. +##### BWA-MEME + +[BWA-MEME](https://github.com/kaist-ina/BWA-MEME) used to map the reads to a reference genome. The aligned reads are coordinate sorted with samtools sort. These files are treated as intermediates and are not placed in the output folder by default. It is not the default aligner, but it can be chosen by setting `--aligner` option to bwameme. + ##### Sentieon bwa mem [Sentieon's bwa mem](https://support.sentieon.com/manual/DNAseq_usage/dnaseq/#map-reads-to-reference) is the software accelerated version of the bwa-mem algorithm. It is used to efficiently perform the alignment using BWA. Aligned reads are then coordinate sorted using Sentieon's [sort](https://support.sentieon.com/manual/usages/general/#util-syntax) utility. These files are treated as intermediates and are not placed in the output folder by default. It is not the default aligner, but it can be chosen by setting `--aligner` option to "sentieon". @@ -96,7 +101,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d ##### Picard's MarkDuplicates -[Picard MarkDuplicates](https://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates) is used for marking PCR duplicates that can occur during library amplification. This is essential as the presence of such duplicates results in false inflated coverages, which in turn can lead to overly-confident genotyping calls during variant calling. Only reads aligned by Bwa-mem2 and bwa are processed by this tool. By default, alignment files are published in bam format. If you would like to store cram files instead, set `--save_mapped_as_cram` to true. +[Picard MarkDuplicates](https://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates) is used for marking PCR duplicates that can occur during library amplification. This is essential as the presence of such duplicates results in false inflated coverages, which in turn can lead to overly-confident genotyping calls during variant calling. Only reads aligned by Bwa-mem2 bwameme and bwa are processed by this tool. By default, alignment files are published in bam format. If you would like to store cram files instead, set `--save_mapped_as_cram` to true.
Output files from Alignment diff --git a/docs/usage.md b/docs/usage.md index 5558ee64..996d0e6a 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -139,7 +139,7 @@ Note that the pipeline is modular in architecture. It offers you the flexibility nf-core/raredisease consists of several tools used for various purposes. For convenience, we have grouped those tools under the following categories: -1. Alignment (bwamem2/bwa/Sentieon BWA mem) +1. Alignment (bwamem2/bwa/bwameme/Sentieon BWA mem) 2. QC stats from the alignment files 3. Repeat expansions (ExpansionsHunter & Stranger) 4. Variant calling - SNV (DeepVariant/Sentieon DNAscope) @@ -162,14 +162,15 @@ The mandatory and optional parameters for each category are tabulated below. | aligner1 | fasta_fai4 | | fasta2 | bwamem24 | | platform | bwa4 | -| mito_name/mt_fasta3 | known_dbsnp5 | +| mito_name/mt_fasta3 | bwameme4 | +| | known_dbsnp5 | | | known_dbsnp_tbi5 | | | min_trimmed_length6 | -1Default value is bwamem2. Other alternatives are bwa and sentieon (requires valid Sentieon license ).
+1Default value is bwamem2. Other alternatives are bwa, bwameme and sentieon (requires valid Sentieon license ).
2Analysis set reference genome in fasta format, first 25 contigs need to be chromosome 1-22, X, Y and the mitochondria.
3If mito_name is provided, mt_fasta can be generated by the pipeline.
-4fasta_fai, bwa and bwamem2, if not provided by the user, will be generated by the pipeline when necessary.
+4fasta_fai, bwa, bwamem2 and bwameme, if not provided by the user, will be generated by the pipeline when necessary.
5Used only by Sentieon.
6Default value is 40. Used only by fastp.
diff --git a/subworkflows/local/utils_nfcore_raredisease_pipeline/main.nf b/subworkflows/local/utils_nfcore_raredisease_pipeline/main.nf index 36c0cbaa..54ee0a08 100644 --- a/subworkflows/local/utils_nfcore_raredisease_pipeline/main.nf +++ b/subworkflows/local/utils_nfcore_raredisease_pipeline/main.nf @@ -217,6 +217,7 @@ def toolCitationText() { align_text = [ params.aligner.equals("bwa") ? "BWA (Li, 2013)," :"", params.aligner.equals("bwamem2") ? "BWA-MEM2 (Vasimuddin et al., 2019)," : "", + params.aligner.equals("bwameme") ? "BWA-MEME (Jung et al., 2022)," : "", params.aligner.equals("sentieon") ? "Sentieon DNASeq (Kendig et al., 2019)," : "", params.aligner.equals("sentieon") ? "Sentieon Tools (Freed et al., 2017)," : "" ] @@ -325,6 +326,7 @@ def toolBibliographyText() { align_text = [ params.aligner.equals("bwa") ? "
  • Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM (arXiv:1303.3997). arXiv. http://arxiv.org/abs/1303.3997
  • " :"", params.aligner.equals("bwamem2") ? "
  • Vasimuddin, Md., Misra, S., Li, H., & Aluru, S. (2019). Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 314–324. https://doi.org/10.1109/IPDPS.2019.00041
  • " : "", + params.aligner.equals("bwameme") ? "
  • Jung Y, Han D. BWA-MEME: BWA-MEM emulated with a machine learning approach. Bioinformatics. 2022;38(9):2404-2413. doi:10.1093/bioinformatics/btac137
  • " : "", params.aligner.equals("sentieon") ? "
  • Kendig, K. I., Baheti, S., Bockol, M. A., Drucker, T. M., Hart, S. N., Heldenbrand, J. R., Hernaez, M., Hudson, M. E., Kalmbach, M. T., Klee, E. W., Mattson, N. R., Ross, C. A., Taschuk, M., Wieben, E. D., Wiepert, M., Wildman, D. E., & Mainzer, L. S. (2019). Sentieon DNASeq Variant Calling Workflow Demonstrates Strong Computational Performance and Accuracy. Frontiers in Genetics, 10, 736. https://doi.org/10.3389/fgene.2019.00736
  • " : "", params.aligner.equals("sentieon") ? "
  • Freed, D., Aldana, R., Weber, J. A., & Edwards, J. S. (2017). The Sentieon Genomics Tools—A fast and accurate solution to variant calling from next-generation sequence data (p. 115717). bioRxiv. https://doi.org/10.1101/115717
  • " : "" ]