8000 Standardised taxon table and mOTU database docs improvement by jfy133 · Pull Request #271 · nf-core/taxprofiler · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Standardised taxon table and mOTU database docs improvement #271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Fixed`

- [#271](https://github.com/nf-core/taxprofiler/pull/271/files) Improved standardised table generation documentation nd mOTUs manual database download tutorial (♥ to @prototaxites for reporting, fix by @jfy133)
- [#269](https://github.com/nf-core/taxprofiler/pull/269/files) Reduced output files in AWS full test output due to very large files
- [#270](https://github.com/nf-core/taxprofiler/pull/270/files) Fixed warning for host removal index parameter, and improved index checks (♥ to @prototaxites for reporting, fix by @jfy133)

Expand Down
8 changes: 6 additions & 2 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,9 @@ nf-core/taxprofiler supports generation of Krona interactive pie chart plots for

##### Multi-Table Generation

In addition to per-sample profiles, the pipeline also supports generation of 'native' multi-sample taxonomic profiles (i.e., those generated by the taxonomic profiling tools themselves or additional utility scripts provided by the tool authors).
The main multiple-sample table from nf-core/taxprofiler is from a dedicated standalone tool originally developed for the pipeline - [Taxpasta](https://taxpasta.readthedocs.io/en/latest/). When providing `--run_profile_standardisation`, every classifier/profiler and database combination will get a standardised and multi-sample taxon table in the [`taxpasta/`](https://nf-co.re/taxprofiler/output) directory. These tables are structured in the same way, to facilitate comparison between the the results of the classifier/profiler

In addition to per-sample profiles and standardised Taxpasta output, the pipeline also supports generation of 'native' multi-sample taxonomic profiles (i.e., those generated by the taxonomic profiling tools themselves or additional utility scripts provided by the tool authors), when providing `--run_profile_standardisation` to your pipeline.

These are executed on a per-database level. I.e., you will get a multi-sample taxon table for each database you provide for each tool and will be placed in the same directory as the directories containing the per-sample profiles.

Expand All @@ -307,7 +309,7 @@ The following tools will produce multi-sample taxon tables:
- **MetaPhlAn3** (via MetaPhlAn's `merge_metaphlan_tables.py` script)
- **mOTUs** (via the `motus merge` command)

Note that the multi-sample tables from these folders are not inter-operable with each other as they can have different formats.
Note that the multi-sample tables from the 'native' tools in each folders are [not inter-operable](https://taxpasta.readthedocs.io/en/latest/tutorials/getting-started/) with each other as they can have different formats and can contain additional and different data. In this case we refer you to use the standardised and merged output from Taxpasta, as described above.

### Updating the pipeline

Expand Down Expand Up @@ -792,6 +794,8 @@ More information on the MetaPhlAn3 database can be found [here](https://github.c

mOTUs does not provide the ability to construct custom databases. Therefore we recommend to use the the prebuilt database of marker genes provided by the developers.

> ⚠️ **Do not change the directory name of the resulting database if moving to a central location** The database name of `db_mOTU/` is hardcoded in the mOTUs tool

To do this you need to have `mOTUs` installed on your machine.

```bash
Expand Down
11 changes: 7 additions & 4 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -467,15 +467,18 @@
},
"motus_use_relative_abundance": {
"type": "boolean",
"description": "Turn on printing relative abundance instead of counts."
"description": "Turn on printing relative abundance instead of counts.",
"fa_icon": "fas fa-percent"
},
"motus_save_mgc_read_counts": {
"type": "boolean",
"description": "Turn on saving the mgc reads count."
"description": "Turn on saving the mgc reads count.",
"fa_icon": "fas fa-save"
},
"motus_remove_ncbi_ids": {
"type": "boolean",
"description": "Turn on removing NCBI taxonomic IDs."
"description": "Turn on removing NCBI taxonomic IDs.",
"fa_icon": "fas fa-address-card"
}
},
"fa_icon": "fas fa-align-center"
Expand All @@ -490,7 +493,7 @@
"type": "boolean",
"fa_icon": "fas fa-toggle-on",
"description": "Turn on standardisation of taxon tables across profilers",
"help_text": "Turns on standardisation of output OTU tables across all tools; each into a TSV format following the following scheme:\n\n|TAXON | SAMPLE_A | SAMPLE_B |\n|-------------|----------------|-----------------|\n| taxon_a | 32 | 123 |\n| taxon_b | 1 | 5 |\n\nThis currently only is generated for mOTUs."
"help_text": "Turns on standardisation of output OTU tables across all tools.\n\nThis happens in two forms, firstly - if available - by a given classifiers/profilers 'native' profile merger and standardisation (for Bracken, Kaiju, Kraken, Centrifuge, MetaPhlAn3, mOTUs), and secondly for _all_ classifier/profilers in the pipeline using [`taxpasta`](https://taxpasta.readthedocs.io).\n\nIn the latter case, taxpasta generates a standardised output as follows:\n\n|TAXON | SAMPLE_A | SAMPLE_B |\n|-------------|----------------|-----------------|\n| taxon_a | 32 | 123 |\n| taxon_b | 1 | 5 |\n\nwhereas all other 'native' tools have varying format outputs. See pipeline [output](https://nf-co.re/taxprofiler) documentation for more information."
},
"standardisation_motus_generatebiom": {
"type": "boolean",
Expand Down
0