8000 PRELOADEDKRAKENUNIQ: Update module. by muniheart · Pull Request #614 · nf-core/taxprofiler · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

PRELOADEDKRAKENUNIQ: Update module. #614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: dev
Choose a base branch
from
Open

Conversation

muniheart
Copy link
Contributor

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed 8000 a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/taxprofiler branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Update the module krakenuniq/preloadedkrakenuniq. This will allow the specification of ram_chunk_size per database. 8397, 8483

@jfy133
Copy link
Member
jfy133 commented May 16, 2025

Pushed a change to hopefully fix the error...

@jfy133
Copy link
Member
jfy133 commented May 17, 2025

@nf-core-bot fix linting

@muniheart
Copy link
Contributor Author

The value, --krakenuniq_ram_chunk_size 16G, is too large for requested memory, 15G. I have reduced to 8G.

@muniheart
Copy link
Contributor Author

I don't understand why task Run pipeline with test data (latest-everything | test_krakenuniq | docker) is reported as ✅ while log indicates error and non-zero exit status.

ERROR ~ Validation of pipeline parameters failed!

 -- Check '.nextflow.log' file for details
The following invalid input values have been detected:

* --databases (https://raw.githubusercontent.com/nf-core/test-datasets/taxprofiler/database_krakenuniq.csv): Validation of file failed:
	-> Entry 1: Error for field 'db_path' (https://github.com/nf-core/test-datasets/raw/taxprofiler/data/database/krakenuniq/testdb-krakenuniq.tar.gz): the file or directory 'https://github.com/nf-core/test-datasets/raw/taxprofiler/data/database/krakenuniq/testdb-krakenuniq.tar.gz' does not exist (db_path should be either a file path or a directory.)

 -- Check script 'subworkflows/nf-core/utils_nfschema_plugin/main.nf' at line: 39 or see '.nextflow.log' file for more details
Error: Process completed with exit code 1.

I have run the workflow locally to successful completion with,

nextflow run muniheart/taxprofiler-dev -r dev -profile test_krakenuniq,singularity --krakenuniq_ram_chunk_size 5GB --outdir results/2025-05-18_12h48

When run without option --krakenuniq_ram_chunk_size 5GB, however, the script, .command.sh, contained option --preload-size 16G, and each slurm job was OOM killed. Nonetheless, each task reported exitcode 0. The configuration,

process {
withName: KRAKENUNIQ_PRELOADEDKRAKENUNIQ {
ext.args = '--preload-size 2G'
}

is not having its intended effect.

@jfy133
Copy link
Member
jfy133 commented May 19, 2025

I don't understand why task Run pipeline with test data (latest-everything | test_krakenuniq | docker) is reported as ✅ while log indicates error and non-zero exit status.

ERROR ~ Validation of pipeline parameters failed!

 -- Check '.nextflow.log' file for details
The following invalid input values have been detected:

* --databases (https://raw.githubusercontent.com/nf-core/test-datasets/taxprofiler/database_krakenuniq.csv): Validation of file failed:
	-> Entry 1: Error for field 'db_path' (https://github.com/nf-core/test-datasets/raw/taxprofiler/data/database/krakenuniq/testdb-krakenuniq.tar.gz): the file or directory 'https://github.com/nf-core/test-datasets/raw/taxprofiler/data/database/krakenuniq/testdb-krakenuniq.tar.gz' does not exist (db_path should be either a file path or a directory.)

 -- Check script 'subworkflows/nf-core/utils_nfschema_plugin/main.nf' at line: 39 or see '.nextflow.log' file for more details
Error: Process completed with exit code 1.

I have run the workflow locally to successful completion with,

nextflow run muniheart/taxprofiler-dev -r dev -profile test_krakenuniq,singularity --krakenuniq_ram_chunk_size 5GB --outdir results/2025-05-18_12h48

When run without option --krakenuniq_ram_chunk_size 5GB, however, the script, .command.sh, contained option --preload-size 16G, and each slurm job was OOM killed. Nonetheless, each task reported exitcode 0. The configuration,

process {
withName: KRAKENUNIQ_PRELOADEDKRAKENUNIQ {
ext.args = '--preload-size 2G'
}

is not having its intended effect.

Hrm that's odd. I would need to experiment on Thursday

@muniheart muniheart added bug Something isn't working enhancement Improvement for existing functionality labels May 20, 2025
Copy link
Member
@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just made a slight tweak of the default behaviour to no chunking, and I confirmed that after moving the 2G specification in test_krakenuniq to a parameter call not ext.args, then I got the correct order of behaviour:

  1. database ram size
  2. --<param> ram size
  3. Config ram size
  4. No ram size, just preload full

If You're OK with my minor changes in the last couple of commits, I think we are good to go for a merge. Please confirm if you're happy and I can hit the button if you don't have merge permissions.

@jfy133
Copy link
Member
jfy133 commented May 22, 2025

@nf-core-bot fix linting

@muniheart
Copy link
Contributor Author

I just made a slight tweak of the default behaviour to no chunking, and I confirmed that after moving the 2G specification in test_krakenuniq to a parameter call not ext.args, then I got the correct order of behaviour:

1. database ram size

2. `--<param>` ram size

3. Config ram size

4. No ram size, just preload full

If You're OK with my minor changes in the last couple of commits, I think we are good to go for a merge. Please confirm if you're happy and I can hit the button if you don't have merge permissions.

I'm okay with your changes, @jfy133, though they don't address the underlying problem, which is that template update included conf/modules.conf into nextflow.config a 2nd time, preventing the overriding of --preload-size value through a custom configuration file. Commit fixed the tests, but I'm concerned it will be undone by a future update of the template.

@muniheart
Copy link
Contributor Author
muniheart commented May 22, 2025

The memory requested in test_krakenuniq.config, 15.GB is insufficient for a chunk size of 16GB and may not be sufficient for preload step. That will need to be adjusted to accommodate configured database. Alternatively, the test databases file can be modified to include --preload-size with a value compatible with memory request.

The test database is 30MB compressed so preload should work.

@muniheart
Copy link
Contributor Author
muniheart commented May 22, 2025

Sorry, I had missed this

    krakenuniq_ram_chunk_size              = '2G'

It makes sense now. I am still concerned by the duplicate includeConfig mentioned above.

@jfy133
Copy link
Member
jfy133 commented May 23, 2025

I just made a slight tweak of the default behaviour to no chunking, and I confirmed that after moving the 2G specification in test_krakenuniq to a parameter call not ext.args, then I got the correct order of behaviour:

1. database ram size

2. `--<param>` ram size

3. Config ram size

4. No ram size, just preload full

If You're OK with my minor changes in the last couple of commits, I think we are good to go for a merge. Please confirm if you're happy and I can hit the button if you don't have merge permissions.

I'm okay with your changes, @jfy133, though they don't address the underlying problem, which is that template update included conf/modules.conf into nextflow.config a 2nd time, preventing the overriding of --preload-size value through a custom configuration file. Commit fixed the tests, but I'm concerned it will be undone by a future update of the template.

Ah I see, sorry I misunderstood.

The duplicated includeConfig was indeed a mistake during a template merge. The module loading should be at the end of nextflow.config: https://github.com/nf-core/tools/blob/e6497bcdf8660b7029bd739f890c2d73e87dd867/nf_core/pipeline-template/nextflow.config#L336-L338

That said, to my knowledge the --<params>, should take the highest priority over everything. Was that not what you observed in testing?

Sorry, I had missed this

    krakenuniq_ram_chunk_size              = '2G'

It makes sense now. I am still concerned by the duplicate includeConfig mentioned above.

Yes exactly, sorry I wasn't clear in my description. I moved it to ensure it does follow the right precedence so we can override it.

That said I think with the exception of test configs, I think whatever is supplied via a custom config should take highest priority as presumably that would be what the user wants to use if our normal recommendations of passing the value do not work for them.

I unfortunately can't look at this today, if you have time could you move the includeConfig modules.config back to the correct place end of the template (again, sorry for that duplication error, indeed our mistake!) , and test again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement Improvement for existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0