-
Notifications
You must be signed in to change notification settings - Fork 26
Add spliting of peptides with python. #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
General best practice would be to put this in a |
OK, will do! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but how about adding an optional parameter to set the minimum peptide list size for the splitting procedure?
main.nf
Outdated
@@ -38,6 +38,8 @@ def helpMessage() { | |||
--max_peptide_length [int] Specifies the maximum peptide length Default: MHC class I: 11 aa, MHC class II: 16 aa | |||
--min_peptide_length [int] Specifies the minimum peptide length Default: MCH class I: 8 aa, MHC class II: 15 aa | |||
--tools [str] Specifies a list of tool(s) to use. Available are: 'syfpeithi', 'mhcflurry', 'mhcnuggets-class-1', 'mhcnuggets-class-2'. Can be combined in a list separated by comma. | |||
--split_peptides_maxchunks [int] Used in combination with '--peptides' or '--proteins': maximum number of peptide chunks that will be created for parallelization. Default: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about --peptides_split_maxchunks
and --peptides_split_chunksize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split_peptides_minsize
is just the min. chunk size (apart for one file that contains the rest), it will become bigger if necessary to avoid more than peptides_split_maxchunks
chunks .
--peptides_split_minchunksize
or would that be too long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we even need both. Wouldn't it be enough to let users define the max number of chunks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the actual number of chunks that is created? That would be a third parameter, but yes it might be a good solution to keep it simple. Or do you really not want to change the *-minchunksize
parameter, then we could let the user define *-maxchunks
, I just understood that you wanted to change that as well ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that it might be sufficient to have a maxchunks
parameter since that defines the size of the chunks anyway indirectly.
|
#28
Ok, I did it in python in the end. With this max. 100 chunks will be created and only files with more than 5000 peptides will be split.That's what I would assume might be useful also for others. Let me know if you think this could be optimised.
Many thanks for contributing to nf-core/epitopeprediction!
Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested on pull requests (PRs).
PR checklist
nextflow run . -profile test,docker
).nf-core lint .
).docs
is updatedCHANGELOG.md
is updatedREADME.md
is updatedLearn more about contributing: https://github.com/nf-core/epitopeprediction/tree/master/.github/CONTRIBUTING.md