8000 Add spliting of peptides with python. by skrakau · Pull Request #29 · nf-core/epitopeprediction · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
8000

Add spliting of peptides with python. #29

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 14, 2020

Conversation

skrakau
Copy link
Member
@skrakau skrakau commented Mar 9, 2020

#28

Ok, I did it in python in the end. With this max. 100 chunks will be created and only files with more than 5000 peptides will be split.That's what I would assume might be useful also for others. Let me know if you think this could be optimised.

Many thanks for contributing to nf-core/epitopeprediction!

Please fill in the appropriate checklist below (delete whatever is not relevant). These are the most common things requested on pull requests (PRs).

PR checklist

  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!
  • If necessary, also make a PR on the nf-core/epitopeprediction branch on the nf-core/test-datasets repo
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Make sure your code lints (nf-core lint .).
  • Documentation in docs is updated
  • CHANGELOG.md is updated
  • README.md is updated

Learn more about contributing: https://github.com/nf-core/epitopeprediction/tree/master/.github/CONTRIBUTING.md

@apeltzer
Copy link
Member

General best practice would be to put this in a foo.py and place this script in the /bin folder of the pipeline and call it accordingly in the script section here, keeps the main.nf cleaner :-)

@skrakau
Copy link
Member Author
skrakau commented Mar 16, 2020

OK, will do! :)

Copy link
Collaborator
@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but how about adding an optional parameter to set the minimum peptide list size for the splitting procedure?

@christopher-mohr christopher-mohr added the enhancement New feature or request label Mar 24, 2020
@christopher-mohr christopher-mohr linked an issue Mar 24, 2020 that may be closed by this pull request
main.nf Outdated
@@ -38,6 +38,8 @@ def helpMessage() {
--max_peptide_length [int] Specifies the maximum peptide length Default: MHC class I: 11 aa, MHC class II: 16 aa
--min_peptide_length [int] Specifies the minimum peptide length Default: MCH class I: 8 aa, MHC class II: 15 aa
--tools [str] Specifies a list of tool(s) to use. Available are: 'syfpeithi', 'mhcflurry', 'mhcnuggets-class-1', 'mhcnuggets-class-2'. Can be combined in a list separated by comma.
--split_peptides_maxchunks [int] Used in combination with '--peptides' or '--proteins': maximum number of peptide chunks that will be created for parallelization. Default: 100
Copy link
Collaborator
@christopher-mohr christopher-mohr Mar 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about --peptides_split_maxchunksand --peptides_split_chunksize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split_peptides_minsize is just the min. chunk size (apart for one file that contains the rest), it will become bigger if necessary to avoid more than peptides_split_maxchunks chunks .

--peptides_split_minchunksize or would that be too long?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we even need both. Wouldn't it be enough to let users define the max number of chunks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the actual number of chunks that is created? That would be a third parameter, but yes it might be a good solution to keep it simple. Or do you really not want to change the *-minchunksize parameter, then we could let the user define *-maxchunks, I just understood that you wanted to change that as well ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that it might be sufficient to have a maxchunksparameter since that defines the size of the chunks anyway indirectly.

@skrakau
Copy link
Member Author
skrakau commented Apr 9, 2020
  • changed --split_peptides_maxchunks -> --peptides_split_maxchunks
  • changed --split_peptides_minsize -> --peptides_split_minchunksize
  • improved help within split_peptides.py ("Max." -> "Maximum" etc.)
  • adjusted whitespaces in epitopeprediction helpMessage()

@christopher-mohr christopher-mohr merged commit ad06e45 into nf-core:dev Apr 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add splitting for peptides
3 participants
0