-
Notifications
You must be signed in to change notification settings - Fork 69
Handle empty input files when saving embedding plots #193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* master: Filter genes without expression before reduce data in save_embeddings.py Add progress messages to merge_benchmarks.py Check for empty files when merging benchmarks Added missing import statement in metrics.py Isolated label fix (#187)
Why delete the existing embedding output files? What should there be if there is an empty h5ad file? |
I assumed an empty |
I might mean we were not careful in how we save our data if there is sth to delete. Did you come across this case? I would maybe throw an error if this happens.. what do you think? Typical Snakemake procedure would be to overwrite though. |
Ah, I remember why I did this now. The Does anyone know what we do with the metrics if the |
In the metrics case we just output a file that has no value for all metric outputs (I think it was this, and not NAs). That is then captured in the metrics aggregation script to say the method wasn't run. @mumichae might be able to confirm this. I think the safest approach here would be to throw an error if the file exists. Then we can check manually and just run the snakemake command again if needed. |
In case the adata is empty, we would want empty output. If we had a non-empty output on the previous run, but for some reason the current run produces an empty adata, we wouldn't want to keep the previous embedding, so I'd overwrite any file that is output from the script but not captured by snakemake. But seeing as we provide all output filenames to the snakemake rule, snakemake should remove these files if they exist before a rerun. So all that you'll need to do is fix your script to output empty files. |
So to clarify, when we have failed runs, they result in empty .h5ad files. That means an output file will be created. Snakemake doesn't execute a rule if there is no input for it. If you require snakemake to create an output file that it can't create, it'll throw a missing input error. |
When input AnnData is empty in save_embeddings.py
Ok, I changed how empty output files are created. I think this is more what we want but if not I'm probably a bit confused about what to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say: yes... would you agree @mumichae ?
Yes, this looks fine |
This was a bit trickier than I thought because I think the pipeline will complain if there is no output. What I have done is:
.h5ad
file is empty, if yes:I think it works but I have only tested it very quickly so probably good if @danielStrobl can try running it before we merge.