Reformat the data conversion stages to fit better with pipeline-oriented processing · Issue #30 · askap-craco/CELEBI · GitHub
More Web Proxy on the site http://driver.im/
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data conversion stages (i.e. the conversion from vcraft files to codif, and then from codif to difx) are structured around the assumption of a single user actively "doing" the processing. This is not ideal for the pipeline-oriented approach the processing has taken on since implementing Nextflow, and a more modular approach would be better.
🔈 Motivation
Nextflow handles process parallelization and submission to slurm queues, which are both currently being handled internally to the conversion processes (specifically within vcraft2obs.py and askap2difx.py). This improved efficiency in the old active-user approach, but is now causing the largest bottleneck in the localisation of FRBs. It also makes the consumption of resources and actual time required for the stages more opaque than it could be (e.g. Nextflow's reports know nothing about what the actual slurm jobs are doing, since all Nextflow sees is some jobs being submitted to a queue).
🛰 Alternatives
Each of the conversion stages could be separated into their own processes, and their data handling possibly made more granular so that Nextflow can handle the parallelisation and minimise waiting time.
The text was updated successfully, but these errors were encountered:
🚀 Feature Request
The data conversion stages (i.e. the conversion from
vcraft
files tocodif
, and then fromcodif
todifx
) are structured around the assumption of a single user actively "doing" the processing. This is not ideal for the pipeline-oriented approach the processing has taken on since implementing Nextflow, and a more modular approach would be better.🔈 Motivation
Nextflow handles process parallelization and submission to slurm queues, which are both currently being handled internally to the conversion processes (specifically within
vcraft2obs.py
andaskap2difx.py
). This improved efficiency in the old active-user approach, but is now causing the largest bottleneck in the localisation of FRBs. It also makes the consumption of resources and actual time required for the stages more opaque than it could be (e.g. Nextflow's reports know nothing about what the actual slurm jobs are doing, since all Nextflow sees is some jobs being submitted to a queue).🛰 Alternatives
Each of the conversion stages could be separated into their own processes, and their data handling possibly made more granular so that Nextflow can handle the parallelisation and minimise waiting time.
The text was updated successfully, but these errors were encountered: