8000 Reformat the data conversion stages to fit better with pipeline-oriented processing · Issue #30 · askap-craco/CELEBI · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Reformat the data conversion stages to fit better with pipeline-oriented processing #30
Open
@d-r-scott

Description

@d-r-scott

🚀 Feature Request

The data conversion stages (i.e. the conversion from vcraft files to codif, and then from codif to difx) are structured around the assumption of a single user actively "doing" the processing. This is not ideal for the pipeline-oriented approach the processing has taken on since implementing Nextflow, and a more modular approach would be better.

🔈 Motivation

Nextflow handles process parallelization and submission to slurm queues, which are both currently being handled internally to the conversion processes (specifically within vcraft2obs.py and askap2difx.py). This improved efficiency in the old active-user approach, but is now causing the largest bottleneck in the localisation of FRBs. It also makes the consumption of resources and actual time required for the stages more opaque than it could be (e.g. Nextflow's reports know nothing about what the actual slurm jobs are doing, since all Nextflow sees is some jobs being submitted to a queue).

🛰 Alternatives

Each of the conversion stages could be separated into their own processes, and their data handling possibly made more granular so that Nextflow can handle the parallelisation and minimise waiting time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0