-
Notifications
You must be signed in to change notification settings - Fork 10
Adds first iteration of dataconverter section #655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fe5ff0d
to
a1ed455
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added some comments for now. We need to define early on what an application definition is, before we start saying we validate against it.
Could possibly also use one sentence about ELNs (to add metadata missing from propietary files) and how the ELN Generator helps with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Language polishing and resolving open comments remain, afterwards should be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable changes, can be merged to see the draft in full
BEFORE MERGING, pipeline needs fix !
paper/paper.md
Outdated
|
||
# Statement of need | ||
|
||
Achieving FAIR (Findable, Accessible, Interoperable, and Reproducible) data principles in experimental physics and materials science requires consistent implementation of standardized data formats. While NeXus provides comprehensive data specifications for structured scientific data storage, pynxtools simplifies the implementation process for developers and researchers by providing guided workflows and automated validation to ensure complete compliance. Existing tools [@Koennecke:2024; @Jemian:2025] provide solutions with individual capabilities, but none offers a comprehensive end-to-end workflow for proper NeXus adoption. pynxtools addresses this critical gap by providing an accessible framework that enforces complete NeXus application definition compliance through automated validation, detailed error reporting for missing required data points, and clear implementation pathways via configuration files and extensible plugins. This approach transforms NeXus from a complex specification into a practical solution, enabling researchers to achieve true data interoperability without deep technical expertise in the underlying standards. | ||
Achieving FAIR (Findable, Accessible, Interoperable, and Reproducible) data principles in experimental physics and materials science requires consistent implementation of standardized data formats. While NeXus provides comprehensive data specifications for structured storeage of scientific data, pynxtools simplifies the implementation process for developers and researchers by providing guided workflows and automated validation to ensure complete compliance. Existing tools [@Koennecke:2024; @Jemian:2025] provide solutions with individual capabilities, but none offers a comprehensive end-to-end workflow for proper NeXus adoption. pynxtools addresses this critical gap by providing an accessible framework that enforces complete NeXus application definition compliance through automated validation, detailed error reporting for missing required data points, and clear implementation pathways via configuration files and extensible plugins. This approach transforms NeXus from a complex specification into a practical solution, enabling researchers to achieve true data interoperability without deep technical expertise in the underlying standards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"storage" instead of "storeage"
paper/paper.md
Outdated
|
||
The __dataconverter__ provides a command line interface (CLI) to produce NeXus files where users can use one of the built-in readers for generic functionality or technique-specific reader plugins distributed as separate Python packages. | ||
|
||
For developers, the __dataconverter__ provides an abstract __reader__ class to create plugins that process experiment-specific file formats and try to fill the NeXus specification. A __Template__ object is passed to the __reader__ by the __dataconverter__ that acts like a form that has to be filled by the __reader__. __Template__ is a subclass of a regular Python __dict__ class. It keeps a similar interface to a __dict__ to help developers write code they are familiar with. It transparently ensures structural compliance with the selected NeXus application definition. __Template__ categorizes all data elements according to three requirement levels: required, recommended, and optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last sentence: could say that these levels are also the same level of requiredness that NeXus has (here, it sounds like something that we invented just for the Template class).
Pipeline to be fixed by #657, needs a rebase here afterwards |
43c9d63
to
b2f8f0e
Compare
I kept it as simple as possible for now. I can get into more details. But I feel like we should reduce the length of our parts anyways.