10000 add a separate error for names with wrong type and remove such keys by lukaspie · Pull Request #638 · FAIRmat-NFDI/pynxtools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

add a 8000 separate error for names with wrong type and remove such keys #638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 23, 2025

Conversation

lukaspie
Copy link
Collaborator
@lukaspie lukaspie commented May 14, 2025

The idea is to catch the following situation: a field/group is given that has a name that another group/field at the same level has (and which is therefore reserved).

Approach:

  1. Throw and log an error on such cases
  2. this error should be separate to the existing checks for concept names
  3. if this error is thrown, it should result in that field not to be written at all (i.e., it shall be removed from the template)

@lukaspie lukaspie mentioned this pull request May 14, 2025
4 tasks
@lukaspie
Copy link
Collaborator Author
lukaspie commented May 14, 2025

One problem that we have here (but also in #636) is if there is a link in the template. Then the key is e.g. /ENTRY[entry]/SAMPLE[sample]/temperature_env/temperature_sensor, which means that the expected type based on the current logic is a field. But this actually links a whole group.

@lukaspie lukaspie force-pushed the 599-undescriptive-error-message-cleanup branch from 9c26dbe to 241570a Compare May 19, 2025 07:01
@lukaspie lukaspie force-pushed the 599-undescriptive-error-message-cleanup branch from 241570a to 0900383 Compare May 19, 2025 12:33
@lukaspie lukaspie force-pushed the 599-undescriptive-error-message-cleanup branch from 0900383 to 1a88532 Compare May 19, 2025 12:43
@rettigl
Copy link
Collaborator
rettigl commented May 19, 2025

If I add

  "/ENTRY/entry_identifier": {
    "identifier":"@attrs:metadata/loader/scan_path"
  },

I get:

ERROR: The type ('group') of the given concept 'entry_identifier' conflicts with another existing concept of the same name, which is of type 'field').
WARNING: The attribute /ENTRY[entry]/entry_identifier/identifier will not be written.

The validation does not pick up that I try to write a group with a name that is defined as field.
If I add a concept, I get:

ERROR: The type ('group') of the given concept 'COLLECTION[entry_identifier]' conflicts with another existing concept of the same name, which is of type 'field').
WARNING: The attribute /ENTRY[entry]/COLLECTION[entry_identifier]/identifier will not be written.
WARNING: Given field name 'COLLECTION' conflicts with the non-variadic name 'entry_identifier (opt)'

If I have the conflicting field also in the dict, I get:

ERROR: The type ('group') of the given concept 'COLLECTION[entry_identifier]' conflicts with another existing concept of the same name, which is of type 'field').
WARNING: The attribute /ENTRY[entry]/COLLECTION[entry_identifier]/identifier will not be written.
WARNING: Given field name 'COLLECTION' conflicts with the non-variadic name 'entry_identifier (opt)'

But the conversion works.

@rettigl
Copy link
Collaborator
rettigl commented May 19, 2025

Comparing the behavior of #636 and this PR, I would prefer the functionality here. But the reported warnings/consistency can still be improved (i.e. what exactly is being removed, element types etc.).

@rettigl
Copy link
Collaborator
rettigl commented May 19, 2025

Link checking does not seem to work properly yet. If I add a link

    "RESOLUTION[angular_resolution]": {
      "resolution": "@link:/entry/instrument/electronanalyzer/angular_resolution",
      "resolution/@units": "deg",
      "physical_quantity": "angle",
      "type": "estimated"
    },

I don't get a warning, even though I linked a group to a defined field name.
If I add the same structure manually

    "RESOLUTION[angular_resolution]": {
      "resolution": {
        "RESOLUTION[angular_resolution]": {
          "resolution": 1.0
        }
      },
      "resolution/@units": "deg",
      "physical_quantity": "angle",
      "type": "estimated"
    },

, i get:

ERROR: The type ('group') of the given concept 'resolution' conflicts with another existing concept of the same name, which is of type 'field').
WARNING: The attribute /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution/RESOLUTION[angular_resolution]/resolution will not be written.
WARNING: Unit /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution/@units in dataset without its field /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution.
WARNING: The attribute /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution/@units will not be written.

@lukaspie
Copy link
Collaborator Author

Link checking does not seem to work properly yet. If I add a link

    "RESOLUTION[angular_resolution]": {
      "resolution": "@link:/entry/instrument/electronanalyzer/angular_resolution",
      "resolution/@units": "deg",
      "physical_quantity": "angle",
      "type": "estimated"
    },

I don't get a warning, even though I linked a group to a defined field name. If I add the same structure manually

    "RESOLUTION[angular_resolution]": {
      "resolution": {
        "RESOLUTION[angular_resolution]": {
          "resolution": 1.0
        }
      },
      "resolution/@units": "deg",
      "physical_quantity": "angle",
      "type": "estimated"
    },

, i get:

ERROR: The type ('group') of the given concept 'resolution' conflicts with another existing concept of the same name, which is of type 'field').
WARNING: The attribute /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution/RESOLUTION[angular_resolution]/resolution will not be written.
WARNING: Unit /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution/@units in dataset without its field /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution.
WARNING: The attribute /ENTRY[entry]/INSTRUMENT[instrument]/RESOLUTION[angular_resolution]/resolution/@units will not be written.

Yeah, this is expected behavior as we don't really follow links currently.

The only thing that I implemented w.r.t. links in this PR is that it picks up that if there is a link, the type you expect to write can be either group or field. Like, if we have ENTRY[entry]/SAMPLE[sample]/temperature_env/temperature_sensor, from the key structure we expect a field for temperature_sensor. But if we have "link" in the value, we also allow defining groups like this (and don't throw the newly implemented error).

To actually follow the link and apply all the same checks as for regular fields is something I would implement in a separate PR.

@rettigl
Copy link
Collaborator
rettigl commented May 19, 2025

The only thing that I implemented w.r.t. links in this PR is that it picks up that if there is a link, the type you expect to write can be either group or field.

I don't understand that. If we want to have a semantically valid file, we should not allow to link a group to a name which is defined as a field (as in my example).

@lukaspie lukaspie linked an issue May 20, 2025 that may be closed by this pull request
Copy link
Collaborator
@rettigl rettigl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly lgtm and seems to do what it should, apart from a few comments

@rettigl
Copy link
Collaborator
rettigl commented May 20, 2025

Not sure why the tests fail, though

Co-authored-by: Laurenz Rettig <53396064+rettigl@users.noreply.github.com>
@lukaspie
Copy link
Collaborator Author

Not sure why the tests fail, though

I think at least for pynxtools-mpes/igor this is related to #641 and FAIRmat-NFDI/pynxtools-mpes#52.

lukaspie and others added 2 commits May 21, 2025 10:58
Co-authored-by: Laurenz Rettig <53396064+rettigl@users.noreply.github.com>
@rettigl
Copy link
Collaborator
rettigl commented May 21, 2025

Not sure why the tests fail, though

I think at least for pynxtools-mpes/igor this is related to #641 and FAIRmat-NFDI/pynxtools-mpes#52.

Indeed this seems to work with merged #641: FAIRmat-NFDI/pynxtools-igor#8
I still don't understand why this only occurs now in this PR.

@RubelMozumder
Copy link
Collaborator

Test from pynxtools-spm is failling because of the field and group having the same name conflict in NXsample and NXspm is actively using NXsample_component group to collect data for substrate and experiment_sample.

6D40
@lukaspie
Copy link
Collaborator Author

Test from pynxtools-spm is failling because of the field and group having the same name conflict in NXsample and NXspm is actively using NXsample_component group to collect data for substrate and experiment_sample.

Yes, that's true. I think if you want to use a NXsample_component group, you cannot name it sample_component as this is blocked for use in fields. Can you rename it?

@lukaspie
Copy link
Collaborator Author

@RubelMozumder we now see a bunch of errors and warnings for pynxtools-spm and pynxtools-xrd because we levelled up some of the previous warnings to errors, see helpers.py. IMO, these should be errors, e.g. if a required field is missing. So I suggest we merge here and then you fix spm and xrd separately.

@rettigl
Copy link
Collaborator
rettigl commented May 23, 2025

The logic for attributes with missing fields in the nexus tree still seems to be broken to some extend. If I add (in NXelectron_detector):

        "raw_data": {
          "energy/@type": "kinetic"
        }

I get:

ERROR: The data entry corresponding to /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/raw is required and hasn't been supplied by the reader.
WARNING: Missing attribute: "/ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/@signal"
WARNING: There were attributes set for the field /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/energy, but the field does not exist.
WARNING: There were attributes set for the field /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/energy, but the field does not exist.
WARNING: The attribute /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/energy/@type will not be written.

The missing attribute error comes twice, which is not expected.

@lukaspie
Copy link
Collaborator Author

The logic for attributes with missing fields in the nexus tree still seems to be broken to some extend. If I add (in NXelectron_detector):

        "raw_data": {
          "energy/@type": "kinetic"
        }

I get:

ERROR: The data entry corresponding to /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/raw is required and hasn't been supplied by the reader.
WARNING: Missing attribute: "/ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/@signal"
WARNING: There were attributes set for the field /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/energy, but the field does not exist.
WARNING: There were attributes set for the field /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/energy, but the field does not exist.
WARNING: The attribute /ENTRY[entry]/INSTRUMENT[instrument]/ELECTRONANALYZER[electronanalyzer]/ELECTRON_DETECTOR[detector]/raw_data/energy/@type will not be written.

The missing attribute error comes twice, which is not expected.

Yeah, this occured now because we only remove all collected keys in the end now. For this example, it was special because it is in NXdata and since we do not have signal/axes defined here, we never proceeded to the attributes for axisnames defined in the appdef. So we never removed it from the non-visited list. I have just implemented it such that in the check_attributes_of_nonexisting_field function, we also remove such keys from the non-visited list, so we do not need to check them again. Should fix your problem.

@lukaspie lukaspie merged commit 8bf3aab into master May 23, 2025
13 of 17 checks passed
@lukaspie lukaspie deleted the 599-undescriptive-error-message-cleanup branch May 23, 2025 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Undescriptive error message
3 participants
0