8000 Mark namefitted group as undocumented by lukaspie · Pull Request #570 · FAIRmat-NFDI/pynxtools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Mark namefitted group as undocumented #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

lukaspie
Copy link
Collaborator
@lukaspie lukaspie commented Mar 3, 2025

@rettigl with this PR, we get an "is written without documentation" warning if a variadic concept name is used that is not part of the application definition.

I decided to reuse the undocumented warning here because this can be turned off with a flag in the CLI call of dataconverter. So if somebody is sure that they want to use this namefit anyway, we can just allow that by passing the `ignore-undocumented´ flag.

Fixes #563

@lukaspie lukaspie requested a review from rettigl March 3, 2025 10:25
@lukaspie lukaspie linked an issue Mar 3, 2025 that may be closed by this pull request
Copy link
Collaborator
@rettigl rettigl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to work, however it forces now the NAME[instance] notation also for variadic Fields. In fact, I support this, as previously namefitting was sometimes not correct.

One improvement I suggest is to add a separate clause for NXcollections. These can as I understand it appear anywhere:

"This is the NXcollection class which can appear anywhere underneath NXentry"

@rettigl
Copy link
Collaborator
rettigl commented Mar 3, 2025

I used NXroot before to write nexus files w/o schema. These give warnings now, see igor-reader tests. Maybe this is fine.

@rettigl
Copy link
Collaborator
rettigl commented Mar 3, 2025

Mpes test failures are all with PROCESS_MPES, which should be removed anyways, so this will be resolved later.

@rettigl
Copy link
Collaborator
rettigl commented Mar 3, 2025

I used NXroot before to write nexus files w/o schema. These give warnings now, see igor-reader tests. Maybe this is fine.

Ah no, NXroot contains NXentry. The issue is that we need ENTRY[scan...]/DATA[data]/DATA[data] now.

@lukaspie lukaspie linked an issue Mar 4, 2025 that may be closed by this pull request
@rettigl
Copy link
Collaborator
rettigl commented Mar 5, 2025

@lukaspie Maybe we merge this into #554, fix the tests accordingly, and once the new definitions are out, merge this into master? Or do you think it makes sense fixing reader tests now before the new release comes?

@rettigl rettigl changed the base branch from fix_validation_and_add_more_tests to master March 14, 2025 08:54
@rettigl
Copy link
Collaborator
rettigl commented Mar 14, 2025

@lukaspie I reset this to merge to master to run tests.
I added a test for a failing namefitting, which namefits any illegal fixed name to a variadic group with required children. Maybe you can have a look how to solve this.

@lukaspie
Copy link
Collaborator Author

@rettigl I implemented what we discussed today, i.e. you need to give the concept name if you want to use a variadic concept. Note that we still need to do namefitting because we have the additional constraint of the nameType which the new field needs to fit to (this is not super important here, but will be with the new definitions). I think the new version will make namefitting and testing much cleaner.

Validation tests are passing now, but the plugin tests are still failing. Some of that is related to the addittional checks we are running now (like the type of the @vector attribute in MPES), but there seem to be some things that are additionally wrong:

  • For pynxtools-mpes, WARNING: Missing attribute: "/ENTRY[entry]/data/@signal". Not sure where that comes from.

Note that for pynxtools-mpes, I get Field /ENTRY[entry]/PROCESS_MPES[process]/CALIBRATION[ky_calibration]/calibrated_axis written without documentation even though NXcalibration/calibrated_axis is a non-variadic concept. This should actually be /ENTRY[entry]/PROCESS_MPES[process]/kN_CALIBRATION[ky_calibration]/calibrated_axis, then it works.

Will continue next week.

@rettigl
Copy link
Collaborator
rettigl commented Mar 14, 2025

Excellent, thanks. I already briefly looked into the mpes errors. The calibration error comes because in the referenced definitions, it is calibrated_AXIS for some reason, instead of calibrated_exis in NXcalibration. Not sure where that came from. The other issue I think is also with the config file, I will start an appdate branch.

@rettigl
Copy link
Collaborator
rettigl commented Mar 14, 2025

Excellent, thanks. I already briefly looked into the mpes errors. The calibration error comes because in the referenced definitions, it is calibrated_AXIS for some reason, instead of calibrated_exis in NXcalibration. Not sure where that came from. The other issue I think is also with the config file, I will start an appdate branch.

Checking of group attributes was not properly handled, because the "@" was not removed in the keys dict. I changed the logic now that the @ should be in the dict for all attributes.

What is still not working now is the proper handling of @url, because it is considered a concept rather than a name. This should be hopefully fixed by the new nameType code, I assume?

@lukaspie
Copy link
Collaborator Author
lukaspie commented Mar 17, 2025

@rettigl why was you last commit (25e0013) neccessary? I think we can just do:

best_match = None

for node in nodes:
    if not node.variadic:
        if instance_name == node.name:
            return node
    else:
        if concept_name and concept_name == node.name:
            if instance_name == node.name:
                return node

            score = get_nx_namefit(instance_name, node.name)
            if score > -1:
                best_match = node

return best_match

With the early-return, we will prefer direct name match over concept match anyway.

@rettigl
Copy link
Collaborator
rettigl commented Mar 17, 2025

@rettigl why was you last commit (25e0013) neccessary? I think we can just do:

best_match = None

for node in nodes:
    if not node.variadic:
        if instance_name == node.name:
            return node
    else:
        if concept_name and concept_name == node.name:
            if instance_name == node.name:
                return node

            score = get_nx_namefit(instance_name, node.name)
            if score > -1:
                best_match = node

return best_match

With the early-return, we will prefer direct name match over concept match anyway.

This might also work. With the code before there were randomly different behaviors, as the elements in the list appear randomly ordered. So we need to parse all emements before we can make a decision. the code above should also work, I guess. The missing documentation for data/energy/@type here: https://github.com/FAIRmat-NFDI/pynxtools/actions/runs/13865392531/job/38803289191#step:9:230 was due to the wrong mapping of AXISNAME[energy] to AXISNAME rather than energy.

@lukaspie
Copy link
Collaborator Author

@rettigl why was you last commit (25e0013) neccessary? I think we can just do:

best_match = None

for node in nodes:
    if not node.variadic:
        if instance_name == node.name:
            return node
    else:
        if concept_name and concept_name == node.name:
            if instance_name == node.name:
                return node

            score = get_nx_namefit(instance_name, node.name)
            if score > -1:
                best_match = node

return best_match

With the early-return, we will prefer direct name match over concept match anyway.

This might also work. With the code before there were randomly different behaviors, as the elements in the list appear randomly ordered. So we need to parse all emements before we can make a decision. the code above should also work, I guess. The missing documentation for data/energy/@type here: FAIRmat-NFDI/pynxtools/actions/runs/13865392531/job/38803289191#step:9:230 was due to the wrong mapping of AXISNAME[energy] to AXISNAME rather than energy.

Ok, the new version also seems to work (indeed, my old code was not iterating all nodes before making a decisions, but this is fixed now).

From what I can tell, all the remanining issues with the plugin tests are either solved by using the nameType properly or they are real issues that must be fixed within the plugins. My suggestion is that we merge here (either directly to master or through #585), with the plugin tests failing. Then I can rebase #578 and we can make the plugin updates using the new definitions.

Copy link
Collaborator
@rettigl rettigl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. Merge here, and close the others

@lukaspie lukaspie merged commit b83c55e into master Mar 17, 2025
6 of 17 checks passed
@lukaspie lukaspie deleted the 563-bug-converter-does-not-warn-if-variadic-concept-name-is-invalid branch March 17, 2025 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: converter does not warn if variadic concept name is invalid Improve search for fields when they are not directly in the appdef
4 participants
0