8000 Move away from arbitrary heuristic concept domain assignment model to an extensible cascading ruleset · Issue #734 · OHDSI/CommonDataModel · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Move away from arbitrary heuristic concept domain assignment model to an extensible cascading ruleset #734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ekorchmar opened this issue Mar 11, 2025 · 0 comments

Comments

@ekorchmar
Copy link

There is no single methodology for domain classification. The Book of OHDSI suggests that domain assignment should be heuristic, and even refers back to the vocabularies code base:

Domain assignments are an OMOP-specific feature done during vocabulary ingestion using a heuristic laid out in [Pallas](https://github.com/ohDSI/vocabulary-v5.0).

This creates issues on design level, enabling circular reasoning and arbitrary decisions. Example: should scar tissue formations be classified as Conditions or as Observations? The only source of truth for answering this is heuristic laid out in the code, and up to implementers to change at will.

Of course, domain assignment is a very difficult task. Sure, most times a diagnosis is made, it is unambiguously a Condition, and when a substance is injected, it is likely a Drug. But there always will be ambiguity around symptoms and syndromes vs. diseases, biological substances vs. pharmaceutical drugs and measurement vs. observations, and many other edge cases. Nevertheless, having at least somewhat formalized Domain definition page at Vocabulary wiki would lay down important groundwork; and there are precedent systems to enable pin-point documentation of precise edge cases.

As a reference example, SNOMED has a system of short template pages, defining how the new concepts in particular domains should be modelled; we could have the same hierarchy of templates for domain assignment in edge cases. There is nothing wrong with subjective reasoning (we are operating with made-up high-level abstractions over physical processes and ephemeral emerging properties of biological systems), nor with justifying decisions by precedent, but as circle of Vocabulary authors will grow, documentation system is required for consistency and predictability.

And thanks to SNOMED’s multiaxial hierarchy, these conventions would also be routinely easy to implement for most of the domains, as there can be arbitrary number of rule definitions in OMOP SNOMED representation. Other vocabularies are rarely as cleanly structured, but also often:
A: map to SNOMED as Standard
B: divide codes semantically into ranges, which can serve as sorting mechanism for domains.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0