Move away from arbitrary heuristic concept domain assignment model to an extensible cascading ruleset · Issue #734 · OHDSI/CommonDataModel · GitHub
More Web Proxy on the site http://driver.im/
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is no single methodology for domain classification. The Book of OHDSI suggests that domain assignment should be heuristic, and even refers back to the vocabularies code base:
Domain assignments are an OMOP-specific feature done during vocabulary ingestion using a heuristic laid out in [Pallas](https://github.com/ohDSI/vocabulary-v5.0).
This creates issues on design level, enabling circular reasoning and arbitrary decisions. Example: should scar tissue formations be classified as Conditions or as Observations? The only source of truth for answering this is heuristic laid out in the code, and up to implementers to change at will.
Of course, domain assignment is a very difficult task. Sure, most times a diagnosis is made, it is unambiguously a Condition, and when a substance is injected, it is likely a Drug. But there always will be ambiguity around symptoms and syndromes vs. diseases, biological substances vs. pharmaceutical drugs and measurement vs. observations, and many other edge cases. Nevertheless, having at least somewhat formalized Domain definition page at Vocabulary wiki would lay down important groundwork; and there are precedent systems to enable pin-point documentation of precise edge cases.
As a reference example, SNOMED has a system of short template pages, defining how the new concepts in particular domains should be modelled; we could have the same hierarchy of templates for domain assignment in edge cases. There is nothing wrong with subjective reasoning (we are operating with made-up high-level abstractions over physical processes and ephemeral emerging properties of biological systems), nor with justifying decisions by precedent, but as circle of Vocabulary authors will grow, documentation system is required for consistency and predictability.
And thanks to SNOMED’s multiaxial hierarchy, these conventions would also be routinely easy to implement for most of the domains, as there can be arbitrary number of rule definitions in OMOP SNOMED representation. Other vocabularies are rarely as cleanly structured, but also often:
A: map to SNOMED as Standard
B: divide codes semantically into ranges, which can serve as sorting mechanism for domains.
The text was updated successfully, but these errors were encountered:
There is no single methodology for domain classification. The Book of OHDSI suggests that domain assignment should be heuristic, and even refers back to the vocabularies code base:
This creates issues on design level, enabling circular reasoning and arbitrary decisions. Example: should scar tissue formations be classified as Conditions or as Observations? The only source of truth for answering this is heuristic laid out in the code, and up to implementers to change at will.
Of course, domain assignment is a very difficult task. Sure, most times a diagnosis is made, it is unambiguously a Condition, and when a substance is injected, it is likely a Drug. But there always will be ambiguity around symptoms and syndromes vs. diseases, biological substances vs. pharmaceutical drugs and measurement vs. observations, and many other edge cases. Nevertheless, having at least somewhat formalized Domain definition page at Vocabulary wiki would lay down important groundwork; and there are precedent systems to enable pin-point documentation of precise edge cases.
As a reference example, SNOMED has a system of short template pages, defining how the new concepts in particular domains should be modelled; we could have the same hierarchy of templates for domain assignment in edge cases. There is nothing wrong with subjective reasoning (we are operating with made-up high-level abstractions over physical processes and ephemeral emerging properties of biological systems), nor with justifying decisions by precedent, but as circle of Vocabulary authors will grow, documentation system is required for consistency and predictability.
And thanks to SNOMED’s multiaxial hierarchy, these conventions would also be routinely easy to implement for most of the domains, as there can be arbitrary number of rule definitions in OMOP SNOMED representation. Other vocabularies are rarely as cleanly structured, but also often:
A: map to SNOMED as Standard
B: divide codes semantically into ranges, which can serve as sorting mechanism for domains.
The text was updated successfully, but these errors were encountered: