GitHub - cukarthik/ehr-ops

EHR Ops Dashboard

This repository is to guide you through EHR Ops Dashboard. EHR Ops Dashboard is a tableau dashboard provided by AllofUs analytics team to reflect the data quality across over HPOs.

⚠ NO PHI entry in this repo.

High Level Summary

Definition of metrics and related queries
How to read the dashboard
What's to expect in new coming features

Definition of Metrics and Related Queries

1. Data Transfer Rate A

Definition: The number of participants with EHR data per HPO divided by the number of all participants eligible for EHR data transfer
Metrics: Numerator is generated by eligible_participants_ehr.sql. Denominator is generated by all_eligible_participants.sql

2. Data Transfer Rate B

Definition: The number of participants with EHR data per HPO divided by the number of all participants eligible for EHR data transfer who have completed an in-person visit (physical measurements OR biospecimen)
Metrics: Numerator is generated by eligible_participants_ehr.sql. Denominator is generated by in_person_participants.sql

3. General Conformance

GC-1
Every row in the condition_occurrence, visit_occurrence, procedure_occurrence, drug_exposure, measurement and observation tables should have a well defined concept_id - GC-1 Standard - Every row should have a standard concept_id with correct domain and it's not null or 0. - The metric is generated by gc_1_standard.sql. - GC-1 Source - Every row should have a source concept_id with correct domain and it's not null or 0. - The metric is generated by gc_1_source.sql.
GC-2
Every foreign keys should be valid. For example, the foreign keys checked in condition_occurrence_table includes condition_concept_id, visit_occurrence_id, person_id, condition_type_concept_id, condition_source_concept_id, condition_status_concept_id
GC-3
- GC-3 Standard
  - Every row should have a concept_id which is not null or 0.
  - The metric is generated by gc_1_standard.sql.
- GC-3 Source
  - Every row should have a source concept_id which is not null or 0.
  - The metric is generated by gc_1_source.sql.

4. Date Conformance

DC-1
- There should not be any end dates before start dates in any of the clinical data tables.
- The metric is generated by dc_1.sql
DC-2
- No data points should exist beyond 30 days of death date, if applicable.
- The metric is generated by dc_2.sql
DC-3
- No dates should be prior to 1900 (for observation) or 1980 (for other clinical data tables).
- The metric is generated by dc_3.sql
DC-4
- Date and datetime fields should match.
- The metric is generated by dc_4.sql

5. General DQ Metrics

Unit Concept Failure
- In measurement table, rows with numerical value_source_value or value_as_number should have standard concept_id and have the domain "Unit"
- The metric is generated by unit_route_failure.sql
Route Concept Failure
- In drug_exposure table, concept_ids should both be standard and have the domain "Route"
- The metric is generated by unit_route_failure.sql
Visit ID Failure
- In condition and procedure tables, rows should not have null/0 visit_occurrence_id and these visit_occrrence_id should also exist in visit_occurrence table
- The metric is generated by visit_id_failure.sql
Duplicates
- In each table, there should not be duplicated rows.
- The metric is generated by duplicates.sql

6. COVID Mapping

In measurement table, rows related to COVID-19 results (i.e. rows with concept_id being desendant of 756055) should have neither null or 0 in value_as_concept_id and the value_source_value field should be mapped to standard concepts we provide on our website.
https://sites.google.com/view/ehrupload/data-quality-metrics/covid-19-data-mapping?authuser=0
The metric is generated by covid_mapping.sql

7. Physical Measurements

All sites should be submitting data on participants' physical measurements. Specifically, we check against body height, body weight, and BMI.
The metric is generated by physical_meas.sql

How to Read the Dashboard

Filter Session
- You could filter on the awardee or organization level to focus on the sites that you are interested in.
- Click the filter and then apply the selections.
Aggregate Data Transfer Rate Session
- This session shows aggregate number of awardees/organizations with data transfer rate falling in different threshold.
- The calculation of Data Transfer Rate A and B are in the tooltips when you hover over the aggreagate numbers.
- If you click on the number, the dashboard will be filtered to show the sites which are in that number category only.
Data Transfer Rate Session
- This session is a detailed dashboard with all organizations' metrics for Data Transfer Rate and most recent submission time.
- You could hover over the green bar to see the break-down of participants with different status (eligible participants, eligible participants with EHR data and eligible participants with in-person visit)
- You could also hover over the blue dot to see details of the metrics.
- The calculation and definition are in the tooltips.
Data Conformance Session
- This session is showing the general conformance and date conformance metrics.
- If you hover over the cell, the success rate break-down by each table will show up. This is for your reference to check your table in further detail.
Definition Session
- This session is giving you the definition of each jargon and metric.

What's to expect 9116 in new coming features?

Here are the metrics still under design and implementation:

Trending chart of metrics
Other measurements and drug integration

SQL Compilation

Use compile.py to render, or compile, query templates into runnable SQL.

Requirements

Python 3.7+

Note: The examples below assume a modern shell environment.

Example 1: Compiling a single template file

Assuming this is the contents of query_1.sql:

SELECT
 field_1
,field_2
FROM `{{project_id}}.{{dataset_id}}.table_1`

executing

python compile.py query_1.sql --project_id 'drc-prod' --dataset_id 'ehr-ops'

stores the compiled file at .compiled/query_1.sql

SELECT
 field_1
,field_2
FROM `drc-prod.ehr-ops.table_1`

Example 2: Compiling multiple files

Assuming these template files exist:

query_1.sql
metrics/query_2.sql

executing the command

python compile.py **/*.sql --project_id 'drc-prod' --dataset_id 'ehr-ops'

compiles them and saves the results at:

.compiled/query_1.sql
.compiled/metrics/query_2.sql

Workbook Version Control

The XML-formatted workbook is stored in the file EHR Ops Dashboard.twb. By default, the XML may contain sensitive session-specific information (usernames, project names, etc.) when saving via Tableau Desktop. This information must be stripped out before committing to the repo. Follow the following instructions to setup a pre-commit workflow that will accomplish this.

Setup

Install project packages.

pip install - r requirements.txt
Create a file named .cipher and add key-value pairs for variables that need to be stripped out.
- Follow the example in .cipher.example.
- Currently these 3 variables are required to be set:
  - pdr_project
  - curation_dataset
  - rdr_dataset
Initialize a pre-commit hook that flags the workbook for sensitive information.

sh ./init_pre_commit.sh
- Utilizes the pre-commit library.
- Follows configuration in .pre-commit-config.yaml.
- Executes the script strip_workbook.py before each commit.

Commit Workflow

*Starting at the point after workbook is modified in Tableau Desktop

(optional) Execute check_workbook.py script to check if workbook contains sensitive information.

python check_workbook.py
(optional) If outputted message indicates sensitive information, execute strip_workbook.py script to clean.

python strip_workbook.py
Stage changes in working directory.
Commit the changes. An automated pre-commit process will run and do the following:
1. Runs the strip_workbook.py script before committing to remove sensitive information.
2. Runs the restore_workbook.py script after committing to restore variable values in workbook.
  - Note: a minor side effect of this process is that the restored workbook is automatically re-staged. To unstage, manually execute git reset HEAD "EHR Ops Dashboard.twb"

Restoring Workbook

Take an already stripped workbook.
Execute the restore_workbook.py script to reset variables stored in .cipher.
Open the workbook in Tableau Desktop and navigate the interface to reset data source connection parameters.
Proceed with development as normal.

Pull Requests and Merging

If a pull request indicates a merge conflict:

In local repo, merge the destination branch into the source branch.
If conflicts are within the EHR Ops Dashboard.twb file, accept all current changes via a conflict resolution tool (essentially overwriting the workbook).
Commit and push to conclude conflict resolution.

Developer Guide

Please reference this guide for development setup.

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.ci_scripts		.ci_scripts
.circleci		.circleci
.configs		.configs
.git-hooks		.git-hooks
airflow/dags		airflow/dags
app_engine_std		app_engine_std
cloud_functions		cloud_functions
dashboard_metrics		dashboard_metrics
flowchart		flowchart
query_execute		query_execute
tests		tests
tools		tools
zendesk_automation		zendesk_automation
.cipher.example		.cipher.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
check_workbook.py		check_workbook.py
compile.py		compile.py
init_pre_commit.sh		init_pre_commit.sh
readme.md		readme.md
requirements.in		requirements.in
requirements.txt		requirements.txt
restore_workbook.py		restore_workbook.py
strip_workbook.py		strip_workbook.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EHR Ops Dashboard

High Level Summary

Definition of Metrics and Related Queries

How to Read the Dashboard

What's to expect 9116 in new coming features?

SQL Compilation

Requirements

Example 1: Compiling a single template file

Example 2: Compiling multiple files

Workbook Version Control

Setup

Commit Workflow

Restoring Workbook

Pull Requests and Merging

Developer Guide

About

Releases

Packages

Languages

cukarthik/ehr-ops

Folders and files

Latest commit

History

Repository files navigation

EHR Ops Dashboard

High Level Summary

Definition of Metrics and Related Queries

How to Read the Dashboard

What's to expect 9116 in new coming features?

SQL Compilation

Requirements

Example 1: Compiling a single template file

Example 2: Compiling multiple files

Workbook Version Control

Setup

Commit Workflow

Restoring Workbook

Pull Requests and Merging

Developer Guide

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages