From 61051ae00d836895ca1530d182116d937b614bb8 Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Fri, 14 Apr 2023 18:17:08 -0700 Subject: [PATCH 01/10] Create pull_request_template.md --- .github/pull_request_template.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 .github/pull_request_template.md diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md new file mode 100644 index 0000000..676e848 --- /dev/null +++ b/.github/pull_request_template.md @@ -0,0 +1,17 @@ +## Issue this PR Addresses/Closes + +Closes #(Issue Number) + +If you don't have an issue #, please first open an issue on the repo before submitting a PR to discuss the changes you'd like to make. + +## Summary of Changes + +(Succint summary of the changes introduced by this PR) + +## Why Do We Need These Changes + +(Short description why this PR is necessary) + + +## Reviewers +@clausherther From 4ae828cf06d14c15812c82dc858f3b079ec1464c Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Tue, 2 May 2023 18:07:55 -0700 Subject: [PATCH 02/10] Add Datacoves sponsor (#256) * Add Datacoves sponsor * Update image links * Add image link * Add sponsor table * Update table --- README.md | 55 ++++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 48 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index e25fe54..948e3e0 100644 --- a/README.md +++ b/README.md @@ -21,17 +21,54 @@ Development of `dbt-expectations` (and `dbt-date`) is funded by our amazing [sponsors](https://github.com/sponsors/calogica), including our **featured** sponsors: -### aggua (www.aggua.io) + + + + + + - + -### Elementary (www.elementary-data.com) + + - + -### re_data (www.getre.io) + - +
+

www.aggua.io

+

+ + + +

+
+

datacoves.com

+

+ + + + Datacoves + + +

+
+

www.elementary-data.com

+

+ + + +

+
+

www.getre.io

+

+ + + +

+
## Install @@ -138,7 +175,7 @@ For example, use `America/New_York` for East Coast Time. ### Multi-column -- [expect_column_pair_values_A_to_be_greater_than_B](#expect_column_pair_values_A_to_be_greater_than_B) +- [expect_column_pair_values_A_to_be_greater_than_B](#expect_column_pair_values_a_to_be_greater_than_b) - [expect_column_pair_values_to_be_equal](#expect_column_pair_values_to_be_equal) - [expect_column_pair_values_to_be_in_set](#expect_column_pair_values_to_be_in_set) - [expect_compound_columns_to_be_unique](#expect_compound_columns_to_be_unique) @@ -602,6 +639,7 @@ tests: Expect column entries to be strings that match a given regular expression. Valid matches can be found anywhere in the string, for example "[at]+" will identify the following strings as expected: "cat", "hat", "aa", "a", and "t", and the following strings as unexpected: "fish", "dog". Optional (keyword) arguments: + - `is_raw` indicates the `regex` pattern is a "raw" string and should be escaped. The default is `False`. - `flags` is a string of one or more characters that are passed to the regex engine as flags (or parameters). Allowed flags are adapter-specific. A common flag is `i`, for case-insensitive matching. The default is no flags. @@ -621,6 +659,7 @@ tests: Expect column entries to be strings that do NOT match a given regular expression. The regex must not match any portion of the provided string. For example, "[at]+" would identify the following strings as expected: "fish”, "dog”, and the following as unexpected: "cat”, "hat”. Optional (keyword) arguments: + - `is_raw` indicates the `regex` pattern is a "raw" string and should be escaped. The default is `False`. - `flags` is a string of one or more characters that are passed to the regex engine as flags (or parameters). Allowed flags are adapter-specific. A common flag is `i`, for case-insensitive matching. The default is no flags. @@ -640,6 +679,7 @@ tests: Expect the column entries to be strings that can be matched to either any of or all of a list of regular expressions. Matches can be anywhere in the string. Optional (keyword) arguments: + - `is_raw` indicates the `regex` pattern is a "raw" string and should be escaped. The default is `False`. - `flags` is a string of one or more characters that are passed to the regex engine as flags (or parameters). Allowed flags are adapter-specific. A common flag is `i`, for case-insensitive matching. The default is no flags. @@ -660,6 +700,7 @@ tests: Expect the column entries to be strings that do not match any of a list of regular expressions. Matches can be anywhere in the string. Optional (keyword) arguments: + - `is_raw` indicates the `regex` pattern is a "raw" string and should be escaped. The default is `False`. - `flags` is a string of one or more characters that are passed to the regex engine as flags (or parameters). Allowed flags are adapter-specific. A common flag is `i`, for case-insensitive matching. The default is no flags. From b28ee9bded806543ab2db3d41742f01978eacc09 Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Tue, 2 May 2023 18:50:11 -0700 Subject: [PATCH 03/10] Add dbt version param to CI --- .circleci/config.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index b909cc5..ce72ae9 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -13,6 +13,7 @@ jobs: DBT_PROFILES_DIR: ./integration_tests/ci DBT_PROJECT_DIR: ./integration_tests BIGQUERY_SERVICE_KEY_PATH: "/home/circleci/bigquery-service-key.json" + DBT_VERSION: 1.4.1 steps: - checkout @@ -22,8 +23,7 @@ jobs: python3 -m venv venv . venv/bin/activate pip install -U pip setuptools wheel - pip install dbt-core dbt-postgres dbt-bigquery dbt-snowflake - pip install sqlfluff sqlfluff-templater-dbt + pip install dbt-core==$DBT_VERSION dbt-postgres==$DBT_VERSION dbt-bigquery==$DBT_VERSION dbt-snowflake==$DBT_VERSION - run: name: Install dbt dependencies From 6700bbc09aed5613b15ac98524b2c618b4b153ba Mon Sep 17 00:00:00 2001 From: Vince Faller Date: Wed, 24 May 2023 12:51:38 -0700 Subject: [PATCH 04/10] #257 Add count to expect_compound_columns_to_be_unique (#261) * #257 Add count to expect_compound_columns_to_be_unique * move comma to end of line --- .../multi-column/expect_compound_columns_to_be_unique.sql | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/macros/schema_tests/multi-column/expect_compound_columns_to_be_unique.sql b/macros/schema_tests/multi-column/expect_compound_columns_to_be_unique.sql index 9943fef..16c2ec1 100644 --- a/macros/schema_tests/multi-column/expect_compound_columns_to_be_unique.sql +++ b/macros/schema_tests/multi-column/expect_compound_columns_to_be_unique.sql @@ -37,8 +37,9 @@ with validation_errors as ( select {% for column in columns -%} - {{ column }}{% if not loop.last %},{% endif %} + {{ column }}, {%- endfor %} + count(*) as {{adapter.quote("n_records")}} from {{ model }} where 1=1 From 62d292ef896575ae2e782dc371182361b1356838 Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Wed, 14 Jun 2023 11:52:52 -0700 Subject: [PATCH 05/10] Update README.md --- README.md | 21 --------------------- 1 file changed, 21 deletions(-) diff --git a/README.md b/README.md index 948e3e0..c1faaa6 100644 --- a/README.md +++ b/README.md @@ -47,27 +47,6 @@ Development of `dbt-expectations` (and `dbt-date`) is funded by our amazing [spo - - -

www.elementary-data.com

-

- - - -

- - - -

www.getre.io

-

- - - -

- - - - ## Install From 8331e5f4a582f7fdf7891bbf1f3ab357c82b66cb Mon Sep 17 00:00:00 2001 From: Maxim Kupfer Date: Wed, 14 Jun 2023 16:11:19 -0700 Subject: [PATCH 06/10] Update README.md (#268) Clarify that percentage tolerance should be expressed as a decimal value as it could be misinterpreted to be a nominal value and cause silent errors. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c1faaa6..09dfc8a 100644 --- a/README.md +++ b/README.md @@ -262,7 +262,7 @@ tests: compare_row_condition: some_flag=false ``` -**Note**: You can also express a **tolerance** factor, either as an absolute tolerable difference, `tolerance`, or as a tolerable % difference `tolerance_percent`. +**Note**: You can also express a **tolerance** factor, either as an absolute tolerable difference, `tolerance`, or as a tolerable % difference `tolerance_percent` expressed as a decimal (i.e 0.05 for 5%). ### [expect_table_column_count_to_be_between](macros/schema_tests/table_shape/expect_table_column_count_to_be_between.sql) From 0a53b469302e708f40697d4601735cd6bc43f8f0 Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Mon, 17 Jul 2023 11:48:44 -0700 Subject: [PATCH 07/10] Update README.md --- README.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/README.md b/README.md index 09dfc8a..11df108 100644 --- a/README.md +++ b/README.md @@ -24,15 +24,6 @@ Development of `dbt-expectations` (and `dbt-date`) is funded by our amazing [spo - -
-

www.aggua.io

-

- - - -

-

datacoves.com

From 276777f50877e98a448e842fbf90bd71bdd978f4 Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Sun, 6 Aug 2023 15:47:14 -0700 Subject: [PATCH 08/10] Add duckdb support (#271) * Upgrade to dbt 0.8.0 * Add CI support for duckdb * Add duckdb support to utils * Update integration test * Update dbt version for CI * Update time format * Fix integration test --- .circleci/config.yml | 10 ++++++++-- integration_tests/ci/profiles.yml | 4 ++++ .../models/schema_tests/timeseries_hourly.sql | 10 +++++++--- macros/math/rand.sql | 6 ++++++ macros/regex/regexp_instr.sql | 10 ++++++++-- macros/utils/datatypes.sql | 4 ++++ packages.yml | 2 +- 7 files changed, 38 insertions(+), 8 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index ce72ae9..a2b6ee9 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -13,7 +13,7 @@ jobs: DBT_PROFILES_DIR: ./integration_tests/ci DBT_PROJECT_DIR: ./integration_tests BIGQUERY_SERVICE_KEY_PATH: "/home/circleci/bigquery-service-key.json" - DBT_VERSION: 1.4.1 + DBT_VERSION: 1.6.0 steps: - checkout @@ -23,7 +23,7 @@ jobs: python3 -m venv venv . venv/bin/activate pip install -U pip setuptools wheel - pip install dbt-core==$DBT_VERSION dbt-postgres==$DBT_VERSION dbt-bigquery==$DBT_VERSION dbt-snowflake==$DBT_VERSION + pip install dbt-core==$DBT_VERSION dbt-postgres==$DBT_VERSION dbt-bigquery==$DBT_VERSION dbt-snowflake==$DBT_VERSION dbt-duckdb==$DBT_VERSION - run: name: Install dbt dependencies @@ -76,6 +76,12 @@ jobs: . venv/bin/activate dbt build -t snowflake --project-dir $DBT_PROJECT_DIR + - run: + name: "Run Tests - DuckDB" + command: | + . venv/bin/activate + dbt build -t duckdb --project-dir $DBT_PROJECT_DIR + - store_artifacts: path: ./logs diff --git a/integration_tests/ci/profiles.yml b/integration_tests/ci/profiles.yml index 3b7a9be..d8238b7 100644 --- a/integration_tests/ci/profiles.yml +++ b/integration_tests/ci/profiles.yml @@ -33,4 +33,8 @@ integration_tests: schema: "{{ env_var('SNOWFLAKE_TEST_SCHEMA') }}" threads: 10 + duckdb: + type: duckdb + path: ":memory:" + target: postgres diff --git a/integration_tests/models/schema_tests/timeseries_hourly.sql b/integration_tests/models/schema_tests/timeseries_hourly.sql index 6f9ec76..44c5001 100644 --- a/integration_tests/models/schema_tests/timeseries_hourly.sql +++ b/integration_tests/models/schema_tests/timeseries_hourly.sql @@ -1,5 +1,9 @@ -{{ dbt_date.date_spine('hour', - start_date=dbt_date.n_days_ago(10), - end_date=dbt_date.tomorrow() +{% set end_date = modules.datetime.datetime.today().replace(hour=0, minute=0, second=0, microsecond=0) %} +{% set start_date = (end_date - modules.datetime.timedelta(days=10)) %} + +{{ dbt_date.get_base_dates( + start_date=start_date, + end_date=end_date, + datepart="hour" ) }} diff --git a/macros/math/rand.sql b/macros/math/rand.sql index ff041e8..f36c889 100644 --- a/macros/math/rand.sql +++ b/macros/math/rand.sql @@ -31,3 +31,9 @@ random() {%- endmacro -%} + +{% macro duckdb__rand() -%} + + random() + +{%- endmacro -%} diff --git a/macros/regex/regexp_instr.sql b/macros/regex/regexp_instr.sql index efa39d5..fa48a0e 100644 --- a/macros/regex/regexp_instr.sql +++ b/macros/regex/regexp_instr.sql @@ -47,6 +47,12 @@ coalesce(array_length((select regexp_matches({{ source_value }}, '{{ regexp }}', regexp_instr({{ source_value }}, '{{ regexp }}', {{ position }}, {{ occurrence }}, 0, '{{ flags }}') {% endmacro %} +{% macro duckdb__regexp_instr(source_value, regexp, position, occurrence, is_raw, flags) %} +{% if flags %}{{ dbt_expectations._validate_flags(flags, 'ciep') }}{% endif %} +regexp_matches({{ source_value }}, '{{ regexp }}', '{{ flags }}') +{% endmacro %} + + {% macro _validate_flags(flags, alphabet) %} {% for flag in flags %} {% if flag not in alphabet %} @@ -74,7 +80,7 @@ regexp_instr({{ source_value }}, '{{ regexp }}', {{ position }}, {{ occurrence } {% if not is_match %} {# Using raise_compiler_error causes disabled tests with invalid flags to fail compilation #} {{ exceptions.warn( - "flags " ~ flags ~ " isn't a valid re2 flag pattern" + "flags " ~ flags ~ " isn't a valid re2 flag pattern" ) }} {% endif %} -{% endmacro %} \ No newline at end of file +{% endmacro %} diff --git a/macros/utils/datatypes.sql b/macros/utils/datatypes.sql index 93fe83b..fceb492 100644 --- a/macros/utils/datatypes.sql +++ b/macros/utils/datatypes.sql @@ -33,3 +33,7 @@ {% macro postgres__type_datetime() -%} timestamp without time zone {%- endmacro %} + +{% macro duckdb__type_datetime() -%} + timestamp +{%- endmacro %} diff --git a/packages.yml b/packages.yml index 9f0b6d0..c0567b1 100644 --- a/packages.yml +++ b/packages.yml @@ -1,3 +1,3 @@ packages: - package: calogica/dbt_date - version: [">=0.7.0", "<0.8.0"] + version: [">=0.8.0", "<0.9.0"] From b17306ef936d182035907ef4de245cf03b8467c2 Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Sun, 6 Aug 2023 15:48:38 -0700 Subject: [PATCH 09/10] Update README.md --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 11df108..16f0621 100644 --- a/README.md +++ b/README.md @@ -55,6 +55,13 @@ packages: # for the latest version tag ``` +This package supports: + +* Postgres +* Snowflake +* BigQuery +* DuckDB + For latest release, see [https://github.com/calogica/dbt-expectations/releases](https://github.com/calogica/dbt-expectations/releases) ### Dependencies From a41f5d5012140486173eb6efbf73b0bf1172e78a Mon Sep 17 00:00:00 2001 From: Claus Herther Date: Sun, 6 Aug 2023 15:50:29 -0700 Subject: [PATCH 10/10] Update CHANGELOG.md --- CHANGELOG.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 76f2c3a..7bc2dd5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,12 @@ +# dbt-expectations v0.9.0 + +## New Features +* Add count to expect_compound_columns_to_be_unique by @VDFaller in https://github.com/calogica/dbt-expectations/pull/261 +* Add duckdb support by @clausherther in https://github.com/calogica/dbt-expectations/pull/271 + +## Docs +* Update README.md by @mbkupfer in https://github.com/calogica/dbt-expectations/pull/268 + # dbt-expectations v0.8.5 ## New Features