checking wal_level before quering for control checkpoint metrics #20490

aldrickdev · 2025-06-11T22:31:34Z

What does this PR do?

Makes sure that the wal_level is logical before trying to query for the control checkpoint metrics introduced in this PR.

Motivation

The motivation was to address the issues brought up in this case.

User was getting the error Error querying pg_control_checkpoint: wal_level must be set to 'logical' when the agent tries to collect the control checkpoint metrics.

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

jasonmp85

I like the first branch of an if being the "normal" case and the else to be the exception, so if we could do that if possible, would be good…

Other than that LGTM

postgres/datadog_checks/postgres/postgres.py

Review from jasonmp85 is dismissed. Related teams and files:

database-monitoring-agent
- postgres/datadog_checks/postgres/postgres.py

codecov · 2025-06-12T00:48:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.34%. Comparing base (d2dd062) to head (f57ec0b).
Report is 4 commits behind head on master.

Additional details and impacted files

Flag	Coverage Δ
activemq	`?`
cassandra	`?`
confluent_platform	`?`
hive	`?`
hivemq	`?`
hudi	`?`
ignite	`?`
jboss_wildfly	`?`
kafka	`?`
postgres	`93.12% <100.00%> (+3.51%)`	⬆️
presto	`?`
solr	`?`
tomcat	`?`
weblogic	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bonnefoa · 2025-06-12T07:49:42Z

postgres/datadog_checks/postgres/postgres.py

+        if self.wal_level == 'logical':
+            self.log.debug("wal_level is logical, adding control checkpoint metrics")
+
+            if self.version >= V10:
+                queries.append(QUERY_PG_CONTROL_CHECKPOINT)
+
+            else:
+                queries.append(QUERY_PG_CONTROL_CHECKPOINT_LT_10)


As mentioned, this is an Aurora specific issue. Calling pg_current_wal_lsn() on an Aurora will fail with:

SELECT pg_current_wal_lsn(); ERROR: wal_level must be set to 'logical' HINT: WAL control functions cannot be executed when wal_level < logical.

Normal PG can run this query regardless of the WAL level so the check should also include self.is_aurora is False

) * Added wal_level check * Added test * updated changelog * Formatting * Updated some logic

…point metrics (#20500) * Remove extra setting of internal resource tag (#20485) * Initial LiteLLM integration (#20480) * litellm poc * fixtures * litellm implementation * Delete ibm_db2/simple_ibm_db2_check.py * e2e test skip * Rename 1.added to 20480.added * Update manifest.json * Update manifest.json * license * sort metadata * license header * fix manifest * remove unusable metric * labeler * Delete celery/.junit/test-e2e-py3.12.xml * Delete silk/.junit/test-unit-py3.12.xml * Delete strimzi/.junit/test-e2e-py3.12-0.34.xml * Update common.py * Update test_unit.py * Add Agent Integrations to Falco codeowners (#20486) * Falco Integration (#20449) * Falco Integration * E2E * undo manifest changes * add docker-compose.yaml * Lint and fix docker-compose * Falco service check and lint * Add metadata.csv * fix metadata.csv * Changelog fix and add towncrier header * Add version to Agent metadata * Remove falco.version from manifest * Added wal_level check * Added test * updated changelog * Formatting * Updated some logic * checking wal_level before quering for control checkpoint metrics (#20490) * Added wal_level check * Added test * updated changelog * Formatting * Updated some logic * Release new integrations for 7.68.x (#20491) (#20492) * Release new integrations for 7.68.x (#20491) * [Release] Bumped eset_protect version to 1.0.0 * [Release] Bumped kuma version to 1.0.0 * [Release] Bumped litellm version to 1.0.0 * [Release] Bumped microsoft_dns version to 1.0.0 * [Release] Bumped watchguard_firebox version to 1.0.0 * [Release] Update metadata * Remove the in-toto new file to backport to master * Add supported OS classifiers to Falco (#20495) * Updated check and test to account for aurora environments * Added changelog and formatted code * [SQLServer] - Add AO failover monitor template (#20488) * [SQLServer] - Add AO failover monitor template * manifest * Formatting --------- Co-authored-by: Eric Weaver <eweaver755@gmail.com> Co-authored-by: Steven Yuen <steven.yuen@datadoghq.com> Co-authored-by: Kyle Neale <kyle.neale@datadoghq.com> Co-authored-by: Juanpe Araque <juanpedro.araque@datadoghq.com> Co-authored-by: Ilia Kurenkov <ilia.kurenkov@datadoghq.com> Co-authored-by: Joel Marcotte <91903666+joelmarcotte@users.noreply.github.com>

…point metrics (#20500) * Remove extra setting of internal resource tag (#20485) * Initial LiteLLM integration (#20480) * litellm poc * fixtures * litellm implementation * Delete ibm_db2/simple_ibm_db2_check.py * e2e test skip * Rename 1.added to 20480.added * Update manifest.json * Update manifest.json * license * sort metadata * license header * fix manifest * remove unusable metric * labeler * Delete celery/.junit/test-e2e-py3.12.xml * Delete silk/.junit/test-unit-py3.12.xml * Delete strimzi/.junit/test-e2e-py3.12-0.34.xml * Update common.py * Update test_unit.py * Add Agent Integrations to Falco codeowners (#20486) * Falco Integration (#20449) * Falco Integration * E2E * undo manifest changes * add docker-compose.yaml * Lint and fix docker-compose * Falco service check and lint * Add metadata.csv * fix metadata.csv * Changelog fix and add towncrier header * Add version to Agent metadata * Remove falco.version from manifest * Added wal_level check * Added test * updated changelog * Formatting * Updated some logic * checking wal_level before quering for control checkpoint metrics (#20490) * Added wal_level check * Added test * updated changelog * Formatting * Updated some logic * Release new integrations for 7.68.x (#20491) (#20492) * Release new integrations for 7.68.x (#20491) * [Release] Bumped eset_protect version to 1.0.0 * [Release] Bumped kuma version to 1.0.0 * [Release] Bumped litellm version to 1.0.0 * [Release] Bumped microsoft_dns version to 1.0.0 * [Release] Bumped watchguard_firebox version to 1.0.0 * [Release] Update metadata * Remove the in-toto new file to backport to master * Add supported OS classifiers to Falco (#20495) * Updated check and test to account for aurora environments * Added changelog and formatted code * [SQLServer] - Add AO failover monitor template (#20488) * [SQLServer] - Add AO failover monitor template * manifest * Formatting --------- Co-authored-by: Eric Weaver <eweaver755@gmail.com> Co-authored-by: Steven Yuen <steven.yuen@datadoghq.com> Co-authored-by: Kyle Neale <kyle.neale@datadoghq.com> Co-authored-by: Juanpe Araque <juanpedro.araque@datadoghq.com> Co-authored-by: Ilia Kurenkov <ilia.kurenkov@datadoghq.com> Co-authored-by: Joel Marcotte <91903666+joelmarcotte@users.noreply.github.com> (cherry picked from commit 08389f8)

aldrickdev added 3 commits June 11, 2025 16:15

Added wal_level check

48c5299

Added test

dd8507f

updated changelog

f84c4d7

aldrickdev requested review from a team as code owners June 11, 2025 22:31

aldrickdev added the qa/skip-qa Automatically skip this PR for the next QA label Jun 11, 2025

temporal-github-worker-1 bot added agent/review-requested ecosystems/review-requested product/review-requested labels Jun 11, 2025

datadog-agent-integrations-bot bot added integration/postgres team/agent-integrations team/database-monitoring-agent labels Jun 11, 2025

aldrickdev changed the title ~~Sdbm 1804 postgres error querying pg control checkpoint wal level must be set to logical~~ checking wal_level before quering for control checkpoint metrics Jun 11, 2025

Formatting

89598ca

jasonmp85 previously approved these changes Jun 11, 2025

View reviewed changes

postgres/datadog_checks/postgres/postgres.py Outdated Show resolved Hide resolved

Updated some logic

f57ec0b

jasonmp85 approved these changes Jun 11, 2025

View reviewed changes

aldrickdev enabled auto-merge June 12, 2025 00:46

aldrickdev added this pull request to the merge queue Jun 12, 2025

Merged via the queue into master with commit 5cc1053 Jun 12, 2025
23 checks passed

aldrickdev deleted the SDBM-1804-postgres-Error-querying-pg_control_checkpoint-wal_level-must-be-set-to-logical branch June 12, 2025 00:48

bonnefoa reviewed Jun 12, 2025

View reviewed changes

aldrickdev restored the SDBM-1804-postgres-Error-querying-pg_control_checkpoint-wal_level-must-be-set-to-logical branch June 12, 2025 12:44

aldrickdev mentioned this pull request Jun 12, 2025

checking wal_level and if is aurora before querying for control checkpoint metrics #20500

Merged

3 tasks

aldrickdev added a commit that referenced this pull request Jun 12, 2025

checking wal_level before quering for control checkpoint metrics (#20490

a91cf69

) * Added wal_level check * Added test * updated changelog * Formatting * Updated some logic

datadog-agent-integrations-bot bot mentioned this pull request Jun 25, 2025

[Backport 7.68.x] checking wal_level and if is aurora before querying for control checkpoint metrics #20590

Merged

3 tasks

ngraef mentioned this pull request Jun 26, 2025

[postgres] errors for unsupported features in Aurora #20356

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

checking wal_level before quering for control checkpoint metrics #20490

checking wal_level before quering for control checkpoint metrics #20490

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

checking wal_level before quering for control checkpoint metrics #20490

checking wal_level before quering for control checkpoint metrics #20490

Uh oh!

Conversation

What does this PR do?

Motivation

Review checklist (to be filled by reviewers)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!