prometheus_notifications_dropped_total increases with an alert_relabel_config dropping alerts on all configured AlertManagers

What did you do?

I added an alert_relabel_configs section to my prometheus configuration. Afterwards I noticed that prometheus_notifications_dropped_total was increasing. I did not expect this metric to increase because the dropping is intentionally configured.

Reading through notifier.sendAll() it looks like the boolean result driving this metric is based on numSuccess > 0. I believe there is a scenario where the call to relabelAlerts returns a slice of length 0 for each AlertManager, we continue and numSuccess == 0 when we hit the bottom of the function. This then returns false, which incorrectly increments the dropped alerts by the number of alerts before relabeling.

The way we caught this is we want to fire an alert internally to prometheus but not send it to any of our alertmanager instances.

I would expect that prometheus_notifications_dropped_total excludes intentionally dropped alerts such that unintentional drops can be surfaced. Right now the metric is including intentional and unintentional drops, which makes it very difficult to tell if there is a problem sending alerts to AlertManager.

What did you expect to see?

prometheus_notifications_dropped_total staying at 0.

What did you see instead? Under which circumstances?

prometheus_notifications_dropped_total increasing by the number of relabel drop matches.

System information

No response

Prometheus version

v3.1.0 linux/amd64

Prometheus configuration file

alertmanagers:
        - alert_relabel_configs:
            - action: drop
              regex: node-condition-k8s
              source_labels:
                - notify

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

System information

Prometheus version

Prometheus configuration file

Alertmanager version

Alertmanager configuration file

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

System information

Prometheus version

Prometheus configuration file

Alertmanager version

Alertmanager configuration file

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions