-
Notifications
You must be signed in to change notification settings < 8000 /li> - Fork 1.2k
collectd 6
collectd 6 is the next major version of collectd. It contains significant and backwards incompatible changes to the data model.
Metric labels: collectd 6 identifies metric instances using "labels", i.e. an arbitrary number of key-value pairs. collectd 5 provided two "instance" fields, plugin instance and type instance, that worked decent for many situations. Plugin authors frequently ran into limitations with this model, though, for example when more than two dimensions were required or when the dimensions did not align with the "plugin" and "type" framework.
The new data model allows us to provide more labels and to name the labels. The df plugin is a good example: in collectd 5 users needed to decide whether metrics should be identified by device or by mount point. The new data model allows us to report both. Additionally, the plugin now reports the filesystem type and mode (read-only vs. read/write).
Resource attributes: collectd 6 also uses "labels" to identify the resource being monitored. This replaces the "host" field in collectd 5's "value list". By using a more flexible schema, we can now better identify resources that don't really have a hostname, e.g. ModBus temperature sensors or "serverless" cloud services. This approach also allows users to provide additional information about the monitored resource, such as a version number or build timestamp.
Relative CPU utilization, activated with the ReportUtilization
option, is now reported as a fraction, i.e. a number between zero and one. This fraction is global, over all CPUs. In other words the states of all CPUs of a system sum up to one. With collectd 5, relative utilization was reported as a percentage, i.e. a number between 0 and 100, and the states of each CPU individually summed up to 100.
Absolute CPU usage, activated with the ReportUsage
option, is reported as as microseconds the CPU spent in each state. This differs from the OpenTelemetry Semantic Convention, which asks for this metric to be reported in seconds. However, collectd doesn't support floating point counter metrics yet.
v5 metric | v6 metric |
---|---|
cpu-${cpu_num}/cpu-${state} | system.cpu.time{system.cpu.logical_number="${cpu_num}",system.cpu.state="${state}"} |
cpu-${cpu_num}/percent-${state} | system.cpu.utilization{system.cpu.logical_number="${cpu_num}",system.cpu.state="${state}"} |
cpu/count | system.cpu.logical.count |
v5 metric | v6 metric |
---|---|
df-${mount_point_or_device}/df_complex-${state} | system.filesystem.usage |
df-${mount_point_or_device}/percent_bytes-${state} | system.filesystem.utilization |
df-${mount_point_or_device}/df_inodes-${state} | system.filesystem.inodes.usage |
df-${mount_point_or_device}/percent_inodes-${state} | system.filesystem.inodes.utilization |
Labels:
Label name | Description |
---|---|
system.device | Name of the block device, e.g. "/dev/sda1". |
system.filesystem.mountpoint | Mount point (a path). |
system.filesystem.state | One of "used", "free", or "reserved". |
system.filesystem.mode | One of "ro" or "rw". |
system.filesystem.type | The file system type, e.g. "ext4". |
v5 metric | v6 metric |
---|---|
disk-${device}/disk_octets | system.disk.io{system.device="${device}",disk.io.direction="read|write"} |
disk-${device}/disk_ops | system.disk.operations{system.device="${device}",disk.io.direction="read|write"} |
disk-${device}/disk_time | system.disk.operation_time{system.device="${device}",disk.io.direction="read|write"} |
disk-${device}/disk_merged | system.disk.merged{system.device="${device}",disk.io.direction="read|write"} |
disk-${device}/disk_io_time | system.disk.io_time{system.device="${device}"} and system.disk.weighted_io_time{system.device="${device}"} |
disk-${device}/pending_operations | system.disk.pending_operations{system.device="${device}"} |
disk-${device}/percent-utilization | system.disk.utilization{system.device="${device}"} |
v5 metric | v6 metric |
---|---|
interface-${device}/if_octets | system.network.io{system.device="${device}",network.io.direction="receive|transmit"} |
interface-${device}/if_packets | system.network.packets{system.device="${device}",network.io.direction="receive|transmit"} |
interface-${device}/if_errors | system.network.errors{system.device="${device}",network.io.direction="receive|transmit"} |
interface-${device}/if_dropped | system.network.dropped{system.device="${device}",network.io.direction="receive|transmit"} |
- "utilization" (relative metric) is reported as a fraction of one, instead of a percentage.
- Config options have been renamed:
-
ValuesAbsolute
→ReportUsage
-
ValuesPercentage
→ReportUtilization
-
- The
slab
andavailable
states have been removed from the Linux implementation, since they overlapped with other states, causing some memory to be counted twice. - The
shared
state has been added to the Linux implementation. This is in line with the OpenTelemetry Semantic Convention.
v5 metric | v6 metric |
---|---|
memory/memory-${state} | system.memory.usage{system.memory.state="${state}"} |
memory/percent-${state} | system.memory.utilization{system.memory.state="${state}"} |