Description
- Version of collectd: 5.12.0
- Operating system / distribution: Fedora 37
- Kernel version (if applicable): 6.3.8-100.fc37.x86_64
Expected behavior
I expect that when enabled, the smart plugin does not increase the num_err_log_entries
of a Seagate FireCuda 530 NVMe drive.
Actual behavior
With the smart plugin enabled, every minute, the num_err_log_entries
values increments by one
Steps to reproduce
I have few hard disks on my server, including a Samsung SSD 980 PRO 2T. collectd is configured with the smart plugin and worked perfectly fine with the Samsung NVMe.
But today I added a new Seagate FireCuda 530. After a reboot, I was curious to see if the smart plugin picks it up - and it did, however, I spotted that num_err_log_entries
was increasing.
I found this https://www.osso.nl/blog/kioxia-nvme-num-err-log-entries-0xc004-smartctl/ website describing a similar problem - but there smartctl was used directly and smartctl bug was fixed long time ago. This pointed me into the direction of collectd / smart plugin, and thus, I started testing.
Without a <Plugin "smart"> tag (thus, auto-detect I assume), or with the drive enabled -> the num_err_log_entries
is increasing. The only way to not have it increase is to disable it from being monitored:
<Plugin "smart">
Disk "sda"
Disk "sdb"
Disk "sdc"
Disk "sdd"
Disk "sde"
# Disk "nvme0n1"
Disk "nvme1n1"
IgnoreSelected false
</Plugin>
The error reported is:
# nvme error-log /dev/nvme0n1
Error Log Entries for device:nvme0n1 entries:63
.................
Entry[ 0]
.................
error_count : 52
sqid : 0
cmdid : 0x9010
status_field : 0x2002(Invalid Field in Command: A reserved coded value or an unsupported value in a defined field)
phase_tag : 0
parm_err_loc : 0x4
lba : 0
nsid : 0xffffffff
vs : 0
trtype : The transport type is not indicated or the error is not transport related.
cs : 0
trtype_spec_info: 0
.................
Entry[ 1]
.................
error_count : 51
sqid : 0
cmdid : 0xa014
status_field : 0x2002(Invalid Field in Command: A reserved coded value or an unsupported value in a defined field)
phase_tag : 0
parm_err_loc : 0x4
lba : 0
nsid : 0xffffffff
vs : 0
trtype : The transport type is not indicated or the error is not transport related.
cs : 0
trtype_spec_info: 0
.................
Entry[ 2]