8000 [BUG] Persistent data storage in Python controller bindings is not power loss safe. · Issue #36654 · project-chip/connectedhomeip · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
[BUG] Persistent data storage in Python controller bindings is not power loss safe. #36654
Open
@kepstin

Description

@kepstin

Reproduction steps

A Home Assistant user reported an issue where following a power loss, all of their connected Matter devices had disappeared.

From the user's provided logs: matter-logs.txt
It looks like the json file was corrupted (with partially written data), and upon reading the corrupted file the Matter SDK decided to reset and start from a blank storage.

The code responsible for writing the file is here:

def Commit(self):
''' Commits the cached JSON configuration to file (if one was provided in the constructor).
Otherwise, this is a no-op.
'''
if (self._path is None):
return
if (self._file is None):
try:
self._file = open(self._path, 'w')
except Exception as ex:
LOGGER.error(
f"Could not open {self._path} for writing configuration. Error: {ex}")
return
self._file.seek(0)
json.dump(self._jsonData, self._file, ensure_ascii=True, indent=4)
self._file.truncate()
self._file.flush()

The logic here does an in-place overwrite of the file. The problem this causes is that if power is cut at any point after the write is started, but before the operating system finishing flushing writes to disk, then the user may be left with a corrupt partially written file which contains a mix of data - parts of the new file and parts of the old file.

I'd recommend the following logic instead:

  • (optional) Make a backup of of the storage file by copying it to a different filename, e.g. with a ".backup" suffix, leaving the existing file in place. (flush and fsync the newly written file to ensure the backup is updated before the original file gets replaced.)
  • Write the new data to a temporary file in the same directory (e.g. using tempfile.mkstemp or tempfile.NamedTemporaryFile with delete=False).
  • After writing the data, flush and fsync to ensure the file contents are persisted on the disk. This is needed to ensure ordering - make sure that the file contents will be fully written before the file gets renamed.
  • Rename the temporary file to the final filename, atomically replacing the existing file.
  • Run fsync on the renamed file to ensure the rename is committed to storage.

Bug prevalence

Rarely - requires unlucky timing of power loss

GitHub hash of the SDK that was being used

648d7bf (chip-wheels 2024.9.0)

Platform

python

Platform Version(s)

No response

Anything else?

I originally reported this issue on Home Assistant's python-matter-server: home-assistant-libs/python-matter-server#977 but it turns out that the responsible code was actually in the Matter SDK, not provided by Home Assistant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0