Releases · longhorn/longhorn

@PhanLe1010

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Highlight

[FEATURE] Support creation of PVC from CSI snapshot without copying data for v2 data engine 7794 - @PhanLe1010
[FEATURE] V1 and V2 volume offline replica rebuilding 8443 - @mantissahz @roger-ryao
[FEATURE] Delta Replica Rebuilding using Delta Snapshot: Control and Data Planes 10037 - @shuo-wu
[FEATURE] v2 volume supports UBLK frontend 9456 - @PhanLe1010 @chriscchien
[UI][FEATURE] V1 and V2 volume offline replica rebuilding 10581 - @houhoucoop @roger-ryao
[TASK] Migrate v1beta1 CR to v1beta2 10250 - @COLDTURNIP @roger-ryao
[FEATURE] Storage network with V2 data engine 6450 - @c3y1huang @roger-ryao
[FEATURE] Recurring system backup 6534 - @yangchiu @c3y1huang
[UI][FEATURE] Recurring system backup 10262 - @yangchiu @houhoucoop

Feature

[FEATURE] Cleanup orphaned volume runtime resources if the resources already deleted 6764 - @COLDTURNIP @chriscchien
[FEATURE] Running replicas field in volume table 10817 - @xelab04 @roger-ryao
[FEATURE] Longhorn UI supports orphaned instance CRs management 10760 - @yangchiu @houhoucoop
[FEATURE] Allow auto deleting snapshot when a backup is created from that snapshot. 9213 - @yangchiu @mantissahz
[FEATURE] Add missing metrics of number of volumes/replicas by node/cluster 7599 - @c3y1huang @roger-ryao

Improvement

[IMPROVEMENT] Add Prometheus metrics for Replica and Engine CRs 10722 - @hookak @chriscchien
[IMPROVEMENT] Export longhorn engine rebuild status as prometheus metrics 10550 - @hookak @chriscchien
[IMPROVEMENT] Disable Snapshot Checksum Calculation for Single-Replica V1 Volume 10518 - @derekbit @chriscchien
[IMPROVEMENT] Remove unnecessary lasso dependency 10856 - @derekbit @chriscchien
[IMPROVEMENT] add extraObject in charts 10835 - @DrummyFloyd @chriscchien
[IMPROVEMENT] Disable the v2 snapshot hashing while it is being deleted 10563 - @shuo-wu @roger-ryao
[IMPROVEMENT] v2 checksum calculation and update should follow the v1 flow 10480 - @shuo-wu @roger-ryao
[IMPROVEMENT] add strict field validation to the update option in upgrade path 10644 - @ChanYiLin @chriscchien
[IMPROVEMENT] Show snapshot size during in-progress backup 9783 - @yangchiu @houhoucoop
[IMPROVEMENT] Don't synchronize all filesystem before snapshotting a v2 volume 9023 - @yangchiu @DamiaSan
[IMPROVEMENT] spdk_tgt can cancel lvol checksum calculation while there is high priority task 10421 - @yangchiu @DamiaSan
[IMPROVEMENT] Lvol is not force-removed if Blob is busy 10474 - @yangchiu @DamiaSan
[IMPROVEMENT] Remove deprecated fields from CRDs 6684 - @derekbit @roger-ryao
[IMPROVEMENT] Longhorn CLI fails to recognize Raspbian OS 10676 - @bachmanity1 @roger-ryao
[IMPROVEMENT] Reduce auto balancing logging noise for detached volumes 10691 - @dihmandrake @roger-ryao
[IMPROVEMENT] Remove the upper bound of v2-data-engine-guaranteed-instance-manager-cpu 10662 - @derekbit @roger-ryao
[IMPROVEMENT] Clean up BackupTarget condition message handling 8224 - @chriscchien @houhoucoop
[IMPROVEMENT] Longhorn CLI supports SLES micro 9256 - @yangchiu @DamiaSan
[IMPROVEMENT] Allow volumeBindingMode to be set from helm values 10592 - @ruant @roger-ryao
[DOC] Prepare a knowledge base for backing image trouble shooting during upgrade 10590 - @ChanYiLin @chriscchien
[IMPROVEMENT] Missing Prometheus Metrics for Engine v2 Volumes 10472 - @hookak @roger-ryao
[IMPROVEMENT] Create Volume UI improvement, Automatically Filter Backing Image Based on v1 or v2 Selection 10086 - @houhoucoop @roger-ryao
[UI][IMPROVEMENT] Improve the Warning Message When Failed to Remove Block-Type Disks 10580 - @houhoucoop @roger-ryao
[IMPROVEMENT] Pass full backup mode option to CSI volume snapshot type backup 9785 - @ChanYiLin @roger-ryao
[UI][IMPROVEMENT] Clean up BackupTarget condition message handling 10579 - @houhoucoop
[IMPROVEMENT] Improve the Warning Message When Failed to Remove Block-Type Disks 10522 - @yangchiu @ChanYiLin
[IMPROVEMENT] Move SettingNameV2DataEngineHugepageLimit to danger zone settings 7746 - @derekbit @chriscchien
[IMPROVEMENT] Include the /proc/mounts file and multipath.config in the support-bundle 6754 - @c3y1huang @roger-ryao
[IMPROVEMENT] Use code-generator/kube_codegen.sh to generate K8s stubs and CRDs 7944 - @derekbit @chriscchien
[IMPROVEMENT] CRD & API code generator decouple from Go conventional source path 10556 - @COLDTURNIP
[IMPROVEMENT] Support configurable upgrade-responder URL 10437 - @derekbit @roger-ryao
[IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10233 - @yangchiu @james-munson
[IMPROVEMENT] Code conventions in the Longhorn project with golangci-lint 8955 -
[IMPROVEMENT] Improve the UX of updating danger zone settings 8070 - @yangchiu @mantissahz

Bug

[BUG] backup target settings causes longhorn-manager crash loop during upgrade to v1.9.0 10864 - @COLDTURNIP
[BUG][v1.9.0-rc1] v2 volumes don't reuse failed replicas as expected after a node goes down 10828 - @yangchiu @shuo-wu
[BUG][v1.9.0-rc1] Unexpected orphaned data are created after v2 instance managers deleted 10829 - @yangchiu
[BUG] v2 volume with backing image fails to recover to healthy state after deleting instance manager during replica rebuilding 10521 - @yangchiu
[BUG] V2 Backing image not ready after upgrade from v1.8.1 to v1.9.x 10805 - @COLDTURNIP @chriscchien
[BUG] Test case test_snapshot_prune_and_coalesce_simultaneously_with_backing_image fails 10808 - @yangchiu @c3y1huang
[BUG][UI] Snapshots of v2 volume with backing image aren't shown on the Snapshots and Backups graph 10526 - @derekbit @chriscchien
[BUG] Deleted orphan data still renders on the page until page refresh 10803 - @COLDTURNIP @chriscchien @houhoucoop
[BUG] v2 Engine loops in detaching and attaching state after rebuilding 10396 - @shuo-wu @roger-ryao
[BUG] DR volume does not sync with latest backup when activation 10824 - @c3y1huang @chriscchien
[BUG][v1.9.0-rc1] While running component resilience robot test suite, a v2 instance manager gets stuck in Terminating state, preventing block disk from being schedulable on this node 10810 - @DamiaSan
[BUG][v1.9.0-rc1] Block disks become temporarily unavailable after the upgrading from v1.8.1 to v1.9.0-rc1 10821 - @mantissahz
[BUG] Naming collision when creating the name of the new backing image manager 10616 - @yangchiu @ChanYiLin
[BUG] v2 volume replica status error after snapshot deletion with Immediate Data Integrity Check Enable...

@shuo-wu

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Highlight

[FEATURE] Delta Replica Rebuilding using Delta Snapshot: Control and Data Planes 10037 - @shuo-wu
[FEATURE] V1 and V2 volume offline replica rebuilding 8443 - @mantissahz @roger-ryao
[FEATURE] v2 volume supports UBLK frontend 9456 - @PhanLe1010 @chriscchien
[UI][FEATURE] V1 and V2 volume offline replica rebuilding 10581 - @houhoucoop @roger-ryao
[FEATURE][v2] Support creation of PVC from CSI snapshot without copying data for v2 data engine 7794 - @PhanLe1010
[TASK] Migrate v1beta1 CR to v1beta2 10250 - @COLDTURNIP @roger-ryao
[FEATURE] Storage network with V2 data engine 6450 - @c3y1huang @roger-ryao
[FEATURE] Recurring system backup 6534 - @yangchiu @c3y1huang
[UI][FEATURE] Recurring system backup 10262 - @yangchiu @houhoucoop

Feature

[FEATURE] Longhorn UI supports orphaned instance CRs management 10760 - @yangchiu @houhoucoop
[FEATURE] Allow auto deleting snapshot when a backup is created from that snapshot. 9213 - @yangchiu @mantissahz
[FEATURE] Add missing metrics of number of volumes/replicas by node/cluster 7599 - @c3y1huang @roger-ryao

Improvement

[IMPROVEMENT] Cleanup orphaned volume runtime resources if the resources already deleted 6764 - @COLDTURNIP @chriscchien
[IMPROVEMENT] Disable the v2 snapshot hashing while it is being deleted 10563 - @shuo-wu @roger-ryao
[IMPROVEMENT] Add Prometheus metrics for Replica and Engine CRs 10722 - @hookak @chriscchien
[IMPROVEMENT] v2 checksum calculation and update should follow the v1 flow 10480 - @shuo-wu @roger-ryao
[IMPROVEMENT] add strict field validation to the update option in upgrade path 10644 - @ChanYiLin @chriscchien
[IMPROVEMENT] Export longhorn engine rebuild status as prometheus metrics 10550 - @hookak @chriscchien
[IMPROVEMENT] Show snapshot size during in-progress backup 9783 - @yangchiu @houhoucoop
[IMPROVEMENT] Don't synchronize all filesystem before snapshotting a v2 volume 9023 - @yangchiu @DamiaSan
[IMPROVEMENT] spdk_tgt can cancel lvol checksum calculation while there is high priority task 10421 - @yangchiu @DamiaSan
[IMPROVEMENT] V2 volume snapshot supports labels 9808 - @DamiaSan
[IMPROVEMENT] Lvol is not force-removed if Blob is busy 10474 - @yangchiu @DamiaSan
[IMPROVEMENT] Remove deprecated fields from CRDs 6684 - @derekbit @roger-ryao
[IMPROVEMENT] Longhorn CLI fails to recognize Raspbian OS 10676 - @bachmanity1 @roger-ryao
[IMPROVEMENT] Reduce auto balancing logging noise for detached volumes 10691 - @dihmandrake @roger-ryao
[IMPROVEMENT] Remove the upper bound of v2-data-engine-guaranteed-instance-manager-cpu 10662 - @derekbit @roger-ryao
[IMPROVEMENT] Clean up BackupTarget condition message handling 8224 - @chriscchien @houhoucoop
[IMPROVEMENT] Longhorn CLI supports SLES micro 9256 - @yangchiu @DamiaSan
[IMPROVEMENT] Allow volumeBindingMode to be set from helm values 10592 - @ruant @roger-ryao
[DOC] Prepare a knowledge base for backing image trouble shooting during upgrade 10590 - @ChanYiLin @chriscchien
[IMPROVEMENT] Missing Prometheus Metrics for Engine v2 Volumes 10472 - @hookak @roger-ryao
[IMPROVEMENT] Create Volume UI improvement, Automatically Filter Backing Image Based on v1 or v2 Selection 10086 - @houhoucoop @roger-ryao
[UI][IMPROVEMENT] Improve the Warning Message When Failed to Remove Block-Type Disks 10580 - @houhoucoop @roger-ryao
[IMPROVEMENT] Pass full backup mode option to CSI volume snapshot type backup 9785 - @ChanYiLin @roger-ryao
[UI][IMPROVEMENT] Clean up BackupTarget conditi 8000 on message handling 10579 - @houhoucoop
[IMPROVEMENT] Improve the Warning Message When Failed to Remove Block-Type Disks 10522 - @yangchiu @ChanYiLin
[IMPROVEMENT] Move SettingNameV2DataEngineHugepageLimit to danger zone settings 7746 - @derekbit @chriscchien
[IMPROVEMENT] Disable Snapshot Checksum Calculation for Single-Replica V1 Volume 10518 - @derekbit @chriscchien
[IMPROVEMENT] Include the /proc/mounts file and multipath.config in the support-bundle 6754 - @c3y1huang @roger-ryao
[IMPROVEMENT] Use code-generator/kube_codegen.sh to generate K8s stubs and CRDs 7944 - @derekbit @chriscchien
[IMPROVEMENT] CRD & API code generator decouple from Go conventional source path 10556 - @COLDTURNIP
[IMPROVEMENT] Support configurable upgrade-responder URL 10437 - @derekbit @roger-ryao
[IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10233 - @yangchiu @james-munson
[UI][IMPROVEMENT] Improving error transparency for volume attachment failure 10257 - @houhoucoop
[IMPROVEMENT] Code conventions in the Longhorn project with golangci-lint 8955 -
[IMPROVEMENT] Improve the UX of updating danger zone settings 8070 - @yangchiu @mantissahz

Bug

[BUG][v1.9.0-rc1] Unable to delete v2 backing image 10811 -
[BUG][v1.9.0-rc1] While running component resilience robot test suite, a v2 instance manager gets stuck in Terminating state, preventing block disk from being schedulable on this node 10810 - @COLDTURNIP
[BUG] Test case test_snapshot_prune_and_coalesce_simultaneously_with_backing_image fails 10808 - @c3y1huang
[BUG] V2 Backing image not ready after upgrade from v1.8.1 to v1.9.x 10805 - @COLDTURNIP
[BUG] v2 volume replica status error after snapshot deletion with Immediate Data Integrity Check Enabled 10798 - @shuo-wu
[BUG] Enabling V2 Data Engine setting, v2 instance manager doesn't start after certain negative factor operations 10791 - @COLDTURNIP @yangchiu
[BUG] Volume migration negative test cases fail on v2 volumes 10800 -
[BUG] spdk emits Device or resource busy while registering lvol checksum calculation 10140 - @shuo-wu @roger-ryao
[BUG] Naming collision when creating the name of the new backing image manager 10616 - @yangchiu @ChanYiLin
[BUG] v2 Engine loops in detaching and attaching state after rebuilding 10396 - @shuo-wu @roger-ryao
[BUG] v2 instance managers keep crashing on master-head arm64 environment 10768 - @yangchiu @PhanLe1010
[BUG] Wrong image name in longhorn-images.txt 10774 - @c3y1huang
[BUG] Replica auto balance disk in pressure fails on v2 volumes 10551 - @DamiaSan
[BUG] Failed to terminate namesapce longhorn-system if there is a support bundle ReadyForDownload 10731 - @yangchiu @c3y1huang
[BUG] After node down and force delete the terminating deployment pod, volume can not attach success 10689 - @c3y1huang @chriscchien
[BUG] Deleting a replica of one v2 volume will also deg...

@derekbit

Longhorn v1.8.1 Release Notes

Longhorn 1.8.1 introduces several improvements and bug fixes that are intended to improve system quality, resilience, stability and security.

The Longhorn team appreciates your contributions and expects to receive feedback regarding this release.

Note

For more information about release-related terminology, see Releases.

Installation

Important

Ensure that your cluster is running Kubernetes v1.25 or later before installing Longhorn v1.8.1.

You can install Longhorn using a variety of tools, including Rancher, Kubectl, and Helm. For more information about installation methods and requirements, see Quick Installation in the Longhorn documentation.

Upgrade

Important

Ensure that your cluster is running Kubernetes v1.25 or later before upgrading from Longhorn v1.7.x or v1.8.x (< v1.8.1) to v1.8.1.

Longhorn only allows upgrades from supported versions. For more information about upgrade paths and procedures, see Upgrade in the Longhorn documentation.

Post-Release Known Issues

For information about issues identified after this release, see Release-Known-Issues.

Resolved Issues

Improvement

[BACKPORT][v1.8.1][IMPROVEMENT] Support configurable upgrade-responder URL 10439 - @derekbit @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Several warning for unknown reason 10420 - @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10376 - @yangchiu @james-munson

Bug

[BACKPORT][v1.8.1][BUG] csi keeps creating backup if the backup target is unavailable 10510 - @mantissahz @roger-ryao
[BACKPORT][v1.8.1][BUG] integer divide by zero in replica scheduler 10506 - @c3y1huang @chriscchien
[BACKPORT][v1.8.1][BUG] Leading or trailing spaces in Longhorn UI break search 10508 - @houhoucoop @roger-ryao
[BACKPORT][v1.8.1][BUG] When replica rebuilding completed, the progress could be 99 instead of 100 10485 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] list_backupVolume API could randomly returns failed to find a node that is ready and has the default engine image error 10478 - @yangchiu @mantissahz
[BACKPORT][v1.8.1][BUG] nil pointer when the backing image copy is delete from the spec but also gets evicted at the same time 10466 - @yangchiu @ChanYiLin
[BACKPORT][v1.8.1][BUG] 2 uninstall pods could be created after uninstall job was created, one failed with deleting-confirmation-flag is set to false error, while the other completed successfully 10484 -
[BACKPORT][v1.8.1][BUG][UI] Backup store setting doesn't apply to the cloned volume 10468 - @yangchiu @mantissahz
[BACKPORT][v1.8.1][BUG] v2 volume workload FailedMount with message Staging target path /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/xxx/globalmount is no longer valid 10477 -
[BACKPORT][v1.8.1][BUG][UI] Bulk backup creation with a detached volume returns error 405 and error messages show in browser console 10462 - @mantissahz
[BACKPORT][v1.8.1][BUG] V2 volume fails to cleanup error replica and rebuild new one - test_data_locality_basic 10364 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Data lost caused by Longhorn CSI plugin doing a wrong filesystem format action in a rare race condition 10418 - @yangchiu @PhanLe1010
[BACKPORT][v1.8.1][BUG] v2 Engine loops in detaching and attaching state after rebuilding 10397 - @shuo-wu
[BACKPORT][v1.8.1][BUG] A V2 volume checksum will change after replica rebuilding if the volume created with backing image 10341 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Bug in snapshot count enforcement cause volume faulted and stuck in detaching/attaching loop 10309 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.8.1][BUG] Test case test_csi_mount_volume_online_expansion is failing due to unable to expand PVC 10414 - @yangchiu @c3y1huang
[BACKPORT][v1.8.1][BUG] V2 BackingImage failed after node reboot 10343 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] Workload pod will not be able to move to new node when backup operation is taking a long time 10172 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] WebUI Volumes Disappear and Reappear 10332 - @PhanLe1010 @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] "Error get size" from "metrics_collector.(*BackupCollector).Collect" on every metric scrape 10361 - @derekbit @chriscchien
[BACKPORT][v1.8.1][BUG] [UI] 'Create' button on the System Backup page is disabled after reloading page 10354 - @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] Proxy gRPC API ReplicaList returns different output formats for v1 and v2 volumes 10353 - @shuo-wu @roger-ryao
[BACKPORT][v1.8.1][BUG] constant attaching/reattaching of volumes after upgrading to 1.8 10315 - @james-munson
[BACKPORT][v1.8.1][BUG] Backup Execution Timeout setting issue in Helm chart 10325 - @james-munson @chriscchien
[BACKPORT][v1.8.1][BUG] v2 engine stuck in detaching-attaching loop if the previous replica is not cleaned up correct 10363 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Longhorn CSI plugin 1.8.0 crashes consistently when trying to create a snapshot 10319 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] Engine stuck in "stopped" state, prevent volume attach 10329 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] After upgrading to v1.8.0 the version number lost on the web-ui 10337 - @derekbit
[BACKPORT][v1.8.1][BUG] insufficient storage;precheck new replica failed after a temporary shutdown of a node 10234 - @PhanLe1010

Misc

[TASK] Fix CVE issues for v1.8.1 10318 - @c3y1huang

Contributors

@derekbit

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Improvement

[BACKPORT][v1.8.1][IMPROVEMENT] Support configurable upgrade-responder URL 10439 - @derekbit @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Several warning for unknown reason 10420 - @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10376 - @yangchiu @james-munson

Bug

[BACKPORT][v1.8.1][BUG] csi keeps creating backup if the backup target is unavailable 10510 - @mantissahz @roger-ryao
[BACKPORT][v1.8.1][BUG] integer divide by zero in replica scheduler 10506 - @c3y1huang @chriscchien
[BACKPORT][v1.8.1][BUG] Leading or trailing spaces in Longhorn UI break search 10508 - @houhoucoop @roger-ryao
[BACKPORT][v1.8.1][BUG] When replica rebuilding completed, the progress could be 99 instead of 100 10485 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] list_backupVolume API could randomly returns failed to find a node that is ready and has the default engine image error 10478 - @yangchiu @mantissahz
[BACKPORT][v1.8.1][BUG] nil pointer when the backing image copy is delete from the spec but also gets evicted at the same time 10466 - @yangchiu @ChanYiLin
[BACKPORT][v1.8.1][BUG] 2 uninstall pods could be created after uninstall job was created, one failed with deleting-confirmation-flag is set to false error, while the other completed successfully 10484 -
[BACKPORT][v1.8.1][BUG][UI] Backup store setting doesn't apply to the cloned volume 10468 - @yangchiu @mantissahz
[BACKPORT][v1.8.1][BUG] v2 volume workload FailedMount with message Staging target path /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/xxx/globalmount is no longer valid 10477 -
[BACKPORT][v1.8.1][BUG][UI] Bulk backup creation with a detached volume returns error 405 and error messages show in browser console 10462 - @mantissahz
[BACKPORT][v1.8.1][BUG] V2 volume fails to cleanup error replica and rebuild new one - test_data_locality_basic 10364 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Data lost caused by Longhorn CSI plugin doing a wrong filesystem format action in a rare race condition 10418 - @yangchiu @PhanLe1010
[BACKPORT][v1.8.1][BUG] v2 Engine loops in detaching and attaching state after rebuilding 10397 - @shuo-wu
[BACKPORT][v1.8.1][BUG] A V2 volume checksum will change after replica rebuilding if the volume created with backing image 10341 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Bug in snapshot count enforcement cause volume faulted and stuck in detaching/attaching loop 10309 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.8.1][BUG] Test case test_csi_mount_volume_online_expansion is failing due to unable to expand PVC 10414 - @yangchiu @c3y1huang
[BACKPORT][v1.8.1][BUG] V2 BackingImage failed after node reboot 10343 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] Workload pod will not be able to move to new node when backup operation is taking a long time 10172 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] WebUI Volumes Disappear and Reappear 10332 - @PhanLe1010 @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] "Error get size" from "metrics_collector.(*BackupCollector).Collect" on every metric scrape 10361 - @derekbit @chriscchien
[BACKPORT][v1.8.1][BUG] [UI] 'Create' button on the System Backup page is disabled after reloading page 10354 - @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] Proxy gRPC API ReplicaList returns different output formats for v1 and v2 volumes 10353 - @shuo-wu @roger-ryao
[BACKPORT][v1.8.1][BUG] constant attaching/reattaching of volumes after upgrading to 1.8 10315 - @james-munson
[BACKPORT][v1.8.1][BUG] Backup Execution Timeout setting issue in Helm chart 10325 - @james-munson @chriscchien
[BACKPORT][v1.8.1][BUG] v2 engine stuck in detaching-attaching loop if the previous replica is not cleaned up correct 10363 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Longhorn CSI plugin 1.8.0 crashes consistently when trying to create a snapshot 10319 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] Engine stuck in "stopped" state, prevent volume attach 10329 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] After upgrading to v1.8.0 the version number lost on the web-ui 10337 - @derekbit
[BACKPORT][v1.8.1][BUG] insufficient storage;precheck new replica failed after a temporary shutdown of a node 10234 - @PhanLe1010

Misc

[TASK] Fix CVE issues for v1.8.1 10318 - @c3y1huang

Contributors

@derekbit

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Improvement

[BACKPORT][v1.8.1][IMPROVEMENT] Support configurable upgrade-responder URL 10439 - @derekbit @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Several warning for unknown reason 10420 - @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10376 - @yangchiu @james-munson

Bug

[BACKPORT][v1.8.1][BUG] When replica rebuilding completed, the progress could be 99 instead of 100 10485 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] list_backupVolume API could randomly returns failed to find a node that is ready and has the default engine image error 10478 - @yangchiu @mantissahz
[BACKPORT][v1.8.1][BUG] nil pointer when the backing image copy is delete from the spec but also gets evicted at the same time 10466 - @yangchiu @ChanYiLin
[BACKPORT][v1.8.1][BUG] 2 uninstall pods could be created after uninstall job was created, one failed with deleting-confirmation-flag is set to false error, while the other completed successfully 10484 -
[BACKPORT][v1.8.1][BUG][UI] Backup store setting doesn't apply to the cloned volume 10468 - @yangchiu @mantissahz
[BACKPORT][v1.8.1][BUG] v2 volume workload FailedMount with message Staging target path /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/xxx/globalmount is no longer valid 10477 -
[BACKPORT][v1.8.1][BUG][UI] Bulk backup creation with a detached volume returns error 405 and error messages show in browser console 10462 - @mantissahz
[BACKPORT][v1.8.1][BUG] V2 volume fails to cleanup error replica and rebuild new one - test_data_locality_basic 10364 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Data lost caused by Longhorn CSI plugin doing a wrong filesystem format action in a rare race condition 10418 - @yangchiu @PhanLe1010
[BACKPORT][v1.8.1][BUG] v2 Engine loops in detaching and attaching state after rebuilding 10397 - @shuo-wu
[BACKPORT][v1.8.1][BUG] A V2 volume checksum will change after replica rebuilding if the volume created with backing image 10341 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Bug in snapshot count enforcement cause volume faulted and stuck in detaching/attaching loop 10309 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.8.1][BUG] Test case test_csi_mount_volume_online_expansion is failing due to unable to expand PVC 10414 - @yangchiu @c3y1huang
[BACKPORT][v1.8.1][BUG] V2 BackingImage failed after node reboot 10343 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] Workload pod will not be able to move to new node when backup operation is taking a long time 10172 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] WebUI Volumes Disappear and Reappear 10332 - @PhanLe1010 @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] "Error get size" from "metrics_collector.(*BackupCollector).Collect" on every metric scrape 10361 - @derekbit @chriscchien
[BACKPORT][v1.8.1][BUG] [UI] 'Create' button on the System Backup page is disabled after reloading page 10354 - @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] Proxy gRPC API ReplicaList returns different output formats for v1 and v2 volumes 10353 - @shuo-wu @roger-ryao
[BACKPORT][v1.8.1][BUG] constant attaching/reattaching of volumes after upgrading to 1.8 10315 - @james-munson
[BACKPORT][v1.8.1][BUG] Backup Execution Timeout setting issue in Helm chart 10325 - @james-munson @chriscchien
[BACKPORT][v1.8.1][BUG] v2 engine stuck in detaching-attaching loop if the previous replica is not cleaned up correct 10363 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Longhorn CSI plugin 1.8.0 crashes consistently when trying to create a snapshot 10319 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] Engine stuck in "stopped" state, prevent volume attach 10329 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] After upgrading to v1.8.0 the version number lost on the web-ui 10337 - @derekbit
[BACKPORT][v1.8.1][BUG] insufficient storage;precheck new replica failed after a temporary shutdown of a node 10234 - @PhanLe1010

Misc

[TASK] Fix CVE issues for v1.8.1 10318 - @c3y1huang

Contributors

@derekbit

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Improvement

[BACKPORT][v1.8.1][IMPROVEMENT] Support configurable upgrade-responder URL 10439 - @derekbit @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Several warning for unknown reason 10420 - @roger-ryao
[BACKPORT][v1.8.1][IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10376 - @yangchiu @james-munson

Bug

[BACKPORT][v1.8.1][BUG] V2 volume fails to cleanup error replica and rebuild new one - test_data_locality_basic 10364 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Data lost caused by Longhorn CSI plugin doing a wrong filesystem format action in a rare race condition 10418 - @yangchiu @PhanLe1010
[BACKPORT][v1.8.1][BUG] v2 Engine loops in detaching and attaching state after rebuilding 10397 - @shuo-wu
[BACKPORT][v1.8.1][BUG] A V2 volume checksum will change after replica rebuilding if the volume created with backing image 10341 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Bug in snapshot count enforcement cause volume faulted and stuck in detaching/attaching loop 10309 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.8.1][BUG] Test case test_csi_mount_volume_online_expansion is failing due to unable to expand PVC 10414 - @yangchiu @c3y1huang
[BACKPORT][v1.8.1][BUG] V2 BackingImage failed after node reboot 10343 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] Workload pod will not be able to move to new node when backup operation is taking a long time 10172 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] WebUI Volumes Disappear and Reappear 10332 - @PhanLe1010 @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] "Error get size" from "metrics_collector.(*BackupCollector).Collect" on every metric scrape 10361 - @derekbit @chriscchien
[BACKPORT][v1.8.1][BUG] [UI] 'Create' button on the System Backup page is disabled after reloading page 10354 - @chriscchien @houhoucoop
[BACKPORT][v1.8.1][BUG] Proxy gRPC API ReplicaList returns different output formats for v1 and v2 volumes 10353 - @shuo-wu @roger-ryao
[BACKPORT][v1.8.1][BUG] constant attaching/reattaching of volumes after upgrading to 1.8 10315 - @james-munson
[BACKPORT][v1.8.1][BUG] Backup Execution Timeout setting issue in Helm chart 10325 - @james-munson @chriscchien
[BACKPORT][v1.8.1][BUG] v2 engine stuck in detaching-attaching loop if the previous replica is not cleaned up correct 10363 - @shuo-wu @chriscchien
[BACKPORT][v1.8.1][BUG] Longhorn CSI plugin 1.8.0 crashes consistently when trying to create a snapshot 10319 - @PhanLe1010 @chriscchien
[BACKPORT][v1.8.1][BUG] Engine stuck in "stopped" state, prevent volume attach 10329 - @ChanYiLin @chriscchien
[BACKPORT][v1.8.1][BUG] After upgrading to v1.8.0 the version number lost on the web-ui 10337 - @derekbit
[BACKPORT][v1.8.1][BUG] insufficient storage;precheck new replica failed after a temporary shutdown of a node 10234 - @PhanLe1010

Misc

[TASK] Fix CVE issues for v1.8.1 10318 - @c3y1huang

Contributors

@jangseon-ryu

Longhorn v1.7.3 Release Notes

Longhorn 1.7.3 introduces several improvements and bug fixes that are intended to improve system quality, resilience, stability and security.

The Longhorn team appreciates your contributions and expects to receive feedback regarding this release.

Note

For more information about release-related terminology, see Releases.

Installation

Important

Ensure that your cluster is running Kubernetes v1.21 or later before installing Longhorn v1.7.3.

You can install Longhorn using a variety of tools, including Rancher, Kubectl, and Helm. For more information about installation methods and requirements, see Quick Installation in the Longhorn documentation.

Upgrade

Important

Ensure that your cluster is running Kubernetes v1.21 or later before upgrading from Longhorn v1.6.x or v1.7.x (< v1.7.0) to v1.7.3.

Longhorn only allows upgrades from supported versions. For more information about upgrade paths and procedures, see Upgrade in the Longhorn documentation.

Deprecation & Incompatibilities

The functionality of the environment check script overlaps with that of the Longhorn CLI, which is available starting with v1.7.0. Because of this, the script is deprecated in v1.7.0 and is scheduled for removal in v1.8.0.

For information about important changes, including feature incompatibility, deprecation, and removal, see Important Notes in the Longhorn documentation.

Post-Release Known Issues

For information about issues identified after this release, see Release-Known-Issues.

Resolved Issues

Feature

[BACKPORT][v1.7.3][FEATURE] Add periodic HugePages (2Mi) configuration check to ensure v2 data engine compatibility 10029 - @jangseon-ryu @yangchiu

Improvement

[BACKPORT][v1.7.3][IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10375 - @yangchiu @james-munson
[BACKPORT][v1.7.3][IMPROVEMENT] Add support for JSON log format configuration in Longhorn components (UI, driver) 10082 - @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Logging the reason why the instance manager pod is going to be deleted. 9887 - @derekbit @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Why is it not possible to change the replica count in v2 longhorn volume? 9806 - @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Check NFS versions in /etc/nfsmount.conf instead 9831 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][IMPROVEMENT] Add dmsetup and dmcrypt utilities check in cli 9935 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][UI][IMPROVEMENT] Improve the volume size information on UI 9964 - @houhoucoop @roger-ryao
[BACKPORT][v1.7.3][IMPROVEMENT] Prevent Volume Resize Stuck 9914 - @c3y1huang @roger-ryao
[BACKPORT][v1.7.3][IMPROVEMENT][UI] Add backupBackingImage table in backup page with tabs 9970 - @houhoucoop
[BACKPORT][v1.7.3][IMPROVEMENT] Reject strict-local + RWX volume creation 9930 - @COLDTURNIP @yangchiu
[BACKPORT][v1.7.3][IMPROVEMENT] Configure the log level of other system and user managed components via longhorn manager setting 9617 - @yangchiu @james-munson
[BACKPORT][v1.7.3][IMPROVEMENT] Change confusing error message to warning level 9917 - @yangchiu @derekbit
[BACKPORT][v1.7.3][IMPROVEMENT] Building longhorn-manager takes long time 9693 - @derekbit @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Talos support for environment check in longhorn manager 9723 - @yangchiu @c3y1huang

Bug

[BACKPORT][v1.7.3][BUG] Data lost caused by Longhorn CSI plugin doing a wrong filesystem format action in a rare race condition 10417 - @yangchiu @PhanLe1010 @chriscchien
[BACKPORT][v1.7.3][BUG] kubectl drain node is blocked by unexpected orphan engine processes 10427 - @yangchiu @PhanLe1010
[BACKPORT][v1.7.3][BUG] Test case test_csi_mount_volume_online_expansion is failing due to unable to expand PVC 10413 - @yangchiu @c3y1huang
[BACKPORT][v1.7.3][BUG] Workload pod will not be able to move to new node when backup operation is taking a long time 10173 - @yangchiu
[BUG][v1.7.x] Excessive memory consumption F438 caused by RWX volumes / ganesha.nfsd 8523 - @james-munson @chriscchien
[BACKPORT][v1.7.3][BUG] WebUI Volumes Disappear and Reappear 10331 - @PhanLe1010 @chriscchien @houhoucoop
[BACKPORT][v1.7.3][BUG] "Error get size" from "metrics_collector.(*BackupCollector).Collect" on every metric scrape 10362 - @derekbit @chriscchien
[BACKPORT][v1.7.3][BUG] Engine stuck in "stopped" state, prevent volume attach 9954 - @ChanYiLin @roger-ryao
[BACKPORT][v1.7.3][BUG] Backup Execution Timeout setting issue in Helm chart 10326 - @james-munson @chriscchien
[BACKPORT][v1.7.3][BUG] Instability after power failure 10185 - @yangchiu @james-munson
[BACKPORT][v1.7.3][BUG] CSI plugin pod keep crashing util the backup volume appears when creation a backup via the CSI snapshotter 10024 - @mantissahz @chriscchien
[BACKPORT][v1.7.3][BUG] insufficient storage;precheck new replica failed after a temporary shutdown of a node 10223 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.7.3][BUG] longhorn-manager seems to crash rpm-DB on the host by continuously calling rpm -q ... 10022 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][BUG] Backup progress should not add block failed to upload to successful count 9793 - @derekbit @chriscchien
[BACKPORT][v1.7.3][BUG][v1.8.x] Can not create backup, backup become in error state immediately 10180 - @PhanLe1010 @chriscchien
[BACKPORT][v1.7.3][BUG] Storage doesn't reschedule in v1.7.2 10109 - @PhanLe1010
[BACKPORT][v1.7.3][BUG] Old backups are not cleaned up after timeout 9731 - @mantissahz @roger-ryao
[BACKPORT][v1.7.3][BUG] UnknowOS Message in Longhorn Node Condition on RHEL 9833 - @yangchiu @mantissahz @roger-ryao
[BACKPORT][v1.7.3][BUG] volume FailedMount - Input/output error 10005 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.7.3][BUG] Unable to delete backing image backup through UI 10068 - @chriscchien @houhoucoop @roger-ryao
[BACKPORT][v1.7.3][BUG] Error notification appears on the volume backup details page 10071 - @houhoucoop @roger-ryao
[BACKPORT][v1.7.3][BUG] Missing fromBackup Parameter in API Request When Restoring Multiple Files from Backup List 10051 - @a110605 @roger-ryao
[BACKPORT][v1.7.3][BUG] Webhook servers initialization blocks longhorn-manager from running 10055 - @c3y1huang @chriscchien
[BACKPORT][v1.7.3][BUG] CLI check preflight glosses over absence of NFS installation. 9893 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][BUG] Detached Volume Stuck in Attached State During Node Eviction 9810 - @yangchiu @c3y1huang
[BACKPORT][v1.7.3][BUG] Test case test_node_eviction_multiple_volume failed to reschedule replicas after volume detached 9866 - @yangchiu @c3y1huang
[BACKPORT][v1.7.3][BUG] DR volume fails to reattach and faulted after node stop and start during incremental restore 9803 - @c3y1huang @roger-ryao
[BACKPORT][v1.7.3][BUG] Share manager is permanently stuck in stopping/error if we shutdown the node of share manager pod. This makes RWX PVC cannot attach to any new node 9856 -
[BACKPORT][v1.7.3][BUG] Fail to resize RWX PVC at filesystem resizing step 9738 - @james-munson
[BACKPORT][v1.7.3][BUG] Failed to inspect the backup bac...

@yangchiu

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Feature

[BACKPORT][v1.7.3][FEATURE] Add periodic HugePages (2Mi) configuration check to ensure v2 data engine compatibility 10029 - @yangchiu

Improvement

[BACKPORT][v1.7.3][IMPROVEMENT] Settings change validation should go back to using Volume state to determine "are all volumes detached" 10375 - @yangchiu @james-munson
[BACKPORT][v1.7.3][IMPROVEMENT] Add support for JSON log format configuration in Longhorn components (UI, driver) 10082 - @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Logging the reason why the instance manager pod is going to be deleted. 9887 - @derekbit @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Why is it not possible to change the replica count in v2 longhorn volume? 9806 - @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Check NFS versions in /etc/nfsmount.conf instead 9831 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][IMPROVEMENT] Add dmsetup and dmcrypt utilities check in cli 9935 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][UI][IMPROVEMENT] Improve the volume size information on UI 9964 - @houhoucoop @roger-ryao
[BACKPORT][v1.7.3][IMPROVEMENT] Prevent Volume Resize Stuck 9914 - @c3y1huang @roger-ryao
[BACKPORT][v1.7.3][IMPROVEMENT][UI] Add backupBackingImage table in backup page with tabs 9970 - @houhoucoop
[BACKPORT][v1.7.3][IMPROVEMENT] Reject strict-local + RWX volume creation 9930 - @COLDTURNIP @yangchiu
[BACKPORT][v1.7.3][IMPROVEMENT] Configure the log level of other system and user managed components via longhorn manager setting 9617 - @yangchiu @james-munson
[BACKPORT][v1.7.3][IMPROVEMENT] Change confusing error message to warning level 9917 - @yangchiu @derekbit
[BACKPORT][v1.7.3][IMPROVEMENT] Building longhorn-manager takes long time 9693 - @derekbit @chriscchien
[BACKPORT][v1.7.3][IMPROVEMENT] Talos support for environment check in longhorn manager 9723 - @yangchiu @c3y1huang

Bug

[BACKPORT][v1.7.3][BUG] Data lost caused by Longhorn CSI plugin doing a wrong filesystem format action in a rare race condition 10417 - @yangchiu @PhanLe1010 @chriscchien
[BACKPORT][v1.7.3][BUG] kubectl drain node is blocked by unexpected orphan engine processes 10427 - @yangchiu @PhanLe1010
[BACKPORT][v1.7.3][BUG] Test case test_csi_mount_volume_online_expansion is failing due to unable to expand PVC 10413 - @yangchiu @c3y1huang
[BACKPORT][v1.7.3][BUG] Workload pod will not be able to move to new node when backup operation is taking a long time 10173 - @yangchiu
[BUG][v1.7.x] Excessive memory consumption caused by RWX volumes / ganesha.nfsd 8523 - @james-munson @chriscchien
[BACKPORT][v1.7.3][BUG] WebUI Volumes Disappear and Reappear 10331 - @PhanLe1010 @chriscchien @houhoucoop
[BACKPORT][v1.7.3][BUG] "Error get size" from "metrics_collector.(*BackupCollector).Collect" on every metric scrape 10362 - @derekbit @chriscchien
[BACKPORT][v1.7.3][BUG] Engine stuck in "stopped" state, prevent volume attach 9954 - @ChanYiLin @roger-ryao
[BACKPORT][v1.7.3][BUG] Backup Execution Timeout setting issue in Helm chart 10326 - @james-munson @chriscchien
[BACKPORT][v1.7.3][BUG] Instability after power failure 10185 - @yangchiu @james-munson
[BACKPORT][v1.7.3][BUG] CSI plugin pod keep crashing util the backup volume appears when creation a backup via the CSI snapshotter 10024 - @mantissahz @chriscchien
[BACKPORT][v1.7.3][BUG] insufficient storage;precheck new replica failed after a temporary shutdown of a node 10223 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.7.3][BUG] longhorn-manager seems to crash rpm-DB on the host by continously calling rpm -q ... 10022 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][BUG] Backup progress should not add block failed to upload to successful count 9793 - @derekbit @chriscchien
[BACKPORT][v1.7.3][BUG][v1.8.x] Can not create backup, backup become in error state immediately 10180 - @PhanLe1010 @chriscchien
[BACKPORT][v1.7.3][BUG] Storage doesn't reschedule in v1.7.2 10109 - @PhanLe1010
[BACKPORT][v1.7.3][BUG] Old backups are not cleaned up after timeout 9731 - @mantissahz @roger-ryao
[BACKPORT][v1.7.3][BUG] UnknowOS Message in Longhorn Node Condition on RHEL 9833 - @yangchiu @mantissahz @roger-ryao
[BACKPORT][v1.7.3][BUG] volume FailedMount - Input/output error 10005 - @PhanLe1010 @roger-ryao
[BACKPORT][v1.7.3][BUG] Unable to delete backing image backup through UI 10068 - @chriscchien @houhoucoop @roger-ryao
[BACKPORT][v1.7.3][BUG] Error notification appears on the volume backup details page 10071 - @houhoucoop @roger-ryao
[BACKPORT][v1.7.3][BUG] Missing fromBackup Parameter in API Request When Restoring Multiple Files from Backup List 10051 - @a110605 @roger-ryao
[BACKPORT][v1.7.3][BUG] Webhook servers initialization blocks longhorn-manager from running 10055 - @c3y1huang @chriscchien
[BACKPORT][v1.7.3][BUG] CLI check preflight glosses over absence of NFS installation. 9893 - @COLDTURNIP @roger-ryao
[BACKPORT][v1.7.3][BUG] Detached Volume Stuck in Attached State During Node Eviction 9810 - @yangchiu @c3y1huang
[BACKPORT][v1.7.3][BUG] Test case test_node_eviction_multiple_volume failed to reschedule replicas after volume detached 9866 - @yangchiu @c3y1huang
[BACKPORT][v1.7.3][BUG] DR volume fails to reattach and faulted after node stop and start during incremental restore 9803 - @c3y1huang @roger-ryao
[BACKPORT][v1.7.3][BUG] Share manager is permanently stuck in stopping/error if we shutdown the node of share manager pod. This makes RWX PVC cannot attach to any new node 9856 -
[BACKPORT][v1.7.3][BUG] Fail to resize RWX PVC at filesystem resizing step 9738 - @james-munson
[BACKPORT][v1.7.3][BUG] Failed to inspect the backup backing image information if NFS backup target URL with options 9703 - @yangchiu @mantissahz
[BACKPORT][v1.7.3][BUG] Pre-upgrade pod should event the reason for any failures. 9643 - @yangchiu @james-munson

Misc

[TASK] Fix CVE issues for v1.7.3 9897 - @c3y1huang
[BACKPORT][v1.7.3][TASK] Install the latest grpc_health_probe at build time 9715 - @yangchiu @c3y1huang

Contributors

@mantissahz

Longhorn v1.8.0 Release Notes

This latest version of Longhorn introduces several features, enhancements, and bug fixes that are intended to improve system quality and the overall user experience. Highlights include new V2 Data Engine features, multiple backupstores, automatic RWX volume expansion, installation and upgrades via Helm Controller, and V2 Data Engine Support for Talos Linux.

The Longhorn team appreciates your contributions and anticipates receiving feedback regarding this release.

For more information about release-related terminology, see Releases.

Warning

An incorrect Longhorn image tag (v1.8.x-head) was used in the deployment manifest and the Helm chart. The correct tag for Longhorn v1.8.0 images is v1.8.0. For more information, see Issue #10336.

If you installed or upgraded Longhorn using the deployment manifest or the Helm chart from the main Longhorn repository, perform the following actions to resolve the issue:

New installations: Replace v1.8.x-head with v1.8.0 in the deployment manifest or the Helm chart before deploying Longhorn.
Upgrades: Replace v1.8.x-head with v1.8.0 in the deployment manifest or Helm chart. Next, upgrade the Longhorn system and update the engine image for volumes that use v1.8.x-head.

This issue does not affect installations and upgrades performed using the Longhorn Helm repository. For more details, refer to the Install with Helm section of the official documentation.

Important

The CSI external-snapshotter was upgraded to v8.2.0. Ensure that all clusters are running Kubernetes v1.25 or later before upgrading to Longhorn v1.8.0 or a later version.

Deprecation & Incompatibilities

The default block size for block-type disks in earlier Longhorn releases is 4096 bytes. However, 512 bytes is more commonly used and aligns with the V1 Data Engine's configuration. Additionally, the 4096-byte block size is incompatible with backing images generated by the V1 Data Engine. To address these concerns, the default block size was changed to 512 bytes.

If you have existing V2 volumes, perform the following steps:
1. Back up the V2 volumes.
2. Remove the V2 volumes.
3. Delete the block-type disk with a 4096-byte block size from node.spec.disks.
4. Erase the old data on the block-type disk using tools such as dd.
5. Add the disk again to node.spec.disks with the updated configuration.
6. Restore the V2 volumes.
For more information, see #10053.
A V2 volume data corruption issue that affects earlier Longhorn releases has been resolved in v1.8.0. The issue involves potential continual changes to the checksum of files in a V2 volume with multiple replicas. This occurs because SPDK allocates clusters without initialization, leading to data inconsistencies across replicas. The varying data read from the volume can result in data corruption and broken backups. For more information, see #10035.

Primary Highlights

New V2 Data Engine Features

Although the V2 Data Engine is still considered an experimental feature in this release, the core functions have been significantly enhanced.

Configurable CPU cores: Support the global and node-specific configuration options provide greater control and flexibility for optimizing performance and resource allocation.
Disaster recovery (DR) volumes: Designed to store data in a backup cluster in the event of a failure in the main cluster. DR volumes enhance the resiliency of Longhorn volumes by ensuring data can be quickly restored in case of cluster outages.
Auto-salvage volumes: Automatically repair volumes in the event of a failure.
Live migration: Allow for the migration of volumes from one node to another without interrupting the services using those volumes.
Volume encryption: Ensure data stored in volumes is protected through encryption in transit and at rest.
Delta replica rebuilding using snapshot checksum.
Backing image update and download.

Multiple Backupstores and Default Backup Target

Starting with v1.8.0, Longhorn allows you to use multiple backupstores for storing backups of Longhorn volumes. Longhorn v1.8.0 also creates a default backup target (default) during installation and upgrades. The default backup target is used for system backups and volumes that were created without an assigned backup target name.

Demo | Documentation | GitHub Issue

Automatic RWX Volume Expansion

Longhorn v1.8.0 supports fully automatic online expansion of RWX volumes without the need to scale down the workload or apply manual commands. To use this feature, ensure that the v1.8.0 versions of Longhorn Manager, Share Manager, and the CSI plugin are all running.

Documentation | GitHub Issue

Helm Controller

You can now install and upgrade Longhorn on clusters running RKE2 or K3s using the Helm Controller that is built into those distributions. The Helm Controller manages Helm charts using a HelmChart Custom Resource Definition (CRD), which contains most of the options that would normally be passed to the Helm command-line tool.

Documentation | GitHub Issue

V2 Data Engine Support for Talos Linux

Longhorn v1.8.0 supports usage of V2 volumes in Talos Linux clusters. To use this feature, ensure that all nodes meet the V2 Data Engine prerequisites.

Documentation | GitHub Issue

Installation

Important

Ensure that your cluster is running Kubernetes v1.25 or later before installing Longhorn v1.8.0.

You can install Longhorn using a variety of tools, including Rancher, Kubectl, and Helm. For more information about installation methods and requirements, see Quick Installation in the Longhorn documentation.

Upgrade

Important

Ensure that your cluster is running Kubernetes v1.25 or later before upgrading from Longhorn v1.7.x to v1.8.0.

Longhorn only allows upgrades from supported versions. For more information about upgrade paths and procedures, see Upgrade in the Longhorn documentation.

Post-Release Known Issues

For information about issues identified after this release, see Release-Known-Issues.

Highlight

[FEATURE] Support encrypted v2 volumes 7355 - @mantissahz @chriscchien
[FEATURE] Support v2 volume delta replica rebuilding based on snapshot checksum: data plane and control plane 9488 - @shuo-wu @roger-ryao
[FEATURE] Multiple backup stores support 5411 - @mantissahz @roger-ryao
[FEATURE] Support v2 volume delta replica rebuilding based on snapshot checksum: SPDK part 5573 - @DamiaSan @roger-ryao
[IMPROVEMENT] v2 volume supports data locality 9371 - @derekbit @roger-ryao
[FEATURE] SPDK volume supports live migration 6361 - @PhanLe1010 @chriscchien @roger-ryao
[FEATURE] Support backing image for v2 volume 6341 - @yangchiu @ChanYiLin @chriscchien
[FEATURE] Make CPU core configurable for v2 data engine 8835 - @yangchiu @derekbit
[FEATURE] Support/verify installing Longhorn Helm chart using helm-controller which is built in k3s and rke2 9506 - @yangchiu @james-munson
[FEATURE] v2 volume supports auto salvage 8430 - @c3y1huang @chriscchien
[FEATURE] v2 volumes supports DR volume 6613 - @c3y1huang @roger-ryao
[FEATURE] Talos support for v2 data engine 7791 - @yangchiu @c3y1huang

Feature

[FEATURE] Support Data Engine Option For Backing Image in CSI flow 10084 - @ChanYiLin @roger-ryao
[FEATURE] Missing option to modify backup-task retain value from UI while editing recurring job [8810](#8810...

@mantissahz

DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.

Resolved Issues in this release

Highlight

[FEATURE] Support encrypted v2 volumes 7355 - @mantissahz @chriscchien
[FEATURE] Support v2 volume delta replica rebuilding based on snapshot checksum: data plane and control plane 9488 - @shuo-wu @roger-ryao
[FEATURE] Multiple backup stores support 5411 - @mantissahz @roger-ryao
[FEATURE] Support v2 volume delta replica rebuilding based on snapshot checksum: SPDK part 5573 - @DamiaSan @roger-ryao
[IMPROVEMENT] v2 volume supports data locality 9371 - @derekbit @roger-ryao
[FEATURE] SPDK volume supports live migration 6361 - @PhanLe1010 @chriscchien @roger-ryao
[FEATURE] Support backing image for v2 volume 6341 - @yangchiu @ChanYiLin @chriscchien
[FEATURE] Make CPU core configurable for v2 data engine 8835 - @yangchiu @derekbit
[FEATURE] Support/verify installing Longhorn Helm chart using helm-controller which is built in k3s and rke2 9506 - @yangchiu @james-munson
[FEATURE] v2 volume supports auto salvage 8430 - @c3y1huang @chriscchien
[FEATURE] v2 volumes supports DR volume 6613 - @c3y1huang @roger-ryao
[FEATURE] Talos support for v2 data engine 7791 - @yangchiu @c3y1huang

Feature

[FEATURE] Support Data Engine Option For Backing Image in CSI flow 10084 - @ChanYiLin @roger-ryao
[FEATURE] Missing option to modify backup-task retain value from UI while editing recurring job 8810 - @houhoucoop @roger-ryao
[UI][FEATURE] Multiple backup stores support 8647 - @a110605 @roger-ryao
[UI][FEATURE] Support V2 Backing Image in UI 9880 - @yangchiu @chriscchien @houhoucoop
[FEATURE] Add periodic HugePages (2Mi) configuration check to ensure v2 data engine compatibility 9983 - @yangchiu @jangseon-ryu
[FEATURE][UI] Fill in the secret and secret namespace when restoring the BackupBackingImage 9490 - @chriscchien @houhoucoop
[FEATURE] Add kernel module check for data engine v2 9915 - @jangseon-ryu @roger-ryao
[FEATURE]: Automatic online filesystem resize for RWX volumes. 9736 - @yangchiu @james-munson
[FEATURE] Add recurring job label to backup metrics or empty value 9429 - @ChanYiLin @roger-ryao
[FEATURE] Longhorn CLI supports darwin platform 9532 - @derekbit @chriscchien

Improvement

[IMPROVEMENT] Bump nfs-ganesha in longhorn-share-manager from v6.3 to v6.5 10194 - @derekbit @chriscchien
[IMPROVEMENT] V2 Data engine avoids changing the data of a lvol when decoupling it from 9922 - @DamiaSan @roger-ryao
[IMPROVEMENT] Change the AIO disk block size to 512 bytes for v2 data engine 10053 - @derekbit @chriscchien
[IMPROVEMENT] No need to truncate block-type disk's StorageAvailable value 10121 - @derekbit @chriscchien
[IMPROVEMENT] Separate the default backup target settings from default settings 10089 - @mantissahz @roger-ryao
[IMPROVEMENT] Add support for JSON log format configuration in Longhorn components (UI, driver) 10064 - @IshinMV @chriscchien
[IMPROVEMENT] No need to start tgtd in an instance-manager pod for v2 data engine 9941 - @derekbit @roger-ryao
[IMPROVEMENT] Configure the log level of other system and user managed components via longhorn manager setting 6702 - @james-munson @roger-ryao
[IMPROVEMENT] Longhorn CLI should install cryptsetup 9315 - @mantissahz @roger-ryao
[IMPROVEMENT] Collect and display disk space usage for the backing images 8757 - @ChanYiLin @roger-ryao
[IMPROVEMENT] Check NFS versions in /etc/nfsmount.conf instead 9830 - @COLDTURNIP @roger-ryao
[IMPROVEMENT] Change misleading error message to warning level 9916 - @yangchiu @derekbit
[IMPROVEMENT] No need to redirect spdk_tgt log to /var/log/spdk_tgt.log 9926 - @derekbit @chriscchien
[UI][IMPROVEMENT] Make backup wait until there is no 3D11 backup being delete and Add the progress time 8750 - @a110605
[IMPROVEMENT] Allow specify data engine version for the default storageclass during Helm installation 9584 - @shuo-wu @chriscchien
[IMPROVEMENT] Building longhorn-manager takes long time 8744 - @derekbit @chriscchien
[IMPROVEMENT] Reject strict-local + RWX volume creation 6735 - @COLDTURNIP @chriscchien
[IMPROVEMENT] Restored volume or other operations should be allowed on the old and running instance-manager pods 9383 - @derekbit @chriscchien
[IMPROVEMENT] Make backup deletion async and force backup creation wait until there is no backup being delete 8746 - @ChanYiLin @roger-ryao
[IMPROVEMENT][UI] Attach table search keyword to URL on Backup pages 9974 - @chriscchien @houhoucoop
[IMPROVEMENT] Busrt ISCSI Connection Errors, and IM Pod Restarting to make LH Volume disconnection 9851 - @ChanYiLin @chriscchien
[IMPROVEMENT] Why is it not possible to change the replica count in v2 longhorn volume? 9805 - @hookak @chriscchien
[IMPROVEMENT][UI] Add backupBackingImage table in backup page with tabs 8956 - @chriscchien @houhoucoop
[IMPROVEMENT] Decrease the reconnect delay of v2 volume nvme initiator 9818 - @derekbit @chriscchien
[UI][IMPROVEMENT] Improve the volume size information on UI 8843 - @yangchiu @houhoucoop
[IMPROVEMENT] Add dmsetup and dmcrypt utilities check in cli 8217 - @COLDTURNIP @roger-ryao
[UI IMPROVEMENT] Collect and display disk space usage for the backing images 9325 - @chriscchien @houhoucoop
[IMPROVEMENT] Check kernel module dm_crypt on host machines 9153 - @mantissahz @roger-ryao
[IMPROVEMENT] Logging the reason why the instance manager pod is going to be deleted. 9886 - @derekbit @chriscchien
[IMPROVEMENT] Remove isCLIAPIVersionOne related codes 7191 - @derekbit @chriscchien
[IMPROVEMENT] Prevent Volume Resize Stuck 6633 - @c3y1huang @roger-ryao
[IMPROVEMENT] Expose EngineImage CLIAPIVersion and ControllerAPIVersion 6531 - @chriscchien @houhoucoop
[IMPROVEMENT] Update longhorn-share-manager's nfs-ganesha to v6.2 9614 - @derekbit @chriscchien
[UI][IMPROVEMENT] Display ReUploadedDataSize and NewlyUploadedDataSize of each backup on Backup page 8975 - @yangchiu @houhoucoop
[IMPROVEMENT] Remove the default CSI component images in longhorn-manager 9580 - @yangchiu @c3y1huang
[IMPROVEMENT] the if-not-present volume backup policy should also check if the latest backup is up-to-date 6027 - @c3y1huang @chriscchien
[IMPROVEMENT] longhorn/go-common-libs should support Darwin 9487 - @Yu-Jack @chriscchien
[IMPROVEMENT] Enable Prometheus metrics of the CSI sidecar components 7938 - @yangchiu @ChanYiLin
[IMPROVEMENT] Talos support for environment check in longhorn manager 9558 - @yangchiu @c3y1huang
[IMPROVEMENT] Refactor condition icon and text in edit node and disk modal 9238 - @a110605
[IMPROVEMENT] Read-only volu...

Releases: longhorn/longhorn

Longhorn v1.9.0-rc2

Resolved Issues in this release

Highlight

Feature

Improvement

Bug

Contributors

Longhorn v1.9.0-rc1

Resolved Issues in this release

Highlight

Feature

Improvement

Bug

Contributors

Longhorn v1.8.1

Longhorn v1.8.1 Release Notes

Installation

Upgrade

Post-Release Known Issues

Resolved Issues

Improvement

Bug

Misc

Contributors

Contributors

Longhorn v1.8.1-rc3

Resolved Issues in this release

Improvement

Bug

Misc

Contributors

Contributors

Longhorn v1.8.1-rc2

Resolved Issues in this release

Improvement

Bug

Misc

Contributors

Contributors

Longhorn v1.8.1-rc1

Resolved Issues in this release

Improvement

Bug

Misc

Contributors

Contributors

Longhorn v1.7.3

Longhorn v1.7.3 Release Notes

Installation

Upgrade

Deprecation & Incompatibilities

Post-Release Known Issues

Resolved Issues

Feature

Improvement

Bug

Contributors

Longhorn v1.7.3-rc2

Resolved Issues in this release

Feature

Improvement

Bug

Misc

Contributors

Contributors

Longhorn v1.8.0

Longhorn v1.8.0 Release Notes

Deprecation & Incompatibilities

Primary Highlights

New V2 Data Engine Features

Multiple Backupstores and Default Backup Target

Automatic RWX Volume Expansion

Helm Controller

V2 Data Engine Support for Talos Linux

Installation

Upgrade

Post-Release Known Issues

Highlight

Feature