-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add multi-threading compression #1336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
and automatic determination
no longer tied to Legacy format: receive buffer sizes as parameters.
- no crc - no stdin support - no dictionary support - relatively wasteful resource management
currently not active : behave as if blocks are still independent
- traces display line number - errors display traces (debuglevel >= 3)
Note: effectively limited to using the dictionary once for now, as opposed to once per block when blocks are independent (no impact when blocks are linked: dictionary is supposed to be used once anyway) Also : - clarifies that default lz4frame block size is 64 KB - refactor LZ4 Frame spec, dictionary paragraph - updated manual
now lz4 build defaults to single-thread mode
though it still builds single-threaded lz4
featuring time, speed, and cpu usage
mostly for C++98 compatibility (which benefit is dubious, this policy could be revisited)
and make it build the MT version of lz4 with an option that is enabled by default but can be disabled on demand. Also : - added make target mesonbuild, for easy local build - made huge tests friendlier to MT variant, for faster testing time
and some minor printf limitations on Windows.
also: provide MT-friendly time measurements for Linux/posix
c585380
to
043f625
Compare
also: display time/cpu statistics after legacy compression
t-mat
reviewed
Dec 30, 2023
} | ||
} else { | ||
/* some other error */ | ||
goto failed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not happen. But we can think about the following scenario :
glpi()
returnsERROR_INSUFFICIENT_BUFFER
at the first time.buffer
is allocated,while()
loop continues.glpi()
fails at the second time. (for some reason)- Here, potentially
buffer
will be leaked.
and a potential double fclose()
In scenarios where there are many files, instead of recreating and freeing thread pool for every file do it once, and share it across all files. The impact is small, but can be measured: in a scenario compressing a lot of files of 5 MB at level 1 with 6 threads, compression time went down from 3 sec to 2.8 sec.
lz4 skips multithreading for small files. In a scenario invoking lz4 repetitively on a lot of small files, processing time goes down from 5.5 sec to 4.5 sec.
bennetthardwick
pushed a commit
to foxglove/mcap
that referenced
this pull request
Jun 12, 2025
### Changelog This PR adds two new write options: * `compression_level: u32` – compression level to use for zstd/lz4, or zero to use the default compression level. * `compression_threads: u32` - number of threads to use for compression, or zero to disable multithreaded compression. ### Docs None? Let me know if I need to update docs somewhere. ### Description Prior to this PR, it wasn't possible to set the compression level; the default lz4 and zstd compression levels were used. Now, it's possible to set the lz4 and zstd compression levels. Prior to this PR, non-wasm targets were forced to use multithreaded zstd compression ([reference](https://github.com/foxglove/mcap/blob/b404e42bd9ca6e9193ae994f7f3d752248ea8b06/rust/src/write.rs#L1368)). Now, it's possible to disable multithreaded compression, or to use a different number of compression threads. lz4 supports multithreaded compression as of v1.10.0 (see [[1]](https://github.com/lz4/lz4/releases/tag/v1.10.0) and [[2]](lz4/lz4#1336)), but this feature doesn't seem to be accessible from the [lz4](https://crates.io/crates/lz4) / [lz4-sys](https://crates.io/crates/lz4-sys) crates. For now, the `compression_level` option is ignored when using lz4 compression. Let me know if we should print a warning or fail. I made these write options rather than parameterizing the [`Compression`](https://github.com/foxglove/mcap/blob/b404e42bd9ca6e9193ae994f7f3d752248ea8b06/rust/src/lib.rs#L158) enum variants, because these particular options are orthogonal to the compression algorithm, and because they're only applicable to writers.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch adds multi-threading support for compression to the
lz4
CLI.Features :
Here are a few results using this patch :
This is a pretty big patch, and I would expect various issues to crop up during testings.
Therefore it's going to be important to test this feature thoroughly before considering building a new release around it.
Open discussions on #1334
Known current limitations :
pthread
(i.e. mostlyposix
environments)