Note
kvikio
has/will have support for the functionality contained in this package. Please use their implementation rapidsai/kvikio#646 as I will personally be using it and encouraging others to do so as well.
Two zarr-python
v3 compatible stores using kvikio
for remote and local data: https://docs.rapids.ai/api/kvikio/stable/quickstart/
plus (at least one) codec(s).
uv pip install cuda-zarr[cuda12]
Nvidia's documentation on how level/checksum are used in Zstd (the only exported codec here) is quite sparse (here?), but testing seems to show levels -7 to 22 all work. This codec only seems to work when used either roundtrip i.e., data is read and written using it, or only read. If you write data with this, it seems you can't read it back in with CPU data.
from cuda_zarr import ZstdGPU, CuFileStore, RemoteCuFileStore
import zarr
zarr.registry.register_codec("zstd", ZstdGPU)
zarr.config.set({'codecs.zstd': f"{ZstdGPU.__module__}.{ZstdGPU.__name__}", "buffer": "zarr.core.buffer.gpu.Buffer", "ndbuffer": "zarr.core.buffer.gpu.NDBuffer"})
store = CuFileStore('/path/to/store')
remote_store = RemoteCuFileStore.from_url("http://my_remote_data_server.com/path/to/the/store.zarr")
...
Untested in unit testing is the RemoteCuFileStore
with s3 (although http
is tested). Also RemoteCuFileStore
only supports get
and not set
via kvikio
(it will go through normal CPU based fsspec
io
in the set
case).