8000 [Datasets] Raise descriptive error if `iter_torch_batches` can't convert data · Issue #32953 · ray-project/ray · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
[Datasets] Raise descriptive error if iter_torch_batches can't convert data #32953
Open
@bveeramani

Description

@bveeramani

Description

If your dataset contains data that can't be converted to Torch tensors:

import ray


ds = ray.data.from_items([{"bytes": b"spam"}, {"bytes": b"ham"}])
next(iter(ds.iter_torch_batches()))

Then Torch raises an error:

TypeError: can't convert np.ndarray of type numpy.bytes_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.

It'd be nice if we provided a more descriptive error that describes the column name and provides a workaround:

ValueError: Column 'bytes' of type `numpy.bytes_.` can't be converted to Torch tensors. To fix the issue, drop the column or specify a `collate_fn`.

Use case

QOL

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Important issue, but not time-criticaldataRay Data-related issuesenhancementRequest for new feature and/or capabilitypending-cleanupThis issue is pending cleanup. It will be removed in 2 weeks after being assigned.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0