Open
Description
Description
If your dataset contains data that can't be converted to Torch tensors:
import ray
ds = ray.data.from_items([{"bytes": b"spam"}, {"bytes": b"ham"}])
next(iter(ds.iter_torch_batches()))
Then Torch raises an error:
TypeError: can't convert np.ndarray of type numpy.bytes_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
It'd be nice if we provided a more descriptive error that describes the column name and provides a workaround:
ValueError: Column 'bytes' of type `numpy.bytes_.` can't be converted to Torch tensors. To fix the issue, drop the column or specify a `collate_fn`.
Use case
QOL