Open
Description
What happened + What you expected to happen
Running the following example gives a confusing error message that does not tell the user how to fix the problem:
block_iterator, stats, executor = ds._plan.execute_to_iterator()
File "/Users/ekl/Library/Python/3.9/lib/python/site-packages/ray/data/exceptions.py", line 86, in handle_trace
raise e.with_traceback(None) from SystemException()
ray.exceptions.RayTaskError(TypeError): ray::MapBatches(f)() (pid=56603, ip=127.0.0.1)
File "/Users/ekl/Library/Python/3.9/lib/python/site-packages/ray/cloudpickle/cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "/Users/ekl/Library/Python/3.9/lib/python/site-packages/ray/cloudpickle/cloudpickle.py", line 1245, in dump
return super().dump(obj)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/socket.py", line 273, in __getstate__
raise TypeError(f"cannot pickle {self.__class__.__name__!r} object")
TypeError: cannot pickle 'socket' object
The expected error message is something more like this:
Checking Serializability of <ray.data._internal.execution.operators.map_transformer.MapTransformer object at 0x16fffa040>
================================================================================
!!! FAIL serialization: cannot pickle 'socket' object
Versions / Dependencies
Ray 2.32
Reproduction script
import ray
import socket
def f(x):
return {"x": [socket.socket()]}
ds = ray.data.from_items([1,2,3])
ds = ds.map_batches(f)
ds.show()
Issue Severity
Medium: It is a significant difficulty but I can work around it.