Open
Description
We should remove the internal Dask API dependencies from the Dask-on-Ray scheduler, implementing our own Rayification pass of the Dask graph.
Why
The Dask-on-Ray scheduler currently uses the local Dask scheduler to "Rayify" the Dask task graph, i.e. to turn the Dask task functions into Ray remote functions and properly unpack and repack task dependencies. While this does result in a shorter scheduler implementation, it also comes with a few disadvantages:
- The scheduler is relying on Dask internal APIs that may break at any time.
- The semantics of the custom callback hooks break, given that they execute at the time of Rayification, not at the time of actual task execution.
- It limits our ability to add other nice Ray-specific hooks to the Rayification process.
- Given that Ray task submission is so lightweight, all of the extra machinery in the local Dask scheduler is unnecessary overhead for the Rayification pass.
We should create our own version of the local Dask scheduler that's lightweight, adds additional Rayification hooks for callbacks, and purely relies on the Dask scheduler API and the Dask graph spec, not internal Dask APIs.