-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[data] gather dask tests into single test files #54163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR decouples general data tests from the optional Dask dependency by removing Dask-related assertions from existing test files and gathering all Dask integration tests into a new, dedicated test file.
- Remove Dask usage from generic transform and optimizer tests
- Introduce
test_ecosystem_dask.py
with all Dask-specific end-to-end tests - Update imports and test organization accordingly
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
test_transform_pyarrow.py | Removed a Dask-based assertion to eliminate Dask dependency from generic PyArrow tests. |
test_execution_optimizer.py | Deleted the Dask end-to-end test to keep optimizer tests independent of Dask. |
test_ecosystem_dask.py | Added new file containing all Dask integration tests and necessary helper imports. |
@@ -28,8 +29,37 @@ def test_from_dask(ray_start_regular_shared): | |||
assert df.equals(dfds) | |||
|
|||
|
|||
def test_from_dask_e2e(ray_start_regular_shared): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use pytest.importorskip('dask')
at the top of this test to skip it when Dask is not installed, preventing failures in environments without the optional dependency.
def test_from_dask_e2e(ray_start_regular_shared): | |
def test_from_dask_e2e(ray_start_regular_shared): | |
pytest.importorskip('dask') |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
....
dask.config.set({"dataframe.convert-string": False}) | ||
|
||
from ray.util.dask import ray_dask_get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider using a context manager (with dask.config.set(...):
) or resetting the config after the test to avoid leaking this setting into other tests.
dask.config.set({"dataframe.convert-string": False}) | |
from ray.util.dask import ray_dask_get | |
with dask.config.set({"dataframe.convert-string": False}): | |
from ray.util.dask import ray_dask_get |
Copilot uses AI. Check for mistakes.
this is required for accepting #52589 |
so that general data tests no longer depend on dask, as dask dependency is optional. Signed-off-by: Lonnie Liu <lonnie@anyscale.com>
3ee7e80
to
c048531
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamp
so that general data tests no longer depend on dask, as dask dependency is optional.