8000 [Data] Infer the data schema in Ray Datasets · Issue #35230 · ray-project/ray · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
[Data] Infer the data schema in Ray Datasets #35230
Open
@zhe-thoughts

Description

@zhe-thoughts

Description

Right now, with "strict mode" enabled by default, users need to care about using the right schema when they passing in functions into map or map_batches

Ideally, we should infer the schema (I guess we already do that in ray.data.read_xxx calls). Another reference is Dask DataFrame: https://docs.dask.org/en/stable/generated/dask.dataframe.DataFrame.apply.html

Use case

This will simplify users' mental model in doing batch processing with Ray Data

Metadata

Metadata

Assignees

Labels

P2Important issue, but not time-criticaldataRay Data-related issuesenhancementRequest for new feature and/or capabilitypending-cleanupThis issue is pending cleanup. It will be removed in 2 weeks after being assigned.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0