8000 [Dev] Make the various mappings in `MultiFileReaderData` typesafe by Tishj · Pull Request #16596 · duckdb/duckdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Dev] Make the various mappings in MultiFileReaderData typesafe #16596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Mar 13, 2025

Conversation

Tishj
Copy link
Contributor
@Tishj Tishj commented Mar 11, 2025

This PR does not change anything functionality-wise

In the MultiFileReaderData we have many mappings, used in various contexts, and many of these mappings are tied to each other - if one changes, the other should also change.

All of these were using idx_t, which made it very confusing and error-prone to use and understand what type of index is required and produced.

This PR introduces the following structs as typesafe replacements:

  • MultiFileLocalColumnId
  • MultiFileLocalIndex
  • MultiFileGlobalIndex
  • MultiFileConstantMapIndex
  • MultiFileLocalColumnIds
  • MultiFileColumnMapping
  • MultiFileFilterMap
  • MultiFileConstantMap
  • MultiFileCastMap

column_ids, column_mapping, filter_map, constant_map and cast_map are indexed with and produce the typed indexes
MultiFileFilterEntry has been changed to produce MultiFileConstantMapIndex or MultiFileLocalIndex depending on the way it was set.

@Tishj
Copy link
Contributor Author
Tishj commented Mar 11, 2025

In the future perhaps we want to look into making these actual types, so the type system will actually help in disallowing indexes to not be used in the wrong contexts, but for now I think the semantic changes should help with development / reasoning about the code.

This PR now also does this proposed next step of making it actually typesafe

For example the logic in ParquetReader::ScanInternal is a lot clearer now, how given the filter's (global) filter_idx we look up the MultiFileFilterEntry, which gives us the local idx, which we use to find the column_id from the column_ids and also the (global) result_idx (which we can now see is redundant because that should be the same as filter_idx).

@duckdb-draftbot duckdb-draftbot marked this pull request as draft March 12, 2025 14:10
…tant index or a local index, and retrieved in the same way
@Tishj Tishj changed the title [Dev] Add some typedefs to distinguish between the various indexes used in the MultiFileReaderData [Dev] Make the various mappings in MultiFileReaderData typesafe Mar 12, 2025
@Tishj Tishj marked this pull request as ready for review March 12, 2025 14:45
Copy link
Collaborator
@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me (after CI passes).

@Tishj Tishj marked this pull request as draft March 12, 2025 15:46
@Tishj Tishj marked this pull request as ready for review March 12, 2025 15:46
@Mytherin Mytherin merged commit 81a9e52 into duckdb:main Mar 13, 2025
49 of 50 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
[Dev] Make the various mappings in `MultiFileReaderData` typesafe (duckdb/duckdb#16596)
Issue template: direct UI issues to the UI repository (duckdb/duckdb#16619)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 15, 2025
[Dev] Make the various mappings in `MultiFileReaderData` typesafe (duckdb/duckdb#16596)
Issue template: direct UI issues to the UI repository (duckdb/duckdb#16619)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 16, 2025
[Dev] Make the various mappings in `MultiFileReaderData` typesafe (duckdb/duckdb#16596)
Issue template: direct UI issues to the UI repository (duckdb/duckdb#16619)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 17, 2025
[Dev] Make the various mappings in `MultiFileReaderData` typesafe (duckdb/duckdb#16596)
Issue template: direct UI issues to the UI repository (duckdb/duckdb#16619)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
[Dev] Make the various mappings in `MultiFileReaderData` typesafe (duckdb/duckdb#16596)
Issue template: direct UI issues to the UI repository (duckdb/duckdb#16619)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
[Dev] Make the various mappings in `MultiFileReaderData` typesafe (duckdb/duckdb#16596)
Issue template: direct UI issues to the UI repository (duckdb/duckdb#16619)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0