8000 Add "thousands" option to CSV Reader by pdet · Pull Request #17220 · duckdb/duckdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add "thousands" option to CSV Reader #17220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 30, 2025
Merged

Conversation

pdet
Copy link
Contributor
@pdet pdet commented Apr 23, 2025

This PR adds a new option called thousands to the CSV reader. It accepts a single character used to identify thousands separators in decimal, float, or double values.

By default, thousands is set to NULL. If specified, it must be different from the decimal_separator option.

When dealing with misplaced thousands separators, we follow the same approach as pandas: simply remove them from the string and ignore whether they were correct.

The option has also been added to the Python client.

Fix: #17093
Fix: #15678
Fix: #6578
Fix: #16632
Fix: #6154

@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 23, 2025 12:46
@pdet pdet marked this pull request as ready for review April 23, 2025 12:49
@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 23, 2025 13:25
@pdet pdet marked this pull request as ready for review April 23, 2025 13:30
@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 24, 2025 11:13
@pdet pdet marked this pull request as ready for review April 24, 2025 11:15
@Mytherin Mytherin merged commit cc5fc34 into duckdb:main Apr 30, 2025
49 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Add "thousands" option to CSV Reader (duckdb/duckdb#17220)
Fix [InvokeCI / NotifyExternalRepository] GitHub Actions has encountered an internal error when running your job. (duckdb/duckdb#17218)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
Add "thousands" option to CSV Reader (duckdb/duckdb#17220)
Fix [InvokeCI / NotifyExternalRepository] GitHub Actions has encountered an internal error when running your job. (duckdb/duckdb#17218)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 19, 2025
Add "thousands" option to CSV Reader (duckdb/duckdb#17220)
Fix [InvokeCI / NotifyExternalRepository] GitHub Actions has encountered an internal error when running your job. (duckdb/duckdb#17218)
@pdet pdet deleted the thousands_parameter branch May 28, 2025 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_csv() fails to recognize DOUBLE columns with a comma (thousands separator) and dot (decimal separator)
2 participants
0