DuckDB 1.3.0 "Ossivalis"

@Mytherin

This release of DuckDB is named "Ossivalis" after Bucephala Ossivalis, an ancestor of the Goldeneye duck that lived Millions of years ago.

Please also refer to the announcement blog post: https://duckdb.org/2025/05/21/announcing-duckdb-130

What's Changed

V1.2 histrionicus by @Mytherin in #16070
V1.2 histrionicus by @Mytherin in #16072
unittests: clear test directory after every test by @Mytherin in #16053
Benchmark runner: catch and log errors + add support for retry load N syntax by @Mytherin in #16054
Throw an error when unsupported commands are used in concurrentloop by @Mytherin in #16009
Remove extension definitions to prevent re-compilation of the entire system on commit by @Mytherin in #15955
Display schema information of currently selected database only by @ashwaniYDV in #15815
Issue #14366: Average Intervals by @hawkfish in #15864
Internal #2176: Temporal AVG by @hawkfish in #15661
discussions #15981: remove confusing comment in "duckdb/tools/shell/shell.cpp" by @komainu8 in #15984
Fix #15466 Transform LIMIT or OFFSET first based on order specified in prepared statement by @ashwaniYDV in #15484
Bitpacking mode info by @arjenpdevries in #15623
Sniff Timestamp_TZ from CSV FIles by @pdet in #15730
[no-op] Add documentation for filesystem read behavior by @dentiny in #15937
Accept "Auto" as date/timestamp format by @pdet in #15808
Parquet Reader Cleanup: Move ColumnReaders to separate files by @Mytherin in #16092
Parquet Reader: Move decoding logic into separate Decoder classes by @Mytherin in #16100
BundleStaticLibs to be also triggered by InvokeCI by @carlopi in #16107
Parquet Reader: Split DeltaLengthByteArray decoder from DeltaByteArray, and read the strings in a streaming manner by @Mytherin in #16105
Parquet Dictionary reader: set NULL values as the last value in the dictionary by @Mytherin in #16106
Parquet Reader: Share ResizeableBuffers across decoders, and unify Plain/PlainReference by @Mytherin in #16113
Using GitHub ARM runners for Linux CLI builds by @hannes in #16119
Parquet Reader: Implement dedicated Skip method by @Mytherin in #16117
Use ColumnSegment::FilterSelection and SelectionVector for filtering in Parquet scans by @Mytherin in #16126
[Dev] Fix output (long lines > 333 characters) getting truncated in shell by @Tishj in #16128
Adaptive table filter: initialize filter order based on heuristics by @Mytherin in #16127
Feature #16044: TimeZone Offset Seconds by @hawkfish in #16048
ATTACH OR REPLACE database to allow swapping of new data. by @xevix in #15355
[Dev] Remove upsert_conflict_in_different_chunk.test by @Tishj in #15980
[Dev] Fix issue related to unpacked columns and the NOT operator by @Tishj in #15534
[Julia] Add support for named params in prepared statements by @tqml in #15621
Use Adaptive Filters in the Parquet reader by @Mytherin in #16133
Parquet reader: push table filters directly into dictionaries by @Mytherin in #16136
Parquet reader: Plain templates - make CHECKED a template parameter, and use memcpy/bulk skip when reading/skipping without defines by @Mytherin in #16141
Parquet reader: only set invalid entry in the dictionary when the column has defines by @Mytherin in #16144
Add uniq_ptr_cast for interpreted benchmark. by @Tmonster in #16151
Hopefully fixing ci runs by @hannes in #16150
Removed the last CI job that used the Ubuntu 18 setup by @hannes in #16155
Parquet Reader: Split CreateReader into two separate stages - ParseSchema and CreateReader by @Mytherin in #16161
Have CSV Parellel tests on CI again by @pdet in #16164
[Python][Dev] Bump the minimum pybind11 version from 2.6 to 2.9 by @Tishj in #16159
Add StackTraces to FatalExceptions by @NiclasHaderer in #16158
Rework invoke by @carlopi in #16108
Adds pre-optimization hooks for DuckDB by @NiclasHaderer in #16115
Unify behavior of range/generate_series with PostgreSQL by @kryonix in #15935
[CI] Avoid Linux CLI jobs to fail-fast by @carlopi in #16173
Parquet: Add dedicated Select method that can be used to push selection vectors into the read by @Mytherin in #16174
Unvendor ICU by @m-kuhn in #16176
Parquet reader: batch check if buffer is available in RLEBpDecoder by @Mytherin in #16185
Parquet Reader: for DeltaLengthByteArray encoding, directly refer to strings from the block without copying by @Mytherin in #16186
unified names for duckdb-extensions by @hmeriann in #16179
Only delete test directory when --test-temp-dir is not specified by @Mytherin in #16192
Fix #16163: COLUMNS should not treat identifiers as strings by @Mytherin in #16193
Parquet reader: Avoid applying bloom filters if we are casting columns by @Mytherin in #16194
Pretty print sniffer values by @pdet in #16182
V1.2 histrionicus by @Mytherin in #16191
Bump Julia by @Mytherin in #16199
Ensure that dependent targets are present after find_package. by @BillyONeal in #16197
Concurrency groups for R and Wasm by @hmeriann in #16201
Parquet Writer Cleanup: Move ColumnWriters to separate files by @Mytherin in #16202
[fix] Use bigobj when building with MSVC by @m-kuhn in #16200
Improve performance of UNNEST/UNPIVOT by using selection vectors to unnest multiple lists at once by @Mytherin in #16210
Add the TRY expression by @Tishj in #15939
[Python][Dev] Replace the default connection when it's closed by @Tishj in #16160
Use steady clock for profiler by @dentiny in #16198
Add parallel memset when building hash join table by @hehezhou in #16172
Avoid unnecessarily projecting the input columns of the UNPIVOT operator in the UNNEST by @Mytherin in #16221
Left join push down optimization by @Damon07 in #15881
Do In-Filter pushdown in PyArrow by @pdet in #16224
Use _win32 with MSVC by @cfis in #16235
Fix Python 3 executable name on Windows by @cfis in #16236
Fix -std=c++11 by @cfis in #16237
Issue #8265: AsOf Nested Loop by @hawkfish in #16218
Include extension_util.hpp in libduckdb by @mlafeldt in #16255
Generalize rowid into the concept of virtual columns, and make filename a virtual column in the Parquet/CSV/JSON readers by @Mytherin in #16248
Modify histogram test to more fuzzily check boundaries since the test can be inconsistent on different platforms by @Mytherin in #16261
[Dev] Fix issue in TRY expression with dictionary_expression Vector verification by @Tishj in #16262
[Python Dev] Add the correct variant of the -std=c++11 flag based on the compiler (MSVC or not) by @Tishj in #16267
Fix extension install mode null by @samansmink in #16268
A little cleanup. by @JasonPunyon in #16028
Improve Parquet writer performance by @lnkuiper in #16243
Merge v1.2-histrionicus into main by @Mytherin in #16284
Many reclaim storage fixes by @taniabogatsch in #15825
Arena allocator for MinMaxN and skip NULLs when creating enum by @lnkuiper in #16246
Add pragma to truncate duckdb log storage by @samansmink in #16274
Some more Parquet writer performance improvements by @lnkuiper in #16287
Do duckdb_extract_statements to be able to execute pivot in ADBC by @pdet in #16162
[Dev] Improve/Add handling of escapes in VARCHAR -> list/struct/map and align behavior by @Tishj in #15944
make ValidityMask::RowIsValidUnsafe really unsafe by @xuke-hat in #16302
Multi File Reader Rework: Add MultiFileReaderFunction that is used to wrap a single-file reader, and use it for the Parquet reader by @Mytherin in #16299
[Python Dev] Add support for fully qualified references in .table(...) method by @Tishj in #16291
[Dev] MultiFileReader - Add to the column_indexes for file_row_number by @Tishj in #16311
Parquet reader performance by @lnkuiper in #16315
Bump Julia FixedPointDecimals dependency version by @mbarbar in #16323
Merge V1.2 histrionicus into main by @Mytherin in #16324
Add new recursive semantics (USING KEY) by @cryoEncryp in #12430
fix: add StringStats::SetMaxStringLength by @rustyconover in #16326
Fix decorrelation of WITH USING KEY by @kryonix in #16330
Issue #16250: Window Range Performance by @hawkfish in #16320
Verify UTF-8 in DeltaLengthByteArrayDecoder and speed it up by @lnkuiper in #16328
Add missing include by @Mytherin in #16342
[chore] No ccache for OSX Python by @carlopi in #16348
Linux CLI: override platform for ARM manylinux by @carlopi in #16347
docs: tweak explanation of median for even cardinality inputs by @NickCrews in #13726
[parquet] Fix implicit idx_t to int64_t conversion flagged by clang-tidy by @carlopi in #16368
Improve error message on failure to install local extension by @carlopi in #16371
MAIN_BRANCH_VERSIONING: main branch to get descriptors like v1.3.0-dev1234 instead of v1.2.1-dev1234 by @carlopi in #16366
Parallel HT Zeroing: Set entries_per_task so that there are 4x more tasks than threads by @gropaul in #16301
Internal #2176: SUMMARIZE Temporal Types by @hawkfish in #16095
[SwiftRelease CI] fetch tags before checking there is already a tag with the same name by @hmeriann in #16376
Push Select into ArrayColumnData to avoid scanning arrays that are not required for the query by @Mytherin in #16356
Revert "Linux CLI: override platform for ARM manylinux" by @carlopi in #16374
Rework CSV Reader: use the new MultiFileReaderFunction interface by @Mytherin in #16349
Add connection and transaction identifiers by @samansmink in #16296
Add parquet 'unknown' logical type by @hannes in #16378
Internal #4287: INTERVAL Times DOUBLE by @hawkfish in #16386
pb/compressed vector serialization by @peterboncz in #16066
Fix issue #16377 by @kryonix in #16391
Read support for Parquet Float16 by @hannes in #16395
MAIN_BRANCH_VERSIONING: Adopt also for Python build and amalgamation by @carlopi in #16400
Fuzzer Fix: Fix Avg for NULL cast to TIMESTAMP by @Tmonster in #16394
[FriendlySQL] Expand functionality of the Unpacked COLUMNS expression by @Tishj in #16290
Python Client: Faster Python Object Conversion by @Mytherin in #16431
Fixup #16400 by correctly passing down SETUPTOOLS_SCM_PRETEND_VERSION by @carlopi in #16435
Issue #16250: Window Range Performance by @hawkfish in #16438
Merge v1.2-histrionicus into main by @Mytherin in #16439
MAIN_BRANCH_VERSIONING: Add also prefix_version by @carlopi in #16441
[no-op] Remove unused function GetValueRefUnsafe by @dentiny in #16440
Filter Combiner Clean-up: move filter pushdown to separate functions, remove old commented out code by @Mytherin in #16443
[Python] Add the SQLExpression method to the Expression API by @Tishj in #16424
[Dev] Mention the problematic type in UNNEST BinderException by @Tishj in #16429
Merge v1.2 into main again by @Mytherin in #16447
Filter Combiner: Allow rowid pushdown for IN/OR filters and pushdown for temporal types by @Mytherin in #16450
Parquet: always launch max threads if we are scanning multiple files by @Mytherin in #16457
fix documents of C functions by @yiyuanliu in #16357
Add a TableFilterState for execution of table filters by @Mytherin in #16461
Mirror discussions to the internal repository by @szarnyasg in #16464
Rework JSON Reader: use the new MultiFileReaderFunction interface by @Mytherin in #16477
Speed-up contains by using memchr on every iteration by @Mytherin in #16484
Fix error cases by @Y-- in #16494
Prevent external joins (if possible) by @lnkuiper in #16430
Merge v1.2 into main by @Mytherin in #16517
Optimize FSST decoding by @lnkuiper in #16508
Extract subsystem by name by @dentiny in #16226
Avoid throwing an exception (that is then swallowed) when computing compressed materialization over stats that are not set by @Mytherin in #16532
Checksum backward compatibility by @lnkuiper in #16505
Prefetch Parquet page header by @lnkuiper in #16507
Let GitHub render *.test files as SQL by @mlafeldt in #16534
Fix ADBC to properly quote table and schema names by @CurtHagenlocher in #16526
Pass ClientContext to catalog initialize, and postpone index binding when replaying the WAL by @Mytherin in #16536
Allow UNITTEST_ROOT_DIRECTORY to be configured through CMake by @Mytherin in #16540
Internal #4347: ISO Year Week by @hawkfish in #16567
throw() -> noexcept in skiplist by @r-barnes in #16548
Fix test/sql/aggregate/aggregates/histogram_table_function.test to pass the Linux CLI (arm64) CI by @hmeriann in #16538
feat: move GRANT from reserved to unreserved keyword by @stephaniewang526 in #16546
Python test runner: Avoid enabling profiling when executing restart command by @Flogex in #16547
Add duckdb_prepared_statements by @Tishj in #16541
[minor] Keep bit type sanity check consistent by @dentiny in #16575
Support CREATE TABLE AS ... WITH NO DATA by @hannes in #16586
Parquet FLOAT16 - fix cast by @hannes in #16580
remove invalid tokens from nanosecond example by @hamilton in #16577
CrossVersion.yml: Add v1.2.1, v1.2-histrionicus and main by @carlopi in #16576
Fix #16524: DEPENDENT_JOIN may not flatten by @flashmouse in #16537
[Julia] Add support for appending duckdb List types by @era127 in #16512
[PySpark] - Add expr function by @mariotaddeucci in #16468
regex_replace no longer swallows regex errors by @hannes in #16380
Parquet Writer Clean-up: Split CreateWriterRecursive into two methods, and use ParquetColumnData for writer as well by @Mytherin in #16592
Bump Julia to 1.2.1 by @Mytherin in #16593
Improved appender error message by @NiclasHaderer in #16599
Change static variables to be on the stack instead by @Y-- in #16597
Add support for RETURN_STATS to COPY by @Mytherin in #16595
Better error messages for the CSV Scanner by @pdet in #16585
Support Enum types in read_csv - Python by @pdet in # 8000 15710
Fix CI Tidy by @pdet in #16610
Add some minor helper functions (QueryResultIterator::IsNull and casts to MultiFileList/Reader by @Mytherin in #16611
Add support for ALTER TABLE tbl SET PARTITIONED BY (key1, key2, ...) in the grammar by @Mytherin in #16612
Issue template: direct UI issues to the UI repository by @szarnyasg in #16619
[Dev] Make the various mappings in MultiFileReaderData typesafe by @Tishj in #16596
Bump mbedtls to 3.6.2 and re-apply patches by @hannes in #16485
Read and Write Complex Json from Arrow Types by @pdet in #16385
Add Docker support for RISC-V CI with appropriate build commands by @mocusez in #16549
Fix missing **kwargs in adbc_driver_duckdb.dbapi.connect() by @davlee1972 in #16637
[Dev] Clean up and fix the CGroup memory/cpu limit discovery logic by @Tishj in #16608
Expose Value::ToSQLString() in C API by @mt-caret in #16471
Add the missing binding for json_serialize_sql by @liznear in #16666
Do not create validity mask for non-null const vector by @xuke-hat in #16669
Fix #16665: fix parquet multi_reader bloom_probe logic error by @flashmouse in #16677
Add alias to catalog by @c-herrewijn in #16600
Decouple physical operator ownership from operators by @taniabogatsch in #16545
cmake: fix external icu by @autoantwort in #16676
Character length and date functions by @hannes in #16653
[Dev] Don't try to include third_party/mbedtls/VERSION with package_build.py by @Tishj in #16683
Add -ui to CLI help text by @akx in #16626
Fix alias of column reference lost in ReplaceProjectionBindings by @Damon07 in #16686
Merge v1.2-histrionicus into main by @Mytherin in #16687
Fix for GCC-4.8 by @Mytherin in #16690
JSON Reader: make read_position atomic so this can be read by the progress bar while processing the JSON file by @Mytherin in #16692
[Julia] support binding for vectors by @slwu89 in #16701
Make CSV Parser strict_mode=True fail on a mix of new line delimiters. by @pdet in #15959
[pypi] Fix cleanup logic for multiple branches by @hmeriann in #16634
Add support for ALTER TABLE tbl SET SORTED BY (key1 DESC, key2, ...) in the grammar by @Mytherin in #16714
RETURN_STATS: remove footer_offset, and emit written partition keys by @Mytherin in #16715
In case all rows of a CSV batch are errors, we continue processing by @pdet in #16713
add workaround for patching httpfs ext by @samansmink in #16722
Implement UUID v7 by @dentiny in #15819
Fix roundtripping of stringified nested types by @Tishj in #16304
Add Notify External Repositories Workflow by @maiadegraaf in #16730
Expose a selection vector and the Slice method to the C API by @joseph-isaacs in #16696
Add support for tracking column_size_bytes and contains_nan in RETURN_STATS by @Mytherin in #16731
Add support for WRITE_EMPTY_FILE option to COPY - which allows skipping of writing empty files by @Mytherin in #16737
Parquet Writer: Truncate string stats for large strings, instead of bailing on writing stats by @Mytherin in #16736
RLE compression - memset alignment bytes to zero when aligning the counts by @Mytherin in #16735
Write UUID stats to Parquet files and support reading uuid stats by @Mytherin in #16744
Add an initial value to list_reduce by @maiadegraaf in #16602
shell: make -bail work for more errors by @mlafeldt in #16594
Call Notify External Repositories from Invoke CI by @maiadegraaf in #16747
JSON bugfixes by @lnkuiper in #16729
Add support for dynamically providing extra info post-execution in table functions, and use this to emit the total number of files read by the MultiFileReader by @Mytherin in #16749
[Python Dev] Fix the versioning of the nightly python builds by @Tishj in #16739
shell: fix sometimes-uninitialized error by @mlafeldt in #16761
Issue #16250: Window Range Performance by @hawkfish in #16765
Avoid building Python 3.7 wheels also for Linux by @carlopi in #16769
Pyodide 0.27.2: conditionally skip tests by @carlopi in #16772
Push catalog lookups through an extensible EntryLookupInfo struct by @Mytherin in #16764
Fix two minor problems with NotifyExternalRepositories / odbc by @carlopi in #16776
update expected results reflecting the changes brought ups with Fix roundtripping of stringified nested types PR by @hmeriann in #16775
Merge V1.2 -> Main by @pdet in #16751
Add support for time travel syntax in the FROM clause by @Mytherin in #16774
Python docs: List all join types by @szarnyasg in #16789
[chore] NotifyExternalRepositories.yml: Fix endpoint to be pinged by @carlopi in #16793
Remove delta from extensions built on a nightly basis (vs main branch) by @carlopi in #16795
OSX.yml & Windows.yml: remove repository_dispatch, already handled by InvokeCI by @carlopi in #16796
Make extensions be linked privitally into duckdb by @JAicewizard in #16726
Add additional iterations to avoid assertion failure in TemporaryMemoryManager by @lnkuiper in #16801
Change the STANDARD_MASK_SIZE calculation to use size of template type. by @sebastiaan-dev in #16807
Fix nightly table sample error by @Tmonster in #16811
Fix tidy by @pdet in #16805
support 'categories' label in function catalog by @c-herrewijn in #15654
regenerate function headers by @c-herrewijn in #16822
Internal #4490: Window Jump Reset by @hawkfish in #16816
Regression.yml: Actually checkout proper base.sha commit by @carlopi in #16824
fix: drop useless python import by @yihong0618 in #16808
NightlyTests.yml: Inline env variables into build command by @carlopi in #16817
Benchmark runner summary by @hmeriann in #16759
Add storage_version 66 for version 1.3.0 by @carlopi in #16800
Revert "fix: drop useless python import" by @Mytherin in #16834
[MultiFileReader] Rework MultiFileReader::FinalizeChunk to use Expressions by @Tishj in #16630
Merge v1.2 into main by @Mytherin in #16832
Fix NULL key handling in mark join by @xuke-hat in #16825
compressed vector serialization fixes by @peterboncz in #16648
really sorry about this by @peterboncz in #16840
Fix Python docstrings for unique by @szarnyasg in #16845
[MultiFileReader] Create "local" filters to hand to underlying readers by @Tishj in #16838
Revert "Regression.yml: Actually checkout proper base.sha commit" by @Mytherin in #16860
[ART] Immediately erase empty fixed-size buffers by @taniabogatsch in #16727
Resolve defaults and column index map by pushing a Projection (instead of executing in the insert itself) by @Mytherin in #16867
Fix issue with sorting dev versions in pypi_cleanup.py script to keep on PyPi the most recent dev versions by @hmeriann in #16873
Allow filters to be pushed through joins if there are projection maps by @lnkuiper in #16871
Expressions in create secret by @samansmink in #15801
Python - Arrow IPC support in from_arrow by @pdet in #16821
[ART] Introduce a new ARTScanner and make InitMerge and Vacuum iterative by @taniabogatsch in #16861
Do not pushdown filters which bindings only match the right side of the left join by @Damon07 in #16880
MultiFileReader Rework (part 17) - remove MultiFileReaderData - and move as much as possible out of the file readers by @Mytherin in #16882
ICU: Unify TimeZone accessing code by @Mytherin in #16887
Rework ICU age computation to convert to a timestamp and use the regular interval age computation by @Mytherin in #16889
Reduce allocations during aggregations by @lnkuiper in #16849
CI: Prevent marking issues as 'stale' if they have the 'no stale' label by @szarnyasg in #16903
Add field name to log line which fails Parquet spec by @jsbali in #16862
Internal #4490: Window Threading Cleanup by @hawkfish in #16879
Adding gzip version of shell for linux/osx install script by @hannes in #16116
Fix USING KEY reference error by @kryonix in #16906
[Nested] Enable Varargs in LIST_CONCAT by @maiadegraaf in #16870
Fix several issues with vsize=2, and move vsize=2 tests to Main.yml by @Mytherin in #16918
C API comments: Fix a/an typos by @szarnyasg in #16925
Reduce locking with FILE_SIZE_BYTES/ROW_GROUPS_PER_FILE in Parquet writer by @lnkuiper in #16928
[Python] Fix annotation of condition argument in join so it accepts Expression by @MarcoGorelli in #16933
Fix GCC 4.8 and add it back to Main workflow by @Mytherin in #16937
Merge v1.2 into main again by @Mytherin in #16939
MultiFileReader - Perform nested remapping of field indexes instead of relying on casts by @Mytherin in #16941
Internal #4552: Short Circuit CSE by @hawkfish in #16931
Add back manylinux extensions by @carlopi in #16944
Run CI on merge group by @Mytherin in #16945
Internal #4516: Interval BIGINT Variants by @hawkfish in #16904
Split query string for multi-statement queries by @Mytherin in #16955
Vector Verification: Rework to run based on env variable DUCKDB_DEBUG_VERIFY_VECTOR and move to Main.yml by @Mytherin in #16957
Move the no string inline/alternative verify workflow to Main.yml by @Mytherin in #16958
[Python] Tighten type annotations on shape and columns by @MarcoGorelli in #16948
Pass down CMAKE_POLICY_VERSION_MINIMUM and fix for local development by @carlopi in #16953
[ART] Use the ARTScanner for VerifyAllocations (make it iterative) by @taniabogatsch in #16946
Move ThreadSanitizer test from nightly test to Main, and fix locking issue by @Mytherin in #16960
Re-enable workflows to run on PRs by @Mytherin in #16961
Fix for selecting NaN values from Parquet files by @Mytherin in #16962
Move LatestStorage tests to NightlyRelease - and fix issue with overflow string blocks not being cleaned up correctly by @Mytherin in #16972
Arena-allocate physical operators by @taniabogatsch in #16911
Make file_row_number a virtual column, and support per-file virtual columns in the MultiFileReader by @Mytherin in #16979
Add a setting scheduler_process_partial that allows partial scheduling of tasks in the background threads by @Mytherin in #16973
Clean up format script, gather all files then run concurrently instead of running concurrently per directory by @Mytherin in #16988
Add support for altering struct columns (adding fields, dropping fields, renaming fields) by @Mytherin in #17003
Fix CSV fuzzer tests by @pdet in #16994
[Fix] Keep original expression for macro + lambda's with subqueries by @taniabogatsch in #17020
Detect when tables have been dropped or altered, and prevent deletes in this scenario by @Mytherin in #17018
Update links pointing to duckdb.org by @szarnyasg in #16999
Fix for joining on floating columns #16901 by @nickzoic in #16965
fix: remove ununsed stream struct member from ArrowScanLocalState by @rustyconover in #17023
[Dev] Use UnifiedVectorFormat instead of a flattened Vector in UpdateSegment::Update by @Tishj in #16974
Remove Arrow Extenson from core extensions by @pdet in #17027
Correctly propagate ClientContext to TaskExecutor by @ywelsch in #17026
Issue #17001: AsOf memory Management by @hawkfish in #17028
[MultiFileReader] Make it possible for the multi file reader to add a DeleteFilter to the BaseFileReader by @Tishj in #17032
Add optional OVERRIDE_NEW_DELETE build parameter by @lnkuiper in #17035
Clean-up virtual columns and make MultiFileReader::InitializeReader virtual by @Mytherin in #17038
Allow a table to define their own row-id columns for delete/update, instead of assuming it is always COLUMN_IDENTIFIER_ROW_ID by @Mytherin in #17039
Handle Parquet with compressed empty DataPage v2 by @EnricoMi in #17031
Combine small row groups in Parquet writer by @lnkuiper in #17036
Merge v1.2.2 into main by @carlopi in #17037
implement function so I can send a patch to httpfs by @lnkuiper in #17048
FORCE_ASYNC_SINK_SOURCE: pass also to unittester by @carlopi in #17053
If a Max Line Size Error happens on all CSV dialect candidates, throw a max line size error. by @pdet in #16935
Expose BindExtraColumns as a public function by @Mytherin in #17060
trigger .github/workflows/NightlyBuildsCheck.yml from external repo by @hmeriann in #16949
Minor parquet crypto clean-up: allow footer key to be passed in directly, and avoid constantly re-reading the key from the config by @Mytherin in #17070
update julia to v1.2.2 by @Maxxen in #17074
MultiFileReader Rework (part 18): Replace file path with OpenFileInfo struct by @Mytherin in #17071
Fix httpfs patches: avoid git log since might contain unsanitised error word by @carlopi in #17075
Re-enable Avro on core by @Tishj in #17072
[Nested] Optimize List Type in list_value by @maiadegraaf in #17063
Grow string dictionary dynamically in Parquet writer by @lnkuiper in #17061
Add extended file info to OpenFileInfo, and use this to pass encryption keys and footer size to Parquet reader by @Mytherin in #17085
[Dev] Automatically re-execute when calling __arrow_c_stream__ on an already-consumed-result by @Tishj in #17087
fsst: Avoid to propagate alignment information in FSST_UNALIGNED_STORE by @carlopi in #17094
Fix sqlite3 api wrapper link + remove R-CMD-check + add more nightly tests by @carlopi in #17095
support large dictionary value and constant vector creation in the C API by @joseph-isaacs in #17064
Add missing lock to UpdateSegment::FetchRow, and cleanup API to require the lock by @Mytherin in #17100
Valgrind requires tpch by @carlopi in #17101
Switch to manylinux_2_28 by @hannes in #16956
Changing mbedtls encryption API by @ccfelius in #16196
Pull OpenFileExtended through the opener and virtual file system layers by @Mytherin in #17102
Fix an issue in upserts where the local append state was not correctly flushed by @Mytherin in #17109
Always parallelize read_json schema detection by @lnkuiper in #17106
Move transaction cleanup outside of the transaction lock by @taniabogatsch in #17034
Remove R_CMD_CHECK.yml, now handled by duckdb/duckdb-r repo by @carlopi in #17127
JSON Bugfixes by @lnkuiper in #17119
Refactor relassert runs, adding some variations in compiler / statically linked extensions by @carlopi in #17104
extension-upload-from-nightly.sh: Add --region by @carlopi in #17120
MultiFileReader: several fixes for virtual column handling and make virtual column handling extensible by @Mytherin in #17123
Remove misleading lock comment in data table by @taniabogatsch in #17125
[Dev] Add "registries" to vcpkg.json, add script to list the packages of the registry. by @Tishj in #17124
External File Cache by @lnkuiper in #16463
Notify nightly build status by @hmeriann in #17108
Strict UUID cast by @lnkuiper in #17138
Copy To File: avoid calling Combine for threads that have not written any rows by @Mytherin in #17142
Add file_index virtual column to the multi file reader that returns the file index of the read file by @Mytherin in #17144
MultiFileReader: simplify constant handling, and allow virtual columns returned by the multi file reader to be constant by @Mytherin in #17149
Changes to encodings to make them more flexible to replacement maps. by @pdet in #17146
Optimize large Top N queries by @lnkuiper in #17141
Only trigger TopN rewrite relatively small limits compared to the table size. by @Tmonster in #17140
platform.hpp: Propagate DUCKDB_EXPLICIT_PLATFORM, avoid early return by @carlopi in #17137
Keeping the filters which do not remove NULL values by @Damon07 in #17045
Improve FileSync call on unix platform by @dentiny in #16893
README: Fix to building link by @szarnyasg in #17161
[InvokeCI] Add missing pipe to run instruction by @hmeriann in #17163
Internal #4667: 2025b TimeZone Data by @hawkfish in #17160
Unify function list by @c-herrewijn in #17168
[Dev] Generate the EXTENSION_SECRET_TYPES instead of hardcoding them by @Tishj in #17183
Fix grouping feature with interval type by @handstuyennn in #17181
Add filename to GZIP stream error by @marcoslot in #17166
Issue #17115: TimeTZ Approximate Quantile by @hawkfish in #17162
Issue #17046: AsOf Left Predicates by @hawkfish in #17159
[Fix] Pass delete indexes when committing updates by @taniabogatsch in #17176
Python.yml: Add back logic to perform fast-fail on Python 3.10 by @carlopi in #17107
Notify JDBC repo to run Vendor.yml workflow by @staticlibs in #17099
Issue #17049: ICU Date Cast by @hawkfish in #17067
Add bind_operator callback to TableFunction - allowing table functions to directly emit a LogicalOperator by @Mytherin in #17196
[ENCRYPTION] Make block header size adaptive by @ccfelius in #17118
Issue #16839: Disable TIMESTAMP Casts by @hawkfish in #16899
Add support for an explicit PRESERVE_ORDER flag for copy to file by @Mytherin in #17199
Add SYSTEM_PEAK_BUFFER_MANAGER_MEMORY and SYSTEM_PEAK_TEMP_DIRECTORY_SIZE to profiler by @lnkuiper in #17164
Fix [InvokeCI / NotifyExternalRepository] Unexpected value 'true' by @hmeriann in #17212
Add support for the cast_to_type function, that allows generating a cast from an expression to the type of another column by @Mytherin in #17209
Better cardinality estimates for inequality joins/grouped aggregations by @lnkuiper in #17139
Add ExternalFileCache validation as option for ExtendedOpenFileInfo by @lnkuiper in #17205
Explicitly flush the thread-local optimistic writer in PhysicalBatchInsert when finalizing by @Mytherin in #17214
Pushdown arbitrary expressions into scans by @Mytherin in #17213
Fix #17170: sort selection result in OR expression by @flashmouse in #17180
[Dev] Re-enable Iceberg, Bump Avro, fix generate_extension_functions.py for dependencies between extensions by @Tishj in #17204
Change Invalid Unicode Error to Invalid Encoding by @pdet in #17208
Direct IO for temp files by @lnkuiper in #17219
Fix [InvokeCI / NotifyExternalRepository] GitHub Actions has encountered an internal error when running your job. by @hmeriann in #17218
Add "thousands" option to CSV Reader by @pdet in #17220
add capi functions to create map and union values by @jraymakers in #17227
Only notify JDBC when all runs are successful by @staticlibs in #17233
Update Friendlier SQL link.md by @hfrifkin in #17248
Implement reading concatenated GZIP members by @lnkuiper in #17255
Return invalid BufferHandle upon loading a destroyed BlockHandle by @lnkuiper in #17249
Internal #4772: Timestamp Error Parameter by @hawkfish in #17283
BUGFIX: do not perform unused columns optimization in presence of multiple grouping sets by @Tmonster in #17259
Internal #4532: 13 Month Intervals by @hawkfish in #17303
Dont try to load extension if storage type is already registered by @Maxxen in #17241
Adapt size of hash table during aggregation using HyperLogLog by @lnkuiper in #17236
Switch to always using list identifier instead of array by @J-Meyers in #17242
Add root's query_location also to TransformInterval by @carlopi in #17271
Histogram table function test by @hmeriann in #17276
Guess Parquet footer size by @lnkuiper in #17300
Issue #16563: FLOAT to DECIMAL by @hawkfish in #17302
Feature #15873: Windowed ORDER BYs by @hawkfish in #17304
Switch from Bottom-Up to Top-Down Decorrelation Strategy by @kryonix in #17294
Generating random data for mbedtls without key by @ccfelius in #17309
Fix CI by @Mytherin in #17319
[Arrow] Implement support to consuming and producing Decimal 32 and 64. by @pdet in #17314
take the column ids from the logical get, don't require a LogicalGet … by @Tishj in #17315
Allow installing extensions with external access allowlist by @samansmink in #17316
Implement ARTMerger replacing the recursive ART merge algorithm by @taniabogatsch in #17243
Share null mask with constant null arg vector by @iceTTTT in #17234
Fix #17311: correctly check for presence of recursive keys in transformer by @Mytherin in #17320
[CSV Reader] Simplify Quote/Escape detection code, make it more robust and decouple comment and skip_rows option. by @pdet in #17284
Fix try_cast from NaN double to decimal by @lnkuiper in #17322
Add serialization for new TableColumn type by @Mytherin in #17321
Extract expressions from nested conjunction AND for index scan by @lnkuiper in #17297
Support late materialization in the Parquet reader, and handle COUNT(*) directly in the multi file reader by @Mytherin in #17325
Implement ARTOperator replacing Lookup and the recursive Insert by @taniabogatsch in #17327
Internal #4723: Inequality Condition Pushdown by @hawkfish in #17317
Properly format strings when throw JSON errors by @lnkuiper in #17331
Fix potential vulnerable cloned function by @npt-1707 in #17340
Fix potential vulnerable cloned function by @npt-1707 in #17339
Revert "Skip MinGW, currently failing on main" by @carlopi in #17342
Unify Parquet Metadata cache invalidation logic with Cached File System cache invalidation by @Mytherin in #17334
Fix issue with empty ranges by @kryonix in #17332
Internal #4797: Timestamp Range Cardinality by @hawkfish in #17330
Some nitpicking fixes by @szarnyasg in #17337
Issue #17299: Integer Rounding by @hawkfish in #17328
Parquet Reader: emit partition stats for any files that have cached metadata, and implement ListFilesExtended that adds extra info to files globbed by @Mytherin in #17344
Add support for UUID v7 to Filename Pattern - and clean it up so that it correctly supports composite patterns by @Mytherin in #17345
Add support for the HIVE_FILE_PATTERN option - that allows partitioned files to be written without writing them to a hive-style directory structure by @Mytherin in #17346
Add an OnDetach callback to the catalog that is triggered when the user detaches a catalog by @Mytherin in #17347
Pass commit ID to NotifyExternalRepositories.yml by @staticlibs in #17333
Add support for BENCHMARK_ROOT_DIRECTORY cmake option to change benchmark runner root directory, and add support for cache_file and reload options to enable better caching for non-DuckDB databases by @Mytherin in #17355
Support --directories option in format.py by @Mytherin in #17354
Handle both ENCRYPTION_KEY and STORAGE_VERSION passed as options by @carlopi in #17357
Fix internal exception from assigning invalid index to optional_idx query_id; by @Tishj in #17359
Fixup amalgamation: reqlen is only used with assert enabled by @carlopi in #17361
md5_number: return UHUGEINT by @szarnyasg in #17336
Skip emitting partition stats if "has_deletes" is set in the file info by @Mytherin in #17365
Benchmark runner: add argument, include and load_only options - and make ClickBench run the original benchmark instead of a subset by @Mytherin in #17367
Fix two off-by-one errors in row estimate of range and generate_series by @JelteF in #17373
[Nested] Fix: 16489 - Find NULLs in lists using list_position by @maiadegraaf in #17080
fix #17258: Allow to open database in readonly mode within cli by @jjballano in #17375
Join Hash Table Probing Optimization: Optional Probing Selection Vector by @gropaul in #17062
Remove bundled TPCH & TPCDS in Python wheels by @carlopi in #15923
[Compression] Introduce DICT_FSST compression method by @Tishj in #15637
Deprecate lambda arrow (->) and replace it with LAMBDA x : x + 1 by @taniabogatsch in #17235
fix not setting nested validity when map_extract returns null by @Maxxen in #17379
Function chaining: report missing column instead of missing function if function exists by @Mytherin in #17383
Improve error messages in UPDATE ... SET by @Mytherin in #17384
Add candidates suggestion when COLUMNS regex does not match any columns by @Mytherin in #17385
add step to clean up the disc space to fix No space left on device by @hmeriann in #17390
Fix issue in string -> hugeint conversion with decimals and exponents by @Mytherin in #17388
Improve error message reporting for cast failures by @Mytherin in #17382
Fix Python CI: pin virtualenv to previous version by @Mytherin in #17386
Improve error reporting for missing qualified columns by @Mytherin in #17397
Issue #17266: Lead Lag Nulls by @hawkfish in #17391
Fix #17266：the result of lad/lead when the offset is null by @ditdb in #17268
VirtualFileSystem to take an input, allowing to customize behaviour by @carlopi in #17393
[Dev] Add QualifiedName::ParseComponents, add input to the error messages by @Tishj in #17403
Provide suggestions and a link to the documentation for OOM errors by @Mytherin in #17402
[Dev] Flatten any deeper children vectors, when the top level is a FLAT vector by @Tishj in #17387
Minor fixes for the CLI by @Mytherin in #17405
Add support for CREATE OR REPLACE TYPE, CREATE TYPE IF NOT EXISTS and CREATE TEMPORARY TYPE by @Mytherin in #17404
Use an insertion order preserving map in Value::MAP by @taniabogatsch in #17389
Implement json_each/json_tree by @lnkuiper in #17406
Fix #16552: adjust join condition sequence by @flashmouse in #16943
WAL replay index fixes by @taniabogatsch in #17409
ZSTD: use a high penalty when min size is exceeded instead of disabling compression to allow force compression to work by @Mytherin in #17412
Internal #4723: PWMJ Inequality Pushdown by @hawkfish in #17400
Move all httplib code to HTTPUtil class by @Mytherin in #17420
Avoid generating default views and macros in the temporary catalog by @Mytherin in #17408
unittest: improve detection of whether or not we can run --force-restart tests by @Mytherin in #17419
Give tasks a TaskType with a name by @Mytherin in #17421
Use argparse in scripts/format.py by @adsharma in #17360
Add missing commas by @szarnyasg in #17424
Internal #4830: IEJoin Inequality Pushdown by @hawkfish in #17422
Add conn.query_progress() method by @nickzoic in #16927
Fixes filter pruning use the statistics updated by the same filter by @Damon07 in #17425
Fix JSON extension compilation on Ubuntu 22.04 by @staticlibs in #17434
Use pytest in SQLLogic Python test runner by @Flogex in #16685
On COPY TO/FROM check the format during binding. by @pdet in #17381
BUGFIX: DELIM_JOINS should reflect functionality of NULL filtering conditions in joins with DELIM_GETS by @Tmonster in #16910
Allow directly attaching of Parquet/CSV/JSON files by @Mytherin in #17415
Force errors when trying lines as early as possible by @pdet in #17427
Enable SYSTEM_PEAK_BUFFER_MEMORY and SYSTEM_PEAK_TEMP_DIR_SIZE profiling by default by @lnkuiper in #17407
[C API] Expose the client context, connection id and scalar function bind data by @taniabogatsch in #17449
[CSV Sniffer] Proper type replacement in header only files by @pdet in #17447
Recurse into MAP and LIST with the remap_struct and the MFR ColumnMapper by @Tishj in #17448
Fix: pyproject.toml does not contain a tool.setuptools_scm section by @YUKI2eN3e in #17443
[Fix] Macro binding with unknown parameters in list_has_all and some other code tidying by @taniabogatsch in #17450
Generalize HTTP interface and use the new HTTP interface in httpfs by @Mytherin in #17464
[Fix] Switch between constant and flat vector in C API by @taniabogatsch in #17465
Fix TIMETZ cast in example by @szarnyasg in #17468
Remove duplicated arrow fetch test by @emmanuel-ferdman in #17476
Multi File Reader Rework (Part 19): Make MultiFileReaderInterface virtual, and move reading methods to the BaseFileReader by @Mytherin in #17475
[Serializer] Lambda Compatibilty Fix by @maiadegraaf in #17428
fix parsing bool values in JSON by @ccfelius in #17460
Emit dictionary vectors with unaligned start index by @OmidAfroozeh in #17471
Add release version by @hannes in #17479
Expose qualified table names in GetTableNames and add duckdb_get_table_names to C API by @taniabogatsch in #17472
Bump avro, httpfs, mysql, postgres and sqlite by @Mytherin in #17482
Fix GeoParquet ExpressionColumnReader schema by @Maxxen in #17481
add regression_threshold_seconds argument to regression/test_runner.py by @hmeriann in #17485
DROP of missing entry should fail in binding by @jeewonhh in #17474
HTTPFS Parameters fix by @Mytherin in #17486
HTTPUtil Fix: correctly pass in on_retry by @Mytherin in #17494
Bump spatial & vss by @Maxxen in #17492
Add support for altering structs (drop, add, rename field) inside LIST and MAP columns. by @Tishj in #17462
[Python Dev] Guard against python exceptions when interacting with the currentframe object by @Tishj in #17490
If distinct count from stats is 0, do not use it in Join Order Optimizer by @Tmonster in #17466
Make the encodings extension a core extension, and make it auto-loadable. by @pdet in #17206
Allow passing down rc-style version also via OVERRIDE_GIT_DESCRIBE by @carlopi in #17501
Allow DUCKDB_EXPLICIT_VERSION to be propagated by @carlopi in #17498
Minor nightly fixes by @Mytherin in #17500
Add FileSystem::TryRemoveFile - that only removes a file if it exists by @Mytherin in #17502
Add OperatorFinalize callback to operators - which is called after a pipeline is finished by @Mytherin in #17503
Apply dynamic filter pushdown of TopN optimizer also to existing TopN nodes by @Mytherin in #17504
Fix: Optional Probe Selection by @gropaul in #17505
FileHandle Logging by @samansmink in #16758
Fix typos by @szarnyasg in #17478
Remove spatial from OSX Relassert by @carlopi in #17509
Update more extensions by @Maxxen in #17510
Bump HTTPFS again by @Mytherin in #17511
feat: include catalog and schema names in function serialization by @rustyconover in #17512
Fix encodings by @carlopi in #17514
Fix python nightly build by @Tishj in #17515
Use Catalog::TryAutoLoad for encodings extension by @pdet in #17520
[Python Dev] Using reinterpret_steal breaks the refcount of the passed-in object by @Tishj in #17525
Fix update extensions by @carlopi in #17527
Minor fixes to exception error messages by @carlopi in #17528
[Python Dev] Fix failing tests for the Python SQLLogicTester by @Tishj in #17529
Resolve GitHub workflow set-output deprecation warnings by @kurtmckee in #17516
[CSV Reader] Detect SQLNULL types for schema merging, use schema merging in csv relations, add files_to_sniff option. by @pdet in #17467
Fix extension test by @carlopi in #17536
[Dev] Fix crash when describing a table with a virtual column by @Tishj in #17544
[HTTPUtil] Let requests made through the HTTPUtil interface accept URI's without a scheme. by @Tishj in #17545
Attach after setting database type by @Mytherin in #17546
Pass MultiFileGlobalState to InitializeReader, and pass file list to CreateMapping instead of eagerly getting the first file by @Mytherin in #17553
[Dev] Fix allowed_directories crash by @Tishj in #17548
[Fix] duplicate filters during index scans by @taniabogatsch in #17547
Generate data for tpch sf100 in steps by @Tmonster in #17539
Issue #17537: Fractional Second Padding by @hawkfish in #17556
Make MultiFileList::Copy a virtual method by @Mytherin in #17566
[Dev] Can't use USING COMPRESSION with a deprecated compression type by @Tishj in #17542
Add (de)serialization for ExtraOperatorInfo by @NiclasHaderer in #17563
Fix issue with ExternalFileCache when data is evicted by @lnkuiper in #17567
Remote Reads: allocate correct buffer size for prefetch by @Mytherin in #17557
Remove patch and bump httpfs by @carlopi in #17558
[Dev] Fix Arrow fixed size binary reading by @Tishj in #17573
Fix setup.py to correctly handle OVERRIDE_GIT_DESCRIBE by @carlopi in #17580

Full Changelog: v1.2.2...v1.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DuckDB 1.3.0 "Ossivalis"

What's Changed

Contributors

Uh oh!