8000 Use pytest in SQLLogic Python test runner by Flogex · Pull Request #16685 · duckdb/duckdb · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Use pytest in SQLLogic Python test runner #16685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
May 12, 2025
Merged

Conversation

Flogex
Copy link
Contributor
@Flogex Flogex commented Mar 17, 2025

This PR adds pytest as test runner for the existing SQLLogic Python tests. .test, .test_slow and .test_coverage files in a given test directory are are automatically discovered and added as test cases. The test runner also accepts a few options known from Catch2, like --start-offset, --end-offset or --order.

Usage:

  • Default test run (excludes slow tests): pytest 'tools/pythonpkg/tests/sqllogic/' --test-dir ./test
  • Running all tests: pytest 'tools/pythonpkg/tests/sqllogic/' -m all --test-dir ./test
  • Running an individual test: pytest 'tools/pythonpkg/tests/sqllogic/test_sqllogic.py::test_sqllogic[test/sql/order/test_limit_parameter.test]' --test-dir ./test
  • Running only tests with a particular marker, e.g. only slow tests: pytest 'tools/pythonpkg/tests/sqllogic/' -m slow --capture=no --verbose --test-dir ./test

Verbose output:

...
tools/pythonpkg/tests/sqllogic/test_sqllogic.py::test_sqllogic[test/extension/wrong_function_type.test] SKIPPED (Test is on SKIPPED_TESTS list)
tools/pythonpkg/tests/sqllogic/test_sqllogic.py::test_sqllogic[test/extension/test_tags.test] [19/3955] SKIPPED (/Users/me/mono/extensions/duckdb/te...)
tools/pythonpkg/tests/sqllogic/test_sqllogic.py::test_sqllogic[test/optimizer/zonemaps.test] [20/3955] PASSED
tools/pythonpkg/tests/sqllogic/test_sqllogic.py::test_sqllogic[test/optimizer/regex_optimizer.test] [21/3955] PASSED
tools/pythonpkg/tests/sqllogic/test_sqllogic.py::test_sqllogic[test/optimizer/constant_folding.test] [22/3955] PASSED
...

Non-verbose output:

tools/pythonpkg/tests/sqllogic/test_sqllogic.py sssFssssssssssssssss...........................s..............Fs...sF

This PR leaves the current test runner untouched and does not use the test runner yet in CI. For now, I simply copied over the code for the SQLLogicTestExecutor from the old test runner implementation and replaced the skip/fail methods with their pytest equivalents.

Integrate with pytest to be able to use a fully-fledged test runner.
Adds automatic discovery of .test files.
Adds a few command line options inspired by Catch2.
@Tishj
Copy link
Contributor
Tishj commented Mar 17, 2025

I think we do want to replace the existing method of calling the python sqllogictest with pytest, not have it live alongside it. To properly test this I think you want to start the nightlytest on your fork (as it won't run on PRs to the main remote)

import typing
import warnings
from .skipped_tests import SKIPPED_TESTS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As evidenced by the failing python CI run, this should not live inside tools/pythonpkg/tests, as that is already a pytest directory, it should probably live inside tools/pythonpkg/sqllogic

Copy link
Contributor Author
@Flogex Flogex Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the files to tools/pythonpkg/sqllogic.

This resulted in an import error at first:

ImportError while importing test module '/Users/me/duckdb/tools/pythonpkg/sqllogic/test_sqllogic.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.13/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tools/pythonpkg/sqllogic/test_sqllogic.py:18: in <module>
    from sqllogictest.result import (
scripts/sqllogictest/result.py:33: in <module>
    from .logger import SQLLogicTestLogger
scripts/sqllogictest/logger.py:4: in <module>
    from duckdb import tokenize, token_type
tools/pythonpkg/duckdb/__init__.py:14: in <module>
    from .duckdb import (
E   ImportError: cannot import name 'SQLExpression' from 'duckdb.duckdb' (/Users/me/duckdb/tools/pythonpkg/duckdb/duckdb.cpython-313-darwin.so)

The reason is that tools/pythonpkg gets appended to sys.path when pytest starts so that pytest can then import the sqllogic package. (I think the sqllogic directory has to be a package to be able to import skipped_tests.py in conftest.py.)

I worked around this by changing the import mode to importlib which doesn't change sys.path. Doing this in conftest.py is already to late because then the changes to the path have already happened. Therefore, I added a pytest.ini file in 2f38099 which adds the --import-mode=importlib option when pytest is started.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for the investigation and the writeup, that will help with maintaining 👍

@duckdb-draftbot duckdb-draftbot marked this pull request as draft March 18, 2025 14:27
run: |
python3 tools/pythonpkg/scripts/sqllogictest_python.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also remove sqllogictest_python.py please, add FIXME's to the test_sqllogic.py where functionality is perhaps missing that is in sqllogictest_python.py, I think the regressions should be very minimal (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. As far as I can see, the pytest runner covers almost all the functionality of the old one. The only thing that is missing is an equivalent to --file-list which accepts a file with a list of test cases to run.

  • --file-path is replaced by passing the file path with -k or the name of the test case
  • --start-offset and --build-dir are available
  • Error reporting and skipping is handled by pytest
  • TEST_DIRECTORY_PATH is handled by pytest (including cleaning up the old ones)

SQLLogicTestExecutor hasn't been changed much thus far and works as before.

Flogex added 3 commits March 19, 2025 10:30
GitHub Actions runners set this automatically
With import mode prepend, the tools/pythonpkg directory is added to the sys.path.
This leads to import errors when importing the duckdb package because there is
also a directory named tools/pythonpkg/duckdb.
Hence, setting the import mode to 'importlib' ('append' would also work).
@Tishj
Copy link
Contributor
Tishj commented Mar 20, 2025

Made some changes to fix some issues

diff --git a/scripts/sqllogictest/result.py b/scripts/sqllogictest/result.py
index e639f0baf8..1c74e82c97 100644
--- a/scripts/sqllogictest/result.py
+++ b/scripts/sqllogictest/result.py
@@ -99,7 +99,7 @@ class TestException(Exception):
     __slots__ = ['data', 'message', 'result']
 
     def __init__(self, data: SQLLogicStatementData, message: str, result: ExecuteResult):
-        self.message = f'{str(data)} {message}'
+        self.message = message
         super().__init__(self.message)
         self.data = data
         self.result = result
diff --git a/tools/pythonpkg/sqllogic/conftest.py b/tools/pythonpkg/sqllogic/conftest.py
index 498a020132..6520cd57f1 100644
--- a/tools/pythonpkg/sqllogic/conftest.py
+++ b/tools/pythonpkg/sqllogic/conftest.py
@@ -40,10 +40,16 @@ def pytest_addoption(parser: pytest.Parser):
     parser.addoption("--rng-seed", type=int, dest="rng_seed", help="Random integer seed")
 
 
-@pytest.hookimpl(tryfirst=True)
+@pytest.hookimpl(hookwrapper=True)
 def pytest_keyboard_interrupt(excinfo: pytest.ExceptionInfo):
-    # TODO: CTRL+C does not immediately interrupt pytest. You sometimes have to press it multiple times.
-    pytestmark = pytest.mark.skip(reason="Keyboard interrupt")
+    # Ensure all tests are properly cleaned up on keyboard interrupt
+    from .test_sqllogic import test_sqllogic
+    if hasattr(test_sqllogic, 'executor') and test_sqllogic.executor:
+        if test_sqllogic.executor.database and hasattr(test_sqllogic.executor.database, 'connection'):
+            test_sqllogic.executor.database.connection.interrupt()
+        test_sqllogic.executor.cleanup()
+        test_sqllogic.executor = None
+    yield
 
 
 def pytest_configure(config: pytest.Config):
@@ -165,7 +171,7 @@ def pytest_collection_modifyitems(session: pytest.Session, config: pytest.Config
     if start_offset < 0:
         raise ValueError("--start-offset must be a non-negative integer")
     elif end_offset < start_offset:
-        raise ValueError("--end-offset must be greater than or equal to --start-offset")
+        raise ValueError(f"--end-offset ({end_offset}) must be greater than or equal to --start-offset")
 
     max_end_offset = len(items) - 1
     if end_offset > max_end_offset:
diff --git a/tools/pythonpkg/sqllogic/test_sqllogic.py b/tools/pythonpkg/sqllogic/test_sqllogic.py
index 3ad8ca58a4..55bfb8d4d7 100644
--- a/tools/pythonpkg/sqllogic/test_sqllogic.py
+++ b/tools/pythonpkg/sqllogic/test_sqllogic.py
@@ -4,6 +4,7 @@ import pathlib
 import pytest
 import sys
 from typing import Any, Generator, Optional
+import signal
 
 ##### Copied from sqllogictest_python.py #####
 
@@ -23,8 +24,21 @@ from sqllogictest.result import (
     ExecuteResult,
 )
 
+def sigquit_handler(signum, frame):
+    # Access the executor from the test_sqllogic function
+    if hasattr(test_sqllogic, 'executor') and test_sqllogic.executor:
+        if test_sqllogic.executor.database and hasattr(test_sqllogic.executor.database, 'connection'):
+            test_sqllogic.executor.database.connection.interrupt()
+        test_sqllogic.executor.cleanup()
+        test_sqllogic.executor = None
+    # Re-raise the signal to let the default handler take over
+    signal.signal(signal.SIGQUIT, signal.default_int_handler)
+    os.kill(os.getpid(), signal.SIGQUIT)
+
+
+# Register the SIGQUIT handler
+signal.signal(signal.SIGQUIT, sigquit_handler)
 
-# This is pretty much just a VM
 class SQLLogicTestExecutor(SQLLogicRunner):
     def __init__(self, test_directory: str, build_directory: Optional[str] = None):
         super().__init__(build_directory)
@@ -67,45 +81,64 @@ class SQLLogicTestExecutor(SQLLogicRunner):
         test_delete_file(path + ".wal")
 
     def execute_test(self, test: SQLLogicTest) -> ExecuteResult:
-        self.reset()
-        self.test = test
-        self.original_sqlite_test = self.test.is_sqlite_test()
+        try:
+            self.reset()
+            self.test = test
+            self.original_sqlite_test = self.test.is_sqlite_test()
 
-        # Top level keywords
-        keywords = {'__TEST_DIR__': self.get_test_directory(), '__WORKING_DIRECTORY__': os.getcwd()}
+            # Top level keywords
+            keywords = {'__TEST_DIR__': self.get_test_directory(), '__WORKING_DIRECTORY__': os.getcwd()}
 
-        def update_value(_: SQLLogicContext) -> Generator[Any, Any, Any]:
-            # Yield once to represent one iteration, do not touch the keywords
-            yield None
+            def update_value(_: SQLLogicContext) -> Generator[Any, Any, Any]:
+                # Yield once to represent one iteration, do not touch the keywords
+                yield None
 
-        self.database = SQLLogicDatabase(':memory:', None)
-        pool = self.database.connect()
-        context = SQLLogicContext(pool, self, test.statements, keywords, update_value)
-        pool.initialize_connection(context, pool.get_connection())
-        # The outer context is not a loop!
-        context.is_loop = False
+            self.database = SQLLogicDatabase(':memory:', None)
+            pool = self.database.connect()
+            context = SQLLogicContext(pool, self, test.statements, keywords, update_value)
+            pool.initialize_connection(context, pool.get_connection())
+            # The outer context is not a loop!
+            context.is_loop = False
 
-        try:
-            context.verify_statements()
-            res = context.execute()
-        except TestException as e:
-            res = e.handle_result()
-            if res.type == ExecuteResult.Type.SKIPPED:
-                pytest.skip(str(e.message))
-            else:
-                pytest.fail(str(e.message), pytrace=False)
-
-        self.database.reset()
-
-        # Clean up any databases that we created
+            try:
+                context.verify_statements()
+                res = context.execute()
+            except TestException as e:
+                res = e.handle_result()
+                if res.type == ExecuteResult.Type.SKIPPED:
+                    pytest.skip(str(e.message))
+                else:
+                    pytest.fail(str(e.message), pytrace=False)
+
+            self.database.reset()
+
+            # Clean up any databases that we created
+            for loaded_path in self.loaded_databases:
+                if not loaded_path:
+                    continue
+                # Only delete database files that were created during the tests
+                if not loaded_path.startswith(self.get_test_directory()):
+                    continue
+                os.remove(loaded_path)
+            return res
+        except KeyboardInterrupt:
+            if self.database:
+                self.database.interrupt()
+            raise
+
+    def cleanup(self):
+        if self.database:
+            if hasattr(self.database, 'connection'):
+                self.database.connection.interrupt()
+            self.database.reset()
+            self.database = None
+        # Clean up any remaining test databases
         for loaded_path in self.loaded_databases:
-            if not loaded_path:
-                continue
-            # Only delete database files that were created during the tests
-            if not loaded_path.startswith(self.get_test_directory()):
-                continue
-            os.remove(loaded_path)
-        return res
+            if loaded_path and loaded_path.startswith(self.get_test_directory()):
+                try:
+                    os.remove(loaded_path)
+                except FileNotFoundError:
+                    pass
 
 
 def test_sqllogic(test_script_path: pathlib.Path, pytestconfig: pytest.Config, tmp_path: pathlib.Path):
@@ -119,8 +152,14 @@ def test_sqllogic(test_script_path: pathlib.Path, pytestconfig: pytest.Config, t
 
     build_dir = pytestconfig.getoption("build_dir")
     executor = SQLLogicTestExecutor(str(tmp_path), build_dir)
-    result = executor.execute_test(test)
-    assert result.type == ExecuteResult.Type.SUCCESS
+    # Store executor in the function's arguments so it can be accessed by the interrupt handler
+    test_sqllogic.executor = executor
+    try:
+        result = executor.execute_test(test)
+        assert result.type == ExecuteResult.Type.SUCCESS
+    finally:
+        executor.cleanup()
+        test_sqllogic.executor = None
 
 
 if __name__ == '__main__':
  • The SQLStatementData no longer needs to be part of the message, the pytest name already contains this - also saw an issue in the display in verbose mode because the message was simply too long
  • This fixes the Ctrl+C and Ctrl+\ handling, so the tests can be interrupted now

@Flogex
Copy link
Contributor Author
Flogex 8000 commented Mar 20, 2025

Here is a run of the "Python SQLLogicTest Library" job in my fork: https://github.com/Flogex/duckdb/actions/runs/13967583581/job/39101440384?pr=7

@Flogex
Copy link
Contributor Author
Flogex commented Mar 24, 2025

@Tishj Here is the latest nightly test run: https://github.com/Flogex/duckdb/actions/runs/14022703354/job/39256726461

Three test failures

  • test/sql/copy/return_stats_truncate.test
  • test/sql/copy/return_stats.test
  • test/sql/copy/parquet/writer/skip_empty_write.test
    But those also fail in main's nightly run, hence I think these are unrelated to this PR.

@Flogex Flogex marked this pull request as ready for review March 24, 2025 09:45
@Tishj
Copy link
Contributor
Tishj commented Mar 24, 2025

Yea agreed, I think the <REGEX> handling was changed in the cpp unittester and as always the python one wasn't changed, perhaps we can just skip them explicitly for now

@duckdb-draftbot duckdb-draftbot marked this pull request as draft March 24, 2025 17:50
@Flogex
Copy link
Contributor Author
Flogex commented Mar 24, 2025

Yea agreed, I think the <REGEX> handling was changed in the cpp unittester and as always the python one wasn't changed, perhaps we can just skip them explicitly for now

I excluded the three failing tests in 133ac9a

@Flogex Flogex marked this pull request as ready for review March 24, 2025 17:53
@duckdb-draftbot duckdb-draftbot marked this pull request as draft March 25, 2025 09:44
@Flogex Flogex marked this pull request as ready for review March 25, 2025 09:45
@Flogex Flogex requested a review from Tishj March 27, 2025 21:03
Copy link
Contributor
@Tishj Tishj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, it looks good to me now 👍

@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 7, 2025 09:41
@Flogex Flogex marked this pull request as ready for review April 7, 2025 09:41
@Flogex Flogex marked this pull request as draft April 22, 2025 12:19
@Flogex Flogex marked this pull request as ready for review April 22, 2025 12:19
@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 22, 2025 12:43
@Flogex Flogex marked this pull request as ready for review April 23, 2025 08:21
@duckdb-draftbot duckdb-draftbot marked this pull request as draft April 23, 2025 08:46
@Flogex Flogex marked this pull request as ready for review April 23, 2025 08:47
@Mytherin Mytherin merged commit 12019ed into duckdb:main May 12, 2025
64 of 71 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

@Flogex Flogex deleted the sqllogic-pytest branch May 14, 2025 22:02
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
On COPY TO/FROM check the format during binding. (duckdb/duckdb#17381)
Use pytest in SQLLogic Python test runner (duckdb/duckdb#16685)
Fix JSON extension compilation on Ubuntu 22.04 (duckdb/duckdb#17434)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 18, 2025
On COPY TO/FROM check the format during binding. (duckdb/duckdb#17381)
Use pytest in SQLLogic Python test runner (duckdb/duckdb#16685)
Fix JSON extension compilation on Ubuntu 22.04 (duckdb/duckdb#17434)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 19, 2025
On COPY TO/FROM check the format during binding. (duckdb/duckdb#17381)
Use pytest in SQLLogic Python test runner (duckdb/duckdb#16685)
Fix JSON extension compilation on Ubuntu 22.04 (duckdb/duckdb#17434)
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request May 19, 2025
On COPY TO/FROM check the format during binding. (duckdb/duckdb#17381)
Use pytest in SQLLogic Python test runner (duckdb/duckdb#16685)
Fix JSON extension compilation on Ubuntu 22.04 (duckdb/duckdb#17434)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0