8000 ENH: Free threading changes for FSEvents by HaoZeke · Pull Request #1109 · gorakhargosh/watchdog · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ENH: Free threading changes for FSEvents #1109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

HaoZeke
Copy link
Contributor
@HaoZeke HaoZeke commented May 31, 2025

Wraps up #1105.

Unfortunately I haven't been able to test this on a Mac, so at this point it is purely speculative. Managed to test it on a local machine, and everything seems to pass without errors..

brew install uv
uv venv --python 3.13t .ftci
source .ftci/bin/activate
uv pip install pytest-timeout pytest-run-parallel flaky
uv pip install -v -e .
perl -i -ne 'print unless /--cov/' pyproject.toml
# 500 because test_watcher_deletion_while_receiving_events_2 takes a long time.. otherwise 5 is fine too (if it is skipped)
uv run --no-project python -m pytest --parallel-threads=2 --iterations=2 -v -s --timeout=500 --durations=10 -m "not thread_unsafe"

Basically follows the porting extensions advice:

@HaoZeke HaoZeke force-pushed the ftFSevent branch 4 times, most recently from 5d6fa89 to 534ab3e Compare May 31, 2025 17:43
@BoboTiG
Copy link
Collaborator
BoboTiG commented Jun 1, 2025

Sorry, I killed the wrong job when doing some clean-up. I relaunched them.

@@ -912,5 +949,8 @@ PyInit__watchdog_fsevents(void){
}
watchdog_module_add_attributes(module);
watchdog_module_init();
#ifdef Py_GIL_DISABLED
PyUnstable_Module_SetGIL(module, Py_MOD_GIL_NOT_USED);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that multi init has been asked a few days ago: #1108. Is it worth tackling #1108 to improve that current code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add multi-phase init, but maybe in a separate PR? FWIW I don't think single phase init extension modules are going away anytime soon and it is largely orthogonal to the free threading discussion (except that it might make sense to do a one-shot upgrade).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of doing a one-shot upgrade, and it can be split in as many PR as needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of open PRs from @AA-Turner (hope you don't mind the ping - seemed relevant to your interests!) on NumPy that do the necessary refactor to multi-phase init. Maybe take a look there?

Copy link
@AA-Turner AA-Turner Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nathan.

I don't think single phase init extension modules are going away anytime soon

This is true, but we are starting to seriously consider doing so. Note that the mechanism used in this PR is part of the Unstable C API. We recommend wherever possible using multi-phase init, with the Py_mod_gil slot.

@BoboTiG
Copy link
Collaborator
BoboTiG commented Jun 1, 2025

Note that maybe could you disable Linux+Windows on the CI until you have a working Mac code?

@HaoZeke HaoZeke marked this pull request as ready for review June 1, 2025 14:13
@HaoZeke
Copy link
Contributor Author
HaoZeke commented Jun 1, 2025

Note that maybe could you disable Linux+Windows on the CI until you have a working Mac code?

Sorry about that, done, the CI bug seems to be very intermittent, probably a missing lock issue..

@HaoZeke HaoZeke force-pushed the ftFSevent branch 3 times, most recently from f0d62b9 to 59ab887 Compare June 1, 2025 14:24
Copy link
Collaborator
@BoboTiG BoboTiG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall 👍 💪

I see CI failures but tests passed, so I guess it is OK to merge when you are confident enough :)

setup.py Outdated
@@ -30,6 +30,7 @@
sources=[
"src/watchdog_fsevents.c",
],
include_dirs=[SRC_DIR],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is needed.

@BoboTiG
Copy link
Collaborator
BoboTiG commented Jun 1, 2025

@CCP-Aporia would you like to have look maybe?

@CCP-Aporia
Copy link
Contributor

@CCP-Aporia would you like to have look maybe?

Unfortunately, I have not yet looked into free-threaded Python at all, so I don't know what exactly would be required there. That said, these changes look good on the surface, assuming that LOCK / UNLOCK are the recommended way forward, and that no thread-unsafe API calls were missed.

@ngoldbaum
Copy link

Hi all! I did the free-threaded port of NumPy and am going to review this PR. I work with @HaoZeke at Quansight Labs and am helping him with advice and code review but have not yet looked extensively at this code or this project.

I do have a lot of experience with working on making projects compatible with the free-threaded build. I hope you can accept my review as sufficient to be comfortable releasing free-threaded builds.

Copy link
@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach taken in this PR - using a global PyMutex lock that gets locked and unlocked whenever you need to mutate global dicts stored in the extension - is flawed. That's not giving you a similar level of protection to what the GIL provided before. In particular, the dicts might get modified while your locks aren't held.

Instead, I think what you want is to take a critical section on either thread_to_run_loop or watch_to_stream - depending on which is being mutated - for the entire duration of the function in which the dict is getting mutated.

Also it's not clear to me - are the existing tests being run using pytest-run-parallel on CI? Do they not fail because of the global state you're adding locking for? Do you have test cases that do fail now?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if the maintaners mind - but it might make sense to bring this in as a git submodule rather than directly vendoring a copy. The upstream project is updated reasonably often so IMO it's worth making it easy to review updates without laboriously checking that the vendored copy is identical to what is upstream. Making it easy to update the vendored dependency will also help with future updates to use other newer nicer C APIs that get added in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BoboTiG would you prefer a submodule or the vendored file?

@ngoldbaum
Copy link

Another thing to keep in mind with respect to using a mutex or a critical section is that a mutex can lead to deadlocks if you call into the C API. While PyMutex does have the property that on the GIL-enabled build you cannot deadlock with the GIL using a PyMutex - you can certainly cause deadlocks with other PyMutex instances or cause deadlocks by re-entrantly locking a single PyMutex.

Critical sections are reentrant and cannot deadlock with each other. Of course the trade-off of that is that a critical section may be suspended if you call into the CPython C API, so it doesn't provide as strong of protection as a mutex. On the other hand, this is a similar level of protection for locking internal state in an extension that the GIL provides in a multithreaded Python program running on the GIL-enabled interpreter.

If it's possible to refactor state so that it's not stored inside of a python object (e.g. instead of storing events directly in a dict, they're stored in e.g. a C++ hashmap, just as an example) you can use a mutex as long as you're careful to make sure you're not holding the mutex while you're calling into the CPython C API to avoid a chance of a deadlock. However, if the state is stored in a Python object, it's very difficult to reason about whether a call will or won't deadlock, since many Python operations may cause execution of arbitrary Python code (e.g. while an exception is getting raised).

All that said, as a first pass, IMO a minimal way forward here is to take critical sections on the global dicts in functions where you read from or write to them, as I suggested in my initial review comment.

Co-authored-by: ngoldbaum <ngoldbaum@users.noreply.github.com>
@BoboTiG
Copy link
Collaborator

Awesome @ngoldbaum, awesome! I am not really into the free-threading stuff, and my "expertize" is not at the C level, so your inputs with @HaoZeke & @AA-Turner are very valuable. Thanks a lot for the support!

and also thank you @CCP-Aporia for your input as well ;)

HaoZeke and others added 4 commits June 2, 2025 20:13
Co-authored-by: BoBoTiG <bobotig@noreply.github.com>
Co-authored-by: ngoldbaum <ngoldbaum@users.noreply.github.com>
@HaoZeke HaoZeke requested a review from ngoldbaum June 2, 2025 19:35
Copy link
@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I caught a few reference leaks - please go over this again for anything I might have missed. Reference counting bugs are tricky and it pays to go over C code managing references very carefully. Try to keep an eye out for early returns - you always need to manage resources before every return.

run_loop_ref = CFRunLoopGetCurrent();
value = PyCapsule_New(run_loop_ref, NULL, watchdog_pycapsule_destructor);
PyDict_SetItem(thread_to_run_loop, emitter_thread, value);
Py_INCREF(emitter_thread);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's always worked this way technically but I don't understand why there's an incref here but not the other cases.

}

G_RETURN_NULL_IF(PyErr_Occurred());

Py_INCREF(Py_None);
return Py_None;
Py_RETURN_NONE;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this has always worked this way I guess - but doesn't this leak a reference to value?

Copy link
Contributor Author
@HaoZeke HaoZeke Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evidently not... I think its cleaned up elsewhere somehow. Placing a DECREF on value causes a segfault..

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
FAILED ([thread-unsafe]: uses thread-unsafe fixture(s): {'recwarn'})Process 58393 stopped
* thread #3, stop reason = EXC_BAD_ACCESS (code=1, address=0x7fcacc000100)
    frame #0: 0x0000000101541d00 libpython3.13t.dylib`_PyMem_MiFree + 32
libpython3.13t.dylib`_PyMem_MiFree:
->  0x101541d00 <+32>: movq   0x100(%rdi), %rdx
    0x101541d07 <+39>: subq   %rdi, %rsi
    0x101541d0a <+42>: shrq   $0x10, %rsi
    0x101541d0e <+46>: leaq   (%rsi,%rsi,2), %r8
Target 0: (python) stopped.
static PyObject *
watchdog_read_events(PyObject *self, PyObject *args)
{
    UNUSED(self);
    CFRunLoopRef run_loop_ref = NULL;
    PyObject *emitter_thread = NULL;
    PyObject *value = NULL;

    G_RETURN_NULL_IF_NOT(PyArg_ParseTuple(args, "O:loop", &emitter_thread));

// PyEval_InitThreads() does nothing as of Python 3.7 and is deprecated in 3.9.
// https://docs.python.org/3/c-api/init.html#c.PyEval_InitThreads
#if PY_VERSION_HEX < 0x030700f0
    PyEval_InitThreads();
#endif

    /* Allocate information and store thread state. */
    switch (PyDict_GetItemRef(thread_to_run_loop, emitter_thread, &value)){
        case -1:
            return NULL;
        case 0:
            run_loop_ref = CFRunLoopGetCurrent();
            value = PyCapsule_New(run_loop_ref, NULL, watchdog_pycapsule_destructor);
            if (PyDict_SetItem(thread_to_run_loop, emitter_thread, value) < 0) {
                Py_DECREF(value);
                return NULL;
            }
            break;
        default:
            break;
    }
    Py_INCREF(emitter_thread);

    /* No timeout, block until events. */
    Py_BEGIN_ALLOW_THREADS;
    CFRunLoopRun();
    Py_END_ALLOW_THREADS;

    /* Clean up state information. */
    if (PyDict_DelItem(thread_to_run_loop, emitter_thread) == 0)
    {
/* ADDED DECREF (cuz owned ref) */
        Py_DECREF(value);
/* END ADDITION */
        Py_DECREF(emitter_thread);
    }

    G_RETURN_NULL_IF(PyErr_Occurred());

    Py_RETURN_NONE;
}

case 0:
goto success;
default:
break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment below but if the G_RETURN_NULL_IF macro includes a return statement then you also need to DECREF value if that happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0