-
-
Notifications
You must be signed in to change notification settings - Fork 718
ENH: Free threading changes for FSEvents #1109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
At commit 0a8b2c56331a31d7f7096faaa1c1c26467b51c15
5d6fa89
to
534ab3e
Compare
Sorry, I killed the wrong job when doing some clean-up. I relaunched them. |
@@ -912,5 +949,8 @@ PyInit__watchdog_fsevents(void){ | |||
} | |||
watchdog_module_add_attributes(module); | |||
watchdog_module_init(); | |||
#ifdef Py_GIL_DISABLED | |||
PyUnstable_Module_SetGIL(module, Py_MOD_GIL_NOT_USED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just read https://discuss.python.org/t/concerns-about-pyunstable-module-setgil/89374. Is it a temporary solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to add multi-phase init, but maybe in a separate PR? FWIW I don't think single phase init extension modules are going away anytime soon and it is largely orthogonal to the free threading discussion (except that it might make sense to do a one-shot upgrade).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in favor of doing a one-shot upgrade, and it can be split in as many PR as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a number of open PRs from @AA-Turner (hope you don't mind the ping - seemed relevant to your interests!) on NumPy that do the necessary refactor to multi-phase init. Maybe take a look there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Nathan.
I don't think single phase init extension modules are going away anytime soon
This is true, but we are starting to seriously consider doing so. Note that the mechanism used in this PR is part of the Unstable C API. We recommend wherever possible using multi-phase init, with the Py_mod_gil
slot.
Note that maybe could you disable Linux+Windows on the CI until you have a working Mac code? |
Sorry about that, done, the CI bug seems to be very intermittent, probably a missing lock issue.. |
f0d62b9
to
59ab887
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall 👍 💪
I see CI failures but tests passed, so I guess it is OK to merge when you are confident enough :)
setup.py
Outdated
@@ -30,6 +30,7 @@ | |||
sources=[ | |||
"src/watchdog_fsevents.c", | |||
], | |||
include_dirs=[SRC_DIR], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is needed.
@CCP-Aporia would you like to have look maybe? |
Unfortunately, I have not yet looked into free-threaded Python at all, so I don't know what exactly would be required there. That said, these changes look good on the surface, assuming that |
Hi all! I did the free-threaded port of NumPy and am going to review this PR. I work with @HaoZeke at Quansight Labs and am helping him with advice and code review but have not yet looked extensively at this code or this project. I do have a lot of experience with working on making projects compatible with the free-threaded build. I hope you can accept my review as sufficient to be comfortable releasing free-threaded builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the approach taken in this PR - using a global PyMutex
lock that gets locked and unlocked whenever you need to mutate global dicts stored in the extension - is flawed. That's not giving you a similar level of protection to what the GIL provided before. In particular, the dicts might get modified while your locks aren't held.
Instead, I think what you want is to take a critical section on either thread_to_run_loop
or watch_to_stream
- depending on which is being mutated - for the entire duration of the function in which the dict is getting mutated.
Also it's not clear to me - are the existing tests being run using pytest-run-parallel on CI? Do they not fail because of the global state you're adding locking for? Do you have test cases that do fail now?
src/pythoncapi_compat.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if the maintaners mind - but it might make sense to bring this in as a git submodule rather than directly vendoring a copy. The upstream project is updated reasonably often so IMO it's worth making it easy to review updates without laboriously checking that the vendored copy is identical to what is upstream. Making it easy to update the vendored dependency will also help with future updates to use other newer nicer C APIs that get added in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BoboTiG would you prefer a submodule or the vendored file?
Another thing to keep in mind with respect to using a mutex or a critical section is that a mutex can lead to deadlocks if you call into the C API. While Critical sections are reentrant and cannot deadlock with each other. Of course the trade-off of that is that a critical section may be suspended if you call into the CPython C API, so it doesn't provide as strong of protection as a mutex. On the other hand, this is a similar level of protection for locking internal state in an extension that the GIL provides in a multithreaded Python program running on the GIL-enabled interpreter. If it's possible to refactor state so that it's not stored inside of a python object (e.g. instead of storing events directly in a dict, they're stored in e.g. a C++ hashmap, just as an example) you can use a mutex as long as you're careful to make sure you're not holding the mutex while you're calling into the CPython C API to avoid a chance of a deadlock. However, if the state is stored in a Python object, it's very difficult to reason about whether a call will or won't deadlock, since many Python operations may cause execution of arbitrary Python code (e.g. while an exception is getting raised). All that said, as a first pass, IMO a minimal way forward here is to take critical sections on the global dicts in functions where you read from or write to them, as I suggested in my initial review comment. |
Co-authored-by: ngoldbaum <ngoldbaum@users.noreply.github.com>
Awesome @ngoldbaum, awesome! I am not really into the free-threading stuff, and my "expertize" is not at the C level, so your inputs with @HaoZeke & @AA-Turner are very valuable. Thanks a lot for the support! and also thank you @CCP-Aporia for your input as well ;) |
Co-authored-by: BoBoTiG <bobotig@noreply.github.com>
Co-authored-by: ngoldbaum <ngoldbaum@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I caught a few reference leaks - please go over this again for anything I might have missed. Reference counting bugs are tricky and it pays to go over C code managing references very carefully. Try to keep an eye out for early returns - you always need to manage resources before every return
.
run_loop_ref = CFRunLoopGetCurrent(); | ||
value = PyCapsule_New(run_loop_ref, NULL, watchdog_pycapsule_destructor); | ||
PyDict_SetItem(thread_to_run_loop, emitter_thread, value); | ||
Py_INCREF(emitter_thread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's always worked this way technically but I don't understand why there's an incref here but not the other cases.
} | ||
|
||
G_RETURN_NULL_IF(PyErr_Occurred()); | ||
|
||
Py_INCREF(Py_None); | ||
return Py_None; | ||
Py_RETURN_NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this has always worked this way I guess - but doesn't this leak a reference to value
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Evidently not... I think its cleaned up elsewhere somehow. Placing a DECREF on value causes a segfault..
+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
FAILED ([thread-unsafe]: uses thread-unsafe fixture(s): {'recwarn'})Process 58393 stopped
* thread #3, stop reason = EXC_BAD_ACCESS (code=1, address=0x7fcacc000100)
frame #0: 0x0000000101541d00 libpython3.13t.dylib`_PyMem_MiFree + 32
libpython3.13t.dylib`_PyMem_MiFree:
-> 0x101541d00 <+32>: movq 0x100(%rdi), %rdx
0x101541d07 <+39>: subq %rdi, %rsi
0x101541d0a <+42>: shrq $0x10, %rsi
0x101541d0e <+46>: leaq (%rsi,%rsi,2), %r8
Target 0: (python) stopped.
static PyObject *
watchdog_read_events(PyObject *self, PyObject *args)
{
UNUSED(self);
CFRunLoopRef run_loop_ref = NULL;
PyObject *emitter_thread = NULL;
PyObject *value = NULL;
G_RETURN_NULL_IF_NOT(PyArg_ParseTuple(args, "O:loop", &emitter_thread));
// PyEval_InitThreads() does nothing as of Python 3.7 and is deprecated in 3.9.
// https://docs.python.org/3/c-api/init.html#c.PyEval_InitThreads
#if PY_VERSION_HEX < 0x030700f0
PyEval_InitThreads();
#endif
/* Allocate information and store thread state. */
switch (PyDict_GetItemRef(thread_to_run_loop, emitter_thread, &value)){
case -1:
return NULL;
case 0:
run_loop_ref = CFRunLoopGetCurrent();
value = PyCapsule_New(run_loop_ref, NULL, watchdog_pycapsule_destructor);
if (PyDict_SetItem(thread_to_run_loop, emitter_thread, value) < 0) {
Py_DECREF(value);
return NULL;
}
break;
default:
break;
}
Py_INCREF(emitter_thread);
/* No timeout, block until events. */
Py_BEGIN_ALLOW_THREADS;
CFRunLoopRun();
Py_END_ALLOW_THREADS;
/* Clean up state information. */
if (PyDict_DelItem(thread_to_run_loop, emitter_thread) == 0)
{
/* ADDED DECREF (cuz owned ref) */
Py_DECREF(value);
/* END ADDITION */
Py_DECREF(emitter_thread);
}
G_RETURN_NULL_IF(PyErr_Occurred());
Py_RETURN_NONE;
}
case 0: | ||
goto success; | ||
default: | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment below but if the G_RETURN_NULL_IF
macro includes a return
statement then you also need to DECREF
value
if that happens.
Wraps up #1105.
Unfortunately I haven't been able to test this on a Mac, so at this point it is purely speculative.Managed to test it on a local machine, and everything seems to pass without errors..Basically follows the porting extensions advice:
pythoncapi-compat
GetItem
calls to useGetItemRef
(xref C-API advice on borrowed references) and adjusted the refcount accordinglythread_to_run_loop
andwatch_to_stream