-
Notifications
You must be signed in to change notification settings - Fork 104
cpu_relax
primitive
#4226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: domain-rt4
Are you sure you want to change the base?
cpu_relax
primitive
#4226
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of the naming in the backend (Relax
rather than Cpu_relax
makes me think of branch relaxation before it makes me think of pause/yield/etc), but lgtm generally.
backend/cfg/cfg_cse.ml
Outdated
| Op Opaque -> | ||
(* Assume arbitrary side effects from Iopaque *) | ||
| Op (Opaque | Relax) -> | ||
(* Assume arbitrary side effects from Opaque / Relax *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? (I'm honestly not sure either way)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure either, I went with the most conservative option since we don't want loads/stores reordered across it. But maybe atomic load/stores will already not be reordered? cc @xclerc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to only kill loads
| Ppoll -> | ||
Some alloc_heap | ||
| Ppoll -> Some alloc_heap | ||
| Pcpu_relax -> if Config.poll_insertion then None else Some alloc_heap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly Pcpu_relax
should match Ppoll
in the non-poll-insertion case, but I don't understand why either of them is marked as Some alloc_heap
. What's this function used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears to only be used for checking whether a primitive stack-allocates, so the heap cases don't matter. There's also a CR @mshinwell saying to check this case, so it could probably be removed (in another pr)
Do you know why the compiler seems not to have managed to CSE the load of the immutable field |
I didn't call it |
I checked treating |
Makes
Domain.cpu_relax
a primitive instead of a runtime call.Crelax
or a sequence ofCrelax
andCpoll
, depending onConfig.poll_insertion
.Crelax
is treated likeCopaque
that returns void.caml_ml_domain_cpu_relax
, which now also checks pending actions if poll insertion is disabled.This changes the behavior of
cpu_relax
, ascaml_ml_domain_cpu_relax
was previously implemented as a pause instruction plus ainterrupt_pending
check, ignoring pending actions or external interrupts. As far as I can tell, the only way to setinterrupt_pending
also callsinterrupt_domain
, which will trigger the target domain'syoung_limit
check upon the next poll point, so polling should be sufficient.I've added tests that
cpu_relax
can be used for spin-blocking between systhreads in the initial domain and on another domain.The assembly output for the following function is the same for runtime4, runtime5, and runtime5 with poll insertion:
Runtime4
Runtime5 (without poll insertion)
Runtime5 (with poll insertion)