-
Notifications
You must be signed in to change notification settings - Fork 558
Add a parallelism abstraction library #8880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if constexpr (buildPackageDB) { | ||
if (gs.packageDB().enforceLayering()) { | ||
ComputePackageSCCs::run(gs); | ||
} | ||
} | ||
|
||
workers.multiplexJobWait("rewritePackagesAndFiles", [&gs = as_const(gs), &files, taskq]() { | ||
Timer timeit(gs.tracer(), "packager.rewritePackagesAndFilesWorker"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sad that we lose the ability to add timings for individual workers, but maybe it's not really valuable to know how long the workers take, and what you really care about would be accomplished with:
{
Timer timeit(gs.tracer, "work");
workers.multiplexJobWait(...);
}
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with that. We can always put timers into the body of the worker to know how long the individual iterations take, and use the span around the whole job as a reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait so do we lose anything?
Could we not have kept the behavior by putting the Workers
timer in the body?
I'm trying to understand the change in what's not possible with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting it in the body means we'll see many small spans on the thread when viewing the trace, instead of one long span that covers the duration of the work. I think this is an okay tradeoff, as we can wrap one big span around the use of iterate
, and then see spans for work items on the threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yeah I don't think I care too much about that.
6de25a0
to
20fb82f
Compare
We have a few common patterns for parallelism in Sorbet, and it would be nice to simplify their use. This PR introduces the
Parallel
class, and a first implementation of one of these patterns:iterate
. This template wraps up a common use ofWorkerPool::multiplexJobWait
where we apply a function to the contents of a vector in parallel, but with some additional guards that avoid the pool for sufficiently small inputs.Initial tracing looks good, in that I don't see the visibility checker occasionally spiking in duration on the fast path.
Motivation
I've been seeing strange overhead from using concurrency on small inputs that makes me think that we should avoid it when possible. This shows up commonly on the fast path in LSP, where single file edits are the norm.
Test plan
Existing tests.