8000 Optimize `Semaphore` by HollandDM · Pull Request #9662 · zio/zio · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Optimize Semaphore #9662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: series/2.x
Choose a base branch
from
Open

Conversation

HollandDM
Copy link
@HollandDM HollandDM commented Mar 3, 2025

/claim #9093

This PR attempts to optimize ZIO Semaphore implementation by:

  • Switched from Promise to a dumb-ed down Waiter object, this eliminate Promise spin wait.
  • Implemented WaiterQueue, a simplified and customized version of Java's ConcurrentLinkedQueue, as the semaphore wait queue.

This PR yield some moderate boosts for ZIO's Semaphore, but it's still not catch up to the same level as Java's Semaphore yet. Further optimizations + discussions are welcomed

@CLAassistant
Copy link
CLAassistant commented Mar 3, 2025

CLA assistant check
All committers have signed the CLA.

@HollandDM
Copy link
Author

I also copied benchmark from #9093, here is the result:
jmh-result.json
There are positive gains in "zioFair" & "zioUnfair" variants, especially in high fibers cases, but for lower fibers cases the gains are low, and still not catch up with Java side

@HollandDM HollandDM marked this pull request as draft March 3, 2025 04:14
Copy link
algora-pbc bot commented Mar 3, 2025

💵 To receive payouts, sign up on Algora, link your Github account and connect with Stripe.

@HollandDM
Copy link
Author

The original benchmark is quite unfair for ZIO in a way that the Java version use try ... finally ..., which is more performance that ZIO.acquireReleaseWith. I've updated the benchmark to use the same acquireReleaseWith, and the differences reduces significantly. Maybe it is more about acquireReleaseWith overhead rather semaphore impl being bad.
I also add a case to show the benefit of ZIO (when number of fibers > number of cores). And ZIO cases outperform quite a lot in that situation.

New benchmark result here:
jmh-result.json

@HollandDM HollandDM force-pushed the update-semaphore branch 2 times, most recently from b861376 to 959f1d6 Compare March 3, 2025 14:15
* Creates a new `Semaphore`, baced by ConcurrentSemaphore with the specified
* number of permits, and fairness.
*/
def make2(permits: => Long, fair: => Boolean = false)(implicit trace: Trace): UIO[Semaphore] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just a separate makeUnfair constructor?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just follow Java signature here, can be updated if you want

@HollandDM HollandDM marked this pull request as ready for review March 4, 2025 13:31
@@ -32,7 +33,7 @@ import scala.collection.immutable.{Queue => ScalaQueue}
* If you need functionality that `Semaphore` doesnt' provide, use a
* [[TSemaphore]] and define it in a [[zio.stm.ZSTM]] transaction.
*/
sealed trait Semaphore extends Serializable {
private[zio] trait Semaphore extends Serializable {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private[zio] trait Semaphore extends Serializable {
sealed trait Semaphore extends Serializable {

in object Semaphore { private[zio] abstract class SemaphoreBase extends Semaphore


private final val unitReservation = Exit.succeed(ZeroReservation)

private final class Waiter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a limitation of Promise that makes it unusable for this case? It seems like you are re-encoding a lot of behavior here.

Copy link
Author
@HollandDM HollandDM Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promise has more overhead as it is design for common use cases. Waiter is a Promise that only allow 1 party to await. Because of that the implementation can be more specialized (we don't need as many compare and set loop for Waiter as for Promise).
The code inside Waiter are pretty much dead code, the real logics are handled by semaphore itself. I drafted the logic and forgot to remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that semaphore is low level enough to be implemented as low as possible and not based on a more generic structure like Promise


}

private[SemaphoreImpls] sealed trait WaiterQueue {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly concerned with adding a new Queue implementation if it's just used for one structure. Instead, can we consider shading jctools? It would be useful elsewhere - and likely more performant.

Copy link
Author
@HollandDM HollandDM Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This queue was not created for performance, but for the ability to reduce permit and poll waiter out atomically. I created this because I failed to think of a way to do it with 1 queue and 1 atomic var separately.
About performance, this is only as fast as ConcurrentLinkedQueue. I'd love to use jctools too if I figured out how to do it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you shade the dependency, you can bring it in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I create this queue because I cant think of a way to reduce permit and poll in an atomic step, not because I dont want to use existing queue implementations

private final val unitReservation = Exit.succeed(ZeroReservation)

private final class Waiter(
val blockingOn: () => FiberId,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be a thunk.

}
}

private def await(reservation: Reservation)(implicit trace: Trace): UIO[Unit] = ZIO.suspendSucceed {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is always used within a suspension, you can name it awaitUnsafe, then remove the suspendSuceed

@HollandDM HollandDM force-pushed the update-semaphore branch 2 times, most recently from 0e116dc to 87ab9fa Compare March 5, 2025 14:39
@HollandDM
Copy link
Author

This is ready for review

@HollandDM
Copy link
Author

@kyri-petrou sorry to ping you directly, but this PR had been sitting here for a while now, so I want to know what we gonna do about this. I'm fine with either continue or close this PR

@guizmaii
Copy link
Member
guizmaii commented Jun 4, 2025

@HollandDM Can you resolve the conflict, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0