-
Notifications
You must be signed in to change notification settings - Fork 636
mempool: Handle concurrent requests in recheck callback #895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Handling in order ABCI calls at the client is mandatory. If not then a In my view, the gRPC ABCI client does provide the desired in-order semantics. I would not add more concurrency to Nevertheless, the idea of removing the cursor is I believe a nice one because it does simplify the codebase :) |
It seems that the ABCI Flush method is not relevant at all (see tendermint/tendermint#6994). While simplifying the client interface for v0.36, it was proposed to be removed (see comments in tendermint/tendermint#7607), which finally never happened. The client interface for ABCI has a comment saying that this method should be removed as it is not implemented. And even the SDK does not implement a handler for Flush requests. |
I didn't go thoroughly through the patch, but have some comments:
|
You're right. I removed the label. |
This is what the spec says:
I would say that only the locking part is needed to ensure that no new messages will be received. Is this correct? |
I believe that this is needed, otherwise we might end up with two concurrent remove calls of the same transaction on the To avoid the above problem, I am advocating to take locks in the callbacks. However, this poses a problem because the local client is hanging: the local client is synchronous and locks are not re-entrant in go. We should change this pattern because this client is not doing what is suppose to. However, as the SDK is using it, I wonder how tractable such a change is. |
A bad interleaving that explains why we need |
Just a few more thoughts about improving the parallelism by allowing concurrent checkTx. It seems that there is no easy way to do this. The knot of the problem is this strange pattern in the mempool where one passes a callback to another callback. Maybe the best approach would be to have a re-entrant read/write lock for the All in all, I am not sure this is a priority, given the speed of the current implementation: around 60us to check a Tx with the remote client. |
This has never been thoroughly discussed. Maybe we can include it in today's meeting? |
@otrack Here's an idea to untangle this knot. The callback |
@hvanz very interesting 👍 any plans to resume this? |
Closing as it is currently not an option to break the FIFO ordering when checking transactions. |
Addresses tendermint/tendermint#5519
Also, this PR simplifies the implementation of the recheck callback by eliminating the global pointers
recheckCursor
andrecheckEnd
, which were used to traverse the list of transactions while processing recheck requests. These variables were also used in other parts of the code for deciding if there were still unprocessed recheck requests. This PR replaces them by the existingtxsMap
, which maps tx keys totxs
entries, and by a new counter variable to keep track of how many requests are still to be processed.Before it was assumed that re-CheckTx requests were processed and handled sequentially, but this is not true for gRPC ABCI clients (see comments in tendermint/tendermint#5519).
Built on top of #894
PR checklist
.changelog
(we use unclog to manage our changelog)docs/
orspec/
) and code comments