8000 net: add NetEventsInterface::g_msgproc_mutex by ajtowns · Pull Request #26036 · bitcoin/bitcoin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

net: add NetEventsInterface::g_msgproc_mutex #26036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 20, 2022

Conversation

ajtowns
Copy link
Contributor
@ajtowns ajtowns commented Sep 7, 2022

There are many cases where we assume message processing is single-threaded in order for how we access node-related memory to be safe. Add an explicit mutex that we can use to document this, which allows the compiler to catch any cases where we try to access that memory from other threads and break that assumption.

@fanquake fanquake added the P2P label Sep 7, 2022
@ajtowns
Copy link
Contributor Author
ajtowns commented Sep 7, 2022

Resurrected from #24474.

The main purpose of this PR is to document the variables we assume are only accessed via one thread, and hence which we don't bother to grab a mutex before accessing. Since these annotations are checked automatically by clang in CI, this should help avoid possible bugs in future.

To do this, this PR replaces the per-peer CNode::cs_sendProcessing (that was never actually used as a guard annotation for any variables, nor locked from any thread other than the message processing thread, and can only be used to guard other members of the CNode class) with a single, global, NetEventsInterface:g_mutex_msgproc_thread NetEventsInterface:g_msgproc_mutex that is locked once from the message processing thread and released only when the thread exits.

It would also be possible to have a "dummy" lockable type and use that to document these variables, rather than an actual mutex. I've done an example of that sort of approach at #25174 (comment) but for this case where the lock is only ever acquired once at startup and has no possible contention, the overhead seems so minimal as to not be worth worrying about.

In the long-term, simplifying the locking scheme, and perhaps allowing more things to happen in parallel are obviously still worthwhile; this PR doesn't mean to imply that these things should only be used from a single thread in future, rather it's just documenting that they are right now.

See also #25454 (review) #25557 (comment)

@DrahtBot
Copy link
Contributor
DrahtBot commented Sep 8, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25597 (refactor: replace RecursiveMutex cs_sendProcessing with Mutex by theStack)
  • #25572 (refactor: Introduce EvictionManager and use it for the inbound eviction logic by dergoegge)
  • #24545 (BIP324: Enable v2 P2P encrypted transport by dhruv)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member
@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK. Review, each commit debug builds cleanly (clang 13) without thread safety errors, verified that removing some of the added locks in the first commit does cause thread safety errors. Ran bitcoind with the suggested runtime lock assertions.

Copy link
Contributor
@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f55747f

Are you planning to actually make the message processing multi-threaded?

@ajtowns ajtowns force-pushed the 202209-msgproc-mutex branch from f55747f to 22e7c46 Compare September 13, 2022 02:35
@ajtowns
Copy link
Contributor Author
ajtowns commented Sep 13, 2022

Are you planning to actually make the message processing multi-threaded?

Original point of all this was just to be able to make g_cs_orphans private to TxOrphanage without reducing guard annotations (#21527 and vExtraTxn*).

I don't think multiple net_processing threads is likely to help much -- I think blocking mostly happens due to cs_main getting locked which would just cause multiple threads to stall if we had them, rather than some node requiring a lot of local processing that holds up some other node's local processing. Having either reader-writer locks for the mempool and cs_main or switching to a RCU model might help there, but is probably a lot of work. Either way, it seems helpful in the meantime to make it clearer how we protect things from race conditions.

@ajtowns ajtowns changed the title net: add NetEventsInterface::g_mutex_msgproc_thread mutex net: add NetEventsInterface::g_msgproc_mutex Sep 13, 2022
Copy link
Contributor
@vasild vasild 8000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 22e7c46

A missing "by" in the commit message of net: drop cs_sendProcessing:

SendMessages() is now protected g_msgproc_mutex;

@jonatack
Copy link
Member

Any reason #26036 (comment) wasn't addressed in the last push? It would be helpful to know if the thread safety guidelines in the developer notes should be updated.

Copy link
Member
@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review ACK 22e7c46 🍡

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK 22e7c464db9719ebc8eba0143f5a6024aac658c3 🍡
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUhiGQwAkEL9Sna2ocIewx1Haykhnz0jJc7HCkseZAGV6fG6B7XaIvRJzv72iDgu
cxRU8pnGwa0GNai6aL6oGVJN61ic6UmAtHjn13T+hLTngOVJKq5G9QfuIAaafyRZ
Mv9zQ1n+qbydjD4uBrkkelQc65lX1wASK1q1h4S2iZR1q4oree5rQr9QpOQ7APSX
hfpqRrkYXkLKdlJ89aeNYHaJ6KFtT0b2oJsCCQpkU/Ou3ywluqnN+7QnkNbIDQmu
DQl0gTrImpldkLsgqmSE4CDtcy7Oopx0AbbDyn3ClKj2wOnRXAaa0QUmtcYdACF0
3Wv88lr3mEr+SReN7E5CaijpQELQG2wAbHykl1FQ8kPQ8oQkCaA0gA6D4VL5Ol85
MOAU3gr2mazuo1KKF+tT9HAIb5pFJBzFcFIHpQpKijVWVp6wAqQubQKQOd/qf2Pr
p7r3kqlbXurQIaEn4KpxbYJ7hg2F5X/q+RdS2vgtPr3dC2MLsEEB4ZqjpPe7MiyH
YQJlcdGU
=OIUy
-----END PGP SIGNATURE-----

@@ -5099,7 +5099,7 @@ void PeerManagerImpl::MaybeSendAddr(CNode& node, Peer& peer, std::chrono::micros

// Remove addr records that the peer already knows about, and add new
// addrs to the m_addr_known filter on the same pass.
auto addr_already_known = [&peer](const CAddress& addr) {
auto addr_already_known = [&peer](const CAddress& addr) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db5c65b: Any reason to add the namespace bloat here, but not elsewhere? It seems unlikely that there is another symbol with the same name in the global or another namespace. For that reason I'd also prefer to just make it a global in the global namespace to also de-bloat the other places.

diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index 594141ca84..d9a1124b64 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -537,7 +537,7 @@ public:
 
 private:
     /** Consider evicting an outbound peer based on the amount of time they've been behind our tip */
-    void ConsiderEviction(CNode& pto, Peer& peer, std::chrono::seconds time_in_seconds) EXCLUSIVE_LOCKS_REQUIRED(cs_main, NetEventsInterface::g_msgproc_mutex);
+    void ConsiderEviction(CNode& pto, Peer& peer, std::chrono::seconds time_in_seconds) EXCLUSIVE_LOCKS_REQUIRED(cs_main, g_msgproc_mutex);
 
     /** If we have extra outbound peers, try to disconnect the one with the oldest block announcement */
     void EvictExtraOutboundPeers(std::chrono::seconds now) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
@@ -601,7 +601,7 @@ private:
     void ProcessHeadersMessage(CNode& pfrom, Peer& peer,
                                std::vector<CBlockHeader>&& headers,
                                bool via_compact_block)
-        EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex, !m_headers_presync_mutex, NetEventsInterface::g_msgproc_mutex);
+        EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex, !m_headers_presync_mutex, g_msgproc_mutex);
     /** Various helpers for headers processing, invoked by ProcessHeadersMessage() */
     /** Return true if headers are continuous and have valid proof-of-work (DoS points assigned on failure) */
     bool CheckHeadersPoW(const std::vector<CBlockHeader>& headers, const Consensus::Params& consensusParams, Peer& peer);
@@ -610,7 +610,7 @@ private:
     /** Deal with state tracking and headers sync for peers that send the
      * occasional non-connecting header (this can happen due to BIP 130 headers
      * announcements for blocks interacting with the 2hr (MAX_FUTURE_BLOCK_TIME) rule). */
-    void HandleFewUnconnectingHeaders(CNode& pfrom, Peer& peer, const std::vector<CBlockHeader>& headers) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
+    void HandleFewUnconnectingHeaders(CNode& pfrom, Peer& peer, const std::vector<CBlockHeader>& headers) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
     /** Return true if the headers connect to each other, false otherwise */
     bool CheckHeadersAreContinuous(const std::vector<CBlockHeader>& headers) const;
     /** Try to continue a low-work headers sync that has already begun.
@@ -633,7 +633,7 @@ private:
      */
     bool IsContinuationOfLowWorkHeadersSync(Peer& peer, CNode& pfrom,
             std::vector<CBlockHeader>& headers)
-        EXCLUSIVE_LOCKS_REQUIRED(peer.m_headers_sync_mutex, !m_headers_presync_mutex, NetEventsInterface::g_msgproc_mutex);
+        EXCLUSIVE_LOCKS_REQUIRED(peer.m_headers_sync_mutex, !m_headers_presync_mutex, g_msgproc_mutex);
     /** Check work on a headers chain to be processed, and if insufficient,
      * initiate our anti-DoS headers sync mechanism.
      *
@@ -649,7 +649,7 @@ private:
     bool TryLowWorkHeadersSync(Peer& peer, CNode& pfrom,
                                   const CBlockIndex* chain_start_header,
                                   std::vector<CBlockHeader>& headers)
-        EXCLUSIVE_LOCKS_REQUIRED(!peer.m_headers_sync_mutex, !m_peer_mutex, !m_headers_presync_mutex, NetEventsInterface::g_msgproc_mutex);
+        EXCLUSIVE_LOCKS_REQUIRED(!peer.m_headers_sync_mutex, !m_peer_mutex, !m_headers_presync_mutex, g_msgproc_mutex);
 
     /** Return true if the given header is an ancestor of
      *  m_chainman.m_best_header or our current tip */
@@ -659,7 +659,7 @@ private:
      * We don't issue a getheaders message if we have a recent one outstanding.
      * This returns true if a getheaders is actually sent, and false otherwise.
      */
-    bool MaybeSendGetHeaders(CNode& pfrom, const CBlockLocator& locator, Peer& peer) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
+    bool MaybeSendGetHeaders(CNode& pfrom, const CBlockLocator& locator, Peer& peer) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
     /** Potentially fetch blocks from this peer upon receipt of a new headers tip */
     void HeadersDirectFetchBlocks(CNode& pfrom, const Peer& peer, const CBlockIndex* pindexLast);
     /** Update peer state based on received headers message */
@@ -683,10 +683,10 @@ private:
     void MaybeSendPing(CNode& node_to, Peer& peer, std::chrono::microseconds now);
 
     /** Send `addr` messages on a regular schedule. */
-    void MaybeSendAddr(CNode& node, Peer& peer, std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
+    void MaybeSendAddr(CNode& node, Peer& peer, std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
 
     /** Send a single `sendheaders` message, after we have completed headers sync with a peer. */
-    void MaybeSendSendHeaders(CNode& node, Peer& peer) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
+    void MaybeSendSendHeaders(CNode& node, Peer& peer) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
 
     /** Relay (gossip) an address to a few randomly chosen nodes.
      *
@@ -695,10 +695,10 @@ private:
      * @param[in] fReachable   Whether the address' network is reachable. We relay unreachable
      *                         addresses less.
      */
-    void RelayAddress(NodeId originator, const CAddress& addr, bool fReachable) EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex, NetEventsInterface::g_msgproc_mutex);
+    void RelayAddress(NodeId originator, const CAddress& addr, bool fReachable) EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex, g_msgproc_mutex);
 
     /** Send `feefilter` message. */
-    void MaybeSendFeefilter(CNode& node, Peer& peer, std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
+    void MaybeSendFeefilter(CNode& node, Peer& peer, std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
 
     const CChainParams& m_chainparams;
     CConnman& m_connman;
@@ -1010,7 +1010,7 @@ private:
      *  @return   True if address relay is enabled with peer
      *            False if address relay is disallowed
      */
-    bool SetupAddressRelay(const CNode& node, Peer& peer) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex);
+    bool SetupAddressRelay(const CNode& node, Peer& peer) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
 };
 
 const CNodeState* PeerManagerImpl::State(NodeId pnode) const EXCLUSIVE_LOCKS_REQUIRED(cs_main)
@@ -5099,7 +5099,7 @@ void PeerManagerImpl::MaybeSendAddr(CNode& node, Peer& peer, std::chrono::micros
 
     // Remove addr records that the peer already knows about, and add new
     // addrs to the m_addr_known filter on the same pass.
-    auto addr_already_known = [&peer](const CAddress& addr) EXCLUSIVE_LOCKS_REQUIRED(NetEventsInterface::g_msgproc_mutex) {
+    auto addr_already_known = [&peer](const CAddress& addr) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex) {
         bool ret = peer.m_addr_known->contains(addr.GetKey());
         if (!ret) peer.m_addr_known->insert(addr.GetKey());
         return ret;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't cross my mind that it wasn't necessary here. m_last_block_inv_triggering_headers_sync also doesn't need the class name. I've left it in the class since that class essentially defines "message processing" so it seems more appropriate than not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g_msgproc_mutex already clarifies that this is for message processing, so still think the namespace/class can be dropped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, should it be GlobalMutex? I had some plans to eradicate that one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were a normal global rather than a static member of the class it would need to be GlobalMutex (otherwise you get an error that CConnman::ThreadMessageHandler() needs to be annotated with EXCLUSIVE_LOCKS_REQUIRED(!g_msgproc_mutex), and probably similar in the tests). With the lock being a class member, clang assumes functions outside of the class aren't called with the lock held.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean that if "the namespace/class is dropped" then it becomes normal global and thus its type must be changed.

There are many cases where we assume message processing is
single-threaded in order for how we access node-related memory to be
safe. Add an explicit mutex that we can use to document this, which allows
the compiler to catch any cases where we try to access that memory from
other threads and break that assumption.
SendMessages() is now protected g_msgproc_mutex; so this additional
per-node mutex is redundant.
@ajtowns ajtowns force-pushed the 202209-msgproc-mutex branch from 22e7c46 to 60682dc Compare September 15, 2022 04:51
@maflcko
Copy link
Member
maflcko commented Sep 15, 2022

review ACK 60682dc 🤞

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK 60682dc1e506f01446ba54b4c41c8f32a9459235 🤞
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUgEZgv/f+A8D5L+blHinz1hvX6Lx161mNRj4EjTkO8G6z8T7FAWLAlFjkRUD/px
pfjkbSjby0mCk+Qy56q9TnHNeNCvVtl3Qp4bIoQBzG2KfUtkvGHR2c+KsM3XW6F9
gaZfzhQBQsieSluEt0PtJMx9jo4zDEUEIcHg2QRtPaPIf631UtO6MeIRSnwr0V3K
2+A706TTlgJgKP9lFyAM0KjhKtQ61CVY0Drh0C7wuErTP+Pb+WiQ8jDNM5UfGjvi
lF/VqYJXgvAgT/Y6F7ft8BMo39zrlBK9k+/i1KaUOgysp9/TX1e17BWxfFFnBEsh
hbXcL6bYr5qqw9CaWH3qNWHZV9tTGgdLcypdD2gIAiMaLgZeu05YUOk8pXyp3/0o
sGLI3ih7yQ3Hy1S5e0+hDvU+drVgVndEyaxyCZtSn4u+4e9UWl9YVnncfdXDTwXy
KZ1i5SKJbp+pZ3JWBQmJK6HTalIBOC4TDcs/930ym/CChXmVqv4bMAhS1Mkzp8tB
PGnTV0Eb
=AnQA
-----END PGP SIGNATURE-----

@ajtowns ajtowns force-pushed the 202209-msgproc-mutex branch from 60682dc to d575a67 Compare September 15, 2022 10:45
Copy link
Contributor
@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d575a67

Copy link
Contributor
@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK d575a67 modulo the missing runtime checks

It would be really nice not to violate the developer notes, or at least if we occasionally do, to be for a better reason than given in #26036 (comment).

@maflcko
Copy link
Member
maflcko commented Sep 15, 2022

review ACK d575a67 📽

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK d575a675cc884b1bebdb6197f2ef45c51788d4a3 📽
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjnCgv/VwIJXtqBslTMcneH9mic4H9oVu0GmVcAskWGZcUNOMGAFHhtrD8a08IF
VRj9QOBUlvJHKo13NpDbctT7GOOQPTuDV0TS7ixote1+yjwknsGpB1MpVgCu1s8M
su/9Xx5zdikfQoybb7qbjcYrtaOUQYfoxSsFIGJoiKMTZ1H0D6GjS4D0/XORsMIL
Lk7UNeLByR5Kq/otSHUEdzu42ToZTGEFEPUvOoDefNdFdbbB9KLT59NYSbYiBJLD
39C8VPYuu+M8q2i1PaC4Zr9l7ufFvQv09lw9TBFN+ztAz0gnc6sm+kr2b7/yHkDq
MmDflnhyaMV8XYehtz/SbWiS2QAcNc96ewy5M4pL1V6gNHytzr75qk9oBWpKPSMM
am2/TiSf+EnKwsU9Quj2XbzYhBweFk0xNpXbAYHeFf7WEH0E/vb/5PudbelwmNRP
AzaXYbyh4qR/uKyaJ9wlCFMcr0toWPb7koqHOTba1Zq2Kb+JyWRI6cWkB3BZX2EM
EZQHT3Z2
=X5uj
-----END PGP SIGNATURE-----

@dergoegge
Copy link
Member

Code review ACK d575a67

There are a couple CNodeState members that could also be annotated as guarded by g_msgproc_mutex, see the last couple commits here: https://github.com/dergoegge/bitcoin/commits/202209-msgproc-mutex . Feel free to cherry pick them otherwise i can also open a follow up. (Those CNodeState members should probably move to Peer aswell)

@fanquake fanquake merged commit 5b6f0f3 into bitcoin:master Sep 20, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 20, 2022
@dergoegge
Copy link
Member

Opened a follow-up (#26140) that includes what i mentioned in my previous comment.

@bitcoin bitcoin locked and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0