-
Notifications
You must be signed in to change notification settings - Fork 37.4k
p2p: Avoid forwarding ADDR messages to SPV nodes #17194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It is good for SPV nodes to get addrs sent to them, but it doesn't benefit the network as much as sending addrs to a full node. Ensuring addrs go to 2 full node peers without excluding SPV nodes seems to capture the requirements well. |
Concept ACK Thanks for working on this! |
Do you mean unsolicited addrs? If so, I tend to disagree. That would make it harder to run a bandwidth efficient non-addr-relay node (e.g. an "SPV node"). I think that addr messages should only be gossiped to bip-155 nodes or network nodes (detectable via service bit). All other nodes should send a |
I think it's a bit more nuanced than that. In an ideal world, I think light clients would participate in local IP address management. Perhaps they wouldn't relay addresses themselves, but even if not, they should be learning from IP addresses that are gossiped around. In reality it seems that none of them do, and obviously gossipping IP addresses to nodes that are just going to ignore them is pointless. |
I wanted to note that one SPV client I'm aware of does (or at least did) use addr messages to find nodes. E.g., see this code: However, that wallet also uses BIP37 bloom filters, so I don't know what their strategy will be going forward with regards to using the P2P protocol. Directly relevant to this PR, it looks like they don't accept unsolicited addr messages anyway---they only accept responses to a getaddr message: https://github.com/breadwallet/breadwallet-core/blob/494aa99ed61df24d13e45f6b335cda192c1b765c/bitcoin/BRPeer.c#L282 |
So, it seems like most of us agree that we should allow light clients to request addrs, and we should not send them unsolicited addr messages assuming they forward them. Just wanted to clarify, it is exactly what this PR does. |
I updated the script to account for black holes and be more precise in general (original post is updated to reflect new numbers). |
@naumenkogs I looked through the commit and it wasn't completely obvious to me, how I know I would advocate for a comment explaining the intention of the call: "We are calling |
No. Unfortunately, right now there is no way to distinguish forwarding SPV nodes and non-forwarding SPV nodes. Fyi, service bits Now I agree that documenting and renaming the function would be useful here, will do. |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Here is a functional test that demonstrates this patch works as intended. functional testdiff --git c/test/functional/p2p_addr_relay.py i/test/functional/p2p_addr_relay.py
new file mode 100755
index 000000000..7b74888f0
--- /dev/null
+++ i/test/functional/p2p_addr_relay.py
@@ -0,0 +1,75 @@
+#!/usr/bin/env python3
+# Copyright (c) 2017-2019 The Bitcoin Core developers
+# Distributed under the MIT software license, see the accompanying
+# file COPYING or http://www.opensource.org/licenses/mit-license.php.
+"""Tests address relaying to SPV nodes (aka clients aka ~(NODE_NETWORK | NODE_NETWORK_LIMITED))
+
+* set up nodeRelay (bitcoind)
+* connect two mini nodes to nodeRelay - node1 and node2
+* fire an "addr message" from node1 to nodeRelay
+* observe if node2 will get the address relayed through nodeRelay
+
+Results should be:
+
+Without https://github.com/bitcoin/bitcoin/pull/17194:
+node2 services=NODE_WITNESS | NODE_NETWORK_LIMITED -> relayed
+node2 services=NODE_WITNESS | NODE_NETWORK -> relayed
+node2 services=NODE_WITNESS -> relayed
+
+With https://github.com/bitcoin/bitcoin/pull/17194:
+node2 services=NODE_WITNESS | NODE_NETWORK_LIMITED -> relayed
+node2 services=NODE_WITNESS | NODE_NETWORK -> relayed
+node2 services=NODE_WITNESS -> not relayed
+"""
+from test_framework.messages import CAddress, msg_addr, NODE_NETWORK, NODE_NETWORK_LIMITED, NODE_WITNESS
+from test_framework.mininode import P2PInterface, mininode_lock
+from test_framework.test_framework import BitcoinTestFramework
+from test_framework.util import (
+ assert_equal,
+ wait_until,
+)
+
+import time
+
+class P2Paddr(P2PInterface):
+ def wait_for_addr(self, timeout=5):
+ test_function = lambda: self.last_message.get("addr")
+ wait_until(test_function, timeout=timeout, lock=mininode_lock)
+
+class NodeNetworkLimitedTest(BitcoinTestFramework):
+ def set_test_params(self):
+ self.setup_clean_chain = True
+ self.num_nodes = 1
+ self.extra_args = [[]]
+
+ def setup_network(self):
+ self.add_nodes(self.num_nodes, self.extra_args)
+ self.start_nodes()
+
+ def run_test(self):
+ nodeRelay = self.nodes[0]
+
+ node1 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS | NODE_NETWORK)
+ #node2 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS | NODE_NETWORK)
+ #node2 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS | NODE_NETWORK_LIMITED)
+ node2 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS)
+
+ addr = CAddress()
+ addr.time = int(time.time())
+ addr.nServices = NODE_NETWORK | NODE_WITNESS
+ addr.ip = "123.123.123.123"
+ addr.port = 8333
+
+ msg = msg_addr()
+ msg.addrs.append(addr)
+
+ self.log.info("Checking how nodeRelay relays the address it received from node1")
+
+
8000
node1.send_message(msg)
+
+ node2.wait_for_addr()
+
+ nodeRelay.disconnect_p2ps()
+
+if __name__ == '__main__':
+ NodeNetworkLimitedTest().main()
diff --git c/src/net_processing.cpp i/src/net_processing.cpp
index 465a0c06e..795a9179f 100644
--- c/src/net_processing.cpp
+++ i/src/net_processing.cpp
@@ -97,13 +97,16 @@ void EraseOrphansFor(NodeId peer);
/** Increase a node's misbehavior score. */
void Misbehaving(NodeId nodeid, int howmuch, const std::string& message="") EXCLUSIVE_LOCKS_REQUIRED(cs_main);
/** Average delay between local address broadcasts in seconds. */
static constexpr unsigned int AVG_LOCAL_ADDRESS_BROADCAST_INTERVAL = 24 * 60 * 60;
/** Average delay between peer address broadcasts in seconds. */
-static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 30;
+// Relay immediately, otherwise we may end up delaying the relay for a long
+// time (without upper limit, see PoissonNextSend()) which makes the
+// p2p_addr_relay.py test fragile because it waits for the relay.
+static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 0;
/** Average delay between trickled inventory transmissions in seconds.
* Blocks and whitelisted receivers bypass this, outbound peers get half this delay. */
static const unsigned int INVENTORY_BROADCAST_INTERVAL = 5;
/** Maximum number of inventory items to send per transmission.
* Limits the impact of low-fee transaction floods. */
static constexpr unsigned int INVENTORY_BROADCAST_MAX = 7 * INVENTORY_BROADCAST_INTERVAL;
@@ -1377,13 +1380,20 @@ static void RelayAddress(const CAddress& addr, bool fReachable, CConnman* connma
FastRandomContext insecure_rand;
std::array<std::pair<uint64_t, CNode*>,2> best{{{0, nullptr}, {0, nullptr}}};
assert(nRelayNodes <= best.size());
auto sortfunc = [&best, &hasher, nRelayNodes](CNode* pnode) {
- if (pnode->nVersion >= CADDR_TIME_VERSION && pnode->IsAddrRelayPeer() && MayHaveUsefulAddressDB(pnode->nServices)) {
+ if (pnode->nVersion >= CADDR_TIME_VERSION && pnode->IsAddrRelayPeer()
+#if 1
+ && MayHaveUsefulAddressDB(pnode->nServices)
+#elif 1
+ // same as above
+ && !pnode->fClient
+#endif
+ ) {
uint64_t hashKey = CSipHasher(hasher).Write(pnode->GetId()).Finalize();
for (unsigned int i = 0; i < nRelayNodes; i++) {
if (hashKey > best[i].first) {
std::copy(best.begin() + i, best.begin() + nRelayNodes - 1, best.begin() + i + 1);
best[i] = std::make_pair(hashKey, pnode);
break; Since this PR is not getting traction for some time, if it looks like too risky behavioral change, what about making it optional under a newly introduced config option? |
@vasild Thanks for the test! Will integrate it into this PR. I anticipate that there's not enough attention not because it's too risky, but rather just a natural thing. Maybe a lack of motivation. |
Yes, a config knob is not very attractive either. Btw I had to make this change in the code to make the test pass quickly and deterministically: -static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 30;
+// Relay immediately, otherwise we may end up delaying the relay for a long
+// time (without upper limit, see PoissonNextSend()) which makes the
+// p2p_addr_relay.py test fragile because it waits for the relay.
+static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 0; It cannot be added to the repository in its current form, but is good enough for manual testing. Maybe I overlooked something and it can be done without requiring changes to the code? |
Concept and code review ACK 644eae6 Given evidence that non-full-nodes don't do anything with unsollicited addr relays, and the fact that GETADDR still works, I think it's fine to drop relay to them. It's probably worth announcing this on the bitcoin-dev mailinglist, to ask for comments, and suggest that if there are exceptions, we can work on a service flag for explicitly opting into addr relay. |
@jnewbery @amitiuttarwar Did you want to take a look here? |
644eae6
to
3a1629a
Compare
If the goal is to ensure that ADDR gossips don't disappear into a black hole, then I don't think this is the correct solution.
This seems like a much better solution. We lose nothing by gossiping an address to SPV peers, as long as we also gossip to two "address relay" peers. We don't know whether there are clients that rely on receiving address gossips. After this PR, SPV clients can still request addresses using GETADDR, but we only permit one GETADDR request per connection. If an SPV client wants to request fresh addresses, they'd need to disconnect-reconnect-GETADDR, which seems undesirable.
(http://www.erisian.com.au/bitcoin-core-dev/log-2019-10-17.html#l-591) This seems even better to me. What we actually care about is whether the peer is going to relay the address to more peers, not whether it's advertised that it's able to serve blocks. Whether the peer has previously sent us addresses is our best heuristic for this. |
@jnewbery We seem to have 2 options:
The only real difference is that the first option (current implementation 3a1629a) might exclude some SPV implementations which handle ADDR messages (and send ADDR messages, as required per second option to reliably work), and nobody seen these clients so far. I guess we can ask on the ML whether they exist, but the lack of response won't prove the absence :) I don't mind switching to the second option, but since the activity in this PR is low, I would do it after Concept ACK 2nd option from someone? |
If someone is interested in writing the concept ACK of the second option get in touch with me, happy to review your code provide input. Otherwise I might try to write this. |
This isn't true. We self-announce our address to all peers on a timer: bitcoin/src/net_processing.cpp Lines 4058 to 4065 in a76ccb0
(although in practice most of the time this won't actually result in our address being advertised since it won't have cycled out of It is true that we'll be ineffective at relaying new addresses that we hear about. |
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
@naumenkogs checking in on the status of the PR. Is this something you plan to continue to pursue? |
If I understand the code at bitcoin/src/net_processing.cpp Lines 2783 to 2786 in da69d99
correctly, the assumption of 120 waves of relay in the simulation is too much, because with nSince = nNow - 10 * 60 , gossip relay for a given addr will just end 10 minutes (=20 waves on average) after the self-announcement, assuming synchronized clocks. So I would expect relay to be worse than the simulations suggest.
|
Closing this because it was partially handled by #21528, but also arguments just get outdated after that PR, and the code as well. |
Background
The primary method of relaying node addresses across the network is daily self-announcement and forwarding self-announcements of other nodes.
To rate-limit those, we relay them only to 1 or 2 peers (depending on whether the address is reachable for us).
Unfortunately, in current implementation there is a chance of choosing SPV nodes, which, to the best of my knowledge, currently do not forward those announcements.
Discussion
During the meeting we reached a soft consensus that address relay participation should not be coupled to SPV/full node definition, but rather have an explicit flag (e.g. service flag).
Until that’s getting resolved, I’m suggesting a tiny patch that prevent forwarding (they still can request and learn) these announcements to SPV nodes.
Analysis
To justify the change, I decided to measure how does this relay perform in different conditions (see my script).
I measured how many nodes know about the new address relayed through this protocol (omitting batching <10 rule) within 120 “waves” of relay, each happening every 30 seconds (which is roughly 1 hour).
Let’s see what if the address is reachable to 5% of the network (I guess pretty fair assumption for Onion addresses, for example). Let's call those "exotic".
Let's also simulate having 10% of black holes, which are indistinguishable from real nodes, but don't forward anything.
If you don't believe it's that bad, please make your own simulation to validate mine :)
It will become worse and worse with the growth of SPV.
These numbers can also be used to motivate the high-level change (see discussion).
Questions
We can forward it to these two groups IN ADDITION to forwarding to (1 or 2) peers, if there is a belief that this is might affect their security. The bandwidth overhead would be obviously very low.
P.S.
This experiment also means that we're very vulnerable to just black holes when relaying exotic addresses, but I will write about it in a separate issue.Imrpoved the script to consider black holes.
P.P.S.
I guess the biggest inaccuracy in my little experiment is that in practice, those exotic semi-reachable nodes are well-interconnected, so once the address makes it to one of those — it should be relayed a bit better. Maybe, I will try to measure.Anyway, this does not disregard the problem.
Improved the script to take exotic inter-connectivity into account and updated the measurements.