8000 p2p: Avoid forwarding ADDR messages to SPV nodes by naumenkogs · Pull Request #17194 · bitcoin/bitcoin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

p2p: Avoid forwarding ADDR messages to SPV nodes #17194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

naumenkogs
Copy link
Member
@naumenkogs naumenkogs commented Oct 18, 2019

Background

The primary method of relaying node addresses across the network is daily self-announcement and forwarding self-announcements of other nodes.
To rate-limit those, we relay them only to 1 or 2 peers (depending on whether the address is reachable for us).
Unfortunately, in current implementation there is a chance of choosing SPV nodes, which, to the best of my knowledge, currently do not forward those announcements.

Discussion

During the meeting we reached a soft consensus that address relay participation should not be coupled to SPV/full node definition, but rather have an explicit flag (e.g. service flag).
Until that’s getting resolved, I’m suggesting a tiny patch that prevent forwarding (they still can request and learn) these announcements to SPV nodes.

Analysis

To justify the change, I decided to measure how does this relay perform in different conditions (see my script).
I measured how many nodes know about the new address relayed through this protocol (omitting batching <10 rule) within 120 “waves” of relay, each happening every 30 seconds (which is roughly 1 hour).

Let’s see what if the address is reachable to 5% of the network (I guess pretty fair assumption for Onion addresses, for example). Let's call those "exotic".
Let's also simulate having 10% of black holes, which are indistinguishable from real nodes, but don't forward anything.

non-forwarding SPV nodes 5% 10% 20% 30%
All nodes knowing addr (before the change) 56% 48% 30% 22%
All nodes knowing addr (after the change) 55% 54% 58% 40%
Exotic nodes knowing addr (before the change) 80% 75% 55% 50%
Exotic nodes knowing addr (after the change) 76% 77% 75% 75%

If you don't believe it's that bad, please make your own simulation to validate mine :)

It will become worse and worse with the growth of SPV.

These numbers can also be used to motivate the high-level change (see discussion).

Questions

  1. Would we make SPVs more vulnerable to addrman injections? Like, would it be easier for an attacker to fill their addrmans with malicious sybil nodes? (I guess we are already vulnerable it assuming some capabilities of an attacker, but would it be noticeably worse? EthanHeilman)
  2. Does the same apply to the old pruned nodes? (achow101 I think you mentioned that older nodes would also be excluded here). Should we forward these announcements to them, if this makes them noticeably more vulnerable?

We can forward it to these two groups IN ADDITION to forwarding to (1 or 2) peers, if there is a belief that this is might affect their security. The bandwidth overhead would be obviously very low.

P.S.

This experiment also means that we're very vulnerable to just black holes when relaying exotic addresses, but I will write about it in a separate issue.
Imrpoved the script to consider black holes.

P.P.S.

I guess the biggest inaccuracy in my little experiment is that in practice, those exotic semi-reachable nodes are well-interconnected, so once the address makes it to one of those — it should be relayed a bit better. Maybe, I will try to measure.
Anyway, this does not disregard the problem.

Improved the script to take exotic inter-connectivity into account and updated the measurements.

@EthanHeilman
Copy link
Contributor

We can forward it to these two groups IN ADDITION to forwarding to (1 or 2) peers, if there is a belief that this is might affect their security. The bandwidth overhead would be obviously very low.

It is good for SPV nodes to get addrs sent to them, but it doesn't benefit the network as much as sending addrs to a full node. Ensuring addrs go to 2 full node peers without excluding SPV nodes seems to capture the requirements well.

@fanquake fanquake requested a review from sdaftuar October 18, 2019 22:03
@practicalswift
Copy link
Contributor

Concept ACK

Thanks for working on this!

@maflcko
Copy link
Member
maflcko commented Oct 22, 2019

It is good for SPV nodes to get addrs sent to them

Do you mean unsolicited addrs? If so, I tend to disagree. That would make it harder to run a bandwidth efficient non-addr-relay node (e.g. an "SPV node"). I think that addr messages should only be gossiped to bip-155 nodes or network nodes (detectable via service bit). All other nodes should send a getaddr or getaddrv2 p2p message to solicit addresses.

@sipa
Copy link
Member
sipa commented Oct 22, 2019

I think it's a bit more nuanced than that.

In an ideal world, I think light clients would participate in local IP address management. Perhaps they wouldn't relay addresses themselves, but even if not, they should be learning from IP addresses that are gossiped around.

In reality it seems that none of them do, and obviously gossipping IP addresses to nodes that are just going to ignore them is pointless.

@harding
Copy link
Contributor
harding commented Nov 3, 2019

I wanted to note that one SPV client I'm aware of does (or at least did) use addr messages to find nodes. E.g., see this code:

However, that wallet also uses BIP37 bloom filters, so I don't know what their strategy will be going forward with regards to using the P2P protocol. Directly relevant to this PR, it looks like they don't accept unsolicited addr messages anyway---they only accept responses to a getaddr message: https://github.com/breadwallet/breadwallet-core/blob/494aa99ed61df24d13e45f6b335cda192c1b765c/bitcoin/BRPeer.c#L282

@naumenkogs
Copy link
Member Author
naumenkogs commented Nov 5, 2019

So, it seems like most of us agree that we should allow light clients to request addrs, and we should not send them unsolicited addr messages assuming they forward them.

Just wanted to clarify, it is exactly what this PR does.

@naumenkogs
Copy link
Member Author
naumenkogs commented Dec 1, 2019

I updated the script to account for black holes and be more precise in general (original post is updated to reflect new numbers).
The result is: current behavior is actually not as bad as I thought.
I still think it's worth merging it though, especially because it will take a while for us to figure out all addrV2 stuff.
So let's save some bandwidth and make address relay be more effective for the cost of merging one line.

@EthanHeilman
Copy link
Contributor

@naumenkogs
If I understand you correctly this commit now excludes sending addrs to non-forward nodes, but would send addrs to SPV clients that do forward addrs?

I looked through the commit and it wasn't completely obvious to me, how MayHaveUsefulAddressDB excludes non-forwarding nodes?

I know MayHaveUsefulAddressDB checks the service flags and I read the service flag descriptions in the code but the flags seem obscure to me i.e. I have trouble understanding precisely what they denote. Also one imagines that a non-forwarding node may have a "useful address DB" so the name of the function call is slightly deceptive.

I would advocate for a comment explaining the intention of the call: "We are calling MayHaveUsefulAddressDB to exclude non-forwarding nodes."

@naumenkogs
Copy link
Member Author
naumenkogs commented Dec 2, 2019

@EthanHeilman

this commit now excludes sending addrs to non-forward nodes, but would send addrs to SPV clients that do forward addrs?

No. Unfortunately, right now there is no way to distinguish forwarding SPV nodes and non-forwarding SPV nodes.
Based on the implementations of SPV nodes I found, none of them forward addrs, so I suggest not forwarding to all SPV nodes for now (until we progress with an explicit opt-in addr forwarding).
Also we found that some implementations would still request addrs, and after this PR they would still be allowed to do so.

Fyi, service bits NODE_NETWORK | NODE_NETWORK_LIMITED currently literally mean "everything except SPV".

Now I agree that documenting and renaming the function would be useful here, will do.

@DrahtBot
Copy link
Contributor
DrahtBot commented Jan 17, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@vasild
Copy link
Contributor
vasild commented Mar 30, 2020

Here is a functional test that demonstrates this patch works as intended.

functional test
diff --git c/test/functional/p2p_addr_relay.py i/test/functional/p2p_addr_relay.py
new file mode 100755
index 000000000..7b74888f0
--- /dev/null
+++ i/test/functional/p2p_addr_relay.py
@@ -0,0 +1,75 @@
+#!/usr/bin/env python3
+# Copyright (c) 2017-2019 The Bitcoin Core developers
+# Distributed under the MIT software license, see the accompanying
+# file COPYING or http://www.opensource.org/licenses/mit-license.php.
+"""Tests address relaying to SPV nodes (aka clients aka ~(NODE_NETWORK | NODE_NETWORK_LIMITED))
+
+* set up nodeRelay (bitcoind)
+* connect two mini nodes to nodeRelay - node1 and node2
+* fire an "addr message" from node1 to nodeRelay
+* observe if node2 will get the address relayed through nodeRelay
+
+Results should be:
+
+Without https://github.com/bitcoin/bitcoin/pull/17194:
+node2 services=NODE_WITNESS | NODE_NETWORK_LIMITED -> relayed
+node2 services=NODE_WITNESS | NODE_NETWORK -> relayed
+node2 services=NODE_WITNESS -> relayed
+
+With https://github.com/bitcoin/bitcoin/pull/17194:
+node2 services=NODE_WITNESS | NODE_NETWORK_LIMITED -> relayed
+node2 services=NODE_WITNESS | NODE_NETWORK -> relayed
+node2 services=NODE_WITNESS -> not relayed
+"""
+from test_framework.messages import CAddress, msg_addr, NODE_NETWORK, NODE_NETWORK_LIMITED, NODE_WITNESS
+from test_framework.mininode import P2PInterface, mininode_lock
+from test_framework.test_framework import BitcoinTestFramework
+from test_framework.util import (
+    assert_equal,
+    wait_until,
+)
+
+import time
+
+class P2Paddr(P2PInterface):
+    def wait_for_addr(self, timeout=5):
+        test_function = lambda: self.last_message.get("addr")
+        wait_until(test_function, timeout=timeout, lock=mininode_lock)
+
+class NodeNetworkLimitedTest(BitcoinTestFramework):
+    def set_test_params(self):
+        self.setup_clean_chain = True
+        self.num_nodes = 1
+        self.extra_args = [[]]
+
+    def setup_network(self):
+        self.add_nodes(self.num_nodes, self.extra_args)
+        self.start_nodes()
+
+    def run_test(self):
+        nodeRelay = self.nodes[0]
+
+        node1 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS | NODE_NETWORK)
+        #node2 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS | NODE_NETWORK)
+        #node2 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS | NODE_NETWORK_LIMITED)
+        node2 = nodeRelay.add_p2p_connection(P2Paddr(), services=NODE_WITNESS)
+
+        addr = CAddress()
+        addr.time = int(time.time())
+        addr.nServices = NODE_NETWORK | NODE_WITNESS
+        addr.ip = "123.123.123.123"
+        addr.port = 8333
+
+        msg = msg_addr()
+        msg.addrs.append(addr)
+
+        self.log.info("Checking how nodeRelay relays the address it received from node1")
+
+
8000
        node1.send_message(msg)
+
+        node2.wait_for_addr()
+
+        nodeRelay.disconnect_p2ps()
+
+if __name__ == '__main__':
+    NodeNetworkLimitedTest().main()
diff --git c/src/net_processing.cpp i/src/net_processing.cpp
index 465a0c06e..795a9179f 100644
--- c/src/net_processing.cpp
+++ i/src/net_processing.cpp
@@ -97,13 +97,16 @@ void EraseOrphansFor(NodeId peer);
 /** Increase a node's misbehavior score. */
 void Misbehaving(NodeId nodeid, int howmuch, const std::string& message="") EXCLUSIVE_LOCKS_REQUIRED(cs_main);
 
 /** Average delay between local address broadcasts in seconds. */
 static constexpr unsigned int AVG_LOCAL_ADDRESS_BROADCAST_INTERVAL = 24 * 60 * 60;
 /** Average delay between peer address broadcasts in seconds. */
-static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 30;
+// Relay immediately, otherwise we may end up delaying the relay for a long
+// time (without upper limit, see PoissonNextSend()) which makes the
+// p2p_addr_relay.py test fragile because it waits for the relay.
+static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 0;
 /** Average delay between trickled inventory transmissions in seconds.
  *  Blocks and whitelisted receivers bypass this, outbound peers get half this delay. */
 static const unsigned int INVENTORY_BROADCAST_INTERVAL = 5;
 /** Maximum number of inventory items to send per transmission.
  *  Limits the impact of low-fee transaction floods. */
 static constexpr unsigned int INVENTORY_BROADCAST_MAX = 7 * INVENTORY_BROADCAST_INTERVAL;
@@ -1377,13 +1380,20 @@ static void RelayAddress(const CAddress& addr, bool fReachable, CConnman* connma
     FastRandomContext insecure_rand;
 
     std::array<std::pair<uint64_t, CNode*>,2> best{{{0, nullptr}, {0, nullptr}}};
     assert(nRelayNodes <= best.size());
 
     auto sortfunc = [&best, &hasher, nRelayNodes](CNode* pnode) {
-        if (pnode->nVersion >= CADDR_TIME_VERSION && pnode->IsAddrRelayPeer() && MayHaveUsefulAddressDB(pnode->nServices)) {
+        if (pnode->nVersion >= CADDR_TIME_VERSION && pnode->IsAddrRelayPeer()
+#if 1
+                && MayHaveUsefulAddressDB(pnode->nServices)
+#elif 1
+                // same as above
+                && !pnode->fClient
+#endif
+                ) {
             uint64_t hashKey = CSipHasher(hasher).Write(pnode->GetId()).Finalize();
             for (unsigned int i = 0; i < nRelayNodes; i++) {
                  if (hashKey > best[i].first) {
                      std::copy(best.begin() + i, best.begin() + nRelayNodes - 1, best.begin() + i + 1);
                      best[i] = std::make_pair(hashKey, pnode);
                      break;

Since this PR is not getting traction for some time, if it looks like too risky behavioral change, what about making it optional under a newly introduced config option?

@naumenkogs
Copy link
Member Author

@vasild Thanks for the test! Will integrate it into this PR.

I anticipate that there's not enough attention not because it's too risky, but rather just a natural thing. Maybe a lack of motivation.
I would say that adding optional would make it even less attractive to review, because of new complexity.

@vasild
Copy link
Contributor
vasild commented Mar 30, 2020

Yes, a config knob is not very attractive either.

Btw I had to make this change in the code to make the test pass quickly and deterministically:

-static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 30;
+// Relay immediately, otherwise we may end up delaying the relay for a long
+// time (without upper limit, see PoissonNextSend()) which makes the
+// p2p_addr_relay.py test fragile because it waits for the relay.
+static const unsigned int AVG_ADDRESS_BROADCAST_INTERVAL = 0;

It cannot be added to the repository in its current form, but is good enough for manual testing. Maybe I overlooked something and it can be done without requiring changes to the code?

@sipa
Copy link
Member
sipa commented Jul 7, 2020

Concept and code review ACK 644eae6

Given evidence that non-full-nodes don't do anything with unsollicited addr relays, and the fact that GETADDR still works, I think it's fine to drop relay to them.

It's probably worth announcing this on the bitcoin-dev mailinglist, to ask for comments, and suggest that if there are exceptions, we can work on a service flag for explicitly opting into addr relay.

@fanquake
Copy link
Member

@jnewbery @amitiuttarwar Did you want to take a look here?

@naumenkogs naumenkogs force-pushed the addr_relay_optimization branch from 644eae6 to 3a1629a Compare July 27, 2020 07:21
@jnewbery
Copy link
Contributor
jnewbery commented Jul 27, 2020

If the goal is to ensure that ADDR gossips don't disappear into a black hole, then I don't think this is the correct solution.

Ensuring addrs go to 2 full node peers without excluding SPV nodes seems to capture the requirements well.

This seems like a much better solution. We lose nothing by gossiping an address to SPV peers, as long as we also gossip to two "address relay" peers.

We don't know whether there are clients that rely on receiving address gossips. After this PR, SPV clients can still request addresses using GETADDR, but we only permit one GETADDR request per connection. If an SPV client wants to request fresh addresses, they'd need to disconnect-reconnect-GETADDR, which seems undesirable.

<aj> how about just biassing against picking peers for the 1 or 2 to relay for if those peers haven't sent us ADDR messages already, if that makes sense?

(http://www.erisian.com.au/bitcoin-core-dev/log-2019-10-17.html#l-591)

This seems even better to me. What we actually care about is whether the peer is going to relay the address to more peers, not whether it's advertised that it's able to serve blocks. Whether the peer has previously sent us addresses is our best heuristic for this.

@naumenkogs
Copy link
Member Author
naumenkogs commented Jul 28, 2020

@jnewbery
I don't have a strong opinion on the approach, indeed the goal is to make sure ADDRs don't go into black hole. Otherwise, a node with 8 outbound peers and 100 inbound SPVs will be self-announcing very poorly. (wrong, see response)

We seem to have 2 options:

  • don't send to SPV
  • make sure we choose those who sent us ADDR before

The only real difference is that the first option (current implementation 3a1629a) might exclude some SPV implementations which handle ADDR messages (and send ADDR messages, as required per second option to reliably work), and nobody seen these clients so far. I guess we can ask on the ML whether they exist, but the lack of response won't prove the absence :)

I don't mind switching to the second option, but since the activity in this PR is low, I would do it after Concept ACK 2nd option from someone?

@EthanHeilman
Copy link
Contributor

If someone is interested in writing the concept ACK of the second option get in touch with me, happy to review your code provide input. Otherwise I might try to write this.

@jnewbery
Copy link
Contributor

Otherwise, a node with 8 outbound peers and 100 inbound SPVs will be self-announcing very poorly.

This isn't true. We self-announce our address to all peers on a timer:

bitcoin/src/net_processing.cpp

Lines 4058 to 4065 in a76ccb0

// Address refresh broadcast
int64_t nNow = GetTimeMicros();
auto current_time = GetTime<std::chrono::microseconds>();
if (pto->IsAddrRelayPeer() && !::ChainstateActive().IsInitialBlockDownload() && pto->m_next_local_addr_send < current_time) {
AdvertiseLocal(pto);
pto->m_next_local_addr_send = PoissonNextSend(current_time, AVG_LOCAL_ADDRESS_BROADCAST_INTERVAL);
}

(although in practice most of the time this won't actually result in our address being advertised since it won't have cycled out of m_addr_known yet).

It is true that we'll be ineffective at relaying new addresses that we hear about.

@DrahtBot
Copy link
Contributor
DrahtBot commented Sep 3, 2020

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@adamjonas
Copy link
Member

@naumenkogs checking in on the status of the PR. Is this something you plan to continue to pursue?

@mzumsande
Copy link
Contributor

If I understand the code at

bitcoin/src/net_processing.cpp

Lines 2783 to 2786 in da69d99

if (addr.nTime > nSince && !peer->m_getaddr_sent && vAddr.size() <= 10 && addr.IsRoutable()) {
// Relay to a limited number of other nodes
RelayAddress(pfrom.GetId(), addr, fReachable);
}

correctly, the assumption of 120 waves of relay in the simulation is too much, because with nSince = nNow - 10 * 60, gossip relay for a given addr will just end 10 minutes (=20 waves on average) after the self-announcement, assuming synchronized clocks. So I would expect relay to be worse than the simulations suggest.

@naumenkogs
Copy link
Member Author

Closing this because it was partially handled by #21528, but also arguments just get outdated after that PR, and the code as well.

@naumenkogs naumenkogs closed this Sep 15, 2021
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0