-
Notifications
You must be signed in to change notification settings - Fork 37.4k
fuzz: Add tx_pool fuzz target #21142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ced2dbe
to
cf801da
Compare
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
Concept ACK: very nice to see the mempool logic more thoroughly fuzzed! |
cf801da
to
4e197bb
Compare
4e197bb
to
faa542a
Compare
Rebased |
faa542a
to
e4e253d
Compare
Tested ACK e4e253d
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, very excited to see fuzzing in ATMP :) I have a lot of questions.
Got it running and have reviewed tx_pool.cpp so far. My biggest suggestion is to try to hit the RBF code by keeping track of outpoints in the mempool and potentially pulling input(s) from there as well.
I'm a fuzzing noob, ran into a sanitizer error on my mac so I ran it in docker instead
(Commands if anyone's interested)
docker run -it ubuntu:bionic /bin/bash
apt update
apt install -y git
apt install -y sudo
git clone https://github.com/bitcoin/bitcoin
cd bitcoin
git remote add marco https://github.com/MarcoFalke/bitcoin-core.git
git fetch marco 2102-fuzzPool
git checkout 2102-fuzzPool
sudo apt-get install build-essential libtool autotools-dev automake pkg-config bsdmainutils python3
sudo apt-get install libevent-dev libboost-system-dev libboost-filesystem-dev libboost-test-dev libboost-thread-dev
sudo apt-get install clang
./autogen.sh
CC=clang CXX=clang++ ./configure --enable-fuzz --with-sanitizers=address,fuzzer,undefined --without-gui --disable-wallet
make
FUZZ=tx_pool src/test/fuzz/fuzz
593c7af
to
bcf96cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Light ACK bcf96cd
Good to see added and improved coverage and efficiency. AFAICT after the first minute, the tx_pool
fuzzer is doing roughly 2-3 times the exec/s of the process_message_tx
and the tx_pool_standard
fuzzers.
I pushed an update for RBF. Will push some more stuff tomorrow 😴 |
ACK fa53822 Reviewed the code and ran the fuzz coverage tests before & after running for a few hours. Definitely getting better coverage with this, especially with the mempool/validation files. Before
After
|
fae8741
to
fab3bee
Compare
fab6b16
to
fae1617
Compare
@MarcoFalke as you requested in the review club yesterday, I ran the if (accepted) {
txids.push_back(tx->GetHash());
+ std::cout << "\n\n\n*********************** SUCCESS ***************************\n\n\n";
}
|
I quickly ran 15kk iterations (double the 7kk iterations of yours) and it did hit a valid tx.
However, I pushed a patch, where you should see a valid tx within ~100k iterations. All feedback has been addressed, this is now ready for re-ACKs. |
Valid tx within a few seconds with latest push.
Edit: and many valid txs. |
reACK faa9ef4 Getting a better overall coverage profile (though I have been running for a lot longer and probably have better seeds by now)
Plus got 378 accepted transactions for the |
Tested ACK faa9ef4 Very nice fuzzing harness! Thanks @MarcoFalke! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code review ACK faa9ef4, a bunch of comments but non blocking
@MarcoFalke why do you call it tx_pool, is it the cool way of saying mempool? I would've called this mempool_validation.cpp and s/tx_pool/mempool
constexpr CAmount SUPPLY_TOTAL{COINBASE_MATURITY * 50 * COIN}; | ||
|
||
CTxMemPool tx_pool_{/* estimator */ nullptr, /* check_ratio */ 1}; | ||
MockedTxPool& tx_pool = *(MockedTxPool*)&tx_pool_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why can't you use a static_cast
here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done in follow-up
amount_view.GetCoin(outpoint, c); | ||
Assert(!c.IsSpent()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just be:
amount_view.GetCoin(outpoint, c); | |
Assert(!c.IsSpent()); | |
Assert(amount_view.GetCoin(outpoint, c)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done in follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to not have side-effects from asserts?
const auto& node = g_setup->m_node; | ||
|
||
std::vector<uint256> txids; | ||
for (const auto& outpoint : g_outpoints_coinbase_init) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is the same g_outpoints_coinbase_init
used for both the tx_pool and tx_pool_standard targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
} | ||
for (int i{0}; i <= 3; ++i) { | ||
// Add some immature and non-existent outpoints | ||
txids.push_back(g_outpoints_coinbase_init.at(i).hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't the ones at the beginning the mature ones?
It could also be more clear to have two sets g_outpoints_coinbases_mature
and g_outpoints_coinbases_immature
.
txids.push_back(g_outpoints_coinbase_init.at(i).hash); | |
txids.push_back(g_outpoints_coinbase_init.at(COINBASE_MATURITY + i).hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch 🤦 . Good catch.
auto pop = outpoints_rbf.begin(); | ||
std::advance(pop, fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, outpoints_rbf.size() - 1)); | ||
const auto outpoint = *pop; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use PickValue
for this? It could take a bool to PickAndMaybeDelete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that make PickValue
unsuitable for collections that can't erase elements. E.g. std::array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh true
@@ -48,6 +48,16 @@ void CallOneOf(FuzzedDataProvider& fuzzed_data_provider, Callables... callables) | |||
return ((i++ == call_index ? callables() : void()), ...); | |||
} | |||
|
|||
template <typename Collection> | |||
const auto& PickValue(FuzzedDataProvider& fuzzed_data_provider, const Collection& col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make this a member function of FuzzedDataProvider
like PickValueInArray
is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FuzzedDataProvider
is taken from upstream, so someone needs to submit this to upstream first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, is it not src/test/fuzz/FuzzedDataProvider.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but it is only synced from upstream: https://github.com/llvm/llvm-project/blob/main/compiler-rt/include/fuzzer/FuzzedDataProvider.h
private: | ||
protected: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit invasive to change all of these mempool members... can you just call something that automatically does RollingFeeUpdate
like tx_pool.GetMinFee()
? Or just set the members you need to protected
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tx_pool.GetMinFee
exists early because of !blockSinceLastRollingFeeBump
, so I can't use that. protected
as a replacement for private
seems fine generally, because it allows tests to mock any private member without having to mark each member individually or add a friend TestClass01
for each class that mocks the mempool. Since mempool isn't derived outside of tests, protected
shouldn't come with any risks either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, makes sense 👍
[[nodiscard]] CMutableTransaction ConsumeTransaction(FuzzedDataProvider& fuzzed_data_provider, const std::optional<std::vector<uint256>>& prevout_txids, const int max_num_in = 10, const int max_num_out = 10) noexcept; | ||
|
||
[[nodiscard]] CScriptWitness ConsumeScriptWitness(FuzzedDataProvider& fuzzed_data_provider, const size_t max_stack_elem_size = 32) noexcept; | ||
|
||
[[nodiscard]] CScript ConsumeScript(FuzzedDataProvider& fuzzed_data_provider, const size_t max_length = 4096, const bool maybe_p2wsh = false) noexcept; | ||
|
||
[[nodiscard]] uint32_t ConsumeSequence(FuzzedDataProvider& fuzzed_data_provider) noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments for the util functions? e.g. I think it'd be helpful to document that ConsumeTransaction
will give a tx that's well-formed but not necessarily valid, and it takes prevout_txids
as input but won't necessarily use a txid from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If prevout_txids
is passed, it should only pick from there. As none of the functions in this file have doxygen comments, so I'll skip them for now. Though, I am more than happy to review a pull adding docs :)
tx_mut.nLockTime = fuzzed_data_provider.ConsumeIntegral<uint32_t>(); | ||
const auto num_in = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_in); | ||
const auto num_out = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_out); | ||
for (int i = 0; i < num_in; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some places you do braced, and in some places you do =
🤷
for (int i = 0; i < num_in; ++i) { | |
for (int i{0}; i < num_in; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this isn't consistent. Though, I'll leave as is for now and remember to write for (int i{0}; i < ...
in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not for (auto i{0} ...
? 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int
is one character shorter 🩳
// All RBF-spendable outpoints | ||
std::set<COutPoint> outpoints_rbf; | ||
// All outpoints counting toward the total supply (subset of outpoints_rbf) | ||
std::set<COutPoint> outpoints_supply; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are confusing 😢
outpoints_rbf
includes all spendable outpoints including ones that would be RBFs, but "RBF-spendable" sounds like they're only the ones in the mempool.
outpoints_supply
doesn't actually include all the outpoints that count towards total supply. It doesn't include the mempool inputs which is why we need to add mempool GetTotalFee
to it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outpoints_rbf
Happy to rename if you have suggestions.
It doesn't include the mempool inputs which is why we need to add mempool GetTotalFee to it...
I don't understand what you mean with "mempool inputs". outpoints_supply
does include all outpoints from the mempool. The reason that GetTotalFee
needs to be added is that the fee in the mempool isn't assigned to an outpoint (yet). The fee is collected in the coinbase transaction in a block, which doesn't exist because this fuzz target doesn't mine any blocks with mempool txs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outpoints_supply does include all outpoints from the mempool.
A right 🤦 I was confoozed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replied to comments
auto pop = outpoints_rbf.begin(); | ||
std::advance(pop, fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, outpoints_rbf.size() - 1)); | ||
const auto outpoint = *pop; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that make PickValue
unsuitable for collections that can't erase elements. E.g. std::array
constexpr CAmount SUPPLY_TOTAL{COINBASE_MATURITY * 50 * COIN}; | ||
|
||
CTxMemPool tx_pool_{/* estimator */ nullptr, /* check_ratio */ 1}; | ||
MockedTxPool& tx_pool = *(MockedTxPool*)&tx_pool_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done in follow-up
amount_view.GetCoin(outpoint, c); | ||
Assert(!c.IsSpent()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done in follow-up
// All RBF-spendable outpoints | ||
std::set<COutPoint> outpoints_rbf; | ||
// All outpoints counting toward the total supply (subset of outpoints_rbf) | ||
std::set<COutPoint> outpoints_supply; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outpoints_rbf
Happy to rename if you have suggestions.
It doesn't include the mempool inputs which is why we need to add mempool GetTotalFee to it...
I don't understand what you mean with "mempool inputs". outpoints_supply
does include all outpoints from the mempool. The reason that GetTotalFee
needs to be added is that the fee in the mempool isn't assigned to an outpoint (yet). The fee is collected in the coinbase transaction in a block, which doesn't exist because this fuzz target doesn't mine any blocks with mempool txs.
const auto& node = g_setup->m_node; | ||
|
||
std::vector<uint256> txids; | ||
for (const auto& outpoint : g_outpoints_coinbase_init) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
} | ||
for (int i{0}; i <= 3; ++i) { | ||
// Add some immature and non-existent outpoints | ||
txids.push_back(g_outpoints_coinbase_init.at(i).hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch 🤦 . Good catch.
tx_mut.nLockTime = fuzzed_data_provider.ConsumeIntegral<uint32_t>(); | ||
const auto num_in = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_in); | ||
const auto num_out = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, max_num_out); | ||
for (int i = 0; i < num_in; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this isn't consistent. Though, I'll leave as is for now and remember to write for (int i{0}; i < ...
in the future.
@@ -48,6 +48,16 @@ void CallOneOf(FuzzedDataProvider& fuzzed_data_provider, Callables... callables) | |||
return ((i++ == call_index ? callables() : void()), ...); | |||
} | |||
|
|||
template <typename Collection> | |||
const auto& PickValue(FuzzedDataProvider& fuzzed_data_provider, const Collection& col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FuzzedDataProvider
is taken from upstream, so someone needs to submit this to upstream first.
private: | ||
protected: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tx_pool.GetMinFee
exists early because of !blockSinceLastRollingFeeBump
, so I can't use that. protected
as a replacement for private
seems fine generally, because it allows tests to mock any private member without having to mark each member individually or add a friend TestClass01
for each class that mocks the mempool. Since mempool isn't derived outside of tests, protected
shouldn't come with any risks either.
[[nodiscard]] CMutableTransaction ConsumeTransaction(FuzzedDataProvider& fuzzed_data_provider, const std::optional<std::vector<uint256>>& prevout_txids, const int max_num_in = 10, const int max_num_out = 10) noexcept; | ||
|
||
[[nodiscard]] CScriptWitness ConsumeScriptWitness(FuzzedDataProvider& fuzzed_data_provider, const size_t max_stack_elem_size = 32) noexcept; | ||
|
||
[[nodiscard]] CScript ConsumeScript(FuzzedDataProvider& fuzzed_data_provider, const size_t max_length = 4096, const bool maybe_p2wsh = false) noexcept; | ||
|
||
[[nodiscard]] uint32_t ConsumeSequence(FuzzedDataProvider& fuzzed_data_provider) noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If prevout_txids
is passed, it should only pick from there. As none of the functions in this file have doxygen comments, so I'll skip them for now. Though, I am more than happy to review a pull adding docs :)
faa9ef4 fuzz: Add tx_pool fuzz targets (MarcoFalke) Pull request description: ACKs for top commit: AnthonyRonning: reACK faa9ef4 practicalswift: Tested ACK faa9ef4 glozow: code review ACK faa9ef4, a bunch of comments but non blocking Tree-SHA512: 8d404398faa46d8e7bf93060a2fe9afd5c0c2bd6e549ff6588d2f3dd1b912dff6c5416d5477c18edecc2e85b00db4fdf4790c3e6597a5149b0d40c9d5014d82f
No description provided.