-
Notifications
You must be signed in to change notification settings - Fork 37.5k
Refactoring CHashWriter & Get{Prevouts,Sequence,Outputs}Hash to SHA256 #18071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
c772944
to
d4b96ba
Compare
e17c920
to
43c8f66
Compare
utACK |
43c8f66
to
09b832b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
09b832b could mention in the description that several proposals require us to "get access to the single SHA256".
Should we rename GetHash()
to GetDoubleSha256()
?
There are some changes in #17977 to these commits, which should be added here if we merge this separately. |
09b832b
to
b0ac6eb
Compare
One example is here: https://github.com/bitcoin/bitcoin/pull/17977/files#diff-46ec93803811240fe7b2b9e83eac7ba2R94 Here's another: https://github.com/bitcoin/bitcoin/pull/17977/files#diff-05b3817be9a0d72b5ad9fa1cff567facR218 |
Ah those changes are intentional for SHA256Uint256, since we use a r-value version at a callsite I figure it's better to have an explicit version that does that. In this version, I make the r-value version return a hash (forcing lifetime consumption of it's parameter) and make the reference version overwrite it's parameter and return void. It should be harder to misuse this API as the types are more clear in the places we use it. In any case, if there is consensus preferring the approach currently in #17977, we'd want to minimally see changes to the Taproot version to both:
I think those are all the differences, aside from the precomputed caches (which would be inappropriate to modify in this PR)? If there are other changes, please let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK b0ac6eb
Micro-nit: the last commit has a typo in the commit message: Prvouts
.
It's still unresolved if the |
If the intention here is to have function signatures that are different from #17977, then I'm NACKish on this, since it would create additional rebase work for the author and reviewers of that PR. |
I think that consideration can be left for a follow-up. It is a further behavior change that makes sense to review separately to make sure nothing breaks. And this change here is valuable enough without it for #17977 etc. So after it is merged the other PRs can be rebased and the |
I agree with @fjahr, it's best if this PR can make no functional changes. Am happy to make a follow-up PR that can be reviewed independently switching cheaphash to a single sha256 (or as sipa points out, even siphash). @Sjors is that OK? @jnewbery this PR has been open for half a year, and is a part of my review of the Taproot code. The below should be about the only diff required as a result of interface changes, which mostly have to be patched into 41d08f5. This code is more clear because m_*_hash is not touched once it has been set. Regardless of this PR, the function signatures should be changed to a const ref. It's not clear to me that it's defined behavior to pass an r-value as a non-const ref, which is done in the taproot code currently (hence the interface changes here -- clearly no UB). If there is consensus that the interfaces should not change, preferring the approach in #17977, I can refactor this PR to be similar. edit: I guess it's not using UB... I could have sworn somewhere that the input to SHA256Uint256 was a non const... diff --git a/src/script/interpreter.cpp b/src/script/interpreter.cpp
index 9ec960d17..003c873d9 100644
--- a/src/script/interpreter.cpp
+++ b/src/script/interpreter.cpp
@@ -1408,12 +1408,12 @@ void PrecomputedTransactionData::Init(const T& txTo, std::vector<CTxOut> spent_o
// Cache is calculated only for transactions with witness
if (txTo.HasWitness()) {
- m_prevouts_hash = GetPrevoutHash(txTo);
- hashPrevouts = SHA256Uint256(m_prevouts_hash);
- m_sequences_hash = GetSequenceHash(txTo);
- hashSequence = SHA256Uint256(m_sequences_hash);
- m_outputs_hash = GetOutputsHash(txTo);
- hashOutputs = SHA256Uint256(m_outputs_hash);
+ hashPrevouts = m_prevouts_hash = GetPrevoutsSHA256(txTo);
+ SHA256Uint256(hashPrevouts);
+ hashSequence = m_sequences_hash = GetSequencesSHA256(txTo);
+ SHA256Uint256(hashSequence);
+ hashOutputs = m_outputs_hash = GetOutputsSHA256(txTo);
+ SHA256Uint256(hashOutputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a mild preference for this PR. Both seem correct.
code review ACK b0ac6eb
Nano nits can be ignored.
|
||
void SHA256Uint256(uint256& hash_io) | ||
{ | ||
CSHA256().Write(hash_io.begin(), 32).Finalize(hash_io.begin()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nano nit: hash_io.size()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this a ton of places in the codebase -- might be nice for someone to add a Write/Finalize interface that is uint256 aware and fix these all at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #19326 (which only does it for the hash.h ones and not the crypto/* ones, but certainly makes sense as a follow-up there too).
|
||
uint256 SHA256Uint256(uint256&& hash_io) | ||
{ | ||
CSHA256().Write(hash_io.begin(), 32).Finalize(hash_io.begin()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nano nit: hash_io.size()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
That makes no sense; cheaphash uses a single sha256. This PR changes it to a double hash, so it's a functional change. Unless I'm reading it wrong. |
@Sjors yes you are reading it wrong. The type of ctx changes from double hashed writer to a single hashed writer. |
e9d710c
to
6a9ad6d
Compare
…Outputs}SHA256. Several proposals (Taproot, MuHash, CTV) require access to the single hash.
6a9ad6d
to
03b1cf8
Compare
I've rebased #18071 following the hash.h span changes (but not #19601 -- I can rebase that one if there is a preference for that approach). pre-rebase summary: Please check the rebase diff at your convenience! @sipa it would be good if you could weigh in on if this is acceptable or if #19601 is better. I think since merge of the span.h changes causes #17977 to need to rebase, it might be a decent time to rebase onto this branch without too much extra work. |
* hash_io is read and written over with the result. | ||
*/ | ||
void SHA256Uint256(uint256& hash_io); | ||
uint256 SHA256Uint256(uint256&& hash_io); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of having these functions instead of a single uint256 SHA256Uint256(const uint256& input)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One benefit is that both versions operate completely in-place on their arguments (otherwise we're reading from a pointer, writing to the stack, then writing to a pointer). Another benefit is that it is slightly harder to misuse this because when you are operating on a l-value there is no return value, which otherwise you might forget to assign.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could be solved using NODISCARD instead. I don't think saving a super brief 32 bytes on the stack is even going to be be observable.
Ah, I missed the switch from BOOST_AUTO_TEST_CASE(getcheaphash)
{
CHashWriter ss(SER_DISK, CLIENT_VERSION);
ss << 1;
BOOST_CHECK_EQUAL(ss.GetCheapHash(), 0x8D07CCE5F258F741ULL);
} |
Since #19601 seems to be preferred by @sipa, @practicalswift, and @jnewbery, going to close #18071. |
…}Hash to SHA256 (Alternative to #18071) 9ab4caf Refactor Get{Prevout,Sequence,Outputs}Hash to Get{Prevouts,Sequences,Outputs}SHA256. (Jeremy Rubin) 6510d0f Add SHA256Uint256 helper functions (Jeremy Rubin) b475d7d Add single sha256 call to CHashWriter (Jeremy Rubin) Pull request description: Opened as an alternative to #18071 to be more similar to #17977. I'm fine with either, deferring to others. cc jnewbery Sjors ACKs for top commit: jnewbery: Code review ACK 9ab4caf jonatack: Tested ACK 9ab4caf fjahr: tested ACK 9ab4caf instagibbs: reACK 9ab4caf Tree-SHA512: 93a7a47697f1657f027b18407bdcce16963f6b23d12372e7ac8fd4ee96769b3e2639369f9956fee669cc881b6338641cddfeeef1516c7104cb50ef4b880bb0a7
…Outputs}Hash to SHA256 (Alternative to bitcoin#18071) 9ab4caf Refactor Get{Prevout,Sequence,Outputs}Hash to Get{Prevouts,Sequences,Outputs}SHA256. (Jeremy Rubin) 6510d0f Add SHA256Uint256 helper functions (Jeremy Rubin) b475d7d Add single sha256 call to CHashWriter (Jeremy Rubin) Pull request description: Opened as an alternative to bitcoin#18071 to be more similar to bitcoin#17977. I'm fine with either, deferring to others. cc jnewbery Sjors ACKs for top commit: jnewbery: Code review ACK 9ab4caf jonatack: Tested ACK 9ab4caf fjahr: tested ACK 9ab4caf instagibbs: reACK bitcoin@9ab4caf Tree-SHA512: 93a7a47697f1657f027b18407bdcce16963f6b23d12372e7ac8fd4ee96769b3e2639369f9956fee669cc881b6338641cddfeeef1516c7104cb50ef4b880bb0a7
These are a couple common refactoring changes in both Taproot and CTV that I think can be merged as is. Other than that the two PRs should largely be conflict free (except for caching, which is a relatively trivial fix).
These essentially just allow us to get access to the single SHA256 version of Get{Prevouts,Sequence,Outputs} hash which are used both in the Taproot and CTV spec. They use the single hash versions because they are cheaper to compute, which reduces validation overhead. These refactors are non-functional, and just expose the ability to get the single hash when needed.
The names of Get*Hash are renamed to Get*SHA256 to avoid confusion with the value returned previously.