8000 perf(types): 3x speedup MakePartSet by ValarDragon · Pull Request #3117 · cometbft/cometbft · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

perf(types): 3x speedup MakePartSet #3117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 28, 2024
Merged

perf(types): 3x speedup MakePartSet #3117

merged 6 commits into from
May 28, 2024

Conversation

ValarDragon
Copy link
Contributor
@ValarDragon ValarDragon commented May 26, 2024

This PR adds some benchmarks, and significantly speeds up types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and every consensus instance) It does so by doing two things:

  • Saving mutexes on the newly created bit array, by defaulting every value to True (rather than setting it in a loop that goes through a mutex)
  • Uses the same hash object throughout, and avoids an extra copy of every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the add block part codepath for both the proposer and every full node.

New:

BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op

Old:

BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op

System wide, this is definitely not our issue (looks like roughly .1ms per blockpart), but still definitely useful time to remove


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments
  • Title follows the Conventional Commits spec

@ValarDragon ValarDragon requested a review from a team as a code owner May 26, 2024 12:58
@ValarDragon ValarDragon requested a review from a team May 26, 2024 12:58
Copy link
Contributor
@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ValarDragon ❤️

Copy link
Contributor
@sergio-mena sergio-mena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing 🎉

@melekes melekes added this pull request to the merge queue May 28, 2024
Merged via the queue into main with commit f94dea5 May 28, 2024
melekes deleted the dev/speedup_make_partset branch May 28, 2024 06:07
@ValarDragon
Copy link
Contributor Author

Ok to backport to 1.x and 38.x ?

ValarDragon added a commit to osmosis-labs/cometbft that referenced this pull request May 29, 2024
<!--

Please add a reference to the issue that this PR addresses and indicate
which
files are most critical to review. If it fully addresses a particular
issue,
please include "Closes #XXX" (where "XXX" is the issue number).

If this PR is non-trivial/large/complex, please ensure that you have
either
created an issue that the team's had a chance to respond to, or had some
discussion with the team prior to submitting substantial pull requests.
The team
can be reached via GitHub Discussions or the Cosmos Network Discord
server in
the #cometbft channel. GitHub Discussions is preferred over Discord as
it
allows us to keep track of conversations topically.
https://github.com/cometbft/cometbft/discussions

If the work in this PR is not aligned with the team's current
priorities, please
be advised that it may take some time before it is merged - especially
if it has
not yet been discussed with the team.

See the project board for the team's current priorities:
https://github.com/orgs/cometbft/projects/1

-->

This PR adds some benchmarks, and significantly speeds up
types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and
every consensus instance) It does so by doing two things:
- Saving mutexes on the newly created bit array, by defaulting every
value to True (rather than setting it in a loop that goes through a
mutex)
- Uses the same hash object throughout, and avoids an extra copy of
every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the
add block part codepath for both the proposer and every full node.

New:
```
BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op
```

Old:
```
BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op
```

System wide, this is definitely not our issue (looks like roughly .1ms
per blockpart), but still definitely useful time to remove

---

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
@melekes
Copy link
Contributor
melekes commented May 29, 2024

@sergio-mena ^

ValarDragon added a commit to osmosis-labs/cometbft that referenced this pull request May 29, 2024
perf(types): 3x speedup MakePartSet (cometbft#3117)
mergify bot pushed a commit to osmosis-labs/cometbft that referenced this pull request May 29, 2024
<!--

Please add a reference to the issue that this PR addresses and indicate
which
files are most critical to review. If it fully addresses a particular
issue,
please include "Closes #XXX" (where "XXX" is the issue number).

If this PR is non-trivial/large/complex, please ensure that you have
either
created an issue that the team's had a chance to respond to, or had some
discussion with the team prior to submitting substantial pull requests.
The team
can be reached via GitHub Discussions or the Cosmos Network Discord
server in
the #cometbft channel. GitHub Discussions is preferred over Discord as
it
allows us to keep track of conversations topically.
https://github.com/cometbft/cometbft/discussions

If the work in this PR is not aligned with the team's current
priorities, please
be advised that it may take some time before it is merged - especially
if it has
not yet been discussed with the team.

See the project board for the team's current priorities:
https://github.com/orgs/cometbft/projects/1

-->

This PR adds some benchmarks, and significantly speeds up
types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and
every consensus instance) It does so by doing two things:
- Saving mutexes on the newly created bit array, by defaulting every
value to True (rather than setting it in a loop that goes through a
mutex)
- Uses the same hash object throughout, and avoids an extra copy of
every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the
add block part codepath for both the proposer and every full node.

New:
```
BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op
```

Old:
```
BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op
```

System wide, this is definitely not our issue (looks like roughly .1ms
per blockpart), but still definitely useful time to remove

---

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit 7b1c1f8)
ValarDragon added a commit to osmosis-labs/cometbft that referenced this pull request May 31, 2024
…pr-83

perf(types): 3x speedup MakePartSet (cometbft#3117) (backport #83)
itsdevbear pushed a commit to berachain/cometbft that referenced this pull request Jul 4, 2024
<!--

Please add a reference to the issue that this PR addresses and indicate
which
files are most critical to review. If it fully addresses a particular
issue,
please include "Closes #XXX" (where "XXX" is the issue number).

If this PR is non-trivial/large/complex, please ensure that you have
either
created an issue that the team's had a chance to respond to, or had some
discussion with the team prior to submitting substantial pull requests.
The team
can be reached via GitHub Discussions or the Cosmos Network Discord
server in
the #cometbft channel. GitHub Discussions is preferred over Discord as
it
allows us to keep track of conversations topically.
https://github.com/cometbft/cometbft/discussions

If the work in this PR is not aligned with the team's current
priorities, please
be advised that it may take some time before it is merged - especially
if it has
not yet been discussed with the team.

See the project board for the team's current priorities:
https://github.com/orgs/cometbft/projects/1

-->

This PR adds some benchmarks, and significantly speeds up
types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and
every consensus instance) It does so by doing two things:
- Saving mutexes on the newly created bit array, by defaulting every
value to True (rather than setting it in a loop that goes through a
mutex)
- Uses the same hash object throughout, and avoids an extra copy of
every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the
add block part codepath for both the proposer and every full node.

New:
```
BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op
```

Old:
```
BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op
```

System wide, this is definitely not our issue (looks like roughly .1ms
per blockpart), but still definitely useful time to remove

---

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit 7b1c1f8)
@melekes
Copy link
Contributor
melekes commented Jul 10, 2024

@mergify backport v1.x

Copy link
Contributor
mergify bot commented Jul 10, 2024

backport v1.x

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jul 10, 2024
<!--

Please add a reference to the issue that this PR addresses and indicate
which
files are most critical to review. If it fully addresses a particular
issue,
please include "Closes #XXX" (where "XXX" is the issue number).

If this PR is non-trivial/large/complex, please ensure that you have
either
created an issue that the team's had a chance to respond to, or had some
discussion with the team prior to submitting substantial pull requests.
The team
can be reached via GitHub Discussions or the Cosmos Network Discord
server in
the #cometbft channel. GitHub Discussions is preferred over Discord as
it
allows us to keep track of conversations topically.
https://github.com/cometbft/cometbft/discussions

If the work in this PR is not aligned with the team's current
priorities, please
be advised that it may take some time before it is merged - especially
if it has
not yet been discussed with the team.

See the project board for the team's current priorities:
https://github.com/orgs/cometbft/projects/1

-->

This PR adds some benchmarks, and significantly speeds up
types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and
every consensus instance) It does so by doing two things:
- Saving mutexes on the newly created bit array, by defaulting every
value to True (rather than setting it in a loop that goes through a
mutex)
- Uses the same hash object throughout, and avoids an extra copy of
every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the
add block part codepath for both the proposer and every full node.

New:
```
BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op
```

Old:
```
BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op
```

System wide, this is definitely not our issue (looks like roughly .1ms
per blockpart), but still definitely useful time to remove

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit f94dea5)

# Conflicts:
#	.changelog/unreleased/improvements/3117-significantly-speedup-make-partset.md
melekes added a commit that referenced this pull request Jul 10, 2024
This PR adds some benchmarks, and significantly speeds up
types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and
every consensus instance) It does so by doing two things:
- Saving mutexes on the newly created bit array, by defaulting every
value to True (rather than setting it in a loop that goes through a
mutex)
- Uses the same hash object throughout, and avoids an extra copy of
every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the
add block part codepath for both the proposer and every full node.

New:
```
BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op
```

Old:
```
BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op
```

System wide, this is definitely not our issue (looks like roughly .1ms
per blockpart), but still definitely useful time to remove

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclo
8000
g](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #3117 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
ValarDragon added a commit to osmosis-labs/cometbft that referenced this pull request Aug 19, 2024
…t#3470)

This PR adds some benchmarks, and significantly speeds up
types.MakePartSet, and Partset.AddPart. (Used by the block proposer, and
every consensus instance) It does so by doing two things:
- Saving mutexes on the newly created bit array, by defaulting every
value to True (rather than setting it in a loop that goes through a
mutex)
- Uses the same hash object throughout, and avoids an extra copy of
every leaf. (main speedup)

I do the same hash optimization for proof.Verify, which is used in the
add block part codepath for both the proposer and every full node.

New:
```
BenchmarkMakePartSet/nParts=1-12         	   38616	     29817 ns/op	     568 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   19888	     59866 ns/op	    1000 B/op	      22 allocs/op
BenchmarkMakePartSet/nParts=3-12         	   12979	     95691 ns/op	    1528 B/op	      33 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    8688	    128192 ns/op	    2024 B/op	      44 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    7308	    155224 ns/op	    2888 B/op	      57 allocs/op
```

Old:
```
BenchmarkMakePartSet/nParts=1-12         	   16647	    106545 ns/op	   74169 B/op	      12 allocs/op
BenchmarkMakePartSet/nParts=2-12         	   10000	    106361 ns/op	  148329 B/op	      23 allocs/op
BenchmarkMakePartSet/nParts=3-12         	    6992	    337644 ns/op	  222587 B/op	      35 allocs/op
BenchmarkMakePartSet/nParts=4-12         	    3488	    480109 ns/op	  296811 B/op	      47 allocs/op
BenchmarkMakePartSet/nParts=5-12         	    2228	    557768 ns/op	  371404 B/op	      61 allocs/op
```

System wide, this is definitely not our issue (looks like roughly .1ms
per blockpart), but still definitely useful time to remove

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request cometbft#3117 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0